Skip to content

Commit 204694c

Browse files
committedJan 5, 2023
Optimize out more checks from hot path for BIG5 decoding
This boosts the speed of BIG5 encoding conversion by just 1-2%. I tried various other tweaks to the BIG5 decoding routine to see if I could make it faster at the cost of using a larger conversion table, but at least on the machine I am using for benchmarking, these other changes just made things slower.
1 parent d75c78b commit 204694c

File tree

2 files changed

+268
-98
lines changed

2 files changed

+268
-98
lines changed
 

‎ext/mbstring/libmbfl/filters/mbfilter_big5.c

+6-2
Original file line numberDiff line numberDiff line change
@@ -398,7 +398,7 @@ static size_t mb_big5_to_wchar(unsigned char **in, size_t *in_len, uint32_t *buf
398398

399399
if (c <= 0x7F) {
400400
*out++ = c;
401-
} else if (c > 0xA0 && c <= 0xF9 && c != 0xC8) {
401+
} else if (c > 0xA0 && c <= 0xF9) {
402402
/* We don't need to check p < e here; it's not possible that this pointer dereference
403403
* will be outside the input string, because of e-- above */
404404
unsigned char c2 = *p++;
@@ -407,8 +407,12 @@ static size_t mb_big5_to_wchar(unsigned char **in, size_t *in_len, uint32_t *buf
407407
unsigned int w = (c - 0xA1)*157 + c2 - ((c2 <= 0x7E) ? 0x40 : 0xA1 - 0x3F);
408408
ZEND_ASSERT(w < big5_ucs_table_size);
409409
w = big5_ucs_table[w];
410-
if (!w)
410+
if (!w) {
411+
if (c == 0xC8) {
412+
p--;
413+
}
411414
w = MBFL_BAD_INPUT;
415+
}
412416
*out++ = w;
413417
} else {
414418
*out++ = MBFL_BAD_INPUT;

0 commit comments

Comments
 (0)