Skip to content

Commit 63c50cc

Browse files
committed
Add AVX2-accelerated version of mb_check_encoding for UTF-8 only
From some local benchmarks which I ran, the AVX2-based version is about 2.8x faster than the SSE2-based version on long (~10,000 byte) strings, 1.6x faster on medium (~100 byte) strings, and just about the same on very short strings. I followed the example of the code in the 'standard' module, using preprocessor directives so that the code can be compiled in any of 4 ways: 1) With no AVX2 support at all (for example, when PHP is compiled for CPU architectures other than AMD64) 2) For CPUs with AVX2 only (for example, when PHP is built with CCFLAGS='-march=native' on a host which implements AVX2) 3) With runtime detection of AVX2 performed by the dynamic linker; this requires a dynamic linker which supports the STT_GNU_IFUNC symbol type extension to the ELF binary standard. This is true of glibc's dynamic linker, as of late 2009. 4) With runtime detection of AVX2 performed by the module init function. The detection is done by checking the output of CPUID and then a function pointer is set accordingly. In this case, all calls to the UTF-8 validation routine are indirect calls through that function pointer.
1 parent d14ed12 commit 63c50cc

File tree

1 file changed

+372
-4
lines changed

1 file changed

+372
-4
lines changed

0 commit comments

Comments
 (0)