Skip to content

Commit d835de1

Browse files
author
Tony Su
authored
[zend_hash]: Use AVX2 instructions for better code efficiency (#10858)
We prefer to use AVX2 instructions for code efficiency improvement 1) Reduce instruction path length Generic x86 Instr: 16, SSE2: 6, AVX2: 4 2) Better ICache locality and density To enable AVX2 instructions, compile with '-mavx2' option via CFLAGS environment variable or command line argument. Note: '-mavx' option still leads to using SSE2 instructions. _mm256_cmpeq_epi64() requires AVX2 (-mavx2). Testing: Build with and without '-mavx2', 'make TEST_PHP_ARGS=-j8 test' presented the same test report. Signed-off-by: Tony Su <[email protected]>
1 parent cd0c6bc commit d835de1

File tree

1 file changed

+12
-2
lines changed

1 file changed

+12
-2
lines changed

Zend/zend_hash.c

+12-2
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,10 @@
2626
# include <arm_neon.h>
2727
#endif
2828

29-
#ifdef __SSE2__
29+
/* Prefer to use AVX2 instructions for better latency and throughput */
30+
#if defined(__AVX2__)
31+
# include <immintrin.h>
32+
#elif defined( __SSE2__)
3033
# include <mmintrin.h>
3134
# include <emmintrin.h>
3235
#endif
@@ -176,7 +179,14 @@ static zend_always_inline void zend_hash_real_init_mixed_ex(HashTable *ht)
176179
HT_SET_DATA_ADDR(ht, data);
177180
/* Don't overwrite iterator count. */
178181
ht->u.v.flags = HASH_FLAG_STATIC_KEYS;
179-
#ifdef __SSE2__
182+
#if defined(__AVX2__)
183+
do {
184+
__m256i ymm0 = _mm256_setzero_si256();
185+
ymm0 = _mm256_cmpeq_epi64(ymm0, ymm0);
186+
_mm256_storeu_si256((__m256i*)&HT_HASH_EX(data, 0), ymm0);
187+
_mm256_storeu_si256((__m256i*)&HT_HASH_EX(data, 8), ymm0);
188+
} while(0);
189+
#elif defined (__SSE2__)
180190
do {
181191
__m128i xmm0 = _mm_setzero_si128();
182192
xmm0 = _mm_cmpeq_epi8(xmm0, xmm0);

0 commit comments

Comments
 (0)