Skip to content

Commit d0b29d8

Browse files
committed
Optimize strspn()
The current implementation uses a nested loop (for + goto), which has complexity O(|s1| * |s2|). If we instead use a lookup table, the complexity drops to O(|s1| + |s2|). This is conceptually the same strategy that common C library implementations such as glibc and musl use. The variation with a bitvector instead of a table also gives a speed-up, but the table variation was about 1.34x faster. On microbenchmarks this easily gave a 5x speedup. This can bring a 1.4-1.5% performance improvement in the Symfony benchmark. Closes GH-12431.
1 parent 14404ac commit d0b29d8

File tree

2 files changed

+34
-10
lines changed

2 files changed

+34
-10
lines changed

UPGRADING

+3
Original file line numberDiff line numberDiff line change
@@ -145,3 +145,6 @@ PHP 8.4 UPGRADE NOTES
145145
* The performance of DOMNode::C14N() is greatly improved for the case without
146146
an xpath query. This can give a time improvement of easily two order of
147147
magnitude for documents with tens of thousands of nodes.
148+
149+
* The performance of strspn() is greatly improved. It now runs in linear time
150+
instead of being bounded by quadratic time.

ext/standard/string.c

+31-10
Original file line numberDiff line numberDiff line change
@@ -1597,19 +1597,40 @@ PHPAPI char *php_stristr(char *s, char *t, size_t s_len, size_t t_len)
15971597
/* }}} */
15981598

15991599
/* {{{ php_strspn */
1600-
PHPAPI size_t php_strspn(const char *s1, const char *s2, const char *s1_end, const char *s2_end)
1600+
PHPAPI size_t php_strspn(const char *haystack, const char *characters, const char *haystack_end, const char *characters_end)
16011601
{
1602-
const char *p = s1, *spanp;
1603-
char c = *p;
1604-
1605-
cont:
1606-
for (spanp = s2; p != s1_end && spanp != s2_end;) {
1607-
if (*spanp++ == c) {
1608-
c = *(++p);
1609-
goto cont;
1602+
/* Fast path for short strings.
1603+
* The table lookup cannot be faster in this case because we not only have to compare, but also build the table.
1604+
* We only compare in this case.
1605+
* Empirically tested that the table lookup approach is only beneficial if characters is longer than 1 character. */
1606+
if (characters_end - characters == 1) {
1607+
const char *ptr = haystack;
1608+
while (ptr < haystack_end && *ptr == *characters) {
1609+
ptr++;
16101610
}
1611+
return ptr - haystack;
1612+
}
1613+
1614+
/* Every character in characters will set a boolean in this lookup table.
1615+
* We'll use the lookup table as a fast lookup for the characters in characters while looping over haystack. */
1616+
bool table[256];
1617+
/* Use multiple small memsets to inline the memset with intrinsics, trick learned from glibc. */
1618+
memset(table, 0, 64);
1619+
memset(table + 64, 0, 64);
1620+
memset(table + 128, 0, 64);
1621+
memset(table + 192, 0, 64);
1622+
1623+
while (characters < characters_end) {
1624+
table[(unsigned char) *characters] = true;
1625+
characters++;
1626+
}
1627+
1628+
const char *ptr = haystack;
1629+
while (ptr < haystack_end && table[(unsigned char) *ptr]) {
1630+
ptr++;
16111631
}
1612-
return (p - s1);
1632+
1633+
return ptr - haystack;
16131634
}
16141635
/* }}} */
16151636

0 commit comments

Comments
 (0)