summaryrefslogtreecommitdiff
path: root/regexec.c
AgeCommit message (Collapse)Author
2024-07-26Fix memory leak in String#start_with? when regexp times outPeter Zhu
[Bug #20653] This commit refactors how Onigmo handles timeout. Instead of raising a timeout error, onig_search will return a ONIGERR_TIMEOUT which the caller can free memory, and then raise a timeout error. This fixes a memory leak in String#start_with when the regexp times out. For example: regex = Regexp.new("^#{"(a*)" * 10_000}x$", timeout: 0.000001) str = "a" * 1000000 + "x" 10.times do 100.times do str.start_with?(regex) rescue end puts `ps -o rss= -p #{$$}` end Before: 33216 51936 71152 81728 97152 103248 120384 133392 133520 133616 After: 14912 15376 15824 15824 16128 16128 16144 16144 16160 16160 Notes: Merged: https://github.com/ruby/ruby/pull/11247
2024-07-25Fix memory leak in Regexp capture group when timeoutPeter Zhu
[Bug #20650] The capture group allocates memory that is leaked when it times out. For example: re = Regexp.new("^#{"(a*)" * 10_000}x$", timeout: 0.000001) str = "a" * 1000000 + "x" 10.times do 100.times do re =~ str rescue Regexp::TimeoutError end puts `ps -o rss= -p #{$$}` end Before: 34688 56416 78288 100368 120784 140704 161904 183568 204320 224800 After: 16288 16288 16880 16896 16912 16928 16944 17184 17184 17200 Notes: Merged: https://github.com/ruby/ruby/pull/11238
2024-04-25[Bug #20453] segfault in Regexp timeoutDaniel Colson
https://bugs.ruby-lang.org/issues/20228 started freeing `stk_base` to avoid a memory leak. But `stk_base` is sometimes stack allocated (using `xalloca`), so the free only works if the regex stack has grown enough to hit `stack_double` (which uses `xmalloc` and `xrealloc`). To reproduce the problem on master and 3.3.1: ```ruby Regexp.timeout = 0.001 /^(a*)x$/ =~ "a" * 1000000 + "x"' ``` Some details about this potential fix: `stk_base == stk_alloc` on [init](https://github.com/ruby/ruby/blob/dde99215f2bc60c22a00fc941ff7f714f011e920/regexec.c#L1153), so if `stk_base != stk_alloc` we can be sure we called [`stack_double`](https://github.com/ruby/ruby/blob/dde99215f2bc60c22a00fc941ff7f714f011e920/regexec.c#L1210) and it's safe to free. It's also safe to free if we've [saved](https://github.com/ruby/ruby/blob/dde99215f2bc60c22a00fc941ff7f714f011e920/regexec.c#L1187-L1189) the stack to `msa->stack_p`, since we do the `stk_base != stk_alloc` check before saving. This matches the check we do inside [`stack_double`](https://github.com/ruby/ruby/blob/dde99215f2bc60c22a00fc941ff7f714f011e920/regexec.c#L1221)
2024-04-23Fix Use-After-Free issue for RegexpHiroshi SHIBATA
Co-authored-by: Isaac Peka <[email protected]>
2024-04-23Fix handling of reg->dmin in Regex matchingIsaac Peka
2024-02-27[Bug #20305] Fix matching against an incomplete characterNobuyoshi Nakada
When matching against an incomplete character, some `enclen` calls are expected not to exceed the limit, and some are expected to return the required length and then the results are checked if it exceeds.
2024-02-07[Bug #20239] Fix overflow at down-castingNobuyoshi Nakada
2024-02-02Fix memory leak in stk_base when Regexp timeoutPeter Zhu
[Bug #20228] If rb_reg_check_timeout raises a Regexp::TimeoutError, then the stk_base will leak.
2024-01-29Correctly handle consecutive lookarounds (#9738)Hiroya Fujinami
Fix [Bug #20207] Fix [Bug #20212] Handling consecutive lookarounds in init_cache_opcodes is buggy, so it causes invalid memory access reported in [Bug #20207] and [Bug #20212]. This fixes it by using recursive functions to detected lookarounds nesting correctly.
2024-01-10Fix to work match cache with peek next optimization (#9459)Hiroya Fujinami
2023-12-30Reduce `if` for decreasing counter on OP_REPEAT_INC (#9393)Hiroya Fujinami
This commit also reduces the warning `'stkp' may be used uninitialized in this function`.
2023-12-29Fix [Bug #20098]: set counter value for {n,m} repetition correctly (#9391)Hiroya Fujinami
2023-12-28Fix [Bug #20083]: correct a cache point size for atomic groups (#9367)Hiroya Fujinami
2023-11-16Fix regex match cache out-of-bounds accessAlan Wu
Previously the following read and wrote 1 byte out-of-bounds: $ valgrind ruby -e 'p /(\W+)[bx]\?/i.match? "aaaaaa aaaaaaaaa aaaa aaaaaaaa aaa aaaaxaaaaaaaaaaa aaaaa aaaaaaaaaaaa a ? aaa aaaa a ?"' 2> >(grep Invalid -A 30) Because of the `match_cache_point_index + 1` in memoize_extended_match_cache_point() and check_extended_match_cache_point(), we need one more byte of space.
2023-10-30Optimize regexp matching for look-around and atomic groups (#7931)Hiroya Fujinami
2023-07-27Add function rb_reg_onig_matchPeter Zhu
rb_reg_onig_match performs preparation, error handling, and cleanup for matching a regex against a string. This reduces repetitive code and removes the need for StringScanner to access internal data of regex. Notes: Merged: https://github.com/ruby/ruby/pull/8123
2023-06-30Don't check for null pointer in calls to freePeter Zhu
According to the C99 specification section 7.20.3.2 paragraph 2: > If ptr is a null pointer, no action occurs. So we do not need to check that the pointer is a null pointer. Notes: Merged: https://github.com/ruby/ruby/pull/8004
2023-05-22Allow the match cache optimization for atomic groups (#7804)TSUYUSATO Kitsune
Notes: Merged-By: makenowjust <[email protected]>
2023-05-13Remove warnings and errors in `regexec.c` with `ONIG_DEBUG_...` macros (#7803)TSUYUSATO Kitsune
Notes: Merged-By: makenowjust <[email protected]>
2023-05-04Delay start of the match cache optimization (#7738)TSUYUSATO Kitsune
Notes: Merged-By: makenowjust <[email protected]>
2023-04-19Refactor `Regexp#match` cache implementation (#7724)TSUYUSATO Kitsune
* Refactor Regexp#match cache implementation Improved variable and function names Fixed [Bug 19537] (Maybe fixed in https://github.com/ruby/ruby/pull/7694) * Add a comment of the glossary for "match cache" * Skip to reset match cache when no cache point on null check Notes: Merged-By: makenowjust <[email protected]>
2023-04-16Fix `PLATFORM_GET_INC`Nobuyoshi Nakada
On platforms where unaligned word access is not allowed, and if `sizeof(val)` and `sizeof(type)` differ: - `val` > `type`, `val` will be a garbage. - `val` < `type`, outside `val` will be clobbered. Notes: Merged: https://github.com/ruby/ruby/pull/7723
2023-04-12[Bug #19587] Fix `reset_match_cache` argumentsNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/7694
2023-04-12ConstifyNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/7694
2023-04-12Extract `bsearch_cache_index` functionNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/7694
2023-03-13[Bug #19476]: correct cache index computation for repetition (#7457)TSUYUSATO Kitsune
Notes: Merged-By: makenowjust <[email protected]>
2023-03-13[Bug #19467] correct cache points and counting failure on ↵TSUYUSATO Kitsune
`OP_ANYCHAR_STAR_PEEK_NEXT` (#7454) Notes: Merged-By: makenowjust <[email protected]>
2022-12-28Fix [Bug 19273], set correct value to `outer_repeat` on `OP_REPEAT` (#7035)TSUYUSATO Kitsune
Notes: Merged-By: makenowjust <[email protected]>
2022-12-22Adjust style [ci skip]Nobuyoshi Nakada
2022-12-14Add `Regexp.linear_time?` (#6901)TSUYUSATO Kitsune
Notes: Merged-By: makenowjust <[email protected]>
2022-12-12Make absent operator work at the end of the input stringYusuke Endoh
https://bugs.ruby-lang.org/issues/19104#change-100542 Notes: Merged: https://github.com/ruby/ruby/pull/6902
2022-11-17Add default cases for cache point finding functionTSUYUSATO Kitsune
Notes: Merged: https://github.com/ruby/ruby/pull/6744
2022-11-17Add OP_CCLASS_MB caseTSUYUSATO Kitsune
Notes: Merged: https://github.com/ruby/ruby/pull/6744
2022-11-09Reduce warningsTSUYUSATO Kitsune
2022-11-09Use long instead of intTSUYUSATO Kitsune
2022-11-09Check for integer overflow in the allocation of match_cache tableYusuke Endoh
2022-11-09Ensure that the table size for CACHE_MATCH fits with intYusuke Endoh
Currently, the keys for CACHE_MATCH are handled as an `int` type. So we should make sure the table size are smaller than the range of `int`.
2022-11-09Prevent GCC warningsYusuke Endoh
``` regexec.c: In function ‘reset_match_cache’: regexec.c:1259:56: warning: suggest parentheses around ‘-’ inside ‘<<’ [-Wparentheses] 1259 | match_cache[k1 >> 3] &= ((1 << (8 - (k2 & 7) - 1)) - 1 << ((k2 & 7) + 1)) | ((1 << (k1 & 7)) - 1); | ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~ regexec.c:1269:60: warning: suggest parentheses around ‘-’ inside ‘<<’ [-Wparentheses] 1269 | match_cache[k2 >> 3] &= ((1 << (8 - (k2 & 7) - 1)) - 1 << ((k2 & 7) + 1)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~ regexec.c: In function ‘find_cache_index_table’: regexec.c:1192:11: warning: ‘m’ may be used uninitialized [-Wmaybe-uninitialized] 1192 | if (!(0 <= m && m < num_cache_table && table[m].addr == p)) { | ~~^~~~ regexec.c: In function ‘match_at’: regexec.c:1238:12: warning: ‘m1’ is used uninitialized [-Wuninitialized] 1238 | if (table[m1].addr < pbegin && m1 + 1 < num_cache_table) m1++; | ^ regexec.c:1218:39: note: ‘m1’ was declared here 1218 | int l = 0, r = num_cache_table - 1, m1, m2; | ^~ regexec.c:1239:12: warning: ‘m2’ is used uninitialized [-Wuninitialized] 1239 | if (table[m2].addr > pend && m2 - 1 > 0) m2--; | ^ regexec.c:1218:43: note: ‘m2’ was declared here 1218 | int l = 0, r = num_cache_table - 1, m1, m2; | ^~ ```
2022-11-09Return ONIGERR_MEMORY if it fails to allocate memory for cache_match_optYusuke Endoh
2022-11-09Revert "Refactor field names"TSUYUSATO Kitsune
This reverts commit 1e6673d6bbd2adbf555d82c7c0906ceb148ed6ee.
2022-11-09Refactor field namesTSUYUSATO Kitsune
2022-11-09Remove debug printfTSUYUSATO Kitsune
2022-11-09Clear cache on OP_NULL_CHECK_END_MEMSTTSUYUSATO Kitsune
2022-11-09Support OP_REPEAT and OP_REPEAT_INCTSUYUSATO Kitsune
2022-11-09Reduce warningsTSUYUSATO Kitsune
2022-11-09Fix to compile when USE_CACHE_MATCH_OPT is disabledTSUYUSATO Kitsune
2022-11-09Enable optimization for PUSH_IF/OR opcodesTSUYUSATO Kitsune
2022-11-09Enable optimization for ANYCHAR_STAR opcodesTSUYUSATO Kitsune
2022-11-09Add index to the latest NULL_CHECK_STACK for fast matchingTSUYUSATO Kitsune
2022-11-09Add static declaration to new functionsTSUYUSATO Kitsune