summaryrefslogtreecommitdiff
path: root/test/ruby/test_regexp.rb
AgeCommit message (Collapse)Author
2023-01-30Fix parsing of regexps that toggle extended mode on/off inside regexpJeremy Evans
This was broken in ec3542229b29ec93062e9d90e877ea29d3c19472. That commit didn't handle cases where extended mode was turned on/off inside the regexp. There are two ways to turn extended mode on/off: ``` /(?-x:#y)#z /x =~ '#y' /(?-x)#y(?x)#z /x =~ '#y' ``` These can be nested inside the same regexp: ``` /(?-x:(?x)#x (?-x)#y)#z /x =~ '#y' ``` As you can probably imagine, this makes handling these regexps somewhat complex. Due to the nesting inside portions of regexps, the unassign_nonascii function needs to be recursive. In recursive mode, it needs to track both opening and closing parentheses, similar to how it already tracked opening and closing brackets for character classes. When scanning the regexp and coming to `(?` not followed by `#`, scan for options, and use `x` and `i` to determine whether to turn on or off extended mode. For `:`, indicting only the current regexp section should have the extended mode switched, recurse with the extended mode set or unset. For `)`, indicating the remainder of the regexp (or current regexp portion if already recursing) should turn extended mode on or off, just change the extended mode flag and keep scanning. While testing this, I noticed that `a`, `d`, and `u` are accepted as options, in addition to `i`, `m`, and `x`, but I can't see where those options are documented. I'm not sure whether or not handling `a`, `d`, and `u` as options is a bug. Fixes [Bug #19379] Notes: Merged: https://github.com/ruby/ruby/pull/7192
2022-12-28Fix [Bug 19273], set correct value to `outer_repeat` on `OP_REPEAT` (#7035)TSUYUSATO Kitsune
Notes: Merged-By: makenowjust <[email protected]>
2022-12-22Always issue deprecation warning when calling Regexp.new with 3rd positional ↵Jeremy Evans
argument Previously, only certain values of the 3rd argument triggered a deprecation warning. First step for fix for bug #18797. Support for the 3rd argument will be removed after the release of Ruby 3.2. Fix minor fallout discovered by the tests. Co-authored-by: Nobuyoshi Nakada <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/6976
2022-12-22Share argument parsing in `Regexp#initialize` and `Regexp.linear_time?`Nobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/6988
2022-12-14Add `Regexp.linear_time?` (#6901)TSUYUSATO Kitsune
Notes: Merged-By: makenowjust <[email protected]>
2022-12-12Make absent operator work at the end of the input stringYusuke Endoh
https://bugs.ruby-lang.org/issues/19104#change-100542 Notes: Merged: https://github.com/ruby/ruby/pull/6902
2022-11-27Relax a too strict timeoutTakashi Kokubun
Regexp tests are flaky. http://rubyci.s3.amazonaws.com/s390x/ruby-master/log/20221128T050004Z.fail.html.gz
2022-11-24Relax the timeout of TestRegexp#test_cache_optimization_squareYusuke Endoh
It fails on riscv (QEmu) http://rubyci.s3.amazonaws.com/debian-riscv64/ruby-master/log/20221124T000021Z.fail.html.gz ``` 1) Error: TestRegexp#test_cache_optimization_square: Regexp::TimeoutError: regexp match timeout /home/rubyci/chkbuild/tmp/build/20221124T000021Z/ruby/test/ruby/test_regexp.rb:1693:in `<main>' /home/rubyci/chkbuild/tmp/build/20221124T000021Z/ruby/test/ruby/test_regexp.rb:1688:in `test_cache_optimization_square' ```
2022-11-19Avoid a timeout on test_cache_optimization_exponentialTakashi Kokubun
The timeout seems too short for some CIs. http://rubyci.s3.amazonaws.com/debian11-aarch64/ruby-master/log/20221120T012840Z.fail.html.gz
2022-11-11Allow a float error for Regexp.timeoutYusuke Endoh
The tests failed on windows https://github.com/ruby/ruby/actions/runs/3440997073/jobs/5740085169#step:18:62 ``` 1) Failure: TestRegexp#test_s_timeout [D:/a/ruby/ruby/src/test/ruby/test_regexp.rb:1586]: <0.30000000000000004> expected but was <0.3>. 2) Failure: TestRegexp#test_timeout_shorter_than_global [D:/a/ruby/ruby/src/test/ruby/test_regexp.rb:1631]: <0.30000000000000004> expected but was <0.3>. ```
2022-11-11Run EnvUtil.apply_timeout_scale outside of assert_separatelyYusuke Endoh
It does not work well in assert_separately
2022-11-09Update timeout seconds for square testTSUYUSATO Kitsune
2022-11-09Update timeout secondsTSUYUSATO Kitsune
2022-11-09Fix and add regexp testsTSUYUSATO Kitsune
2022-10-24Fix per-instance Regexp timeout (#6621)Yusuke Endoh
Fix per-instance Regexp timeout This makes it follow what was decided in [Bug #19055]: * `Regexp.new(str, timeout: nil)` should respect the global timeout * `Regexp.new(str, timeout: huge_val)` should use the maximum value that can be represented in the internal representation * `Regexp.new(str, timeout: 0 or negative value)` should raise an error Notes: Merged-By: mame <[email protected]>
2022-10-10Add MatchData#deconstruct/deconstruct_keysVladimir Dementyev
Notes: Merged: https://github.com/ruby/ruby/pull/6216
2022-06-20[Feature #18788] Support options as `String` to `Regexp.new`Nobuyoshi Nakada
`Regexp.new` now supports passing the regexp flags not only as an `Integer`, but also as a `String. Unknown flags raise errors. Notes: Merged: https://github.com/ruby/ruby/pull/6039
2022-06-20Warn suspicious flag to `Regexp.new`Nobuyoshi Nakada
Now second argument should be `true`, `false`, `nil` or Integer. This flag is confused with third argument some times. Notes: Merged: https://github.com/ruby/ruby/pull/6039
2022-06-06Ignore invalid escapes in regexp commentsJeremy Evans
Invalid escapes are handled at multiple levels. The first level is in parse.y, so skip invalid unicode escape checks for regexps in parse.y. Make rb_reg_preprocess and unescape_nonascii accept the regexp options. In unescape_nonascii, if the regexp is an extended regexp, when "#" is encountered, ignore all characters until the end of line or end of regexp. Unfortunately, in extended regexps, you can use "#" as a non-comment character inside a character class, so also parse "[" and "]" specially for extended regexps, and only skip comments if "#" is not inside a character class. Handle nested character classes as well. This issue doesn't just affect extended regexps, it also affects "(#?" comments inside all regexps. So for those comments, scan until trailing ")" and ignore content inside. I'm not sure if there are other corner cases not handled. A better fix would be to redesign the regexp parser so that it unescaped during parsing instead of before parsing, so you already know the current parsing state. Fixes [Bug #18294] Co-authored-by: Nobuyoshi Nakada <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/5721 Merged-By: jeremyevans <[email protected]>
2022-04-12Just free compiled pattern if no space is usedNobuyoshi Nakada
https://hackerone.com/reports/1220911 Notes: Merged: https://github.com/ruby/ruby/pull/5793
2022-04-05Apply timescale configuration for tests of Regexp.timeoutYusuke Endoh
2022-03-31Return only captured range in `MatchData` [Bug #18670]Nobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/5740 Merged-By: nobu <[email protected]>
2022-03-31re.c: stop a wrong warning of "flags ignored" on Regexp.new(//)Yusuke Endoh
[Bug #18669]
2022-03-30re.c: raise Regexp::TimeoutError instead of RuntimeErrorYusuke Endoh
Notes: Merged: https://github.com/ruby/ruby/pull/5703
2022-03-30re.c: Add `timeout` keyword for Regexp.new and Regexp#timeoutYusuke Endoh
Notes: Merged: https://github.com/ruby/ruby/pull/5703
2022-03-30re.c: Add Regexp.timeout= and Regexp.timeoutYusuke Endoh
[Feature #17837] Notes: Merged: https://github.com/ruby/ruby/pull/5703
2022-03-29Fix multiplex backreferencs near end of string in regexp matchJeremy Evans
Idea from Jirka Marsik. Fixes [Bug #18631] Notes: Merged: https://github.com/ruby/ruby/pull/5710
2022-03-13add some tests for Unicode Version 14.0.0Martin Dürst
2022-02-19Add String#byteindex, String#byterindex, and MatchData#byteoffset (#5518)Shugo Maeda
* Add String#byteindex, String#byterindex, and MatchData#byteoffset [Feature #13110] Co-authored-by: NARUSE, Yui <[email protected]> Notes: Merged-By: shugo <[email protected]>
2021-10-01Avoid race condition in Regexp#matchJeremy Evans
In certain conditions, Regexp#match could return a MatchData with missing captures. This seems to require at the least, multiple threads calling a method that calls the same block/proc/lambda which calls Regexp#match. The race condition happens because the MatchData is passed from indirectly via the backref, and other threads can modify the backref. Fix the issue by: 1. Not reusing the existing MatchData from the backref, and always allocating a new MatchData. 2. Passing the MatchData directly to the caller using a VALUE*, instead of indirectly through the backref. It's likely that variants of this issue exist for other Regexp methods. Anywhere that MatchData is passed implicitly through the backref is probably vulnerable to this issue. Fixes [Bug #17507] Notes: Merged: https://github.com/ruby/ruby/pull/4734
2021-09-17[Feature #18172] Fix duplicate test nameNobuyoshi Nakada
2021-09-16[Feature #18172] Add MatchData#match_lengthNobuyoshi Nakada
The method to return the length of the matched substring corresponding to the given argument. Notes: Merged: https://github.com/ruby/ruby/pull/4851
2021-09-16[Feature #18172] Add MatchData#matchNobuyoshi Nakada
The method to return the single matched substring corresponding to the given argument. Notes: Merged: https://github.com/ruby/ruby/pull/4851
2021-09-12Preserve the encoding of the argument in IndexError [Bug #18160]Nobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/4822
2021-05-12Fix handling of control/meta escapes in literal regexpsJeremy Evans
Ruby uses a recursive algorithm for handling control/meta escapes in strings (read_escape). However, the equivalent code for regexps (tokadd_escape) in did not use a recursive algorithm. Due to this, Handling of control/meta escapes in regexp did not have the same behavior as in strings, leading to behavior such as the following returning nil: ```ruby /\c\xFF/ =~ "\c\xFF" ``` Switch the code for handling \c, \C and \M in literal regexps to use the same code as for strings (read_escape), to keep behavior consistent between the two. Fixes [Bug #14367] Notes: Merged: https://github.com/ruby/ruby/pull/4495
2021-03-16test/ruby/test_regexp.rb: Avoid "ambiguity between regexp and two divisions"Yusuke Endoh
2021-03-15Check backref number buffer overrun [Bug #16376]xtkoba (Tee KOBAYASHI)
2021-01-13Capture to reserved name variables if already defined [Bug #17533]Nobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/4059
2020-12-18Use category: :deprecated in warnings that are related to deprecationJeremy Evans
Also document that both :deprecated and :experimental are supported :category option values. The locations where warnings were marked as deprecation warnings was previously reviewed by shyouhei. Comment a couple locations where deprecation warnings should probably be used but are not currently used because deprecation warning enablement has not occurred at the time they are called (RUBY_FREE_MIN, RUBY_HEAP_MIN_SLOTS, -K). Add assert_deprecated_warn to test assertions. Use this to simplify some tests, and fix failing tests after marking some warnings with deprecated category. Notes: Merged: https://github.com/ruby/ruby/pull/3917
2020-12-18use eval to create different Regexp objectsKoichi Sasada
Only one warning is shown for the same Regexp object, so create different objects to support repeating tests. http://ci.rvm.jp/results/trunk-repeat20@phosphorus-docker/3290658
2020-12-17test/ruby: Check warning messages at a finer granularityNobuyoshi Nakada
Instead of suppressing all warnings wholly in each test scripts by setting `$VERBOSE` to `nil` in `setup` methods. Notes: Merged: https://github.com/ruby/ruby/pull/3925 Merged-By: nobu <[email protected]>
2020-12-02Do not reduce quantifiers if it affects which text will be matchedJeremy Evans
Quantifier reduction when using +?)* and +?)+ should not be done as it affects which text will be matched. This removes the need for the RQ_PQ_Q ReduceType, so remove the enum entry and related switch case. Test that these are the only two patterns affected by testing all quantifier reduction tuples for both the captured and uncaptured cases and making sure the matched text is the same for both. Fixes [Bug #17341] Notes: Merged: https://github.com/ruby/ruby/pull/3808
2020-11-28[Feature #17136] Remove special behavior from $KCODENobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/3483
2020-11-27Separated tests for $KCODE and $=Nobuyoshi Nakada
2020-11-24Detect the premature end of char property in regexpJeremy Evans
Default to ONIGERR_INVALID_CHAR_PROPERTY_NAME in fetch_char_property_to_ctype and only set otherwise if an ending } is found. Fixes [Bug #17340] Notes: Merged: https://github.com/ruby/ruby/pull/3807
2020-01-16`Regexp` in `MatchData` can be `nil`Nobuyoshi Nakada
`String#sub` with a string pattern defers creating a `Regexp` until `MatchData#regexp` creates a `Regexp` from the matched string. `Regexp#last_match(group_name)` accessed its content without creating the `Regexp` though. [Bug #16508]
2020-01-15Freeze Regexp literalsJean Boussier
[Feature #8948] [Feature #16377] Since Regexp literals always reference the same instance, allowing to mutate them can lead to state leak. Notes: Merged: https://github.com/ruby/ruby/pull/2705
2019-12-04Revert "Regexp#match{?} with nil raises TypeError as String, Symbol (#1506)"NARUSE, Yui
This reverts commit 2a22a6b2d8465934e75520a7fdcf522d50890caf. Revert [Feature #13083]
2019-12-04Revert "Revert nil error and adding deprecation message"NARUSE, Yui
This reverts commit 452bee3ee8d68059fabd9b1c7a75661b14e3933e.
2019-11-06Undefine MatchData.allocate [Feature #16294]Nobuyoshi Nakada