Age | Commit message (Collapse) | Author |
|
This was broken in ec3542229b29ec93062e9d90e877ea29d3c19472. That commit
didn't handle cases where extended mode was turned on/off inside the
regexp. There are two ways to turn extended mode on/off:
```
/(?-x:#y)#z
/x =~ '#y'
/(?-x)#y(?x)#z
/x =~ '#y'
```
These can be nested inside the same regexp:
```
/(?-x:(?x)#x
(?-x)#y)#z
/x =~ '#y'
```
As you can probably imagine, this makes handling these regexps
somewhat complex. Due to the nesting inside portions of regexps,
the unassign_nonascii function needs to be recursive. In
recursive mode, it needs to track both opening and closing
parentheses, similar to how it already tracked opening and
closing brackets for character classes.
When scanning the regexp and coming to `(?` not followed by `#`,
scan for options, and use `x` and `i` to determine whether to
turn on or off extended mode. For `:`, indicting only the
current regexp section should have the extended mode
switched, recurse with the extended mode set or unset. For `)`,
indicating the remainder of the regexp (or current regexp portion
if already recursing) should turn extended mode on or off, just
change the extended mode flag and keep scanning.
While testing this, I noticed that `a`, `d`, and `u` are accepted
as options, in addition to `i`, `m`, and `x`, but I can't see
where those options are documented. I'm not sure whether or not
handling `a`, `d`, and `u` as options is a bug.
Fixes [Bug #19379]
Notes:
Merged: https://github.com/ruby/ruby/pull/7192
|
|
Notes:
Merged-By: makenowjust <[email protected]>
|
|
argument
Previously, only certain values of the 3rd argument triggered a
deprecation warning.
First step for fix for bug #18797. Support for the 3rd argument
will be removed after the release of Ruby 3.2.
Fix minor fallout discovered by the tests.
Co-authored-by: Nobuyoshi Nakada <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/6976
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6988
|
|
Notes:
Merged-By: makenowjust <[email protected]>
|
|
https://bugs.ruby-lang.org/issues/19104#change-100542
Notes:
Merged: https://github.com/ruby/ruby/pull/6902
|
|
Regexp tests are flaky.
http://rubyci.s3.amazonaws.com/s390x/ruby-master/log/20221128T050004Z.fail.html.gz
|
|
It fails on riscv (QEmu)
http://rubyci.s3.amazonaws.com/debian-riscv64/ruby-master/log/20221124T000021Z.fail.html.gz
```
1) Error:
TestRegexp#test_cache_optimization_square:
Regexp::TimeoutError: regexp match timeout
/home/rubyci/chkbuild/tmp/build/20221124T000021Z/ruby/test/ruby/test_regexp.rb:1693:in `<main>'
/home/rubyci/chkbuild/tmp/build/20221124T000021Z/ruby/test/ruby/test_regexp.rb:1688:in `test_cache_optimization_square'
```
|
|
The timeout seems too short for some CIs.
http://rubyci.s3.amazonaws.com/debian11-aarch64/ruby-master/log/20221120T012840Z.fail.html.gz
|
|
The tests failed on windows
https://github.com/ruby/ruby/actions/runs/3440997073/jobs/5740085169#step:18:62
```
1) Failure:
TestRegexp#test_s_timeout [D:/a/ruby/ruby/src/test/ruby/test_regexp.rb:1586]:
<0.30000000000000004> expected but was
<0.3>.
2) Failure:
TestRegexp#test_timeout_shorter_than_global [D:/a/ruby/ruby/src/test/ruby/test_regexp.rb:1631]:
<0.30000000000000004> expected but was
<0.3>.
```
|
|
It does not work well in assert_separately
|
|
|
|
|
|
|
|
Fix per-instance Regexp timeout
This makes it follow what was decided in [Bug #19055]:
* `Regexp.new(str, timeout: nil)` should respect the global timeout
* `Regexp.new(str, timeout: huge_val)` should use the maximum value that
can be represented in the internal representation
* `Regexp.new(str, timeout: 0 or negative value)` should raise an error
Notes:
Merged-By: mame <[email protected]>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6216
|
|
`Regexp.new` now supports passing the regexp flags not only as an
`Integer`, but also as a `String. Unknown flags raise errors.
Notes:
Merged: https://github.com/ruby/ruby/pull/6039
|
|
Now second argument should be `true`, `false`, `nil` or Integer.
This flag is confused with third argument some times.
Notes:
Merged: https://github.com/ruby/ruby/pull/6039
|
|
Invalid escapes are handled at multiple levels. The first level
is in parse.y, so skip invalid unicode escape checks for regexps
in parse.y.
Make rb_reg_preprocess and unescape_nonascii accept the regexp
options. In unescape_nonascii, if the regexp is an extended
regexp, when "#" is encountered, ignore all characters until the
end of line or end of regexp.
Unfortunately, in extended regexps, you can use "#" as a non-comment
character inside a character class, so also parse "[" and "]"
specially for extended regexps, and only skip comments if "#" is
not inside a character class. Handle nested character classes as well.
This issue doesn't just affect extended regexps, it also affects
"(#?" comments inside all regexps. So for those comments, scan
until trailing ")" and ignore content inside.
I'm not sure if there are other corner cases not handled. A
better fix would be to redesign the regexp parser so that it
unescaped during parsing instead of before parsing, so you already
know the current parsing state.
Fixes [Bug #18294]
Co-authored-by: Nobuyoshi Nakada <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/5721
Merged-By: jeremyevans <[email protected]>
|
|
https://hackerone.com/reports/1220911
Notes:
Merged: https://github.com/ruby/ruby/pull/5793
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5740
Merged-By: nobu <[email protected]>
|
|
[Bug #18669]
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5703
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5703
|
|
[Feature #17837]
Notes:
Merged: https://github.com/ruby/ruby/pull/5703
|
|
Idea from Jirka Marsik.
Fixes [Bug #18631]
Notes:
Merged: https://github.com/ruby/ruby/pull/5710
|
|
|
|
* Add String#byteindex, String#byterindex, and MatchData#byteoffset [Feature #13110]
Co-authored-by: NARUSE, Yui <[email protected]>
Notes:
Merged-By: shugo <[email protected]>
|
|
In certain conditions, Regexp#match could return a MatchData with
missing captures. This seems to require at the least, multiple
threads calling a method that calls the same block/proc/lambda
which calls Regexp#match.
The race condition happens because the MatchData is passed from
indirectly via the backref, and other threads can modify the
backref.
Fix the issue by:
1. Not reusing the existing MatchData from the backref, and always
allocating a new MatchData.
2. Passing the MatchData directly to the caller using a VALUE*,
instead of indirectly through the backref.
It's likely that variants of this issue exist for other Regexp
methods. Anywhere that MatchData is passed implicitly through
the backref is probably vulnerable to this issue.
Fixes [Bug #17507]
Notes:
Merged: https://github.com/ruby/ruby/pull/4734
|
|
|
|
The method to return the length of the matched substring
corresponding to the given argument.
Notes:
Merged: https://github.com/ruby/ruby/pull/4851
|
|
The method to return the single matched substring corresponding to
the given argument.
Notes:
Merged: https://github.com/ruby/ruby/pull/4851
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4822
|
|
Ruby uses a recursive algorithm for handling control/meta escapes
in strings (read_escape). However, the equivalent code for regexps
(tokadd_escape) in did not use a recursive algorithm. Due to this,
Handling of control/meta escapes in regexp did not have the same
behavior as in strings, leading to behavior such as the following
returning nil:
```ruby
/\c\xFF/ =~ "\c\xFF"
```
Switch the code for handling \c, \C and \M in literal regexps to
use the same code as for strings (read_escape), to keep behavior
consistent between the two.
Fixes [Bug #14367]
Notes:
Merged: https://github.com/ruby/ruby/pull/4495
|
|
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4059
|
|
Also document that both :deprecated and :experimental are supported
:category option values.
The locations where warnings were marked as deprecation warnings
was previously reviewed by shyouhei.
Comment a couple locations where deprecation warnings should probably
be used but are not currently used because deprecation warning
enablement has not occurred at the time they are called
(RUBY_FREE_MIN, RUBY_HEAP_MIN_SLOTS, -K).
Add assert_deprecated_warn to test assertions. Use this to simplify
some tests, and fix failing tests after marking some warnings with
deprecated category.
Notes:
Merged: https://github.com/ruby/ruby/pull/3917
|
|
Only one warning is shown for the same Regexp object, so create
different objects to support repeating tests.
http://ci.rvm.jp/results/trunk-repeat20@phosphorus-docker/3290658
|
|
Instead of suppressing all warnings wholly in each test scripts by
setting `$VERBOSE` to `nil` in `setup` methods.
Notes:
Merged: https://github.com/ruby/ruby/pull/3925
Merged-By: nobu <[email protected]>
|
|
Quantifier reduction when using +?)* and +?)+ should not be done
as it affects which text will be matched.
This removes the need for the RQ_PQ_Q ReduceType, so remove the
enum entry and related switch case.
Test that these are the only two patterns affected by testing all
quantifier reduction tuples for both the captured and uncaptured
cases and making sure the matched text is the same for both.
Fixes [Bug #17341]
Notes:
Merged: https://github.com/ruby/ruby/pull/3808
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/3483
|
|
|
|
Default to ONIGERR_INVALID_CHAR_PROPERTY_NAME in
fetch_char_property_to_ctype and only set otherwise if an ending
} is found.
Fixes [Bug #17340]
Notes:
Merged: https://github.com/ruby/ruby/pull/3807
|
|
`String#sub` with a string pattern defers creating a `Regexp`
until `MatchData#regexp` creates a `Regexp` from the matched
string. `Regexp#last_match(group_name)` accessed its content
without creating the `Regexp` though. [Bug #16508]
|
|
[Feature #8948] [Feature #16377]
Since Regexp literals always reference the same instance,
allowing to mutate them can lead to state leak.
Notes:
Merged: https://github.com/ruby/ruby/pull/2705
|
|
This reverts commit 2a22a6b2d8465934e75520a7fdcf522d50890caf.
Revert [Feature #13083]
|
|
This reverts commit 452bee3ee8d68059fabd9b1c7a75661b14e3933e.
|
|
|