diff options
author | Jeremy Evans <[email protected]> | 2023-11-30 09:53:01 -0800 |
---|---|---|
committer | Jeremy Evans <[email protected]> | 2023-11-30 10:40:40 -0800 |
commit | 060f14bf62ad3f426a6666901c45b82d4334fa26 (patch) | |
tree | d663f15cf44ebd1e8626934265b1253014e5eaab /doc/_regexp.rdoc | |
parent | f75fef66221e55ce9e9e302cfd8ee22062527c6c (diff) |
Update documentation for [[:word:]] and \p{Word} in regexps
Onigmo uses Decimal_Number and not Number for these.
Fixes [Bug #19417]
Diffstat (limited to 'doc/_regexp.rdoc')
-rw-r--r-- | doc/_regexp.rdoc | 30 |
1 files changed, 19 insertions, 11 deletions
diff --git a/doc/_regexp.rdoc b/doc/_regexp.rdoc index ffba14e78f..7b71eee984 100644 --- a/doc/_regexp.rdoc +++ b/doc/_regexp.rdoc @@ -838,13 +838,17 @@ These are also commonly used: - <tt>/\p{Emoji}/</tt>: Unicode emoji. - <tt>/\p{Graph}/</tt>: Non-blank character (excludes spaces, control characters, and similar). -- <tt>/\p{Word}/</tt>: A member of one of the following Unicode character - categories (see below): +- <tt>/\p{Word}/</tt>: A member in one of these Unicode character + categories (see below) or having one of these Unicode properties: - - +Mark+ (+M+). - - +Letter+ (+L+). - - +Number+ (+N+) - - <tt>Connector Punctuation</tt> (+Pc+). + - Unicode categories: + - +Mark+ (+M+). + - <tt>Decimal Number</tt> (+Nd+) + - <tt>Connector Punctuation</tt> (+Pc+). + + - Unicode properties: + - +Alpha+ + - <tt>Join_Control</tt> - <tt>/\p{ASCII}/</tt>: A character in the ASCII character set. - <tt>/\p{Any}/</tt>: Any Unicode character (including unassigned characters). @@ -993,12 +997,16 @@ Ruby also supports these (non-POSIX) bracket expressions: - <tt>/[[:ascii:]]/</tt>: Matches a character in the ASCII character set. - <tt>/[[:word:]]/</tt>: Matches a character in one of these Unicode character - categories (see below): + categories or having one of these Unicode properties: + + - Unicode categories: + - +Mark+ (+M+). + - <tt>Decimal Number</tt> (+Nd+) + - <tt>Connector Punctuation</tt> (+Pc+). - - +Mark+ (+M+). - - +Letter+ (+L+). - - +Number+ (+N+) - - <tt>Connector Punctuation</tt> (+Pc+). + - Unicode properties: + - +Alpha+ + - <tt>Join_Control</tt> === Comments |