diff options
author | shyouhei <shyouhei@b2dd03c8-39d4-4d8f-98ff-823fe69b080e> | 2018-11-16 02:34:00 +0000 |
---|---|---|
committer | shyouhei <shyouhei@b2dd03c8-39d4-4d8f-98ff-823fe69b080e> | 2018-11-16 02:34:00 +0000 |
commit | 6732423b5eb7191e81a23fe929926d50e0e4b39f (patch) | |
tree | f154be4a41e08925bad53c9958d0caca67457d61 | |
parent | 3a083985a471ca3d8429146f9f18dead6747c203 (diff) |
enc/unicode.c: 'a' is bigger than 'A'
In ASCII, 'a' is bigger than 'A'. Which means 'A' - 'a' is a negative
number (-32, to be precise). In C, the type of 'a' and 'A' are signed
int (cf: ISO/IEC 9899:1990 section 6.1.3.4). So 'A' - 'a' is also a
signed int. It is `(signed int)-32`.
The problem is, OnigCodePoint is unsigned int. Adding a negative
number to a variable of OnigCodepoint (`code` here) introduces an
unintentional cast of `(unsigned)(signed)-32`, which is
4,294,967,264. Adding this value to code then overflows, and the
result eventually becomes normal codepoint.
The series of operations are not a serious problem but because
`code >= 'a'` holds, we can `(code - 'a') + 'A'` to reroute this.
See also: https://github.com/k-takata/Onigmo/pull/107
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65752 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
-rw-r--r-- | enc/unicode.c | 6 |
1 files changed, 4 insertions, 2 deletions
diff --git a/enc/unicode.c b/enc/unicode.c index 0d692520e8..2c0d91dfea 100644 --- a/enc/unicode.c +++ b/enc/unicode.c @@ -683,8 +683,10 @@ onigenc_unicode_case_map(OnigCaseFoldType* flagP, MODIFIED; if (flags & ONIGENC_CASE_FOLD_TURKISH_AZERI && code == 'i') code = I_WITH_DOT_ABOVE; - else - code += 'A' - 'a'; + else { + code -= 'a'; + code += 'A'; + } } } else if (code >= 'A' && code <= 'Z') { |