-
Notifications
You must be signed in to change notification settings - Fork 576
UTF-16 filters do not handle all surrogates gracefully #10118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
From @nwc10Created by @nwc10Consider a script written in UTF-16BE, with a character whose surrogate pair $ ./perl -Ilib -MEncode -we 'print "\xFE\xFF", encode("UTF16-BE", "warn qq[Hello world]; # \x{12800}")' >script.pl $ ./perl script.pl Malformed UTF-16 surrogate. But that isn't true: $ iconv -f UTF-16BE -t UTF-8 <script.pl | ./perl The problem is that utf16_textfilter() is reading "line" by "line", assuming The latter (also) doesn't check for end of buffer when reading the second UTF-16LE will suffer the same bugs, once the reading-off-by-one bug is fixed Nicholas Clark Perl Info
|
From @jkeenanOn Tue, 02 Feb 2010 08:26:24 GMT, nicholas wrote:
This problem appears to have been corrected somewhere between 5.10.1 and 5.12.5. ##### $ perlbrew use perl-5.12.5 However, I haven't been able to figure out how to use Porting/bisect.pl to determine the commit at which the program first completed successfully. Suggestions? Thank you very much. -- |
The RT System itself - Status changed from 'new' to 'open' |
From @hvdsOn Sun, 26 Feb 2017 19:10:09 -0800, jkeenan wrote:
Verify that the testcase exits non-zero on failure and zero on success: % perl-5.10 ~/72414-script.pl Check the docs for example of "when was this fixed": % perldoc Porting/bisect-runner.pl | grep -A1 'stop being an error' Bisect: % Porting/bisect.pl --expect-fail -- ./perl -Ilib ~/72414-script.pl S_utf16_textfilter() needs to avoid splitting UTF-16 surrogate pairs. :040000 040000 00e64049450c3e91b8d09afa4b676520cc75836e f73afa6dfba581efaa53915a40b8c611e07cf23f M t The bisector could helpfully s/bad commit/good commit/ under expect-fail. Hugo |
From @jkeenanOn Mon, 27 Feb 2017 12:12:18 GMT, hv wrote:
Bisection confirmed: ##### # good Hugo, Tux, alh +++ for assistance in bisection. Marking ticket Resolved. Thank you very much. |
@jkeenan - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#72414 (status was 'resolved')
Searchable as RT72414$
The text was updated successfully, but these errors were encountered: