You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
mb_decode_numericentity converts entities which immediately follow a valid/invalid entity
Thanks to Kamil Tieleka for suggesting that some of the behaviors of
the legacy implementation which the new mb_decode_numericentity
implementation took care to maintain were actually bugs and should
be fixed. Thanks also to Trevor Rowbotham for providing a link to
the HTML specification, showing how HTML numeric entities should
be interpreted.
mb_decode_numericentity now processes numeric entities in the
following situations where the old implementation would not:
- &<ENTITY> (for example, &A)
- &#<ENTITY>
- &#x<ENTITY>
- <VALID BUT UNTERMINATED DECIMAL ENTITY><ENTITY> (for example, AA)
- <VALID BUT UNTERMINATED HEX ENTITY><ENTITY>
- <INVALID AND UNTERMINATED DECIMAL ENTITY><ENTITY> (it does not matter why
the first entity is invalid; the value could be too big, it could have
too many digits, or it could not match the 'convmap' parameter)
- <INVALID AND UNTERMINATED HEX ENTITY><ENTITY>
This is consistent with the way that web browsers process
HTML entities.
testNonAscii("Starting entity immediately after valid decimal entity which is just within maximum length", $ucs4_test1, "\x3B\x9A\xCA\x00\x00\x00\x00&\x00\x00\x00#\x00\x00\x006\x00\x00\x005\x00\x00\x00;", [0, 0xFFFFFFFF, 0, 0xFFFFFFFF], 'UCS-4BE');
112
+
testNonAscii("Starting entity immediately after valid decimal entity which is just within maximum length", $ucs4_test1, "\x3B\x9A\xCA\x00\x00\x00\x00A", [0, 0xFFFFFFFF, 0, 0xFFFFFFFF], 'UCS-4BE');
testNonAscii("Starting entity immediately after valid hex entity which is just within maximum length", $ucs4_test2, "\x11\x11\x11\x11\x00\x00\x00&\x00\x00\x00#\x00\x00\x006\x00\x00\x005\x00\x00\x00;", [0, 0xFFFFFFFF, 0, 0xFFFFFFFF], 'UCS-4BE');
114
+
testNonAscii("Starting entity immediately after valid hex entity which is just within maximum length", $ucs4_test2, "\x11\x11\x11\x11\x00\x00\x00A", [0, 0xFFFFFFFF, 0, 0xFFFFFFFF], 'UCS-4BE');
Starting entity immediately after decimal entity which is too long: string(18) "�A" => string(14) "�A" (Good)
174
170
Starting entity immediately after hex entity which is too long: string(17) "�A" => string(13) "�" (Good)
175
-
Starting entity immediately after valid decimal entity which is just within maximum length: 000000260000002300000031000000300000003000000030000000300000003000000030000000300000003000000030000000260000002300000036000000350000003b => 3b9aca00000000260000002300000036000000350000003b (Good)
176
-
Starting entity immediately after valid hex entity which is just within maximum length: 0000002600000023000000780000003100000031000000310000003100000031000000310000003100000031000000260000002300000036000000350000003b => 11111111000000260000002300000036000000350000003b (Good)
Starting entity immediately after valid decimal entity which is just within maximum length: 000000260000002300000031000000300000003000000030000000300000003000000030000000300000003000000030000000260000002300000036000000350000003b => 3b9aca0000000041 (Good)
172
+
Starting entity immediately after valid hex entity which is just within maximum length: 0000002600000023000000780000003100000031000000310000003100000031000000310000003100000031000000260000002300000036000000350000003b => 1111111100000041 (Good)
0 commit comments