Skip to content

Commit 8995f60

Browse files
committed
mb_decode_mimeheader obeys RFC 2047 regarding underscores and QPrint encoding
1 parent 157ca65 commit 8995f60

File tree

4 files changed

+37
-2
lines changed

4 files changed

+37
-2
lines changed

NEWS

+6-1
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,8 @@ PHP NEWS
5252
. Added json_validate(). (Juan Morales)
5353

5454
- MBString:
55-
. mb_detect_encoding is better able to identify the correct encoding for Turkish text. (Alex Dowad)
55+
. mb_detect_encoding is better able to identify the correct encoding for
56+
Turkish text. (Alex Dowad)
5657
. mb_detect_encoding's "non-strict" mode now behaves as described in the
5758
documentation. Previously, it would return false if the very first byte
5859
of the input string was invalid in all candidate encodings. (Alex Dowad)
@@ -62,6 +63,10 @@ PHP NEWS
6263
MB_CASE_LOWER_SIMPLE and MB_CASE_TITLE_SIMPLE. (Alex Dowad)
6364
. mb_detect_encoding is better able to identify UTF-8 and UTF-16 strings
6465
with a byte-order mark. (Alex Dowad)
66+
. mb_decode_mimeheader interprets underscores in QPrint-encoded MIME
67+
encoded words as required by RFC 2047; they are converted to spaces.
68+
Underscores must be encoded as "=5F" in such MIME encoded words.
69+
(Alex Dowad)
6570

6671
- Opcache:
6772
. Added start, restart and force restart time to opcache's

UPGRADING

+4
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,10 @@ PHP 8.3 UPGRADE NOTES
7878
casing rules for the Greek letter sigma. For mb_convert_case, conditional
7979
casing only applies to MB_CASE_LOWER and MB_CASE_TITLE modes, not to
8080
MB_CASE_LOWER_SIMPLE and MB_CASE_TITLE_SIMPLE. (Alex Dowad)
81+
. mb_decode_mimeheader interprets underscores in QPrint-encoded MIME
82+
encoded words as required by RFC 2047; they are converted to spaces.
83+
Underscores must be encoded as "=5F" in such MIME encoded words.
84+
(Alex Dowad)
8185

8286
- Standard:
8387
. E_NOTICEs emitted by unserialized() have been promoted to E_WARNING.

ext/mbstring/mbstring.c

+4-1
Original file line numberDiff line numberDiff line change
@@ -5705,7 +5705,10 @@ static unsigned char* mime_header_decode_encoded_word(unsigned char *p, unsigned
57055705
/* Fill `buf` with bytes from decoding QPrint */
57065706
while (p < e) {
57075707
unsigned char c = *p++;
5708-
if (c == '=' && (e - p) >= 2) {
5708+
if (c == '_') {
5709+
*bufp++ = ' ';
5710+
continue;
5711+
} else if (c == '=' && (e - p) >= 2) {
57095712
unsigned char c2 = *p++;
57105713
unsigned char c3 = *p++;
57115714
if (qprint_map[c2] >= 0 && qprint_map[c3] >= 0) {
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
--TEST--
2+
Test mb_decode_mimeheader() function: use of underscores in QPrint-encoded data
3+
--EXTENSIONS--
4+
mbstring
5+
--FILE--
6+
<?php
7+
8+
// RFC 2047 says that in a QPrint-encoded MIME encoded word, underscores should be converted to spaces
9+
var_dump(mb_decode_mimeheader("=?UTF-8?Q?abc?="));
10+
var_dump(mb_decode_mimeheader("=?UTF-8?Q?abc_def?="));
11+
var_dump(mb_decode_mimeheader("_=?UTF-8?Q?abc_def?=_"));
12+
var_dump(mb_decode_mimeheader("=?UTF-8?Q?__=E6=B1=89=E5=AD=97__?="));
13+
14+
// This is how underscores should be encoded in MIME encoded words with QPrint
15+
var_dump(mb_decode_mimeheader("=?UTF-8?Q?=5F?="));
16+
17+
?>
18+
--EXPECT--
19+
string(3) "abc"
20+
string(7) "abc def"
21+
string(9) "_abc def_"
22+
string(10) " 汉字 "
23+
string(1) "_"

0 commit comments

Comments
 (0)