Skip to content

str_getcsv returns null byte for unterminated quoted string #11982

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
apeschar opened this issue Aug 15, 2023 · 3 comments
Closed

str_getcsv returns null byte for unterminated quoted string #11982

apeschar opened this issue Aug 15, 2023 · 3 comments

Comments

@apeschar
Copy link

apeschar commented Aug 15, 2023

Description

The following code:

<?php
echo json_encode(str_getcsv("\""));

Resulted in this output:

["\u0000"]

But I expected this output instead:

[""]

https://3v4l.org/TI1vo

PHP Version

PHP 5.3.0 to now

Operating System

No response

@Girgias
Copy link
Member

Girgias commented Aug 15, 2023

This behaviour dates back to its introduction in PHP 5.3, at this point I think the ship has sailed on "sensible" behaviour for this function. I really need to go back and work onext/csv.

But @bukka what do you think about this?

@bukka
Copy link
Member

bukka commented Aug 24, 2023

I have done some debugging and I think this might be actually a bug that is worth a fix. I see the point about the BC as some apps might potentially rely on that but I think we could still fix it in 8.3 potentially. Specifically this output really does not make any sense to me:

<?php
var_dump(str_getcsv('"","a'));
var_dump(str_getcsv('"","'));

results to

array(2) {
  [0]=>
  string(0) ""
  [1]=>
  string(1) "a"
}
array(2) {
  [0]=>
  string(0) ""
  [1]=>
  string(1) ""
}

As can be seen the current logic is to ignore missing enclosing character at the end. But in the case of empty strings, it prints \0 which seems like a omission in the code to me as it twice increments bptr without checking for end.

@bukka
Copy link
Member

bukka commented Aug 25, 2023

The fix for this is in GH-12047 . It took me quite a bit of time to just figure out how that old (created 20 years ago) parser works. At least I have got some knowledge of that now...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants