Decode pyproject.toml as UTF-8 regardless of system locale #233

staticdev · 2021-01-14T06:26:33Z

Just passes binary to tomlkit. The only way to work on Windows to parse .toml files with special characters.

Closes #213

cjolowicz · 2021-01-14T11:46:53Z

Thank you for the report and for contributing a fix! 🙇‍♂️

Per the TOML spec, input documents must be valid UTF-8. IIUC the problem is that Path.read_text() ends up using the preferred locale encoding instead. Judging by the traceback in your issue description, this happens to be CP-1252 on your system.

As a fix, I would prefer to be explicit about the encoding:

-        text = path.read_text()
+        text = path.read_text(encoding="utf-8")

Could you try this and adapt the PR?

staticdev · 2021-01-14T12:41:59Z

Thank you for the report and for contributing a fix!

Per the TOML spec, input documents must be valid UTF-8. IIUC the problem is that Path.read_text() ends up using the preferred locale encoding instead. Judging by the traceback in your issue description, this happens to be CP-1252 on your system.

As a fix, I would prefer to be explicit about the encoding:
-        text = path.read_text()
+        text = path.read_text(encoding="utf-8")
Could you try this and adapt the PR?

Done @cjolowicz.

cjolowicz · 2021-01-14T19:17:41Z

Released in 0.7.1

Fix TOML parse

ba5ae9c

Code review - update to read text with encoding

d581a37

cjolowicz changed the title ~~Fix TOML parse~~ Decode pyproject.toml as UTF-8 regardless of system locale Jan 14, 2021

cjolowicz added the bug Something isn't working label Jan 14, 2021

cjolowicz merged commit 1d13c74 into cjolowicz:master Jan 14, 2021

cjolowicz mentioned this pull request Jan 14, 2021

Add test for non-ASCII characters in pyproject.toml #234

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Decode pyproject.toml as UTF-8 regardless of system locale #233

Decode pyproject.toml as UTF-8 regardless of system locale #233

Uh oh!

staticdev commented Jan 14, 2021 •

edited

Loading

Uh oh!

cjolowicz commented Jan 14, 2021 •

edited

Loading

Uh oh!

staticdev commented Jan 14, 2021

Uh oh!

cjolowicz commented Jan 14, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Decode pyproject.toml as UTF-8 regardless of system locale #233

Decode pyproject.toml as UTF-8 regardless of system locale #233

Uh oh!

Conversation

staticdev commented Jan 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cjolowicz commented Jan 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

staticdev commented Jan 14, 2021

Uh oh!

cjolowicz commented Jan 14, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

staticdev commented Jan 14, 2021 •

edited

Loading

cjolowicz commented Jan 14, 2021 •

edited

Loading