Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Resolve Windows path problems #8860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve Windows path problems #8860
Changes from 5 commits
bc1e642
338159f
39bacae
94a8c4e
287053b
c3360fa
78d2510
5c66f9f
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missed one odd encoding=
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you are suggesting is I didn't need any of this
get_encoding()
logic - tangent I went on. In the absence of the-X utf8
, I just needed to selectively remove all the, encoding="utf-8"
options where I needed the system encoding to dominate.Since merged - do I push an update or need a new PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nothing is written to file, so no need for encoding= to possibly confuse future reader(s)
python won't re-encode stuff, this is just the thing for read / write through text io wrapper
in absence of -Xutf-8, text io selects something default aka depends on version
https://github.com/python/cpython/blob/cbd192b6ff137aec25913cc80b2bf7bf1ef3755b/Modules/_io/textio.c#L1102-L1141 (our bundled version)
https://github.com/python/cpython/blob/ddb65c47b1066fec8ce7a674ebb88b167f417d84/Modules/_io/textio.c#L1126-L1132 (3.11 on any recent installation)
utf-8 output is ok, reading cli / writing stuff for external tools should have system encoding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for clarification when creating
build.opt
, we are readingutf-8
from dot-h files and writing with system encoding tobuild.opt
otherwise I can get a file not found for a-include ...
if the file path contains diacritics.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also nit:
any time we have characters from codepage different from ansii e.g.
C:\Users\Максім
in cp1251just my guess that GCC probably does not use win32
_wfopen
and widechar, so we have to use exact codepage bytes for filepath representation so system exec (that also does not encode / decode anything in args) for GCC gets proper pathspath and text have different code sections dealing with them, just to denote the difference
(I am gonna go read Python PEP 529, https://vstinner.github.io/python36-utf8-windows.html series and cpython unicodeobject.c)