Skip to content

[Meta] PSES should handle unusual path characters properly #714

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rjmholt opened this issue Jul 27, 2018 · 4 comments
Open

[Meta] PSES should handle unusual path characters properly #714

rjmholt opened this issue Jul 27, 2018 · 4 comments
Labels

Comments

@rjmholt
Copy link
Contributor

rjmholt commented Jul 27, 2018

There are a few open issues that look like PSES is crashing when it comes across a strange character.

@glennsarti
Copy link
Contributor

May also be worth checking with the more unusual UTF8 characters e.g. Runes is a common one we've had trouble with in ruby (puppet) land and Windows.

First line of Rune version of Rune poem at http://www.columbia.edu/~fdc/utf8/
characters chosen since they will not parse on Windows with codepage 437 or 1252
Section 3.2.1.3 of Ruby spec guarantees that \u strings are encoded as UTF-8

https://github.com/puppetlabs/puppet/blob/24ead48f617cd3912491fe419ac7b67cda53a320/spec/unit/pops/loaders/dependency_loader_spec.rb#L71-L76

@rjmholt
Copy link
Contributor Author

rjmholt commented Nov 26, 2018

Runes is a common one we've had trouble with in ruby (puppet) land and Windows.

Not denying that that's a good test case, but were people using runic characters in Ruby/puppet??

PowerShell (Core) and VSCode theoretically handle UTF-8 properly. Our problems with encodings come from (1) Windows PowerShell defaulting to ISO-8859-1/latin-1/CP1252, (2) C# using UTF-16 and (3) the LSP encoding in UTF-8 but specifying offsets in UTF-16.

However, we've been bitten less by encoding issues and much more by:

  • URI significant chars like # and ?
  • PowerShell wildcard chars like *, [, ] and ?
  • Some other characters in paths that we didn't handle properly when we forced to string template.

I think @rkeithhill has sorted most of these last issues out in a recent PR, but it's an ongoing bug hunt.

But yeah, a rune string we could easily add to the tests.

@rkeithhill
Copy link
Contributor

But there's more work to be done to clean up ClientFilePaths that get sent back to the client from PSES>

@glennsarti
Copy link
Contributor

Not denying that that's a good test case, but were people using runic characters in Ruby/puppet??

Oh hell no! But it's on the extreme end of UTF8 usage. If it works with runes, then it'll probably work everywhere else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants