Skip to content

French characters are not displayed correctly when debugging #1680

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
neothoms opened this issue Jan 4, 2019 · 16 comments
Closed

French characters are not displayed correctly when debugging #1680

neothoms opened this issue Jan 4, 2019 · 16 comments
Assignees
Labels

Comments

@neothoms
Copy link

neothoms commented Jan 4, 2019

System Details Output

- Operating system name and version : Windows 10 Version 10.0.17763 Numéro 17763
- VS Code version  : 1.30.1
- PowerShell extension version : 1.10.2
- Output from `$PSVersionTable`
### VSCode version: 1.30.1 dea8705087adb1b5e5ae1d9123278e178656186a x64

### VSCode extensions:
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]


### PSES version: 1.10.2.0

### PowerShell version:

Name                           Value
----                           -----
PSVersion                      5.1.17763.134
PSEdition                      Desktop
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}
BuildVersion                   10.0.17763.134
CLRVersion                     4.0.30319.42000
WSManStackVersion              3.0
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1



<!-- PowerShell output from above goes here -->


Issue Description

I am experiencing a problem with French characters wich are not correctly reproduced on VS Integrated console but it may also apears with any special characters from any language.

Expected Behaviour

When I type the following command directly in the console everything is displayed correctly.
write-host "---- These are french characters : âà ç éèë ïï and they are not correctly displayed ----"

image

Actual Behaviour

But if I use the powershell extension and type the same command in a .ps1 file encoded in UTF8 I have this incorrect result :

image

Attached Logs

Follow the instructions in the troubleshooting docs
about capturing and sending logs.

@SydneyhSmith
Copy link
Collaborator

@neothoms thanks for reporting this! The command you are running is provided by VSCode and it just pastes the path of the open file into the active terminal followed by an "enter" key, so it is likely that the issue is on the VSCode side: looks related to this issue microsoft/vscode#31366

@SydneyhSmith SydneyhSmith added the Bug: VS Code Bugs in VS Code itself. label Jan 4, 2019
@rjmholt rjmholt self-assigned this Jan 9, 2019
@ubidev
Copy link

ubidev commented Jan 10, 2019

Seems to be affecting variables definition too...
1- In vscode I get this error:
01

2- When I open the same fine in ISE, I get this:
02

3- I then correct/save/run the same file in ISE with this result:
03

4- Finally, back in vscode,it displays like this, but runs ok !!!
04

Hope this helps
win 10 ent 1809 latest patches
vscode 1.30.1
[email protected]

PSVersion 5.1.17763.134
PSEdition Desktop
PSCompatibleVersions {1.0, 2.0, 3.0, 4.0...}
BuildVersion 10.0.17763.134
CLRVersion 4.0.30319.42000
WSManStackVersion 3.0
PSRemotingProtocolVersion 2.3
SerializationVersion 1.1.0.1

@rjmholt
Copy link
Contributor

rjmholt commented Jan 10, 2019

This is an encoding problem that occurs with file handling between VSCode and the ISE. Please see #1351 and #1306 (comment).

There's not much we can do from the extension's perspective -- the solution lies in ensuring you have VSCode, the ISE and PowerShell all configured to use UTF-8. The encoding problems you're seeing are (for example) due to the é being encoded in UTF-8 as 0xC3 0xA9 and then being interpreted as CP-1252/ISO-8859-1/latin-1 by PowerShell as the characters é (handy reference).

You need to set PowerShell's encoding to UTF-8, or VSCode's to ISO-8859-1, and make sure the ISE is also configured to do the same. The encoding settings of all your tools need to be set to the same encoding.

Because PowerShell defaults to CP-1252 for things like this, but VSCode defaults to UTF-8, you will need to make a conscious choice as to how you encode your scripts in source control and how you configure the encoding in your editors and PowerShell. If they get out of sync, they may all try to re-encode files and cause problems again.

@rjmholt rjmholt closed this as completed Jan 10, 2019
@ubidev
Copy link

ubidev commented Jan 10, 2019

I understand your points.
I was always impressed since ISE v3 being able to handle french and spanish strings out to console output without a hitch. even localized variable names is not a problem.
Do you know how to configure then vscode to default to the same character support as ISE then? (UTF-8 with BOM) when creating new powershell files?

Thanks in advance

@rjmholt
Copy link
Contributor

rjmholt commented Jan 10, 2019

ISE v3 being able to handle french and spanish strings out to console output without a hitch. even localized variable names is not a problem

That's probably because all the required characters lie within the CP-1252 codepage. The problem now is that the upper half of CP-1252 characters are incompatible with UTF-8.

It looks like you are creating the PowerShell files as UTF-8, but PowerShell is trying to read them as CP-1252. So you need to change the encoding in the integrated terminal (which is just a matter of setting the .NET settings as in normal PowerShell). This is probably best done in your profile.

For setting editor encodings, see #1308 (comment).

@rjmholt
Copy link
Contributor

rjmholt commented Jan 10, 2019

It sounds like it might be helpful for us to summarise this in a document.

But I should emphasise that the encoding problems stem from having editor, PowerShell and source control settings out of sync. There's nothing the extension can really magically do without being opinionated (and therefore breaking people).

@ubidev
Copy link

ubidev commented Jan 10, 2019

coz usually if I follow the following workflow the french characters are always displayed ok at every step, even in source control

  • Creation: ISE or Notepad++ (set to UTF-8 with BOM)
  • Edits: VSCODE
    -Publish: GITLAB then VSTS

thanks for the very appreciated enlightenment and have a great day

@ubidev
Copy link

ubidev commented Jan 10, 2019

I totally agree :)

@rjmholt
Copy link
Contributor

rjmholt commented Jan 10, 2019

Creation: ISE or Notepad++ (set to UTF-8 with BOM)

I think the only caveat is that you need to ensure that ISE is configured to use UTF-8. The annoying part in VSCode is that because it's client/server, you need to configure both VSCode and the Integrated Terminal separately.

thanks for the very appreciated enlightenment and have a great day

Thank you! Let us know if you continue to experience issues. Encoding in PowerShell is a problem with no simple solutions, but we are definitely interested in feedback to make it all work as best we can.

@ubidev
Copy link

ubidev commented Jan 10, 2019

Unfortunately, I can't do that. If I use UTF8 in ISE/VSCODE/NPP for example, all hell breaks loose at execution: variables contaning accented characters break, display strings too.

The only way it works flawlessy at each step of the way on my computers/vms set to Windows english plus french lang pack is if I create the powershell scripts in "UTF8 with BOM" (default behavior in ISE, manual config in NPP)
Then execution, console outputs, gui popups, git/vsts repos contents display perfectly !
That's why I need to configure vscode to create in UTF-8 with BOM by default so I can use vscode only. :)
Obviously, I could forego the use of such characters and do everything in english but unfortunately, this is not an option where I live for many private projects requirements for naming conventions, outputs and documentation.
Thanks again for your time

@rjmholt
Copy link
Contributor

rjmholt commented Jan 10, 2019

Unfortunately, I can't do that. If I use UTF8 in ISE/VSCODE/NPP for example, all hell breaks loose at execution: variables contaning accented characters break, display strings too.

That's because without the BOM PowerShell defaults to CP-1252. If you set PowerShell to use UTF-8, that shouldn't happen:

$OutputEncoding = [console]::InputEncoding = [console]::OutputEncoding = New-Object System.Text.UTF8Encoding

In PowerShell 6+, it should default to UTF-8 anyway and you won't hit any of these problems out of the box.

@ubidev
Copy link

ubidev commented Jan 10, 2019

I'll give this a try. If I end up not having to ask the customers to change their default powershell behavior, that would be great...

where should I put this line? in my script, in the powershell profile?
$OutputEncoding = [console]::InputEncoding = [console]::OutputEncoding = New-Object System.Text.UTF8Encoding

Thanks a zillllion ;-)

@SydneyhSmith
Copy link
Collaborator

Add this to your VSCode settings "files.encoding": "utf8bom"
With this set it will encode your files as UTF-8 with BOM and Windows PowerShell should "just work"

@ubidev
Copy link

ubidev commented Jan 11, 2019

Thanks alot Sydney, will try this asap. :)

Now, I'd also like to try the solution provided by @rjmholt but can't seem to make it work: I pated the line in a test script created in vscode in UTF-8, but I get the same errors and display gibberish :(

@rjmholt
Copy link
Contributor

rjmholt commented Jan 11, 2019

where should I put this line? in my script, in the powershell profile?

The best place is probably in your profile. Otherwise, just execute it in your integrated terminal session. Let use know if that doesn't work though.

@ubidev
Copy link

ubidev commented Jan 12, 2019

Hi !
Here is what I'm getting as a result of copy/pasting verbatim that line in each application specific ps profile. It won't even run in ISE.

vscodeconsole
psconsole
iseconsole

all tests are done in a slim vm running windows 10 ent 1809 english + french langpack + chocalatey packages:

PS C:\src\tst\french> choco list --localonly
Chocolatey v0.10.11
7zip 18.6
7zip.install 18.6
Boxstarter 2.12.0
Boxstarter.Bootstrapper 2.12.0
Boxstarter.Chocolatey 2.12.0
Boxstarter.Common 2.12.0
Boxstarter.HyperV 2.12.0
Boxstarter.WinConfig 2.12.0
chocolatey 0.10.11
chocolatey-core.extension 1.3.3
chromium 71.0.3578.98
Cmder 1.3.11
DotNet4.5.2 4.5.2.20140902
Firefox 64.0
gimp 2.10.8
git 2.20.1
git-lfs 2.6.1
git-lfs.install 2.6.1
git.install 2.20.1
InkScape 0.92.3.20180702
microsoft-message-analyzer 1.4.1.20161119
notepadplusplus 7.6.2
notepadplusplus.install 7.6.2
sumatrapdf 3.1.2
sumatrapdf.commandline 3.1.2
vcredist2010 10.0.40219.2
visualstudiocode 1.23.1.20180730
vscode 1.30.1
vscode-gitattributes 0.4.1
vscode-gitignore 0.5.0
vscode-gitlens 1.0.0.20181011
vscode-icons 1.0.0.20180620
vscode-markdownlint 1.0.0.20181011
vscode-mssql 1.0.0.20181011
vscode-powershell 1.0.0.20181011
windirstat 1.1.2.20161210
winmerge 2.16.0
wireshark 2.6.5
XnView 2.46

PS C:\src\tst\french> $PSVersionTable

Name Value


PSVersion 5.1.17763.134
PSEdition Desktop
PSCompatibleVersions {1.0, 2.0, 3.0, 4.0...}
BuildVersion 10.0.17763.134
CLRVersion 4.0.30319.42000
WSManStackVersion 3.0
PSRemotingProtocolVersion 2.3
SerializationVersion 1.1.0.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants