Add support for all token types with the new semantic highlighting #2852

MartinGC94 · 2020-08-02T09:22:34Z

Summary of the new feature

Powershell has 20 tokentypes and it would be nice if the semantic highlighting supported all of them. See: https://docs.microsoft.com/en-us/dotnet/api/system.management.automation.pstokentype

The current semantic highlighting implementation is currently missing the following types:

Attribute
CommandArgument
GroupEnd/GroupStart
LineContinuation (backtick)
LoopLabel
StatementSeparator (Semicolon)

If/When these gets added I would suggest changing the following mappings:

Value in Using namespace <Value> should use CommandArgument scope. Currently it uses Function.
Attributes like [cmdletbinding()] should use Attribute scope. Currently it uses Type.
Names in functions and configurations should use CommandArgument scope. Currently they use the Function scope except functions with no dashes which has no semantic token type.
Unquoted parameter values like "C:\SomePath" in Get-ChildItem C:\SomePath should use CommandArgument scope. Currently they use the function scope.

The following screenshot hopefully shows what I mean:

The text was updated successfully, but these errors were encountered:

TylerLeonhardt · 2020-08-02T19:34:44Z

PSTokenType is the old type... It's a Windows PowerShell v2 relic...

Token is the new type:
https://docs.microsoft.com/dotnet/api/system.management.automation.language.token

Which has a TokenKind:
https://docs.microsoft.com/dotnet/api/system.management.automation.language.tokenkind

MartinGC94 · 2020-08-02T20:28:44Z

Interesting, I didn't know that. I still think it's worth taking inspiration from PSTokenType when mapping TokenKind to the VS code tokens because 20 tokens is a lot more manageable than 100+ and the old TokenType is basically logical groupings of the newer more granular TokenKind.
Personally I'm mostly interested in getting separate tokens for Attributes and unquoted parameters so I can make them stick out a bit more from everything else.

TylerLeonhardt · 2020-08-02T23:07:07Z

The best way to see if we can do anything about this is to run it through the PowerShell parser/tokenizer:

$tokens = $null
$errs = $null
[System.management.automation.language.Parser]::ParseInput($myScriptAsAStr, [ref] $tokens, [ref] $errs)

Then look at $tokens. If there's something distinguishing about Attributes or un-quoted parameters, we can do something!... But if there isn't, then we can't, I'm afraid.

(I didn't test that code, but it should be something along those lines... The docs: https://docs.microsoft.com/en-us/dotnet/api/system.management.automation.language.parser.parseinput?view=powershellsdk-7.0.0#System_Management_Automation_Language_Parser_ParseInput_System_String_System_Management_Automation_Language_Token____System_Management_Automation_Language_ParseError____)

MartinGC94 · 2020-08-03T21:10:41Z

Then you can do something :) Attributes have the "AttributeName" token flag which as far as I can tell is completely unique to attributes.
Unquoted parameters have the kind+TokenFlag combination of "Generic" + "None" which would affect the same things I suggested "CommandArgument" would affect in the original post above.

For convenience here's a table showing the relevant types and their Kind+TokenFlag combinations:

Token	Kind	TokenFlags
Command	Generic	CommandName
Unquoted Parameter	Generic	None
Type	Identifier	TypeName
Attribute	Identifier	TypeName,AttributeName

TylerLeonhardt · 2020-08-03T23:31:15Z

Nice find! Interested in contributing? This is really nicely scoped to just this method:
https://github.com/PowerShell/PowerShellEditorServices/blob/master/src/PowerShellEditorServices/Services/TextDocument/Handlers/PsesSemanticTokensHandler.cs#L102-L160

MartinGC94 · 2020-08-04T19:57:41Z

I don't know how I would do it. Updating the if statements and/or switch would be simple enough, but none of the standard semantic token types seem to really fit with the Powershell tokens so I would have to define a new type/sub type like it describes here: https://code.visualstudio.com/api/language-extensions/semantic-highlight-guide#semantic-token-classification

TylerLeonhardt · 2020-08-04T20:24:30Z

Yeah I'd rather not add a new token type... Maybe use one of the token types we don't actually use.

TylerLeonhardt · 2020-08-04T20:57:09Z

Here are the possibilities: https://github.com/OmniSharp/csharp-language-server-protocol/blob/f93f973ec96d895c0c6056f5f5942c4646071ab9/src/Protocol/Models/Proposals/SemanticTokenTypes.cs#L35-L57

I like macro or label personally.

SeeminglyScience · 2020-08-04T22:07:02Z

Type is the most accurate (it is a type, but it is also an attribute). If there's a way to do inheritance or something where you have a new token type called Attribute that falls back to Type if the theme doesn't support it that might be cool. Otherwise I'm not sure it's a great idea to use a classification that isn't accurate (when an accurate one exists).

rjmholt · 2020-08-05T15:27:31Z

If there's a way to do inheritance or something where you have a new token type called Attribute that falls back to Type if the theme doesn't support it that might be cool

The problem is that no popular theme will support it, and we've just increased complexity without fixing the issue. Most people will continue using their Dark+/Monokai/Solarized theme and see attributes and types the same. So it'll end up being like in #2856 (comment), where we correctly categorise the token differently and the colour is the same.

Otherwise I'm not sure it's a great idea to use a classification that isn't accurate (when an accurate one exists).

This is the real problem with themes. The themes themselves tend to be hacks designed to achieve nice coloration in the face of bad token categorisation. Which in turns forces the token classifiers to lie. I seem to recall there was a contentious change in the EditorSyntax repo where tokens were categorised more correctly and people complained because highlighting got worse.

I think the ideal here would be for us to define a correct token type, and a hack fallback, like: attribute -> label. That way the ISE theme could colour things with high specificity but users of popular themes will still get the color differentiation they're looking for.

TylerLeonhardt · 2020-08-05T15:48:30Z

@rjmholt if that's possible, I like that plan. I'll probably open an issue on vscode to start this discussion.

Also, I was skimming through the LSP docs and I noticed there's also a section for modifiers:

https://microsoft.github.io/language-server-protocol/specifications/specification-3-16/#textDocument_semanticTokens

export enum SemanticTokenModifiers {
	declaration = 'declaration',
	definition = 'definition',
	readonly = 'readonly',
	static = 'static',
	deprecated = 'deprecated',
	abstract = 'abstract',
	async = 'async',
	modification = 'modification',
	documentation = 'documentation',
	defaultLibrary = 'defaultLibrary'
}

I believe these can be added on to the TokenType to provide additional context. At first glance nothing jumps at me.

ghost added the Needs: Triage Maintainer attention needed! label Aug 2, 2020

rjmholt added Area-UI Issue-Enhancement A feature request (enhancement). labels Aug 3, 2020

TylerLeonhardt mentioned this issue Aug 3, 2020

Unfortunate color choices with semantic highlighting #2856

Closed

SydneyhSmith removed the Needs: Triage Maintainer attention needed! label Aug 4, 2020

ghost added the Needs: Maintainer Attention Maintainer attention needed! label Aug 4, 2020

rjmholt mentioned this issue Aug 10, 2020

Semantic Syntax is incorrectly highlighting arguments as commands #2860

Open

SydneyhSmith removed the Needs: Maintainer Attention Maintainer attention needed! label Aug 11, 2020

rjmholt mentioned this issue Oct 13, 2020

Syntax highlighting worse with some color themes after #1337? PowerShell/PowerShellEditorServices#1368

Closed

andyleejordan added the Area-Semantic Highlighting label Mar 3, 2021

SydneyhSmith mentioned this issue May 13, 2022

Semantic Highlighting Meta Issue #3982

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for all token types with the new semantic highlighting #2852

Add support for all token types with the new semantic highlighting #2852

MartinGC94 commented Aug 2, 2020

TylerLeonhardt commented Aug 2, 2020

MartinGC94 commented Aug 2, 2020

TylerLeonhardt commented Aug 2, 2020 •

edited

Loading

MartinGC94 commented Aug 3, 2020

TylerLeonhardt commented Aug 3, 2020

MartinGC94 commented Aug 4, 2020

TylerLeonhardt commented Aug 4, 2020

TylerLeonhardt commented Aug 4, 2020

SeeminglyScience commented Aug 4, 2020

rjmholt commented Aug 5, 2020

TylerLeonhardt commented Aug 5, 2020

Add support for all token types with the new semantic highlighting #2852

Add support for all token types with the new semantic highlighting #2852

Comments

MartinGC94 commented Aug 2, 2020

TylerLeonhardt commented Aug 2, 2020

MartinGC94 commented Aug 2, 2020

TylerLeonhardt commented Aug 2, 2020 • edited Loading

MartinGC94 commented Aug 3, 2020

TylerLeonhardt commented Aug 3, 2020

MartinGC94 commented Aug 4, 2020

TylerLeonhardt commented Aug 4, 2020

TylerLeonhardt commented Aug 4, 2020

SeeminglyScience commented Aug 4, 2020

rjmholt commented Aug 5, 2020

TylerLeonhardt commented Aug 5, 2020

TylerLeonhardt commented Aug 2, 2020 •

edited

Loading