Skip to content

Commit ed11985

Browse files
committed
(PowerShellGH-1417) Tokenize a document per line to avoid line endings
Previously the folding provider would tokenize the entire document at a time. While this mostly works, it causes issues with regular expressions which use the `$` anchor. This anchor will interpret CRLF vs LF differently. While it may be possible to munge the document prior to tokenization, the prior art within the VS Code codebase shows that this is not the intended usage, i.e. lines should be tokenized, not an entire document. This commit changes the tokenization process to tokenize per line but still preserves the original folding behaviour.
1 parent 77da510 commit ed11985

File tree

1 file changed

+19
-2
lines changed

1 file changed

+19
-2
lines changed

src/features/Folding.ts

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ import * as vscode from "vscode";
88
import {
99
DocumentSelector,
1010
LanguageClient,
11+
Position,
1112
} from "vscode-languageclient";
1213
import { IFeature } from "../feature";
1314
import { ILogger } from "../logging";
@@ -193,8 +194,24 @@ export class FoldingProvider implements vscode.FoldingRangeProvider {
193194
// If the grammar hasn't been setup correctly, return empty result
194195
if (this.powershellGrammar == null) { return []; }
195196

196-
// Convert the document text into a series of grammar tokens
197-
const tokens: ITokenList = this.powershellGrammar.tokenizeLine(document.getText(), null).tokens;
197+
// Tokenize each line and build up an array of document-wide tokens
198+
// Note that line endings (CRLF/LF/CR) have interpolation issues so don't
199+
// tokenize an entire document if the line endings are variable.
200+
const tokens: ITokenList = new Array<IToken>();
201+
let tokenizationState = null;
202+
for (let i = 0; i < document.lineCount; i++) {
203+
const result = this.powershellGrammar.tokenizeLine(document.lineAt(i).text, tokenizationState);
204+
const offset = document.offsetAt(new vscode.Position(i, 0)) ;
205+
206+
for (const item of result.tokens) {
207+
// Add the offset of the line to translate a character offset into
208+
// a document based index
209+
item.startIndex += offset;
210+
item.endIndex += offset;
211+
tokens.push(item);
212+
}
213+
tokenizationState = result.ruleStack;
214+
}
198215

199216
// Parse the token list looking for matching tokens and return
200217
// a list of LineNumberRange objects. Then filter the list and only return matches

0 commit comments

Comments
 (0)