Skip to content
This repository was archived by the owner on Jan 19, 2019. It is now read-only.

Discussion: Attach Comments Implementation/Outstanding Issues #34

Closed
wants to merge 1 commit into from
Closed

Discussion: Attach Comments Implementation/Outstanding Issues #34

wants to merge 1 commit into from

Conversation

JamesHenry
Copy link
Member

@JamesHenry JamesHenry commented Mar 23, 2016

Description Updated: Saturday, 16 July

Original Idea

  • Parse and convert TypeScript comment tokens into Esprima comment tokens, in a similar way to how Acorn tokens are converted in Espree
  • Utilise Espree's logic for attaching comments (found in its own attach-comment.js)

Initial roadblock:

  • TypeScript does not store comment "trivia" on its AST, and does not provide a hook like Acorn.onComment to allow us to collect comment tokens as we parse the rest of the AST

When first digging through the TypeScript documentation, the solution would appear to be to use the TypeScript helpers ts.getLeadingCommentRanges() and ts.getTrailingCommentRanges() on each node as it is parsed and converted.

However, it turns out that will not work for some of our more "interesting" test case sources, such as:

surrounding-while-loop-comments.src.js

function f() { /* infinite */ while (true) { } /* bar */ var each; }

Main Issue:

TypeScript has a fundamentally different approach to trivia ownership.

From the TypeScript Wiki:

We follow Roslyn's notion of trivia ownership for comment ownership. In general, a token owns any trivia after it on the same line up to the next token. Any comment after that line is associated with the following token.

So, in the surrounding-while-loop-comments.src.js example, the fact that there is no newline before /* infinite */ means it is not classed as a leadingComment for the WhileStatement.

surrounding-while-loop-comments.src.js - NO NEWLINE

function f() { /* infinite */ while (true) { } /* bar */ var each; }

Nothing returned from ts.getLeadingCommentRanges() for WhileStatement node

surrounding-while-loop-comments.src.js - NEWLINE

function f() {
/* infinite */ while (true) { } /* bar */ var each; }

/* infinite */ returned from ts.getLeadingCommentRanges() for WhileStatement node

Alternative Approach

  • Separately scan the source using TypeScript's createScanner() method, with skipTrivia set to false, so that we can identify comment tokens.
  • When identified, a TypeScript comment token is converted to an ESTreeToken, and that gets used to populate the extra.leadingComments and extra.trailingComments arrays in a very similar way to Espree.

Using this method to parse and convert, combined with Espree's original attach-comment logic, is where we are currently at with this PR.

We have 5 failing tests:

  • mix-line-and-block-comments
  • surrounding-while-loop-comments
  • switch-fallthrough-comment-in-function
  • switch-fallthrough-comment
  • switch-no-default-comment-in-nested-functions

The theme that runs through them is having too many comments attributed to a node in particular cases.

I believe this is down to the fundamental difference in how we are collecting and parsing comment tokens.

To reiterate, we are using the same attachment logic, but for token parsing:

  • Espree: Collects, converts and attaches the comment tokens over time, as the AST is being parsed by Acorn
  • This parser: First scans the source for all the comments and converts them, then parses the AST and attaches the comments as nodes are being converted

Proposed Solution

It seems like the only way to resolve these 5 remaining test cases is to have a dedicated implementation for attaching comments for this parser.

// TODO: Discuss - this approach doesn't seem right...
// In the test this Identifier should not have trailingComments so naively delete
// - Just because it doesn't in this test, doesn't mean it couldn't have them?
var functionId = convertChild(node.name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm general, nodes don't have trailingComments or leadingComments if they are empty. If that's a pain, we could try leaving them for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, what do you mean by an "empty" node here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If trailingComments is an array of zero length, then we just removed trailingComments from the node. Same for leadingComments.

@nzakas
Copy link
Member

nzakas commented Mar 24, 2016

Nothing looks to hacky to me, nice job so far. The switch fall though tests are definitely important because they affect the ESLint no-fallthrough rule.

@JamesHenry
Copy link
Member Author

Ok, cool. As I mentioned, I am moving flat this weekend so I will not be able to pick up on this again for a while. If you have any further thoughts please let me know, otherwise I will just continue as and when I can.

@nzakas
Copy link
Member

nzakas commented Mar 24, 2016

👍 Good luck moving. I may or may not try to pick this up in the meantime, depending on how I'm feeling.

@nzakas
Copy link
Member

nzakas commented Apr 21, 2016

FYI: There were some bugs in the Espree comment attachment code that just got fixed. You may want to re-port that logic and the associated tests to double-check.

@JamesHenry
Copy link
Member Author

Looping back to this at last!

Merging in the espree changes you mentioned has indeed changed things - it has upped the failing test count from 3 to 5 :/

I feel like I am back up to speed on this now, and will find time to properly attack it later in the week.

@nzakas
Copy link
Member

nzakas commented Jul 8, 2016

Great, thanks!

@eslintbot
Copy link

LGTM

@JamesHenry
Copy link
Member Author

Updated the PR to include the latest changes from master, working on remaining issues now

@JamesHenry
Copy link
Member Author

I have done quite a lot of digging today, and am going to update my original description on this PR to note what I have found.

Unfortunately, it strongly seems like we will need to reimplement attach-comment.js specifically for this parser, as we cannot replicate how it is done in Espree (down to fundamental differences between TypeScript Compiler and Acorn).

@eslintbot
Copy link

LGTM

@DanielRosenwasser
Copy link

I am not 100% familiar with the code, but I read the writeup I think that's probably a good approach. @mhegazy might have some thoughts too.

If you didn't want to hold each of the comments all at once before the ESTree tree is constructed, you would also be able to scan and re-populate comments starting from each node's full start, though I don't know what sorts of savings you'd see if any.

@nzakas
Copy link
Member

nzakas commented Jul 26, 2016

In the interest of moving this forward, why don't we simplify and just make sure that that Program.comments array is properly filled in. I opened an issue (eslint/eslint#6724) on ESLint to automatically calculate leading/trailing comments rather than requiring the parser to do it. We can focus some attention on that to make this side of the equation a bit easier.

@JamesHenry
Copy link
Member Author

JamesHenry commented Jul 26, 2016

That's a great suggestion on how to unblock this most quickly, thanks @nzakas.

I believe we are just about there with fulfilling the mandate of a correct Program.comments array, I will just need to adapt the attach-comment tests to not evaluate the full AST and submit for your review.

@JamesHenry
Copy link
Member Author

Closing in favour of #49

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants