Skip to content

Give position to symbols when unpickling #3525

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Dec 9, 2017

Conversation

smarter
Copy link
Member

@smarter smarter commented Nov 21, 2017

I started working on this to fix a crash when compiling the collection-strawman contrib tests. I minimized it to anonBridge.scala (in the collection-strawman, unpickling happened because of inline def, here the test is in from-tasty so it will be compiled from an unpickled tree). Eventually, I would also like to use the unpickled symbol positions in the IDE, but this will require more work. (Right now, going from an arbitrary Position to a SourcePosition is annoying, it requires creating a SourceFile which requires converting the source file content to an array of bytes).

Copy link
Member

@dottybot dottybot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, and thank you for opening this PR! 🎉

All contributors have signed the CLA, thank you! ❤️

Commit Messages

We want to keep history, but for that to actually be useful we have
some rules on how to format our commit messages (relevant xkcd).

Please stick to these guidelines for commit messages:

  1. Separate subject from body with a blank line
  2. When fixing an issue, start your commit message with Fix #<ISSUE-NBR>:
  3. Limit the subject line to 72 characters
  4. Capitalize the subject line
  5. Do not end the subject line with a period
  6. Use the imperative mood in the subject line ("Added" instead of "Add")
  7. Wrap the body at 80 characters
  8. Use the body to explain what and why vs. how

adapted from https://chris.beams.io/posts/git-commit

Have an awesome day! ☀️

@smarter
Copy link
Member Author

smarter commented Nov 21, 2017

test performance please

@dottybot
Copy link
Member

performance test scheduled: 1 job(s) in queue, 0 running.

@smarter smarter requested a review from odersky November 21, 2017 22:24
@dottybot
Copy link
Member

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/3525/ to see the changes.

Benchmarks is based on merging with master (1d24b19)

Copy link
Contributor

@odersky odersky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

@@ -93,6 +93,7 @@ class ScalaSettings extends Settings.SettingGroup {
val YdetailedStats = BooleanSetting("-Ydetailed-stats", "show detailed internal compiler stats (needs Stats.enabled to be set to true).")
val Yheartbeat = BooleanSetting("-Ydetailed-stats", "show heartbeat stack trace of compiler operations (needs Stats.enabled to be set to true).")
val Yprintpos = BooleanSetting("-Yprintpos", "show tree positions.")
val YprintposSyms = BooleanSetting("-Yprintpos-syms", "show symbol definitions positions.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since -Yprintpos is about to change to Yprint-pos, I guess we also need an additional hypen here: -Yprint-pos-syms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, in which PR is Yprintpos changed to Yprint-pos ?

@@ -437,7 +437,7 @@ object Symbols {

/** Does this symbol come from a currently compiled source file? */
final def isDefinedInCurrentRun(implicit ctx: Context): Boolean = {
pos.exists && defRunId == ctx.runId
pos.exists && defRunId == ctx.runId && sourceFile == ctx.source.file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not the right condition anymore. See for instance how it is used in checkNoConflict. We want to flag top-level definitions in other source files if they clash with a definition in the current source file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably use as third criterion:

 { val file = assocatedFile
   file != null && !file.path.endsWith(".class")
 }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that's tricky because associatedFile forces things, I'll give it a try.

@@ -1155,7 +1155,8 @@ object Parsers {
t
}

def ascription(t: Tree, location: Location.Value) = atPos(startOffset(t), in.skipToken()) {
def ascription(t: Tree, location: Location.Value) = atPos(startOffset(t)) {
in.skipToken()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change? The previous implementation would place the ^ under the : of an ascription, which seems to be what we want, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all other cases, the point is under the first letter of the name of the definition (e.g. for def foo: Int = 1 the point is under f). This change means that I can always use the point of the definition tree to set the symbol position.

isSimilar(d.symbol.sourceFile.path, ctx.source.file.path),
s"private ${d.symbol.showLocated} in ${d.symbol.sourceFile} accessed from ${ctx.owner.showLocated} in ${ctx.source.file}")
isSimilar(d.symbol.sourceFile.path, ctx.owner.sourceFile.path),
s"private ${d.symbol.showLocated} in ${d.symbol.sourceFile} accessed from ${ctx.owner.showLocated} in ${ctx.owner.sourceFile}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would this make a difference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When unpickling from tasty, ctx.source.file will be "Foo.class" whereas ctx.owner.sourceFile will be the source file name found in the SourceFile annotation of the tasty tree.

@smarter
Copy link
Member Author

smarter commented Nov 28, 2017

@odersky There's another difficulty with isDefinedInCurrentRun: If we're compiling from TASTY then associatedFile may very well end with .class but isDefinedInCurrentRun should return true. The only solution I see here is doing a check like ctx.runInfo.units.contains(associatedFile), but that requires adding units to RunInfo (currently it's only in Run which isn't accessible from Context), what do you think?

@smarter
Copy link
Member Author

smarter commented Nov 30, 2017

Alternative proposal to make isDefinedInCurrentRun easy to define: track whether a symbol comes from a compilation unit when the symbol is created. This could be done by giving a special meaning to one bit of the Coord class

@smarter smarter force-pushed the tasty-symbols-pos branch 3 times, most recently from 55506fc to 908c1c9 Compare December 1, 2017 14:23
@smarter
Copy link
Member Author

smarter commented Dec 1, 2017

test performance please

@smarter
Copy link
Member Author

smarter commented Dec 1, 2017

@odersky Could you review the last commit please?

@dottybot
Copy link
Member

dottybot commented Dec 1, 2017

performance test scheduled: 1 job(s) in queue, 0 running.

@dottybot
Copy link
Member

dottybot commented Dec 1, 2017

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/3525/ to see the changes.

Benchmarks is based on merging with master (dc52e6f)

@smarter
Copy link
Member Author

smarter commented Dec 6, 2017

Ping for review of the last commit

@odersky
Copy link
Contributor

odersky commented Dec 6, 2017

I am worried that this seems to cause a slowdown of 6% for Dotty itself. What could be the cause of this?

@odersky
Copy link
Contributor

odersky commented Dec 6, 2017

Last commit LGTM, but it would be good to drill down on the performance issue.

@smarter
Copy link
Member Author

smarter commented Dec 6, 2017

test performance please

@dottybot
Copy link
Member

dottybot commented Dec 6, 2017

performance test scheduled: 1 job(s) in queue, 0 running.

@odersky
Copy link
Contributor

odersky commented Dec 6, 2017

I am still not convinced this is the right thing to do. We add a lot more positions to Tasty files which makes them larger. Symbols get larger too. It seems this has a significant performance implications. So what exactly are we trying to solve here? Can't this be obtained more cheaply?

If it's just the inlining, it would be easy to fix with an additional pass when we create the Inline Body annotation.

@smarter
Copy link
Member Author

smarter commented Dec 6, 2017

We add a lot more positions to Tasty files which makes them larger.

Not at all! The positions of symbols are derived from the position of trees, so we're not storing anything more.

Symbols get larger too.

I don't see how, symbols already had a coord field. It's just that for many of them it was set to NoCoord previously.

It seems this has a significant performance implications.

I think the benchmark result was a fluke, we'll see. If there are performance issues I'll fix them.

So what exactly are we trying to solve here?

Unpickled trees should be as close to their pre-pickling state as possible. This reduces the number of surprises when compiling from tasty since we're using the same code paths than when compiling from sources. Positions for symbols may also allow for better error messages, and could be used in the IDE to reduce the number of tree traversals done for operations such as jump-to-definition (the current PR doesn't do this because it's not trivial).

@dottybot
Copy link
Member

dottybot commented Dec 6, 2017

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/3525/ to see the changes.

Benchmarks is based on merging with master (f92f278)

@smarter
Copy link
Member Author

smarter commented Dec 6, 2017

Hmm it looks like we have no new benchmarks recorded for master since 20/11 (see http://dotty-bench.epfl.ch), so it's difficult to measure the effect of this PR. Ping @liufengyun

@dottybot
Copy link
Member

dottybot commented Dec 6, 2017

performance test scheduled: 1 job(s) in queue, 0 running.

@dottybot
Copy link
Member

dottybot commented Dec 6, 2017

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/3525/ to see the changes.

Benchmarks is based on merging with master (7848ef6)

And add it to the set of options passed to the pickling tests so that we
can start checking that symbol positions are properly pickled and unpickled.
Previously, unpickled symbols always had position NoCoord, now their
position is set based on the corresponding tree position. This does not
always match the position used when they were pickled so `testPickling`
currently fails, this is fixed in the subsequent commits of this PR.

Symbol#isDefinedInCurrentRun is broken by this commit and fixed in a
latter commit in this PR.
A symbol is created before the corresponding tree is unpickled,
so we cannot rely on initialPos to set the symbol position.
Instead, we always pickle the position of definitions.

In practice, this means that we now always pickle the position of all
`TypeDef` and `Bind` (the other `DefTree`s are all `WithLazyField` and
thus already had their positions pickled).
The point of a definition should be the position of the first letter in
the name of a definition. For a `Bind` node this is just the
start (previously we were using the position of the `:`).
Various changes so that `testPickling` with `-Yprintpos-syms` now passes.

It's not always obvious what position to use when new trees are created,
here are the principles we used here:
- When a definition tree is replaced by another definition tree (for
  example, in `ParamForwarding`), the new tree position is set to the
  old tree position.
- When a reference tree is replaced by a definition tree representing an
  accessor and a new reference tree that calls the accessor (for example
  in `SuperAccessors`), the accessor position is set to the old
  tree *point*, and the new reference tree position is set to the old
  tree *position*. This makes sense because we already use zero-extent
  positions for compiler artifacts that the IDE should ignore and an
  accessor is a compilation artifact.
When compiling from TASTY, `ctx.source.file` will be the TASTY file
corresponding to the current compilation unit, this will never match
`d.symbol.sourceFile` whose value is set from the `SourceFile`
annotation in the TASTY tree. Fixed by using `ctx.owner.sourceFile`
instead.

The added test, `anonBridge.scala` needs this fix to work. It also needs
the previous work done in this PR to give positions to unpickled
symbols, otherwise it crashes in `Typer#assertPositioned` because
`Bridges#addBridge` creates a tree whose position comes from a symbol.
And add a test to make sure it's working. The test required a change to
the testing infrastructure: files in compilation groups should be sorted
alphabetically, otherwise we cannot know in advance if the error will
happen in A1.scala or A2.scala
@smarter
Copy link
Member Author

smarter commented Dec 6, 2017

Oops, used the wrong syntax to test multiple commits, trying again...

@smarter
Copy link
Member Author

smarter commented Dec 6, 2017

test performance please 7848ef6 6b19932

@dottybot
Copy link
Member

dottybot commented Dec 6, 2017

performance test scheduled: 1 job(s) in queue, 1 running.

@dottybot
Copy link
Member

dottybot commented Dec 6, 2017

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/3525/ to see the changes.

Benchmarks is based on merging with master (7848ef6)

@smarter
Copy link
Member Author

smarter commented Dec 6, 2017

@liufengyun Looks like testing multiple commits doesn't work anymore, or did I get the syntax wrong?

@liufengyun
Copy link
Contributor

liufengyun commented Dec 6, 2017 via email

@odersky
Copy link
Contributor

odersky commented Dec 7, 2017

OK, I see the rationale now, and it seems performance implications are smaller than first feared. So I am OK to get this in. But maybe finish the performance evaluation first.

@smarter
Copy link
Member Author

smarter commented Dec 9, 2017

test performance please; 18767e0 6b19932

@dottybot
Copy link
Member

dottybot commented Dec 9, 2017

performance test scheduled: 1 job(s) in queue, 0 running.

@liufengyun
Copy link
Contributor

@smarter Sorry, it's colon instead semicolon, I've a little problem with the punctuation names.

@smarter
Copy link
Member Author

smarter commented Dec 9, 2017

I tried using colon in #3525 (comment) but it didn't seem to work either ?

@liufengyun
Copy link
Contributor

Let's try a new curve and see:

retest performance please: 18767e0 6b19932

@liufengyun
Copy link
Contributor

retest performance please: 18767e0 6b19932

@dottybot
Copy link
Member

dottybot commented Dec 9, 2017

performance test scheduled: 3 job(s) in queue, 1 running.

@dottybot
Copy link
Member

dottybot commented Dec 9, 2017

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/3525/ to see the changes.

Benchmarks is based on merging with master (18767e0)

@smarter smarter merged commit fd861b0 into scala:master Dec 9, 2017
@allanrenucci allanrenucci deleted the tasty-symbols-pos branch December 9, 2017 14:55
@dottybot
Copy link
Member

dottybot commented Dec 9, 2017

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/3525-1/ to see the changes.

Benchmarks is based on merging with master (18767e0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants