Skip to content

Performance regression with Error Message classes #1691

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
odersky opened this issue Nov 9, 2016 · 10 comments
Closed

Performance regression with Error Message classes #1691

odersky opened this issue Nov 9, 2016 · 10 comments
Labels
area:reporting Error reporting including formatting, implicit suggestions, etc itype:question

Comments

@odersky
Copy link
Contributor

odersky commented Nov 9, 2016

I have noted that the introduction of error message classes introduces potentially serious performance regressions.

Essentially, the model is that errors can appear very frequently because they are part of retracted backtracking. So even if the program compiles OK it could have generated 10's of thousands of error messages. Previously, that was not a problem because everything in diagnostics was by-name and got forced only when the error got printed. But the new error messages do not behave this way.

E.g. MissingIdentifier takes a by-value string, which is passed a name.show. show is an expensive operation. This should not be called everytime we generate an error.

I think as a first approximation every string in error messages should be by-name, and we should make sure the string is not forced in the initialization code of the message.

@felixmulder felixmulder added area:reporting Error reporting including formatting, implicit suggestions, etc itype:question labels Nov 9, 2016
@felixmulder
Copy link
Contributor

When redesigning the messages, we wanted to keep the laziness - but since we're no longer dealing with only strings, this becomes more complicated.

Whether it is forced or not depends a bit on the behavior of the Reporting trait. As you can see, all messages are taken by value and perhaps it is the following line that actually does the forcing, it could perhaps be made to take () => MessageContainer if it is being forced there...I'll take a closer look at the behavior tomorrow. :)

@Blaisorblade
Copy link
Contributor

Blaisorblade commented Nov 9, 2016

@felixmulder There's more code forcing by-name Messages... In fact, I see several code paths that wrap by-value Message into by-name Message. That includes methods in Message that construct MessageContainers, and error utilities in Parsers (and maybe elsewhere). Also, () => MessageContainer seems redundant as MessageContainer holds Message by-name already.

So either Message should always be passed by-name, or all string arguments of all Messages should be by-name as @odersky suggests. Given the amount of messages and the fact they're supposed to be user-contributed (with good reason), the latter seems harder. Also, the first invariant can be enforced!

To enforce Message is always passed by-name, just forbid constructing Message outside of error reporting: add assert(inErrorReporting) to Message's constructor, where inErrorReporting is only true when errors are actually reported (say, in reporter.report if the error isn't hidden, or sth. like that).

Methods in Message that construct MessageContainers could maybe be turned into extension methods if implicit class A(a: => String) { ... defs ... } preserves non-strictness reliably.

@Blaisorblade
Copy link
Contributor

On a related topic: what's the chance of memory leaks due to laziness? To make this question more interesting, assume Dotty is used in an IDE that tries to preserve MessageContainer (which has no docs discouraging that), and that some message accidentally captures an enclosing class. That's probably the problem spores are created for.

Also remember that MessageContainer stores msgFn and doesn't clear it's forced by contained—a lazy val would do better (ignoring synchronization), since scalac does the right thing on

class Foo(byNameMsg: => String) {
  lazy val byLazyValMsg = byNameMsg
}

in particolar, the initializer for byLazyValMsg sets byNameMsg to null.

@felixmulder
Copy link
Contributor

@Blaisorblade - it's way too late in the evening for me to start with this, hence my short response.

I have an idea of how to deal with this, it is sort of along the lines of what you suggest but without asserts (hopefully). The closurizing is not really redundant as this is where the by-name breaks for new messages since the suggested placement is where Message gets turned into MessageContainer - the latter which is a java interface and consumed by the Reporter. This interface also poses a barrier for a completely fresh implementation since it is depended on by outside tooling.

We'll get there, don't worry 🌞

@Blaisorblade
Copy link
Contributor

@Blaisorblade - it's way too late in the evening for me to start with this, hence my short response.

Of course — I didn't mean to criticize for the incompleteness.

The closurizing is not really redundant as this is where the by-name breaks for new messages since the suggested placement is where Message gets turned into MessageContainer - the latter which is a java interface and consumed by the Reporter

FWIW MessageContainer is defined in Scala, takes messages by-name and inherits from the Java Diagnostic interface. I think the by-name is forced at the location you linked because it's a method receiver. Or maybe I miss some other reason. But no need to debate this now.

@smarter
Copy link
Member

smarter commented Nov 9, 2016

in particolar, the initializer for byLazyValMsg sets byNameMsg to null.

That's interesting, I wasn't aware of that. Is this documented anywhere? Dotty does not seem to do that (ping @DarkDimius) which may lead to surprising new memory leaks when porting an existing codebase.

@Blaisorblade
Copy link
Contributor

@smarter Answered in a new issue (with some more info).

@Blaisorblade
Copy link
Contributor

PR for this in #1696.

@nicolasstucki
Copy link
Contributor

Can we close this issue?

@smarter
Copy link
Member

smarter commented Dec 29, 2017

Yes, Felix fixed it :).

@smarter smarter closed this as completed Dec 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:reporting Error reporting including formatting, implicit suggestions, etc itype:question
Projects
None yet
Development

No branches or pull requests

5 participants