Skip to content

Speed up line/column in OffsetPosition #57

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Speed up line/column in OffsetPosition #57

wants to merge 2 commits into from

Conversation

liskin
Copy link
Contributor

@liskin liskin commented Jul 3, 2015

Building the index again and again for every OffsetPosition instance makes
very little sense. So this makes it build it only once for a given source.

I'm not sure if this is the best way to implement it and I'm afraid that a
global synchronized map may be a performance bottleneck once the number of
processors goes into the hundreds, but I know very little Scala/Java to do
this properly. :-(

Fixes
http://stackoverflow.com/questions/14707127/accessing-position-information-in-a-scala-combinatorparser-kills-performance

Building the index again and again for every OffsetPosition instance makes
very little sense.  So this makes it build it only once for a given source.

I'm not sure if this is the best way to implement it and I'm afraid that a
global synchronized map may be a performance bottleneck once the number of
processors goes into the hundreds, but I know very little Scala/Java to do
this properly. :-(

Fixes http://stackoverflow.com/questions/14707127/accessing-position-information-in-a-scala-combinatorparser-kills-performance
@gourlaysama gourlaysama self-assigned this Jul 27, 2015
@gourlaysama
Copy link
Contributor

Hmm, I like the WeakHashMap version very much, that seems like a good comprise.

In retrospect I would have preferred having the index on the CharSequenceReader side and having OffsetPosition retain a reference to that instead of the source, but this can't be changed without breaking bincompat (if it is even a good thing to do...).

This LGTM too, I just want to test the difference in performance before merging...

@TomasMikula
Copy link

To me, it also makes more sense to have the index on the CharSequenceReader.

@liskin
Copy link
Contributor Author

liskin commented Sep 8, 2015

Yeah, I think that made more sense to me as well, but since I hoped to have this in 1.0.x, I left that idea immediately. :-)

(And it took me a while now to figure out how to implement it correctly. Just using a lazy val inside {CharSequnce,PagedSeq}Reader and override lazy val to pass it onto the next instance would leak memory, so we'd need a separate class to hold the lazy val and pass that around. That's probably just as well since we wouldn't want to duplicate code between the Readers. I can maybe prepare a PR for 1.1 if there's interest.)

@liskin
Copy link
Contributor Author

liskin commented Sep 8, 2015

Which reminds me that I created this pull request against master instead of 1.0.x. Not that I really care where it ends as long as there's a release that I can put in my build.sbt. :-)

@gourlaysama
Copy link
Contributor

@liskin good point, could you reopen a PR with this targeting 1.0.x?

I'll release version 1.0.5 next week or so, in time for the scala 2.11.8 release. It would be great to include this.

@liskin
Copy link
Contributor Author

liskin commented Sep 11, 2015

Sure, here: #68.

(Although I must say I find it mildly annoying that I need to have one additional interaction with github for what would have been a simple local git merge for you.)

Anyway, I'm glad there's going to be a release. Are there plans for releasing 1.1.x as well? We'd like to use the ~>! combinator. :-)

@liskin liskin closed this Sep 11, 2015
@gourlaysama
Copy link
Contributor

Well, I guess I can release 1.1.0 soon too. I was originally hoping SI-8472 would somehow go away before then (it also applies to modules like parser-combinators, which makes running 1.1.0 in the 2.11.x REPL a problem...), but that's unlikely.

@TomasMikula
Copy link

So is 1.1.0 going to have the index in the CharSequenceReader?

@liskin
Copy link
Contributor Author

liskin commented Sep 12, 2015

Well, I think I'll try to do that, as I just found the sad truth of ThreadLocals: whatever is stored in the ThreadLocal holds a reference to the class loader, which in turns holds references to all the static objects, and our static object holds a reference to the ThreadLocal, so there's a cycle with no soft/weak reference in it and thus the static object and the class loader and all the other classes and objects and whatnot never go away, and we have a leak. Not as serious as the one in SI-9010/SI-4929, this only happens when rotating classloaders (tests in sbt, updating a war in container like tomcat/jetty), but it's still nothing to be proud of. :-(

@liskin
Copy link
Contributor Author

liskin commented Sep 12, 2015

(And yes, this does cause sbt to crash on out of Metaspace a few times a day in its default settings.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants