-
Notifications
You must be signed in to change notification settings - Fork 130
Speed up line/column in OffsetPosition #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Building the index again and again for every OffsetPosition instance makes very little sense. So this makes it build it only once for a given source. I'm not sure if this is the best way to implement it and I'm afraid that a global synchronized map may be a performance bottleneck once the number of processors goes into the hundreds, but I know very little Scala/Java to do this properly. :-( Fixes http://stackoverflow.com/questions/14707127/accessing-position-information-in-a-scala-combinatorparser-kills-performance
This should scale much better.
Hmm, I like the In retrospect I would have preferred having the index on the This LGTM too, I just want to test the difference in performance before merging... |
To me, it also makes more sense to have the index on the |
Yeah, I think that made more sense to me as well, but since I hoped to have this in 1.0.x, I left that idea immediately. :-) (And it took me a while now to figure out how to implement it correctly. Just using a lazy val inside {CharSequnce,PagedSeq}Reader and override lazy val to pass it onto the next instance would leak memory, so we'd need a separate class to hold the lazy val and pass that around. That's probably just as well since we wouldn't want to duplicate code between the Readers. I can maybe prepare a PR for 1.1 if there's interest.) |
Which reminds me that I created this pull request against master instead of 1.0.x. Not that I really care where it ends as long as there's a release that I can put in my build.sbt. :-) |
@liskin good point, could you reopen a PR with this targeting 1.0.x? I'll release version 1.0.5 next week or so, in time for the scala 2.11.8 release. It would be great to include this. |
Sure, here: #68. (Although I must say I find it mildly annoying that I need to have one additional interaction with github for what would have been a simple local git merge for you.) Anyway, I'm glad there's going to be a release. Are there plans for releasing 1.1.x as well? We'd like to use the |
Well, I guess I can release 1.1.0 soon too. I was originally hoping SI-8472 would somehow go away before then (it also applies to modules like parser-combinators, which makes running 1.1.0 in the 2.11.x REPL a problem...), but that's unlikely. |
So is 1.1.0 going to have the index in the CharSequenceReader? |
Well, I think I'll try to do that, as I just found the sad truth of ThreadLocals: whatever is stored in the ThreadLocal holds a reference to the class loader, which in turns holds references to all the static objects, and our static object holds a reference to the ThreadLocal, so there's a cycle with no soft/weak reference in it and thus the static object and the class loader and all the other classes and objects and whatnot never go away, and we have a leak. Not as serious as the one in SI-9010/SI-4929, this only happens when rotating classloaders (tests in sbt, updating a war in container like tomcat/jetty), but it's still nothing to be proud of. :-( |
(And yes, this does cause sbt to crash on out of Metaspace a few times a day in its default settings.) |
Building the index again and again for every OffsetPosition instance makes
very little sense. So this makes it build it only once for a given source.
I'm not sure if this is the best way to implement it and I'm afraid that a
global synchronized map may be a performance bottleneck once the number of
processors goes into the hundreds, but I know very little Scala/Java to do
this properly. :-(
Fixes
http://stackoverflow.com/questions/14707127/accessing-position-information-in-a-scala-combinatorparser-kills-performance