-
Notifications
You must be signed in to change notification settings - Fork 1.1k
QuoteUnpicker.unpickleExpr and QuoteUnpicker.unpickleType are slow #12228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Is it possible to get a wider flame graph or an interactive one? It is impossible to see the method names. Can you create a small self-contained example that shows the same issue? |
It would also be quite useful to see the timings of the code within We should know what are the proportions of |
I'll try to get some more detailed information about what happening is inside of unpickleExpr/unpickleType. |
Also, do you have JProfiler? If you do, I can send you the saved JProfile session that you can interactively explore. |
Here is a bit more info. I have colorized unpickle (pink), unpickleTerm (red), and spliceTerms (purple). I didn't find spliceTypes anywhere in the chart. It seems QuoteUnpickler.unpickle is the one that takes the most time. I can send you the profiling session separately if you'd like. |
Hopefully, this provides some additional useful information. Please let me know if you need something else. |
It seems to read a very large number of names. We should analyze which names are serialized when we pickle quotes and whether we can cut the number somehow. |
One thing we should be able to do is cache the result of |
@deusaquilus we will need a small self-contained example that is representative of the kind of code you have and that shows latency in the performance. This will be used for our tests and benchmarks to ensure we properly fix the issue and do not get regressions in the future. |
Yes! I think that would be really good. In order to make a self contained example, I need to implement at least a Quill AST and a Unlifter. Maybe also a Quoter and a Parser. It would basically be a mini-quill implementation. It will be have to be more like a “medium sized” self contained example. |
Otherwise, do you know which quote is taking the longes time to unpickle in your profile? |
After doing some tests I found that unpickling Warm compiler
Warm compiler (cached
Cold compiler
Cold compiler (cached
|
There is still a question of the granularity of the cache
Should we have a maximum size for the cache? Keys should be cheap as they contain strings that exist in the bytecode. On the other hand, trees might cause some memory pressure. |
The quotation that this was benchmarked from was this. val result = testContext.run(liftQuery(insertValues).foreach(e => insert(e))) This was from I think the granularity of the cache should be at least Per macro expansion if not even higher, in Quill the results of the LiftMacro and the QuoteMacro are frequently reused. Any lower of a granularity I think would be totally useless for Quill's use-case. In terms of memory size, I think there should be no limit by default, memory is fairly cheap these days. Maybe there should be a compiler-arg that sets this value if needed, that might be useful for builds in CI environments where resources are more constrained. |
@deusaquilus I think now I can create a benchmark from scratch. Will be simpler to tune and diagnose. |
@nicolasstucki Please have a look at this. I have created a miniature quill implementation of ~270 lines of code that represents the majority of Quill's use case: I also have created a file that represents some large entities and join clauses that are characteristic of the kinds of things that I see in the corporate world (or at least as much as it can be with such a small implementation). This macro takes ~1 second to compile on a warm JVM and 5+ seconds on a cold jvm. It spends the vast majority of that time inside I also have a smaller query that is not a good benchmarking example but it is a good reference for what the most basic kind of query looks like: Please, please have a look at |
Also, please keep in mind that the queries you see in |
Also, feel free to play around with the number of parameters in the LargeModel object if that helps at all. If this is a benchmark you want to run hundreds of times it might be useful to decrease them. That will however most likely make the unpicker look like a smaller part of the latency. |
Speed improvements after #12242 can be seen here https://dotty-bench2.epfl.ch/12242/ |
Uh oh!
There was an error while loading. Please reload this page.
Compiler version
3.0.0-RC3
I am getting ~700ms range bottlenecks from unpickleExpr and unpickleType (blue and light-blue in the chart below) for the Proto-Quill parser and expr-accumulation (for a simple query with 20 parameters). I can't accommodate large nested queries with this kind of latency.
Are there any tricks I can use to optimize unpickling?
Are there any compiler args I can use to make unpickling faster or to disable it for certain parts of the code?
Also, if you want more detailed information in the flame graph, let me know which packages you'd like me to instrument and filter for (since I can't instrument the entire Dotty).
The text was updated successfully, but these errors were encountered: