-
Notifications
You must be signed in to change notification settings - Fork 45
Pre-compile all python code and remove original .py files. #476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ErrorsWe can see what is lost because we've removed the Chances are errors are not going to be caused by our code and thus would just be a distraction to helping the user find the real origin of their error. Also, if they really want to see the line of code, it's all open sourced. RawPretty |
Dockerfile
Outdated
RUN rm ./python/lib/$runtime/site-packages/ddtrace/appsec/_iast/_taint_tracking/*.so | ||
RUN rm ./python/lib/$runtime/site-packages/ddtrace/appsec/_iast/_stacktrace*.so | ||
RUN rm ./python/lib/$runtime/site-packages/ddtrace/internal/datadog/profiling/libdd_wrapper.so | ||
RUN rm ./python/lib/$runtime/site-packages/ddtrace/internal/datadog/profiling/ddup/_ddup.*.so | ||
RUN rm ./python/lib/$runtime/site-packages/ddtrace/internal/datadog/profiling/stack_v2/_stack_v2.*.so | ||
|
||
RUN PYTHONNODEBUGRANGES=1 python -m compileall -o 2 -b ./python/lib/$runtime/site-packages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add comments just to remind what's happening here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM – would like to see this in self monitoring for a while before we release, it might be amazing to see this changes go there!
8a73394
to
df7bad0
Compare
MAX_LAYER_COMPRESSED_SIZE_KB=$(expr 4 \* 1024) | ||
MAX_LAYER_UNCOMPRESSED_SIZE_KB=$(expr 13 \* 1024) | ||
MAX_LAYER_COMPRESSED_SIZE_KB=$(expr 5 \* 1024) | ||
MAX_LAYER_UNCOMPRESSED_SIZE_KB=$(expr 12 \* 1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bytecode is harder to compress (hence increase from 4->5) but uncompressed is much smaller (hence decrease from 13->12)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! Could we update this changes in serverless-tools
once this is merged? Thanks!
583f37e
to
582e612
Compare
60a64f5
to
43947e9
Compare
What does this PR do?
Ships python layer with only pre-built bytecode binaries and none of the original
*.py
files.Motivation
For python 3.12, reduces cold start by >24%. Also reduces memory overhead and runtime duration by small but statistically significant amounts.
Similar overhead improvements are seen with python 3.9.
Additional Notes
Impact of prebuilding bytecode binaries
When the python interpreter starts up, it always first looks for any
*.pyc
files and if found, loads those instead of the*.py
files. If it is loading any*.py
files, it will first compile those into bytecode and save to disk so that the next call is faster. See flow chart:Note that these
*.pyc
files are not actually "compiled" code, because python is still an interpreted language. They are also not platform dependent, meaning we can build these bytecode files anywhere and expect them to work everywhere.These bytecode files do not change the running/execution of the python code. The code will still run exactly the same and at the same speed. What changes instead is the time it takes to load the instructions, prior to them being executed.
Impact of removing
*.py
filesIn addition to pre-compiling all
*.pyc
files, this PR also removes all*.py
files. As can be seen from flow chart and discussion above, this will make no change to the execution of the code.What is potentially lost is the ability for the interpreter to add proper line numbers and code samples to tracebacks. TODO: confirm this and add example here.
Impact of compiling at "optimization level" 2
In addition to pre-compiling bytecode files, this PR also builds those files using "optimization level" 2 by including
-o 2
in the commandline args.From the python docs
Assert statements. On April 18, 2024 I attended the python guild meeting and discussed this change with the team. They said that ddtrace should not require execution of
assert
statements in order to run properly. If there is a failure due to the remove of theassert
statements, then this is considered a bug and should be filed with them directly. I ran several of the ddtrace unittest suites usingpython -OO -m pytest
and found that the same tests failed using this as did when I ran the tests usingpython -m pytest
.__debug__
. The-o 2
flag also removes all code conditional on__debug__
. I did a grep of ddtrace and found no instances of this variable in the code base. I also did a grep for it in all repos I currently have checked out on my laptop and did not find any instances which would affect us.Docstrings. It is safe to remove the docstrings. The only time the customer would care to use them is in a python shell where they can run
help(my_func)
which will then print the docstring to the screen. This is clearly not necessary for the running of any lambda function. Removing them also reduces the size of the layer binary by a noticeable amount.Impact of compiling with env var
PYTHONNODEBUGRANGES=1
This env var is only available in python 3.11 and newer. See python docs:
Adding this env var leads to a noticeable reduction in cold start time (approx 0.5-1%) for python 3.12. I believe what is lost by including this is the column information displayed in any traceback. As these are not required, the trade off here (debug info for cold start improvement) I believe is worth it.
Testing Guidelines
Types of Changes
Check all that apply
https://datadoghq.atlassian.net/browse/SVLS-4713