You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Speed up Python apply_mask 20x by using int.from_bytes/to_bytes (#1034)
Speed up Python apply_mask 20x by using int.from_bytes/to_bytes
This speeds up the Python version of utils.apply_mask about 20 times,
using int.from_bytes so that the XOR is done in a single Python
operation -- in other words, the loop over the bytes is in C rather
than in Python.
Note that it is a trade-off as it uses more memory: this version
allocates roughly len(data) bytes for each of the intermediate values
(e.g., data_int, mask_repeated, mask_int, the XOR result); whereas I
believe the original version only allocates for the return value.
Still, most websocket packets aren't huge, and I believe the massive
speed gain here makes it worth it. (And people that use the speedups.c
version won't be affected.)
Obviously the speedups.c version is still significantly faster again,
but this change makes the library more usable in environments when it's
not feasible to use the C extension.
Data Size ForLoop IntXor Speedups
------------------------------------
1KB 78.6us 3.79us 151ns
1MB 79.7ms 4.38ms 55.4us
I got these timings by using commands like the following (with the
function call adjusted, and 1024 replaced with 1024*1024 as needed).
python3 -m timeit \
-s 'from websockets.utils import apply_mask' \
-s 'data=b"x"*1024; mask=b"abcd"' \
'apply_mask(data, mask)'
This idea came from Will McGugan's blog post "Speeding up Websockets
60X": https://www.willmcgugan.com/blog/tech/post/speeding-up-websockets-60x/
That post contains an ever faster (about 50% faster) way to solve it
using a pre-calculated XOR lookup table, but that pre-allocates a
64K-entry table at import time, which didn't seem ideal. Still, that is
how aiohttp does it, so maybe it's worth considering:
https://github.com/aio-libs/aiohttp/blob/6ec33c5d841c8e845c27ebdd9384bbf72651cbb8/aiohttp/http_websocket.py#L115-L140
The int.from_bytes approach is also the approach used by the
websocket-client library:
https://github.com/websocket-client/websocket-client/blob/5f32b3c0cfb836c016ad2a5f6caeff2978a6a16f/websocket/_abnf.py#L46-L50
Co-authored-by: Aymeric Augustin <[email protected]>
0 commit comments