Skip to content

Commit c7fc0d3

Browse files
benhoytaaugustin
andauthored
Speed up Python apply_mask 20x by using int.from_bytes/to_bytes (#1034)
Speed up Python apply_mask 20x by using int.from_bytes/to_bytes This speeds up the Python version of utils.apply_mask about 20 times, using int.from_bytes so that the XOR is done in a single Python operation -- in other words, the loop over the bytes is in C rather than in Python. Note that it is a trade-off as it uses more memory: this version allocates roughly len(data) bytes for each of the intermediate values (e.g., data_int, mask_repeated, mask_int, the XOR result); whereas I believe the original version only allocates for the return value. Still, most websocket packets aren't huge, and I believe the massive speed gain here makes it worth it. (And people that use the speedups.c version won't be affected.) Obviously the speedups.c version is still significantly faster again, but this change makes the library more usable in environments when it's not feasible to use the C extension. Data Size ForLoop IntXor Speedups ------------------------------------ 1KB 78.6us 3.79us 151ns 1MB 79.7ms 4.38ms 55.4us I got these timings by using commands like the following (with the function call adjusted, and 1024 replaced with 1024*1024 as needed). python3 -m timeit \ -s 'from websockets.utils import apply_mask' \ -s 'data=b"x"*1024; mask=b"abcd"' \ 'apply_mask(data, mask)' This idea came from Will McGugan's blog post "Speeding up Websockets 60X": https://www.willmcgugan.com/blog/tech/post/speeding-up-websockets-60x/ That post contains an ever faster (about 50% faster) way to solve it using a pre-calculated XOR lookup table, but that pre-allocates a 64K-entry table at import time, which didn't seem ideal. Still, that is how aiohttp does it, so maybe it's worth considering: https://github.com/aio-libs/aiohttp/blob/6ec33c5d841c8e845c27ebdd9384bbf72651cbb8/aiohttp/http_websocket.py#L115-L140 The int.from_bytes approach is also the approach used by the websocket-client library: https://github.com/websocket-client/websocket-client/blob/5f32b3c0cfb836c016ad2a5f6caeff2978a6a16f/websocket/_abnf.py#L46-L50 Co-authored-by: Aymeric Augustin <[email protected]>
1 parent f1d6345 commit c7fc0d3

File tree

1 file changed

+5
-2
lines changed

1 file changed

+5
-2
lines changed

src/websockets/utils.py

+5-2
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22

33
import base64
44
import hashlib
5-
import itertools
65
import secrets
6+
import sys
77

88

99
__all__ = ["accept_key", "apply_mask"]
@@ -43,4 +43,7 @@ def apply_mask(data: bytes, mask: bytes) -> bytes:
4343
if len(mask) != 4:
4444
raise ValueError("mask must contain 4 bytes")
4545

46-
return bytes(b ^ m for b, m in zip(data, itertools.cycle(mask)))
46+
data_int = int.from_bytes(data, sys.byteorder)
47+
mask_repeated = mask * (len(data) // 4) + mask[: len(data) % 4]
48+
mask_int = int.from_bytes(mask_repeated, sys.byteorder)
49+
return (data_int ^ mask_int).to_bytes(len(data), sys.byteorder)

0 commit comments

Comments
 (0)