-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: some more perf/clean in saslib.pyx #12961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@kshedden if you can run this on your example would be great (it may only be a small speedup) |
I'm getting ~2.99 seconds per 100K lines on the same file I have been using in #12656 (I think I was getting 3.1-3.2 seconds with the earlier version). |
My test file is not compressed and many of your changes are in the decompression, so it would not have helped. |
good point do we have a compress file to test on ? are those sas7cdat? |
@@ -16,13 +19,13 @@ cdef np.ndarray[uint8_t, ndim=1] rle_decompress(int result_length, np.ndarray[ui | |||
|
|||
while ipos < length: | |||
control_byte = inbuff[ipos] & 0xF0 | |||
end_of_first_byte = int(inbuff[ipos] & 0x0F) | |||
end_of_first_byte = <uint8_t>(inbuff[ipos] & 0x0F) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have to cast to regular int here (2 byte or wider), below there are places where end_of_first_byte is multiplied and will overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure / do u have some test files for this ? if not ok too
|
Many of the files in our test suite are compressed, but they are small and don't necessarily use all the compression codes. Compressed SAS files have the same name extension as decompressed ones (sas7bdat). You might be thinking of sas7bcat which has categorical data labels (sas7bdat only supports floats and strings). |
@kshedden ok I fixed up the byte construction. ideally we would have a test for this :> will merge on your ok. |
@kshedden ok with this? |
@kshedden I merged in. lmk if any more issues (or if you have better test cases) |
more perf
cc @kshedden