-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
TST: Refactor slow tests #53891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
TST: Refactor slow tests #53891
Changes from 4 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
dc154e2
Address more slow tests
mroeschke e1edffc
Parameterize slow test
mroeschke cf9ea4e
Reduce data size in multi_thread
mroeschke c2d7132
Use constant data for test_int64_overflow_groupby_large_df_shuffled
mroeschke 2318e3b
Merge remote-tracking branch 'upstream/main' into tst/slow
mroeschke File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -44,32 +44,6 @@ def test_buffer_overflow(c_parser_only, malformed): | |
parser.read_csv(StringIO(malformed)) | ||
|
||
|
||
def test_buffer_rd_bytes(c_parser_only): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Based on the original issue, this sounded like a PY2 bug specifically so I don't think this needs testing anymore |
||
# see gh-12098: src->buffer in the C parser can be freed twice leading | ||
# to a segfault if a corrupt gzip file is read with 'read_csv', and the | ||
# buffer is filled more than once before gzip raises an Exception. | ||
|
||
data = ( | ||
"\x1F\x8B\x08\x00\x00\x00\x00\x00\x00\x03\xED\xC3\x41\x09" | ||
"\x00\x00\x08\x00\xB1\xB7\xB6\xBA\xFE\xA5\xCC\x21\x6C\xB0" | ||
"\xA6\x4D" + "\x55" * 267 + "\x7D\xF7\x00\x91\xE0\x47\x97\x14\x38\x04\x00" | ||
"\x1f\x8b\x08\x00VT\x97V\x00\x03\xed]\xefO" | ||
) | ||
parser = c_parser_only | ||
|
||
for _ in range(100): | ||
try: | ||
parser.read_csv_check_warnings( | ||
RuntimeWarning, | ||
"compression has no effect when passing a non-binary object as input", | ||
StringIO(data), | ||
compression="gzip", | ||
delim_whitespace=True, | ||
) | ||
except Exception: | ||
pass | ||
|
||
|
||
def test_delim_whitespace_custom_terminator(c_parser_only): | ||
# See gh-12912 | ||
data = "a b c~1 2 3~4 5 6~7 8 9" | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be set as a property of the TextReader? It is a bit ambiguous in a pyx file but in the global namespace all caps I would expect this to be a compile time constant; attaching it as a property makes things clearer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or if this is just for testing maybe you can patch
buffer_lines
directly? The naming here is a bit unclear when scoped outside of the initializerThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have the same concern about
DEFAULT_CHUNKSIZE
above too?Ideally I think these magic numbers should at least be made obvious so they can be configured or removed #53781 and it being in the TextReader would make that less obvious?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I am mistaken with how Cython generates the code comparing this to
DEFAULT_CHUNKSIZE
is exactly the problem; that is a compile time constant whereas this value can be modified at runtime, but they both look the sameThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you suggest how I could make
DEFAULT_BUFFER_HEURISTIC
a property on the cdef TextReader class? I'm having trouble defining it in a way that I could monkeypatchThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm does declaring it
cpdef
help at all? Not worth going down a rabbit hole over if its a hold upThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are my unsuccessful attempts so far just to double check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool thanks for checking. Let's not get hung up on it for now then - I think just a wart between the C/Python and how they get expressed in the Cython global namespace. Can always come back and refactor if we establish a better pattern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for confirming!