DO NOT MERGE: Timing test files #26968

datapythonista · 2019-06-20T14:34:50Z

Checking how long every test file takes to run in the CI. In #26949 I'd like to run tests constraining that a whole file must be run in a single worker (so we can run single tests in parallel).

This means that files that will take a lot of time won't be parallelized well, and should be splitted for this to work. So, I want to know how long each takes.

datapythonista · 2019-06-20T18:34:49Z

Checking how long it takes every test file to see if some should be splitted for better parallelization I see this, which doesn't look right to me:

pandas/tests/io/test_html.py : 21m59.041s
pandas/tests/io/json/test_pandas.py : 8m50.339s
pandas/tests/io/parser/test_common.py : 6m45.860s
pandas/tests/io/excel/test_readers.py : 13m14.864s

Didn't check the files individually, I guess in some cases we want to test loading big files, but 22 minutes to test the html I/O sounds very exaggerated.

@pandas-dev/pandas-core is this expected?

jorisvandenbossche · 2019-06-20T18:41:17Z

Did you try running that single file locally?

datapythonista · 2019-06-20T18:43:25Z

No, wanted to check first that this is not expected and I'm missing something. If this doesn't seem right, I'll then try to figure out what's going on.

jorisvandenbossche · 2019-06-20T19:46:13Z

The second one takes locally 4.6 seconds (and not almost 9 min). There are 2 out of 97 tests skipped (encoding related), of which one is an s3 test

jbrockmendel · 2019-06-27T19:23:08Z

pandas/tests/io/test_html.py

I'm seeing the two (not marked as slow) test_invalid_url tests as about 150s apiece.

jbrockmendel · 2019-06-29T00:29:09Z

@datapythonista closable?

datapythonista · 2019-06-29T09:14:08Z

Yep, let's close this for now, may need to reopen later to research on the timing problem, but I don't have time now.

datapythonista and others added 6 commits June 19, 2019 17:56

WIP/CI/TST: Clean up of the tests script

cf2f9de

Fixing test results file names and some more clean up

2a7e351

Restoring python hash seed

8bae129

Merge remote-tracking branch 'upstream/master' into serial_tests_2

8afdb2a

Merge remote-tracking branch 'upstream/master' into serial_tests_2

318819a

Timing test files

c1f8e91

datapythonista added Testing pandas testing functions or related to the test suite CI Continuous Integration labels Jun 20, 2019

Fixing quotes

ae62af2

datapythonista closed this Jun 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DO NOT MERGE: Timing test files #26968

DO NOT MERGE: Timing test files #26968

Uh oh!

datapythonista commented Jun 20, 2019

Uh oh!

datapythonista commented Jun 20, 2019

Uh oh!

jorisvandenbossche commented Jun 20, 2019

Uh oh!

datapythonista commented Jun 20, 2019

Uh oh!

jorisvandenbossche commented Jun 20, 2019

Uh oh!

jbrockmendel commented Jun 27, 2019

Uh oh!

jbrockmendel commented Jun 29, 2019

Uh oh!

datapythonista commented Jun 29, 2019

Uh oh!

Uh oh!

Uh oh!

DO NOT MERGE: Timing test files #26968

DO NOT MERGE: Timing test files #26968

Uh oh!

Conversation

datapythonista commented Jun 20, 2019

Uh oh!

datapythonista commented Jun 20, 2019

Uh oh!

jorisvandenbossche commented Jun 20, 2019

Uh oh!

datapythonista commented Jun 20, 2019

Uh oh!

jorisvandenbossche commented Jun 20, 2019

Uh oh!

jbrockmendel commented Jun 27, 2019

Uh oh!

jbrockmendel commented Jun 29, 2019

Uh oh!

datapythonista commented Jun 29, 2019

Uh oh!

Uh oh!