TST/REF: io/parser/(test_dtypes.py, test_usecols.py) #38578

arw2019 · 2020-12-19T08:08:08Z

#38370 adds a pyarrow engine to the csv reader. Only a fraction of the io/parser tests pass when pyarrow is used and the rest has to be xfailed/skipped, resulting in a large diff on the PR.

xref #38370 (comment) suggests reorganizing the tests into classes so groups of tests can be xfailed with a single mark.

I'm grouping the tests logically (not based on whether or not they pass with pyarrow) but merging this this will reduce the diff in #38370 substantively.

Likely I will submit a follow-on with some further reorg. I'm happy to push that to this PR if that's preferred, though.

Verifying that total number of tests is unchanged:

(pandas-dev) andrewwieteska@Andrews-MacBook-Pro pandas % pytest  pandas/tests/io/parser/test_dtypes.py  pandas/tests/io/parser/test_usecols.py       
========================================================================= test session starts =========================================================================
platform darwin -- Python 3.8.6, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/andrewwieteska/repos/pandas, configfile: setup.cfg
plugins: forked-1.2.0, xdist-2.1.0, cov-2.10.1, asyncio-0.14.0, hypothesis-5.41.2, instafail-0.4.1
collected 372 items                                                                                                                                                   

pandas/tests/io/parser/test_dtypes.py ......................................................................................................................... [ 32%]
............................................................................................                                                                    [ 57%]
pandas/tests/io/parser/test_usecols.py ........................................................................................................................ [ 89%]
.......................................                                                                                                                         [100%]

======================================================================== 372 passed in 32.49s =========================================================================

versus on master:

(pandas-dev) andrewwieteska@Andrews-MacBook-Pro pandas % pytest  pandas/tests/io/parser/test_dtypes.py  pandas/tests/io/parser/test_usecols.py
========================================================================= test session starts =========================================================================
platform darwin -- Python 3.8.6, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/andrewwieteska/repos/pandas, configfile: setup.cfg
plugins: forked-1.2.0, xdist-2.1.0, cov-2.10.1, asyncio-0.14.0, hypothesis-5.41.2, instafail-0.4.1
collected 372 items                                                                                                                                                   

pandas/tests/io/parser/test_dtypes.py ......................................................................................................................... [ 32%]
............................................................................................                                                                    [ 57%]
pandas/tests/io/parser/test_usecols.py ........................................................................................................................ [ 89%]
.......................................                                                                                                                         [100%]

======================================================================== 372 passed in 32.90s =========================================================================

jreback · 2020-12-21T22:07:40Z

instead of using classes, can you just create / rename more module files? its the same effect.

cc @gfyoung

gfyoung · 2020-12-22T00:23:19Z

100% agree with @jreback feedback. Functions are more idiomatic with pytest.

A module will achieve what you are trying to do with these classes. Otherwise, a custom fixture will do.

arw2019 · 2020-12-22T05:35:46Z

Sgtm - will put these into modules

…v-test-reorg-1

jreback · 2020-12-31T19:09:40Z

looks like something is failing, merge master. pls confrm same number of tests on master as on here.

…v-test-reorg-1

arw2019 · 2020-12-31T19:27:01Z

looks like something is failing

I used the same name for a base file between two folders - fixed now

arw2019 · 2020-12-31T19:55:08Z

This patch:

(pandas-dev) andrewwieteska@Andrews-MacBook-Pro pandas % pytest pandas/tests/io/parser/dtypes pandas/tests/io/parser/usecols  
===================================================== test session starts =====================================================
platform darwin -- Python 3.8.6, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/andrewwieteska/repos/pandas, configfile: setup.cfg
plugins: forked-1.2.0, xdist-2.1.0, cov-2.10.1, asyncio-0.14.0, hypothesis-5.41.2, instafail-0.4.1
collected 372 items                                                                                                           

pandas/tests/io/parser/dtypes/test_categorical.py ..................................................................... [ 18%]
.....................                                                                                                   [ 24%]
pandas/tests/io/parser/dtypes/test_dtypes_basic.py .................................................................... [ 42%]
.......                                                                                                                 [ 44%]
pandas/tests/io/parser/dtypes/test_empty.py ................................................                            [ 57%]
pandas/tests/io/parser/usecols/test_parse_dates.py ...........................                                          [ 64%]
pandas/tests/io/parser/usecols/test_strings.py ..................                                                       [ 69%]
pandas/tests/io/parser/usecols/test_usecols_basic.py .................................................................. [ 87%]
................................................                                                                        [100%]

==================================================== 372 passed in 30.93s =====================================================

On master:

(pandas-dev) andrewwieteska@Andrews-MacBook-Pro pandas % pytest pandas/tests/io/parser/test_dtypes.py pandas/tests/io/parser/test_usecols.py 
===================================================== test session starts =====================================================
platform darwin -- Python 3.8.6, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/andrewwieteska/repos/pandas, configfile: setup.cfg
plugins: forked-1.2.0, xdist-2.1.0, cov-2.10.1, asyncio-0.14.0, hypothesis-5.41.2, instafail-0.4.1
collected 372 items                                                                                                           

pandas/tests/io/parser/test_dtypes.py ................................................................................. [ 21%]
....................................................................................................................... [ 53%]
.............                                                                                                           [ 57%]
pandas/tests/io/parser/test_usecols.py ................................................................................ [ 78%]
...............................................................................                                         [100%]

==================================================== 372 passed in 28.89s =====================================================

arw2019 · 2020-12-31T21:48:22Z

Green

jreback · 2020-12-31T22:13:44Z

thanks

* test reorg * test reorg * split test_dtypes.py into multiple files * split test_usecols.py into multiple files * dedeuplicate base filenames * complete file renaming

arw2019 added 2 commits December 19, 2020 02:53

test reorg

a2c0163

test reorg

347751f

arw2019 added Testing pandas testing functions or related to the test suite Clean Refactor Internal refactoring of code IO CSV read_csv, to_csv and removed Clean labels Dec 21, 2020

jreback added this to the 1.3 milestone Dec 21, 2020

arw2019 added 3 commits December 30, 2020 23:31

split test_dtypes.py into multiple files

6f92c23

Merge branch 'master' of https://github.com/pandas-dev/pandas into cs…

b8dfb6d

…v-test-reorg-1

split test_usecols.py into multiple files

e17ddfd

arw2019 added 2 commits December 31, 2020 14:18

dedeuplicate base filenames

a88314d

Merge branch 'master' of https://github.com/pandas-dev/pandas into cs…

015e041

…v-test-reorg-1

complete file renaming

bc43d16

jreback approved these changes Dec 31, 2020

View reviewed changes

jreback merged commit 9b47091 into pandas-dev:master Dec 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TST/REF: io/parser/(test_dtypes.py, test_usecols.py) #38578

TST/REF: io/parser/(test_dtypes.py, test_usecols.py) #38578

arw2019 commented Dec 19, 2020

jreback commented Dec 21, 2020

gfyoung commented Dec 22, 2020

arw2019 commented Dec 22, 2020

jreback commented Dec 31, 2020

arw2019 commented Dec 31, 2020

arw2019 commented Dec 31, 2020

arw2019 commented Dec 31, 2020

jreback commented Dec 31, 2020

TST/REF: io/parser/(test_dtypes.py, test_usecols.py) #38578

TST/REF: io/parser/(test_dtypes.py, test_usecols.py) #38578

Conversation

arw2019 commented Dec 19, 2020

jreback commented Dec 21, 2020

gfyoung commented Dec 22, 2020

arw2019 commented Dec 22, 2020

jreback commented Dec 31, 2020

arw2019 commented Dec 31, 2020

arw2019 commented Dec 31, 2020

arw2019 commented Dec 31, 2020

jreback commented Dec 31, 2020