Skip to content

TST/REF: io/parser/(test_dtypes.py, test_usecols.py) #38578

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Dec 31, 2020

Conversation

arw2019
Copy link
Member

@arw2019 arw2019 commented Dec 19, 2020

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

#38370 adds a pyarrow engine to the csv reader. Only a fraction of the io/parser tests pass when pyarrow is used and the rest has to be xfailed/skipped, resulting in a large diff on the PR.

xref #38370 (comment) suggests reorganizing the tests into classes so groups of tests can be xfailed with a single mark.

I'm grouping the tests logically (not based on whether or not they pass with pyarrow) but merging this this will reduce the diff in #38370 substantively.

Likely I will submit a follow-on with some further reorg. I'm happy to push that to this PR if that's preferred, though.

Verifying that total number of tests is unchanged:

(pandas-dev) andrewwieteska@Andrews-MacBook-Pro pandas % pytest  pandas/tests/io/parser/test_dtypes.py  pandas/tests/io/parser/test_usecols.py       
========================================================================= test session starts =========================================================================
platform darwin -- Python 3.8.6, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/andrewwieteska/repos/pandas, configfile: setup.cfg
plugins: forked-1.2.0, xdist-2.1.0, cov-2.10.1, asyncio-0.14.0, hypothesis-5.41.2, instafail-0.4.1
collected 372 items                                                                                                                                                   

pandas/tests/io/parser/test_dtypes.py ......................................................................................................................... [ 32%]
............................................................................................                                                                    [ 57%]
pandas/tests/io/parser/test_usecols.py ........................................................................................................................ [ 89%]
.......................................                                                                                                                         [100%]

======================================================================== 372 passed in 32.49s =========================================================================

versus on master:

(pandas-dev) andrewwieteska@Andrews-MacBook-Pro pandas % pytest  pandas/tests/io/parser/test_dtypes.py  pandas/tests/io/parser/test_usecols.py
========================================================================= test session starts =========================================================================
platform darwin -- Python 3.8.6, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/andrewwieteska/repos/pandas, configfile: setup.cfg
plugins: forked-1.2.0, xdist-2.1.0, cov-2.10.1, asyncio-0.14.0, hypothesis-5.41.2, instafail-0.4.1
collected 372 items                                                                                                                                                   

pandas/tests/io/parser/test_dtypes.py ......................................................................................................................... [ 32%]
............................................................................................                                                                    [ 57%]
pandas/tests/io/parser/test_usecols.py ........................................................................................................................ [ 89%]
.......................................                                                                                                                         [100%]

======================================================================== 372 passed in 32.90s =========================================================================

@arw2019 arw2019 added Testing pandas testing functions or related to the test suite Clean Refactor Internal refactoring of code IO CSV read_csv, to_csv and removed Clean labels Dec 21, 2020
@jreback
Copy link
Contributor

jreback commented Dec 21, 2020

instead of using classes, can you just create / rename more module files? its the same effect.

cc @gfyoung

@jreback jreback added this to the 1.3 milestone Dec 21, 2020
@gfyoung
Copy link
Member

gfyoung commented Dec 22, 2020

100% agree with @jreback feedback. Functions are more idiomatic with pytest.

A module will achieve what you are trying to do with these classes. Otherwise, a custom fixture will do.

@arw2019
Copy link
Member Author

arw2019 commented Dec 22, 2020

Sgtm - will put these into modules

@jreback
Copy link
Contributor

jreback commented Dec 31, 2020

looks like something is failing, merge master. pls confrm same number of tests on master as on here.

@arw2019
Copy link
Member Author

arw2019 commented Dec 31, 2020

looks like something is failing

I used the same name for a base file between two folders - fixed now

@arw2019
Copy link
Member Author

arw2019 commented Dec 31, 2020

This patch:

(pandas-dev) andrewwieteska@Andrews-MacBook-Pro pandas % pytest pandas/tests/io/parser/dtypes pandas/tests/io/parser/usecols  
===================================================== test session starts =====================================================
platform darwin -- Python 3.8.6, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/andrewwieteska/repos/pandas, configfile: setup.cfg
plugins: forked-1.2.0, xdist-2.1.0, cov-2.10.1, asyncio-0.14.0, hypothesis-5.41.2, instafail-0.4.1
collected 372 items                                                                                                           

pandas/tests/io/parser/dtypes/test_categorical.py ..................................................................... [ 18%]
.....................                                                                                                   [ 24%]
pandas/tests/io/parser/dtypes/test_dtypes_basic.py .................................................................... [ 42%]
.......                                                                                                                 [ 44%]
pandas/tests/io/parser/dtypes/test_empty.py ................................................                            [ 57%]
pandas/tests/io/parser/usecols/test_parse_dates.py ...........................                                          [ 64%]
pandas/tests/io/parser/usecols/test_strings.py ..................                                                       [ 69%]
pandas/tests/io/parser/usecols/test_usecols_basic.py .................................................................. [ 87%]
................................................                                                                        [100%]

==================================================== 372 passed in 30.93s =====================================================

On master:

(pandas-dev) andrewwieteska@Andrews-MacBook-Pro pandas % pytest pandas/tests/io/parser/test_dtypes.py pandas/tests/io/parser/test_usecols.py 
===================================================== test session starts =====================================================
platform darwin -- Python 3.8.6, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/andrewwieteska/repos/pandas, configfile: setup.cfg
plugins: forked-1.2.0, xdist-2.1.0, cov-2.10.1, asyncio-0.14.0, hypothesis-5.41.2, instafail-0.4.1
collected 372 items                                                                                                           

pandas/tests/io/parser/test_dtypes.py ................................................................................. [ 21%]
....................................................................................................................... [ 53%]
.............                                                                                                           [ 57%]
pandas/tests/io/parser/test_usecols.py ................................................................................ [ 78%]
...............................................................................                                         [100%]

==================================================== 372 passed in 28.89s =====================================================

@arw2019
Copy link
Member Author

arw2019 commented Dec 31, 2020

Green

@jreback jreback merged commit 9b47091 into pandas-dev:master Dec 31, 2020
@jreback
Copy link
Contributor

jreback commented Dec 31, 2020

thanks

luckyvs1 pushed a commit to luckyvs1/pandas that referenced this pull request Jan 20, 2021
* test reorg

* test reorg

* split test_dtypes.py into multiple files

* split test_usecols.py into multiple files

* dedeuplicate base filenames

* complete file renaming
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv Refactor Internal refactoring of code Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants