Skip to content

PERF: Slow import-time expression slowing down test collection #43888

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
bluetech opened this issue Oct 5, 2021 · 3 comments · Fixed by #43898
Closed
3 tasks done

PERF: Slow import-time expression slowing down test collection #43888

bluetech opened this issue Oct 5, 2021 · 3 comments · Fixed by #43898
Labels
Performance Memory or execution speed performance Testing pandas testing functions or related to the test suite
Milestone

Comments

@bluetech
Copy link

bluetech commented Oct 5, 2021

  • I have checked that this issue has not already been reported.

  • I have confirmed this issue exists on the latest version of pandas.

  • I have confirmed this issue exists on the master branch of pandas.

Reproducible Example

I'm a pytest dev that's been looking to speed up collection time, using pandas as a benchmark. I thought you'd be interested to know that there is a case which is slow (~1s) just because it does something heavy at the module scope:

@pytest.mark.parametrize(
"values",
[np.arange(2 ** 24 + 1), np.arange(2 ** 25 + 2).reshape(2 ** 24 + 1, 2)],
ids=["1d", "2d"],
)
def test_pct_max_many_rows(self, values):

These arrays are heavy to create, better to make it lazy by wrapping in a lambda or just split to two separate tests.

There's also something slow in the pandas/tests/reshape/merge/test_merge.py module but I'm not sure what.

Installed Versions

INSTALLED VERSIONS
------------------
commit           : 65998341034861527517756e6dd134dafea2b766
python           : 3.9.7.final.0
python-bits      : 64
OS               : Linux
OS-release       : 5.14.6-arch1-1
Version          : #1 SMP PREEMPT Sat, 18 Sep 2021 16:19:35 +0000
machine          : x86_64
processor        : 
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.4.0.dev0+833.g6599834103
numpy            : 1.21.2
pytz             : 2021.3
dateutil         : 2.8.2
pip              : 21.2.4
setuptools       : 58.0.4
Cython           : 0.29.24
pytest           : 6.3.0.dev745+g483f239d0
hypothesis       : 6.23.1
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : 3.0.1
lxml.etree       : 4.6.3
html5lib         : 1.1
pymysql          : None
psycopg2         : None
jinja2           : 3.0.1
IPython          : None
pandas_datareader: None
bs4              : 4.10.0
bottleneck       : None
fsspec           : None
fastparquet      : None
gcsfs            : None
matplotlib       : 3.4.3
numexpr          : 2.7.3
odfpy            : None
openpyxl         : 3.0.9
pandas_gbq       : None
pyarrow          : None
pyxlsb           : None
s3fs             : None
scipy            : 1.7.1
sqlalchemy       : None
tables           : 3.6.1
tabulate         : 0.8.9
xarray           : None
xlrd             : 2.0.1
xlwt             : 1.3.0
numba            : None

Prior Performance

No response

@bluetech bluetech added Needs Triage Issue that has not been reviewed by a pandas team member Performance Memory or execution speed performance labels Oct 5, 2021
@mzeitlin11
Copy link
Member

Thanks for reporting this @bluetech! A pr to fix this would be very welcome. Generally, we've got a few issues about slow testing - any tips from pytest devs about how to speed up the test suite are always invaluable :)

@mzeitlin11 mzeitlin11 added Testing pandas testing functions or related to the test suite and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 5, 2021
@mzeitlin11 mzeitlin11 added this to the Contributions Welcome milestone Oct 5, 2021
@lithomas1 lithomas1 modified the milestones: Contributions Welcome, 1.4 Oct 6, 2021
@jbrockmendel
Copy link
Member

There's also something slow in the pandas/tests/reshape/merge/test_merge.py module but I'm not sure what.

How do you identify slow-to-collect tests?

@bluetech
Copy link
Author

bluetech commented Oct 6, 2021

In this case I saw it in pyinstrument output. Look for TestRank in the below.

pyinstrument pytest pandas collect only
  _     ._   __/__   _ _  _  _ _/_   Recorded: 07:11:06 PM Samples:  38816
 /_//_/// /_\ / //_// / //_'/ //     Duration: 44.824    CPU time: 43.522
/   _/                      v4.0.4

Program: pytest --co pandas -k xxxxxxxx

44.823 <module>  <string>:1
   [10 frames hidden]  <string>, runpy, <built-in>
      44.629 _run_code  runpy.py:64
      └─ 44.629 <module>  pytest/__main__.py:1
         └─ 44.629 console_main  _pytest/config/__init__.py:181
            └─ 44.629 main  _pytest/config/__init__.py:133
               ├─ 42.436 __call__  pluggy/_hooks.py:244
               │     [3 frames hidden]  pluggy
               │        42.436 _multicall  pluggy/_callers.py:9
               │        └─ 42.436 pytest_cmdline_main  _pytest/main.py:316
               │           └─ 42.436 wrap_session  _pytest/main.py:257
               │              └─ 42.225 _main  _pytest/main.py:320
               │                 └─ 42.225 __call__  pluggy/_hooks.py:244
               │                       [3 frames hidden]  pluggy
               │                          42.225 _multicall  pluggy/_callers.py:9
               │                          └─ 42.225 pytest_collection  _pytest/main.py:333
               │                             └─ 42.225 perform_collect  _pytest/main.py:615
               │                                ├─ 30.825 genitems  _pytest/main.py:832
               │                                │  ├─ 29.808 genitems  _pytest/main.py:832
               │                                │  │  ├─ 18.599 genitems  _pytest/main.py:832
               │                                │  │  │  ├─ 15.039 genitems  _pytest/main.py:832
               │                                │  │  │  │  ├─ 11.149 collect_one_node  _pytest/runner.py:541
               │                                │  │  │  │  │  └─ 11.095 __call__  pluggy/_hooks.py:244
               │                                │  │  │  │  │        [10 frames hidden]  pluggy, <built-in>
               │                                │  │  │  │  │           11.091 _multicall  pluggy/_callers.py:9
               │                                │  │  │  │  │           └─ 11.080 pytest_make_collect_report  _pytest/runner.py:370
               │                                │  │  │  │  │              └─ 11.074 from_call  _pytest/runner.py:317
               │                                │  │  │  │  │                 └─ 11.070 <lambda>  _pytest/runner.py:371
               │                                │  │  │  │  │                    └─ 11.069 collect  _pytest/python.py:876
               │                                │  │  │  │  │                       └─ 10.953 collect  _pytest/python.py:409
               │                                │  │  │  │  │                          └─ 10.824 __call__  pluggy/_hooks.py:244
               │                                │  │  │  │  │                                [54 frames hidden]  pluggy, pytest_asyncio, asyncio, insp...
               │                                │  │  │  │  │                                   10.758 _multicall  pluggy/_callers.py:9
               │                                │  │  │  │  │                                   └─ 10.512 pytest_pycollect_makeitem  _pytest/python.py:221
               │                                │  │  │  │  │                                      └─ 10.365 _genfunctions  _pytest/python.py:451
               │                                │  │  │  │  │                                         ├─ 6.295 from_parent  _pytest/python.py:1653
               │                                │  │  │  │  │                                         │  └─ 6.137 from_parent  _pytest/nodes.py:254
               │                                │  │  │  │  │                                         │     └─ 5.993 _create  _pytest/nodes.py:143
               │                                │  │  │  │  │                                         │        └─ 5.850 __init__  _pytest/python.py:1591
               │                                │  │  │  │  │                                         │           ├─ 2.081 getfixtureinfo  _pytest/fixtures.py:1459
               │                                │  │  │  │  │                                         │           │  ├─ 0.989 getfuncargnames  _pytest/compat.py:122
               │                                │  │  │  │  │                                         │           │  │  └─ 0.795 signature  inspect.py:3109
               │                                │  │  │  │  │                                         │           │  │        [42 frames hidden]  inspect, enum, <built-in>
               │                                │  │  │  │  │                                         │           │  └─ 0.901 getfixtureclosure  _pytest/fixtures.py:1506
               │                                │  │  │  │  │                                         │           ├─ 1.079 [self]  
               │                                │  │  │  │  │                                         │           ├─ 0.862 __init__  _pytest/nodes.py:678
               │                                │  │  │  │  │                                         │           │  └─ 0.725 __init__  _pytest/nodes.py:180
               │                                │  │  │  │  │                                         │           │     └─ 0.583 [self]  
               │                                │  │  │  │  │                                         │           └─ 0.748 <dictcomp>  _pytest/python.py:1638
               │                                │  │  │  │  │                                         └─ 2.669 call_extra  pluggy/_hooks.py:283
               │                                │  │  │  │  │                                               [20 frames hidden]  pluggy, <built-in>
               │                                │  │  │  │  │                                                  2.601 _multicall  pluggy/_callers.py:9
               │                                │  │  │  │  │                                                  ├─ 1.588 pytest_generate_tests  _pytest/fixtures.py:1559
               │                                │  │  │  │  │                                                  │  └─ 1.163 parametrize  _pytest/python.py:1015
               │                                │  │  │  │  │                                                  │     └─ 0.512 _resolve_arg_ids  _pytest/python.py:1149
               │                                │  │  │  │  │                                                  │        └─ 0.506 idmaker  _pytest/python.py:1402
               │                                │  │  │  │  │                                                  │           └─ 0.496 <listcomp>  _pytest/python.py:1410
               │                                │  │  │  │  │                                                  └─ 0.937 pytest_generate_tests  _pytest/python.py:143
               │                                │  │  │  │  │                                                     └─ 0.884 parametrize  _pytest/python.py:1015
               │                                │  │  │  │  └─ 3.746 genitems  _pytest/main.py:832
               │                                │  │  │  │     ├─ 1.644 ihook  _pytest/nodes.py:272
               │                                │  │  │  │     │  └─ 1.594 gethookproxy  _pytest/main.py:545
               │                                │  │  │  │     │     └─ 1.057 _getconftestmodules  _pytest/config/__init__.py:526
               │                                │  │  │  │     │        └─ 0.477 is_file  pathlib.py:1450
               │                                │  │  │  │     │              [12 frames hidden]  pathlib, <built-in>
               │                                │  │  │  │     └─ 1.458 __getattr__  _pytest/config/compat.py:35
               │                                │  │  │  │        └─ 1.385 __getattr__  _pytest/main.py:429
               │                                │  │  │  │           └─ 1.321 subset_hook_caller  pluggy/_manager.py:351
               │                                │  │  │  │                 [8 frames hidden]  pluggy, <built-in>
               │                                │  │  │  ├─ 1.478 ihook  _pytest/nodes.py:272
               │                                │  │  │  │  └─ 1.443 gethookproxy  _pytest/main.py:545
               │                                │  │  │  │     └─ 0.894 _getconftestmodules  _pytest/config/__init__.py:526
               │                                │  │  │  └─ 1.308 __getattr__  _pytest/config/compat.py:35
               │                                │  │  │     └─ 1.218 __getattr__  _pytest/main.py:429
               │                                │  │  │        └─ 1.156 subset_hook_caller  pluggy/_manager.py:351
               │                                │  │  │              [8 frames hidden]  pluggy, <built-in>
               │                                │  │  └─ 11.048 collect_one_node  _pytest/runner.py:541
               │                                │  │     └─ 11.014 __call__  pluggy/_hooks.py:244
               │                                │  │           [6 frames hidden]  pluggy
               │                                │  │              11.010 _multicall  pluggy/_callers.py:9
               │                                │  │              └─ 10.989 pytest_make_collect_report  _pytest/runner.py:370
               │                                │  │                 └─ 10.981 from_call  _pytest/runner.py:317
               │                                │  │                    └─ 10.976 <lambda>  _pytest/runner.py:371
               │                                │  │                       └─ 10.976 collect  _pytest/python.py:509
               │                                │  │                          ├─ 6.287 collect  _pytest/python.py:409
               │                                │  │                          │  └─ 6.194 __call__  pluggy/_hooks.py:244
               │                                │  │                          │        [41 frames hidden]  pluggy, pytest_asyncio, asyncio, insp...
               │                                │  │                          │           6.154 _multicall  pluggy/_callers.py:9
               │                                │  │                          │           └─ 5.996 pytest_pycollect_makeitem  _pytest/python.py:221
               │                                │  │                          │              └─ 5.885 _genfunctions  _pytest/python.py:451
               │                                │  │                          │                 ├─ 3.691 from_parent  _pytest/python.py:1653
               │                                │  │                          │                 │  └─ 3.570 from_parent  _pytest/nodes.py:254
               │                                │  │                          │                 │     └─ 3.462 _create  _pytest/nodes.py:143
               │                                │  │                          │                 │        ├─ 2.728 __init__  _pytest/python.py:1591
               │                                │  │                          │                 │        │  ├─ 0.558 <dictcomp>  _pytest/python.py:1638
               │                                │  │                          │                 │        │  ├─ 0.525 __init__  _pytest/nodes.py:678
               │                                │  │                          │                 │        │  └─ 0.507 getfixtureinfo  _pytest/fixtures.py:1459
               │                                │  │                          │                 │        └─ 0.734 [self]  
               │                                │  │                          │                 └─ 1.383 call_extra  pluggy/_hooks.py:283
               │                                │  │                          │                       [16 frames hidden]  pluggy
               │                                │  │                          │                          1.357 _multicall  pluggy/_callers.py:9
               │                                │  │                          │                          ├─ 0.757 pytest_generate_tests  _pytest/python.py:143
               │                                │  │                          │                          │  └─ 0.736 parametrize  _pytest/python.py:1015
               │                                │  │                          │                          └─ 0.568 pytest_generate_tests  _pytest/fixtures.py:1559
               │                                │  │                          └─ 4.627 _inject_setup_module_fixture  _pytest/python.py:515
               │                                │  │                             └─ 4.621 obj  _pytest/python.py:282
               │                                │  │                                └─ 4.613 _getobj  _pytest/python.py:506
               │                                │  │                                   └─ 4.612 _importtestmodule  _pytest/python.py:581
               │                                │  │                                      └─ 4.603 import_path  _pytest/pathlib.py:454
               │                                │  │                                         └─ 4.403 import_module  importlib/__init__.py:109
               │                                │  │                                               [28 frames hidden]  importlib, <built-in>
               │                                │  │                                                  4.124 exec_module  _pytest/assertion/rewrite.py:131
               │                                │  │                                                  ├─ 1.052 _read_pyc  _pytest/assertion/rewrite.py:361
               │                                │  │                                                  │  └─ 0.604 load  <built-in>:0
               │                                │  │                                                  │        [2 frames hidden]  <built-in>
               │                                │  │                                                  └─ 0.719 <module>  pandas/tests/test_algos.py:1
               │                                │  │                                                     └─ 0.718 TestRank  pandas/tests/test_algos.py:1757
               │                                │  │                                                        └─ 0.718 arange  <built-in>:0
               │                                │  │                                                              [2 frames hidden]  <built-in>
               │                                │  └─ 0.969 collect_one_node  _pytest/runner.py:541
               │                                │     └─ 0.956 __call__  pluggy/_hooks.py:244
               │                                │           [6 frames hidden]  pluggy, <built-in>
               │                                │              0.956 _multicall  pluggy/_callers.py:9
               │                                │              └─ 0.954 pytest_make_collect_report  _pytest/runner.py:370
               │                                │                 └─ 0.954 from_call  _pytest/runner.py:317
               │                                │                    └─ 0.954 <lambda>  _pytest/runner.py:371
               │                                │                       └─ 0.895 collect  _pytest/python.py:714
               │                                │                          └─ 0.543 _collectfile  _pytest/python.py:690
               │                                └─ 10.979 __call__  pluggy/_hooks.py:244
               │                                      [17 frames hidden]  pluggy, hypothesis, <built-in>
               │                                         10.979 _multicall  pluggy/_callers.py:9
               │                                         ├─ 4.616 pytest_collection_modifyitems  _pytest/fixtures.py:1604
               │                                         │  └─ 4.601 reorder_items  _pytest/fixtures.py:281
               │                                         │     ├─ 2.588 reorder_items_atscope  _pytest/fixtures.py:311
               │                                         │     │  └─ 1.974 reorder_items_atscope  _pytest/fixtures.py:311
               │                                         │     │     └─ 1.374 reorder_items_atscope  _pytest/fixtures.py:311
               │                                         │     │        └─ 0.549 reorder_items_atscope  _pytest/fixtures.py:311
               │                                         │     └─ 1.445 get_parametrized_fixture_keys  _pytest/fixtures.py:245
               │                                         │        └─ 1.153 [self]  
               │                                         ├─ 3.710 pytest_collection_modifyitems  _pytest/mark/__init__.py:263
               │                                         │  └─ 3.707 deselect_by_keyword  _pytest/mark/__init__.py:187
               │                                         │     ├─ 2.679 from_item  _pytest/mark/__init__.py:153
               │                                         │     │  ├─ 0.668 <genexpr>  _pytest/mark/__init__.py:173
               │                                         │     │  │  └─ 0.535 <genexpr>  _pytest/nodes.py:371
               │                                         │     │  ├─ 0.660 [self]  
               │                                         │     │  └─ 0.509 listextrakeywords  _pytest/nodes.py:405
               │                                         │     └─ 0.899 evaluate  _pytest/mark/expression.py:215
               │                                         │        └─ 0.713 <module>  <pytest match expression>:1
               │                                         │              [2 frames hidden]  <pytest match expression>
               │                                         │                 0.663 __getitem__  _pytest/mark/expression.py:180
               │                                         │                 └─ 0.565 __call__  _pytest/mark/__init__.py:177
               │                                         └─ 2.409 pytest_collection_modifyitems  pandas/conftest.py:95
               │                                            ├─ 0.946 __contains__  _pytest/mark/structures.py:563
               │                                            │  └─ 0.723 __contains__  _pytest/mark/structures.py:563
               │                                            │     └─ 0.525 __contains__  _pytest/mark/structures.py:563
               │                                            ├─ 0.723 __getattr__  _pytest/mark/structures.py:495
               │                                            │  └─ 0.700 __init__  _pytest/mark/structures.py:213
               │                                            └─ 0.452 add_marker  _pytest/nodes.py:344
               └─ 2.193 _prepareconfig  _pytest/config/__init__.py:303
                  └─ 2.168 __call__  pluggy/_hooks.py:244
                        [3 frames hidden]  pluggy
                           2.168 _multicall  pluggy/_callers.py:9
                           └─ 2.168 pytest_cmdline_parse  _pytest/config/__init__.py:1028
                              └─ 2.168 parse  _pytest/config/__init__.py:1312
                                 └─ 2.165 _preparse  _pytest/config/__init__.py:1180
                                    └─ 1.769 __call__  pluggy/_hooks.py:244
                                          [3 frames hidden]  pluggy
                                             1.769 _multicall  pluggy/_callers.py:9
                                             └─ 1.768 pytest_load_initial_conftests  _pytest/config/__init__.py:1097
                                                └─ 1.768 _set_initial_conftests  _pytest/config/__init__.py:483
                                                   └─ 1.768 _try_load_conftest  _pytest/config/__init__.py:516
                                                      └─ 1.768 _getconftestmodules  _pytest/config/__init__.py:526
                                                         └─ 1.768 _importconftest  _pytest/config/__init__.py:573
                                                            └─ 1.767 import_path  _pytest/pathlib.py:454
                                                               └─ 1.767 import_module  importlib/__init__.py:109
                                                                     [3 frames hidden]  importlib, <built-in>
                                                                        0.929 <module>  pandas/__init__.py:3
                                                                        0.837 exec_module  _pytest/assertion/rewrite.py:131
                                                                        └─ 0.832 <module>  pandas/conftest.py:1
                                                                           └─ 0.753 <module>  pandas/util/_test_decorators.py:1
                                                                              └─ 0.588 _skip_if_no_scipy  pandas/util/_test_decorators.py:116
                                                                                 └─ 0.588 safe_import  pandas/util/_test_decorators.py:51
                                                                                    └─ 0.533 <module>  scipy/stats/__init__.py:1
                                                                                          [581 frames hidden]  scipy, <built-in>, collections, numpy...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants