Skip to content

Commit 8e67015

Browse files
committed
TST, fix for issue pandas-dev#17978.
Incorporate review comments.
1 parent b8d2b82 commit 8e67015

File tree

2 files changed

+92
-59
lines changed

2 files changed

+92
-59
lines changed

doc/source/contributing.rst

+3-37
Original file line numberDiff line numberDiff line change
@@ -777,45 +777,12 @@ Tests that we have ``parametrized`` are now accessible via the test name, for ex
777777
778778
Using ``hypothesis``
779779
~~~~~~~~~~~~~~~~~~~~
780-
With the transition to pytest, things have become easier for testing by having reduced boilerplate for test cases and also by utilizing pytest's features like parametizing, skipping and marking test cases.
780+
With the usage of pytest, things have become easier for testing by having reduced boilerplate for test cases and also by utilizing pytest's features like parametizing, skipping and marking test cases.
781781
782782
However, one has to still come up with input data examples which can be tested against the functionality. There is always a possibility to skip testing an example which could have failed the test case.
783783
784784
Hypothesis is a python package which helps in overcoming this issue by generating the input data based on some set of specifications provided by the user.
785-
e.g suppose we have to test python's sum function for a list of int.
786-
787-
Here is a sample test case using pytest:
788-
789-
.. code-block:: python
790-
791-
import pytest
792-
793-
@pytest.mark.parametrize('seq', [
794-
[0, 0, 0],
795-
[1, 2, 3, 4],
796-
[-3, 5, -8, 23],
797-
[12345678, 9876543, 567894321]
798-
])
799-
def test_sum_using_pytest(seq):
800-
total = 0
801-
for item in seq:
802-
total += item
803-
assert sum(seq) == total
804-
805-
output of test cases:
806-
807-
.. code-block:: shell
808-
809-
collecting ... collected 4 items
810-
pytest_example.py::test_sum_using_pytest[seq0] PASSED [ 25%]
811-
pytest_example.py::test_sum_using_pytest[seq1] PASSED [ 50%]
812-
pytest_example.py::test_sum_using_pytest[seq2] PASSED [ 75%]
813-
pytest_example.py::test_sum_using_pytest[seq3] PASSED [100%]
814-
815-
========================== 4 passed in 0.06 seconds ===========================
816-
817-
818-
Compare it with below example for the same test case using hypothesis.
785+
e.g consider the test case for testing python's sum function for a list of int using hypothesis.
819786
820787
.. code-block:: python
821788
@@ -840,8 +807,7 @@ output of test cases:
840807
841808
========================== 1 passed in 0.33 seconds ===========================
842809
843-
The main difference in above example is use of a decorator "@given(st.lists(st.integers()))" which if applied to test case function, generates some random list of int, which is then assigned to parameter of test case.
844-
Above example clearly helps in adding more coverage for our test functions.
810+
In above example by applying a decorator "@given(st.lists(st.integers()))" to the unit test function, we have directed hypothesis to generate some random list of int as input for the test function, which eventually helps in adding more coverage for our test functions by generating random input data.
845811
846812
For more information about hypothesis or in general about property based testing, check below links:
847813

pandas/util/_hypothesis.py

+89-22
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,11 @@
1+
"""
2+
This module houses utility functions to generate hypothesis strategies which
3+
can be used to generate random input test data for various test cases.
4+
It is for internal use by different test case files like pandas/test/test*.py
5+
files only and should not be used beyond this purpose.
6+
For more information on hypothesis, check
7+
(http://hypothesis.readthedocs.io/en/latest/).
8+
"""
19
import string
210
from hypothesis import (given,
311
settings,
@@ -7,6 +15,62 @@
715

816

917
def get_elements(elem_type):
18+
"""
19+
Helper function to return hypothesis strategy whose elements depends on
20+
the input data-type.
21+
Currently only four types are supported namely, bool, int, float and str.
22+
23+
Parameters
24+
----------
25+
elem_type: type
26+
type of the elements for the strategy.
27+
28+
Returns
29+
-------
30+
hypothesis strategy.
31+
32+
Examples
33+
--------
34+
>>> strat = get_elements(str)
35+
>>> strat.example()
36+
'KWAo'
37+
38+
>>> strat.example()
39+
'OfAlBH'
40+
41+
>>> strat = get_elements(int)
42+
>>> strat.example()
43+
31911
44+
45+
>>> strat.example()
46+
25288
47+
48+
>>> strat = get_elements(float)
49+
>>> strat.example()
50+
nan
51+
52+
>>> strat.example()
53+
inf
54+
55+
>>> strat.example()
56+
-2.2250738585072014e-308
57+
58+
>>> strat.example()
59+
0.5
60+
61+
>>> strat.example()
62+
1.7976931348623157e+308
63+
64+
>>> strat = get_elements(bool)
65+
>>> strat.example()
66+
True
67+
68+
>>> strat.example()
69+
True
70+
71+
>>> strat.example()
72+
False
73+
"""
1074
strategy = st.nothing()
1175
if elem_type == bool:
1276
strategy = st.booleans()
@@ -49,28 +113,32 @@ def get_seq(draw, types, mixed=False, min_size=None, max_size=None,
49113
50114
Examples
51115
--------
52-
seq_strategy = get_seq((int, str, bool),
53-
mixed=True, min_size=1, max_size=5)
54-
seq_strategy.example()
55-
Out[12]: ['lkYMSn', -2501, 35, 'J']
56-
seq_strategy.example()
57-
Out[13]: [True]
58-
seq_strategy.example()
59-
Out[14]: ['dRWgQYrBrW', True, False, 'gmsujJVDBM', 'Z']
60-
61-
seq_strategy = get_seq((int, bool),
62-
mixed=False,
63-
min_size=1,
64-
max_size=5,
65-
transform_func=lambda seq: [str(x) for x in seq])
66-
seq_strategy.example()
67-
Out[19]: ['-1892']
68-
seq_strategy.example()
69-
Out[20]: ['22', '66', '14785', '-26312', '32']
70-
seq_strategy.example()
71-
Out[21]: ['22890', '-15537', '96']
116+
>>> seq_strategy = get_seq((int, str, bool), mixed=True, min_size=1, max_size=5)
117+
118+
>>> seq_strategy.example()
119+
['lkYMSn', -2501, 35, 'J']
120+
121+
>>> seq_strategy.example()
122+
[True]
123+
124+
>>> seq_strategy.example()
125+
['dRWgQYrBrW', True, False, 'gmsujJVDBM', 'Z']
126+
127+
>>> seq_strategy = get_seq((int, bool),
128+
... mixed=False,
129+
... min_size=1,
130+
... max_size=5,
131+
... transform_func=lambda seq: [str(x) for x in seq])
132+
133+
>>> seq_strategy.example()
134+
['9552', '124', '-24024']
135+
136+
>>> seq_strategy.example()
137+
['-1892']
138+
139+
>>> seq_strategy.example()
140+
['22', '66', '14785', '-26312', '32']
72141
"""
73-
strategy = st.nothing()
74142
if min_size is None:
75143
min_size = draw(st.integers(min_value=0, max_value=100))
76144

@@ -85,7 +153,6 @@ def get_seq(draw, types, mixed=False, min_size=None, max_size=None,
85153
elem_strategies.append(get_elements(elem_type))
86154
if not mixed:
87155
break
88-
89156
if transform_func:
90157
strategy = draw(st.lists(st.one_of(elem_strategies),
91158
min_size=min_size,

0 commit comments

Comments
 (0)