TST: Ensure dtypes are set correctly for empty integer columns #24386 #34886

avinashpancham · 2020-06-20T10:51:35Z

[x ] closes DataFrame constructor ignores integer dtype when dict-data and non-overlapping columns #24386
[x ] tests added / passed
[x ] passes black pandas
[x ] passes git diff upstream/master -u -- "*.py" | flake8 --diff
[x ] added tests to verify whether empty integer columns are loaded in as integer columns

…-dev#24386

charlesdong1991

thanks! one nitpick

pandas/tests/dtypes/test_dtypes.py

jorisvandenbossche

Thanks for your contribution!

I would maybe move the test to tests/frame/test_constructors.py

jorisvandenbossche · 2020-06-20T12:12:45Z

pandas/tests/dtypes/test_dtypes.py

+
+
+def test_check_dtype_empty_column():
+    # GH24386: Ensure dtypes are set correctly for empty integer columns


Maybe add to the comment "dictionary data with non-overlapping columns, resulting in an empty DataFrame", as it is specifically this case (not just an empty dataframe)

avinashpancham · 2020-06-20T12:14:07Z

Thanks. Can I add the test at the bottom of the class TestDataFrameConstructors in tests/frame/test_constructors.py or should I add it in a specific section of the class?

jreback · 2020-06-20T12:18:53Z

pandas/tests/dtypes/test_dtypes.py

+
+def test_check_dtype_empty_column():
+    # GH24386: Ensure dtypes are set correctly for empty integer columns
+    data = pd.DataFrame({"a": [1, 2]}, columns=["b"], dtype=int)


also pls parameterize over the dtypes mentioned in the OP

in test_constructors.py there are already similar tests, please move near them

jreback · 2020-06-20T13:05:09Z

pandas/tests/frame/test_constructors.py

+            ("timedelta64[ns]", np.dtype("<m8[ns]")),
+            ("bool", np.dtype("bool")),
+            ("object", np.dtype("O")),
+            ("category", CategoricalDtype(categories=[], ordered=False)),


great as long as you are adding a list, can you add the rest of the types. you ca easily do this by using the types defined in pandas._testing (we also have fixtures for this, but maybe not one that covers all types); would be ok with one for that as well. Note that I don't think we have Boolean, String defined pandas._testing (could add them there)

Added more dtypes check, but now the list is very long. Any ideas on how to makes more pretty? I did not want to add it to the lists in pandas._testing, since I was not sure it would be compatible with other tests that rely on them.

you don't need to do it like this, rather use the already defined ones that are ther
e.g. ALL_EA_INT_DTYPES

Do you have any tips on how to refactor the parametrization if I use ALL_EA_INT_DTYPES? The parametrization now has the form [(input_value, expected_value)], but if I loop over ALL_EA_INT_DTYPES I think I still need to manually provide the expected value every time.

how's that? the expected IS the same as the input

of course, you can easily tranform these:

In [1]: pd.Int64Dtype().name Out[1]: 'Int64'

so my points is that you can simply pass in the ALL_* (just concatenate them all), and check the dtypes of the result columns

Great, that already helps me a lot. But I for int this would not work since the expected is 'Int64'. Similarly for float, str.

Sorry for the large number of questions, this is my first PR :)

sure so what you can do is break this into 2 tests, one for the easy cases where this is true and one for the special cases (where you can input 2 args for input and expected)

Thanks for the feedback, I think I came to a much better test now.

jreback · 2020-06-20T22:26:14Z

thanks @avinashpancham

TST: Ensure dtypes are set correctly for empty integer columns pandas…

66433d5

…-dev#24386

charlesdong1991 added the Testing pandas testing functions or related to the test suite label Jun 20, 2020

charlesdong1991 reviewed Jun 20, 2020

View reviewed changes

pandas/tests/dtypes/test_dtypes.py Outdated Show resolved Hide resolved

pandas/tests/dtypes/test_dtypes.py Outdated Show resolved Hide resolved

avinashpancham added 2 commits June 20, 2020 14:00

Add comment to refer to GH issue tracker

feb7680

Refactor check, use == instead of is

3c2f72b

jorisvandenbossche reviewed Jun 20, 2020

View reviewed changes

jreback added Constructors Series/DataFrame/Index/pd.array Constructors Dtype Conversions Unexpected or buggy dtype conversions labels Jun 20, 2020

jreback requested changes Jun 20, 2020

View reviewed changes

Moved file to test_constructors.py and added test for other dtypes

8c68294

jreback added this to the 1.1 milestone Jun 20, 2020

jreback requested changes Jun 20, 2020

View reviewed changes

avinashpancham added 2 commits June 20, 2020 15:57

Add support for more dtypes

9a41ab8

Refactor testing for data types using containers in _testing.py

d2dfa8b

jreback approved these changes Jun 20, 2020

View reviewed changes

jreback merged commit 8020543 into pandas-dev:master Jun 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TST: Ensure dtypes are set correctly for empty integer columns #24386 #34886

TST: Ensure dtypes are set correctly for empty integer columns #24386 #34886

avinashpancham commented Jun 20, 2020 •

edited

Loading

charlesdong1991 left a comment

jorisvandenbossche left a comment

jorisvandenbossche Jun 20, 2020

avinashpancham commented Jun 20, 2020 •

edited

Loading

jreback Jun 20, 2020

jreback Jun 20, 2020

jreback Jun 20, 2020

avinashpancham Jun 20, 2020

jreback Jun 20, 2020

avinashpancham Jun 20, 2020

jreback Jun 20, 2020

jreback Jun 20, 2020

jreback Jun 20, 2020

avinashpancham Jun 20, 2020

jreback Jun 20, 2020

avinashpancham Jun 20, 2020

jreback commented Jun 20, 2020



		def test_check_dtype_empty_column():
		# GH24386: Ensure dtypes are set correctly for empty integer columns

TST: Ensure dtypes are set correctly for empty integer columns #24386 #34886

TST: Ensure dtypes are set correctly for empty integer columns #24386 #34886

Conversation

avinashpancham commented Jun 20, 2020 • edited Loading

charlesdong1991 left a comment

Choose a reason for hiding this comment

jorisvandenbossche left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

avinashpancham commented Jun 20, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Jun 20, 2020

avinashpancham commented Jun 20, 2020 •

edited

Loading

avinashpancham commented Jun 20, 2020 •

edited

Loading