Skip to content

TST: split up test_concat.py #37243 #37259

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

kamilt5
Copy link
Contributor

@kamilt5 kamilt5 commented Oct 19, 2020

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so i have request some reorg. the key here is to not touch inside tests at all, so that you can asssert number of tests before == number of tests after. I am happy for followups to touch up things is totally fine as well. but i think splitting out by dtype is useful here (and before merge).

return request.param


class TestConcatenate:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this was a mistake originally, can you fold these into concat/test_concat.py

result = concat([df.iloc[[0]], df.iloc[[1]]])
tm.assert_series_equal(result.dtypes, df.dtypes)

def test_concat_series(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move these to test_series.py


tm.assert_frame_equal(pd.concat(CustomIterator2(), ignore_index=True), expected)

def test_concat_invalid(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you create a test_invalid.py (and move things that raise there)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be more clear to do this as another pass (after merging this)

y = [pd.Period("2011-03", freq="M"), pd.Period("2011-04", freq="M")]
result = concat([pd.Series(x), pd.Series(y)], ignore_index=True)
tm.assert_series_equal(result, pd.Series(x + y, dtype="object"))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would not be averse to creating a test_datetimes.py all datetime / tz related, can also split out a test_period and test_timedelta (if we have any of those)

# it works
concat([df1, df2], sort=sort)

def test_handle_empty_objects(self, sort):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worthile to create a test_empty.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should I put also methods, which check empty, from test_series in test_empty? There are methods in test_series like:

    def test_concat_empty_series_dtypes(self, left, right, expected):
        result = pd.concat([Series(dtype=left), Series(dtype=right)])
   def test_concat_empty_series_dtypes_roundtrips(self):
        # round-tripping with self & like self

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes ideally group all of teh empty tests together (whether dataframe / series).

dfa = df1.append(df2)
tm.assert_index_equal(df["grade"].cat.categories, dfa["grade"].cat.categories)

def test_categorical_concat_preserve(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create a test_categorical

@@ -0,0 +1,274 @@
from datetime import datetime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would actually like to integrate these with the other tests (e.g. if they are tz related then into test_datetimes.py), but of some won't and you can leave if its more clear here or put in test_concat either way)

@jreback jreback added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Testing pandas testing functions or related to the test suite labels Oct 20, 2020
@pep8speaks
Copy link

pep8speaks commented Oct 20, 2020

Hello @laffite56! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 84:89: E501 line too long (91 > 88 characters)

Line 230:89: E501 line too long (89 > 88 characters)

Comment last updated at 2020-10-20 17:03:34 UTC

@kamilt5
Copy link
Contributor Author

kamilt5 commented Oct 20, 2020

I have put a few functions into classes, but i don't know if this is fine.

@jreback
Copy link
Contributor

jreback commented Oct 20, 2020

I have put a few functions into classes, but i don't know if this is fine.

I generally would go the other way actually.

But most important for this PR. please let's do just straight copy-paste, its very hard to see changes otherwise. my requests were all about just moving existing code around in blocks, not actually change that much.

so we want to merge this soon. I am also happy to do this in a separate PR by first literally moving the entire file, then moving pieces of it.

@kamilt5
Copy link
Contributor Author

kamilt5 commented Oct 21, 2020

I'm a little confused. You want me to create a new PR, with one commit containing only copy-paste original "test_concat.py" to new folder, like i have right now:

from pandas/tests/reshape/test_concat.py => pandas/tests/reshape/concat/test_concat.py

and then in next commit reshape "test_concat.py" to submodules?

@kamilt5
Copy link
Contributor Author

kamilt5 commented Oct 22, 2020

Sir, @jreback, i don't really know what to do with this issue. I can easly make new PR with every commit separated, like one commit for moving one file, then next ones for every new submodule and for every significant change, so You'll be able to see every change.

@jreback
Copy link
Contributor

jreback commented Oct 22, 2020

Sir, @jreback, i don't really know what to do with this issue. I can easly make new PR with every commit separated, like one commit for moving one file, then next ones for every new submodule and for every significant change, so You'll be able to see every change.

so 1) pls merge master. I would actually just do a straight move & split and NOTHING else in a the first pr. split by classes is fine.
let's do that and will merge quickly. Then its very easy to do followups.

@kamilt5
Copy link
Contributor Author

kamilt5 commented Oct 23, 2020

#37360 here is new PR.

@kamilt5 kamilt5 closed this Oct 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Reshaping Concat, Merge/Join, Stack/Unstack, Explode Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TST: split up test_concat.py
3 participants