BUG: AssertionError on Series.append(DataFrame) fix #30975 #31036

hvardhan20 · 2020-01-15T09:51:29Z

closes AssertionError from Series.append(DataFrame) with 1.0.0rc0 #30975
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

Series.append() does not throw AssertionError anymore. This fix performs a precheck before proceeding with concatenation. Tests have passed. Whatsnew entry added.
Thanks!

pep8speaks · 2020-01-15T09:51:33Z

Hello @hvardhan20! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-01-21 06:02:52 UTC

simonjayhawkins

Thanks @hvardhan20 for the PR.

pandas/tests/series/methods/test_append.py

simonjayhawkins

Thanks @hvardhan20 also need to run black pandas before a commit, see https://dev.pandas.io/docs/development/contributing.html#python-pep8-black

pandas/core/series.py

simonjayhawkins · 2020-01-15T10:35:27Z

pandas/core/series.py

@@ -2538,6 +2538,10 @@ def append(self, to_append, ignore_index=False, verify_integrity=False) -> "Seri
            to_concat = [self]
            to_concat.extend(to_append)
        else:
+            if not isinstance(to_append, type(self)):


I think should allow a Series to be appended to a subclassed Series

Suggested change

if not isinstance(to_append, type(self)):

if not isinstance(to_append, Series):

also need to move the check into above condition to check the items of a sequence

or instead of inside the else, loop over to_concat[1:] outside the else

Thanks for the hint @simonjayhawkins.
Could you please suggest the better way to do this between the following 2 ways:
1)

Suggested change

if not isinstance(to_append, type(self)):

for x in to_concat[1:]:

if not isinstance(x, Series):

msg = (

f"to_append should be a Series or list/tuple of Series, "

f"got {type(to_append).__name__}"

f'{(" of " + type(x).__name__) if isinstance(to_append, (list, tuple)) else ""}'

)

raise TypeError(msg)

Suggested change

if not isinstance(to_append, type(self)):

if any(not isinstance(x, Series) for x in to_concat[1:]):

msg = (

f"to_append should be a Series or list/tuple of Series, "

f"got {type(to_append).__name__}"

)

raise TypeError(msg)

Both go after the else. The problem with 2nd way is we cannot get the type of the sequence element which raises the Error. If there's a better way to do this, please do share.
Thanks!

Test case test_concatlike_same_dtypes is failing in pipeline, which is testing the appending of various data types and expecting to catch and raise a TypeError in Concatenator constructor.
With if not isinstance(x, Series): to_append will catch & raise TypeError for all data types.
Hence test_concatlike_same_dtypes is failing.

I think we should consider the following change:

Suggested change

if not isinstance(to_append, type(self)):

for x in to_concat[1:]:

if isinstance(x, pd.DataFrame):

raise TypeError

What are your thoughts on this?

IIUC correctly from the issue the behaviour of adding a DataFrame needs to be revisited, so I agree that maybe checking for a DataFrame maybe better than checking for not a Series.

I prefer the simple loop of step 1 (fails faster) and the simplicity of message 2.

pandas/core/series.py

simonjayhawkins · 2020-01-15T10:44:20Z

pandas/tests/series/methods/test_append.py

+        msg = "to_append should be a Series or list/tuple of Series, got " \
+              "<class 'pandas.core.frame.DataFrame'>"
+        with pytest.raises(TypeError, match=msg):
+            df.A.append(df)


also need to add test to check items in a list/tuple

Is this an acceptable test case for testing items in sequence?

Suggested change

df.A.append(df)

df = pd.DataFrame({"A": [1, 2], "B": [3, 4]})

li = [df.B, df]

msg = "to_append should be a Series or list/tuple of Series, got list of DataFrame"

with pytest.raises(TypeError, match=msg):

df.A.append(li)

pandas/tests/series/methods/test_append.py

This reverts commit 53f1c11.

simonjayhawkins

Thanks @hvardhan20. a few more suggestions otherwise lgtm.

simonjayhawkins · 2020-01-18T09:45:28Z

pandas/core/series.py

+            if isinstance(x, (pd.DataFrame,)):
+                msg = (
+                    f"to_append should be a Series or list/tuple of Series, "
+                    f"got {type(to_append).__name__}"


may make the error message clearer in the case where the sequence contains a DataFame

Suggested change

f"got {type(to_append).__name__}"

f"got {type(x).__name__}"

simonjayhawkins · 2020-01-18T09:46:52Z

pandas/core/series.py

@@ -2539,6 +2539,13 @@ def append(self, to_append, ignore_index=False, verify_integrity=False) -> "Seri
            to_concat.extend(to_append)
        else:
            to_concat = [self, to_append]
+        for x in to_concat[1:]:
+            if isinstance(x, (pd.DataFrame,)):


Suggested change

if isinstance(x, (pd.DataFrame,)):

if isinstance(x, ABCDataFrame):

instead of checking if its a dataframe, would be better to check not an acceptable type, e.g. Series, list-like of Series.

IIUC other incorrect types are caught by concat. This PR will be backported so just minimising the changes here to fix the regression (raises AssertionError) for now.

could catch the AssertionError and raise improved message instead?

no i would rather not backport this, let's actually fix this properly.

no i would rather not backport this, let's actually fix this properly.

and backport the proper fix?

I think we need something backported. to get back to the (dubious) 0.25.3 behaviour could just remove the self._ensure_type call added for typing purposes that is causing the AssertionError in 1.0.0rc0.

I think we need something backported. to get back to the (dubious) 0.25.3 behaviour could just remove the self._ensure_type call added for typing purposes that is causing the AssertionError in 1.0.0rc0.

ok let's do that then

simonjayhawkins · 2020-01-18T09:49:50Z

pandas/tests/series/methods/test_append.py

+            df.A.append(df)
+        msg = "to_append should be a Series or list/tuple of Series, got list"
+        with pytest.raises(TypeError, match=msg):
+            df.A.append(li)


could use just the minimum to invoke TypeError

Suggested change

df.A.append(li)

df.A.append([df])

jreback · 2020-01-18T15:30:32Z

pandas/core/series.py

@@ -2539,6 +2539,13 @@ def append(self, to_append, ignore_index=False, verify_integrity=False) -> "Seri
            to_concat.extend(to_append)
        else:
            to_concat = [self, to_append]
+        for x in to_concat[1:]:
+            if isinstance(x, (pd.DataFrame,)):


instead of checking if its a dataframe, would be better to check not an acceptable type, e.g. Series, list-like of Series.

…eturn type change

simonjayhawkins

@hvardhan20 the direction of this PR should now be #31036 (comment) instead of what was proposed in #30975 (comment)

simonjayhawkins · 2020-01-19T10:08:34Z

pandas/core/series.py

@@ -2462,7 +2462,7 @@ def searchsorted(self, value, side="left", sorter=None):
    # -------------------------------------------------------------------
    # Combination

-    def append(self, to_append, ignore_index=False, verify_integrity=False) -> "Series":


rather than deleting the annotation, can you use FrameOrSeriesUnion from pandas._typing instead.

@simonjayhawkins
FrameOrSeriesUnion is throwing this error during static analysis:
Performing static analysis using mypy
pandas/core/frame.py:2590: error: Incompatible types in assignment (expression has type "Union[DataFrame, Series]", variable has type "Series")

And FrameOrSeries is throwing this:
Performing static analysis using mypy
pandas/core/series.py:2505: error: Incompatible return value type (got "Union[DataFrame, Series]", expected "FrameOrSeries")

I guess this is why there was no type hint initially for 0.25.3 version of append.
What do you think about this?

Thanks!

OK perhaps we should keep this PR to just fixing the regression. so removing the return annotation is fine here.

pandas/core/series.py

pandas/tests/series/methods/test_append.py

doc/source/whatsnew/v1.0.0.rst

…ing fix

simonjayhawkins

@hvardhan20 lgtm @TomAugspurger @jreback

simonjayhawkins · 2020-01-22T12:07:21Z

pandas/core/series.py

@@ -2462,7 +2462,7 @@ def searchsorted(self, value, side="left", sorter=None):
    # -------------------------------------------------------------------
    # Combination

-    def append(self, to_append, ignore_index=False, verify_integrity=False) -> "Series":


OK perhaps we should keep this PR to just fixing the regression. so removing the return annotation is fine here.

TomAugspurger · 2020-01-22T15:45:50Z

This seems fine for backport.

What's the plan for 1.0.1 / 1.1? For this to raise a TypeError? Do we have an issue for that?

…taFrame) fix pandas-dev#30975

jreback · 2020-01-24T03:38:05Z

thanks @hvardhan20

can you open an issue as @TomAugspurger suggest #31036 (comment)

#30975 (#31267) Co-authored-by: Harshavardhan Bachina <[email protected]>

hvardhan20 added 2 commits January 15, 2020 03:13

GH 30975-Fix for Series.append(DataFrame)

7a3b6fe

GH 30975-Fix for Series.append(DataFrame)

8ea217f

hvardhan20 changed the title ~~Series append fix~~ Series.append(DataFrame) AssertionError Fix #30975 Jan 15, 2020

GH 30975-Fix for Series.append(DataFrame) PEP-8 compliant

d522438

simonjayhawkins requested changes Jan 15, 2020

View reviewed changes

pandas/tests/series/methods/test_append.py Outdated Show resolved Hide resolved

simonjayhawkins added Error Reporting Incorrect or improved errors from pandas Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Jan 15, 2020

GH 30975-Fix for Series.append(DataFrame) Adding TypeError Testcase

b717007

hvardhan20 requested a review from simonjayhawkins January 15, 2020 10:28

simonjayhawkins requested changes Jan 15, 2020

View reviewed changes

hvardhan20 added 2 commits January 15, 2020 18:40

GH 30975-Fix for Series.append(DataFrame) Modified

b584b35

GH 30975-Fix for Series.append(DataFrame) Doc Build Fix

1e5e8e7

hvardhan20 requested a review from simonjayhawkins January 16, 2020 02:40

hvardhan20 added 3 commits January 16, 2020 20:53

GH 31087 Updates ggpy reference to plotnine

53f1c11

Revert "GH 31087 Updates ggpy reference to plotnine"

4e1f6b1

This reverts commit 53f1c11.

Revert "GH 31087 Updates ggpy reference to plotnine"

041ed12

This reverts commit 53f1c11.

hvardhan20 force-pushed the series_append_fix branch from 53f1c11 to 1e5e8e7 Compare January 17, 2020 03:44

hvardhan20 changed the title ~~Series.append(DataFrame) AssertionError Fix #30975~~ BUG: AssertionError on Series.append(DataFrame) fix #30975 Jan 17, 2020

hvardhan20 added 2 commits January 17, 2020 14:04

GH 30975-Fix for Series.append(DataFrame) Simple message change

721a560

GH 30975-Fix for Series.append(DataFrame) Simple message test change

f312bae

simonjayhawkins reviewed Jan 18, 2020

View reviewed changes

simonjayhawkins added this to the 1.0.0 milestone Jan 18, 2020

jreback requested changes Jan 18, 2020

View reviewed changes

jreback modified the milestone: 1.0.0 Jan 18, 2020

hvardhan20 added 2 commits January 18, 2020 20:26

GH 30975-Fix for Series.append(DataFrame) Removed self._ensure_type

a1ebee2

GH 30975-Fix for Series.append(DataFrame) Removed self._ensure_type r…

32d2989

…eturn type change

hvardhan20 requested a review from jreback January 19, 2020 07:35

simonjayhawkins requested changes Jan 19, 2020

View reviewed changes

simonjayhawkins added Regression Functionality that used to work in a prior pandas version and removed Error Reporting Incorrect or improved errors from pandas labels Jan 19, 2020

hvardhan20 added 6 commits January 20, 2020 04:13

GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour

87ea2aa

GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour return typ…

19dcea3

…ing fix

GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour return typ…

cdf92ae

…ing fix

GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour return typ…

a747921

…ing fix

GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour return typ…

a256fe8

…ing fix

GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour return typ…

14e0fb4

…ing fix

hvardhan20 requested a review from simonjayhawkins January 21, 2020 06:45

simonjayhawkins approved these changes Jan 22, 2020

View reviewed changes

jreback approved these changes Jan 24, 2020

View reviewed changes

jreback merged commit 8d1e084 into pandas-dev:master Jan 24, 2020

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Jan 24, 2020

Backport PR pandas-dev#31036: BUG: AssertionError on Series.append(Da…

66f2aaa

…taFrame) fix pandas-dev#30975

meeseeksmachine mentioned this pull request Jan 24, 2020

Backport PR #31036 on branch 1.0.x (BUG: AssertionError on Series.append(DataFrame) fix #30975 ) #31267

Merged

jreback pushed a commit that referenced this pull request Jan 24, 2020

Backport PR #31036: BUG: AssertionError on Series.append(DataFrame) fix

cf2e9b9

#30975 (#31267) Co-authored-by: Harshavardhan Bachina <[email protected]>

hvardhan20 deleted the series_append_fix branch January 24, 2020 20:46

hvardhan20 mentioned this pull request Jan 29, 2020

Series.append(DataFrame) should throw TypeError #31413

Closed

hvardhan20 mentioned this pull request Feb 19, 2020

Series append raises TypeError #32090

Merged

5 tasks

	if not isinstance(to_append, type(self)):
	if not isinstance(to_append, Series):

	if isinstance(x, (pd.DataFrame,)):
	if isinstance(x, ABCDataFrame):

Uh oh!

BUG: AssertionError on Series.append(DataFrame) fix #30975 #31036

BUG: AssertionError on Series.append(DataFrame) fix #30975 #31036

Uh oh!

Conversation

hvardhan20 commented Jan 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pep8speaks commented Jan 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2020-01-21 06:02:52 UTC

Uh oh!

simonjayhawkins left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

simonjayhawkins left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hvardhan20 Jan 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hvardhan20 Jan 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

simonjayhawkins left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

simonjayhawkins left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

simonjayhawkins left a comment

Choose a reason for hiding this comment

hvardhan20 commented Jan 15, 2020 •

edited

Loading

pep8speaks commented Jan 15, 2020 •

edited

Loading

hvardhan20 Jan 15, 2020 •

edited

Loading

hvardhan20 Jan 15, 2020 •

edited

Loading