Skip to content

BUG: AssertionError on Series.append(DataFrame) fix #30975 #31036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Jan 24, 2020
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
7a3b6fe
GH 30975-Fix for Series.append(DataFrame)
hvardhan20 Jan 15, 2020
8ea217f
GH 30975-Fix for Series.append(DataFrame)
hvardhan20 Jan 15, 2020
d522438
GH 30975-Fix for Series.append(DataFrame) PEP-8 compliant
hvardhan20 Jan 15, 2020
b717007
GH 30975-Fix for Series.append(DataFrame) Adding TypeError Testcase
hvardhan20 Jan 15, 2020
b584b35
GH 30975-Fix for Series.append(DataFrame) Modified
hvardhan20 Jan 16, 2020
1e5e8e7
GH 30975-Fix for Series.append(DataFrame) Doc Build Fix
hvardhan20 Jan 16, 2020
53f1c11
GH 31087 Updates ggpy reference to plotnine
hvardhan20 Jan 17, 2020
4e1f6b1
Revert "GH 31087 Updates ggpy reference to plotnine"
hvardhan20 Jan 17, 2020
041ed12
Revert "GH 31087 Updates ggpy reference to plotnine"
hvardhan20 Jan 17, 2020
721a560
GH 30975-Fix for Series.append(DataFrame) Simple message change
hvardhan20 Jan 17, 2020
f312bae
GH 30975-Fix for Series.append(DataFrame) Simple message test change
hvardhan20 Jan 17, 2020
a1ebee2
GH 30975-Fix for Series.append(DataFrame) Removed self._ensure_type
hvardhan20 Jan 19, 2020
32d2989
GH 30975-Fix for Series.append(DataFrame) Removed self._ensure_type r…
hvardhan20 Jan 19, 2020
87ea2aa
GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour
hvardhan20 Jan 20, 2020
19dcea3
GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour return typ…
hvardhan20 Jan 20, 2020
cdf92ae
GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour return typ…
hvardhan20 Jan 20, 2020
a747921
GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour return typ…
hvardhan20 Jan 20, 2020
a256fe8
GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour return typ…
hvardhan20 Jan 21, 2020
14e0fb4
GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour return typ…
hvardhan20 Jan 21, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1155,6 +1155,7 @@ Other
- :meth:`DataFrame.to_csv` and :meth:`Series.to_csv` now support dicts as ``compression`` argument with key ``'method'`` being the compression method and others as additional compression options when the compression method is ``'zip'``. (:issue:`26023`)
- Bug in :meth:`Series.diff` where a boolean series would incorrectly raise a ``TypeError`` (:issue:`17294`)
- :meth:`Series.append` will no longer raise a ``TypeError`` when passed a tuple of ``Series`` (:issue:`28410`)
- :meth:`Series.append` will now raise a ``TypeError`` when passed a DataFrame (:issue:`30975`)
- Fix corrupted error message when calling ``pandas.libs._json.encode()`` on a 0d array (:issue:`18878`)
- Backtick quoting in :meth:`DataFrame.query` and :meth:`DataFrame.eval` can now also be used to use invalid identifiers like names that start with a digit, are python keywords, or are using single character operators. (:issue:`27017`)
- Bug in ``pd.core.util.hashing.hash_pandas_object`` where arrays containing tuples were incorrectly treated as non-hashable (:issue:`28969`)
Expand Down
7 changes: 7 additions & 0 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -2539,6 +2539,13 @@ def append(self, to_append, ignore_index=False, verify_integrity=False) -> "Seri
to_concat.extend(to_append)
else:
to_concat = [self, to_append]
for x in to_concat[1:]:
if isinstance(x, (pd.DataFrame,)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if isinstance(x, (pd.DataFrame,)):
if isinstance(x, ABCDataFrame):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of checking if its a dataframe, would be better to check not an acceptable type, e.g. Series, list-like of Series.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC other incorrect types are caught by concat. This PR will be backported so just minimising the changes here to fix the regression (raises AssertionError) for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could catch the AssertionError and raise improved message instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no i would rather not backport this, let's actually fix this properly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no i would rather not backport this, let's actually fix this properly.

and backport the proper fix?

I think we need something backported. to get back to the (dubious) 0.25.3 behaviour could just remove the self._ensure_type call added for typing purposes that is causing the AssertionError in 1.0.0rc0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need something backported. to get back to the (dubious) 0.25.3 behaviour could just remove the self._ensure_type call added for typing purposes that is causing the AssertionError in 1.0.0rc0.

ok let's do that then

msg = (
f"to_append should be a Series or list/tuple of Series, "
f"got {type(to_append).__name__}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may make the error message clearer in the case where the sequence contains a DataFame

Suggested change
f"got {type(to_append).__name__}"
f"got {type(x).__name__}"

)
raise TypeError(msg)
return self._ensure_type(
concat(
to_concat, ignore_index=ignore_index, verify_integrity=verify_integrity
Expand Down
12 changes: 12 additions & 0 deletions pandas/tests/series/methods/test_append.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,18 @@ def test_append_tuples(self):

tm.assert_series_equal(expected, result)

def test_append_dataframe_raises(self):
# GH 30975
df = pd.DataFrame({"A": [1, 2], "B": [3, 4]})
li = [df.B, df]

msg = "to_append should be a Series or list/tuple of Series, got DataFrame"
with pytest.raises(TypeError, match=msg):
df.A.append(df)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also need to add test to check items in a list/tuple

Copy link
Contributor Author

@hvardhan20 hvardhan20 Jan 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an acceptable test case for testing items in sequence?

Suggested change
df.A.append(df)
df = pd.DataFrame({"A": [1, 2], "B": [3, 4]})
li = [df.B, df]
msg = "to_append should be a Series or list/tuple of Series, got list of DataFrame"
with pytest.raises(TypeError, match=msg):
df.A.append(li)

msg = "to_append should be a Series or list/tuple of Series, got list"
with pytest.raises(TypeError, match=msg):
df.A.append(li)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could use just the minimum to invoke TypeError

Suggested change
df.A.append(li)
df.A.append([df])



class TestSeriesAppendWithDatetimeIndex:
def test_append(self):
Expand Down