Skip to content

BUG: AssertionError on Series.append(DataFrame) fix #30975 #31036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Jan 24, 2020
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
7a3b6fe
GH 30975-Fix for Series.append(DataFrame)
hvardhan20 Jan 15, 2020
8ea217f
GH 30975-Fix for Series.append(DataFrame)
hvardhan20 Jan 15, 2020
d522438
GH 30975-Fix for Series.append(DataFrame) PEP-8 compliant
hvardhan20 Jan 15, 2020
b717007
GH 30975-Fix for Series.append(DataFrame) Adding TypeError Testcase
hvardhan20 Jan 15, 2020
b584b35
GH 30975-Fix for Series.append(DataFrame) Modified
hvardhan20 Jan 16, 2020
1e5e8e7
GH 30975-Fix for Series.append(DataFrame) Doc Build Fix
hvardhan20 Jan 16, 2020
53f1c11
GH 31087 Updates ggpy reference to plotnine
hvardhan20 Jan 17, 2020
4e1f6b1
Revert "GH 31087 Updates ggpy reference to plotnine"
hvardhan20 Jan 17, 2020
041ed12
Revert "GH 31087 Updates ggpy reference to plotnine"
hvardhan20 Jan 17, 2020
721a560
GH 30975-Fix for Series.append(DataFrame) Simple message change
hvardhan20 Jan 17, 2020
f312bae
GH 30975-Fix for Series.append(DataFrame) Simple message test change
hvardhan20 Jan 17, 2020
a1ebee2
GH 30975-Fix for Series.append(DataFrame) Removed self._ensure_type
hvardhan20 Jan 19, 2020
32d2989
GH 30975-Fix for Series.append(DataFrame) Removed self._ensure_type r…
hvardhan20 Jan 19, 2020
87ea2aa
GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour
hvardhan20 Jan 20, 2020
19dcea3
GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour return typ…
hvardhan20 Jan 20, 2020
cdf92ae
GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour return typ…
hvardhan20 Jan 20, 2020
a747921
GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour return typ…
hvardhan20 Jan 20, 2020
a256fe8
GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour return typ…
hvardhan20 Jan 21, 2020
14e0fb4
GH 30975-Fix for Series.append(DataFrame) 0.25.3 behaviour return typ…
hvardhan20 Jan 21, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1170,7 +1170,7 @@ Other
- Bug where :meth:`DataFrame.itertuples` would incorrectly determine whether or not namedtuples could be used for dataframes of 255 columns (:issue:`28282`)
- Handle nested NumPy ``object`` arrays in :func:`testing.assert_series_equal` for ExtensionArray implementations (:issue:`30841`)
- Bug in :class:`Index` constructor incorrectly allowing 2-dimensional input arrays (:issue:`13601`, :issue:`27125`)

- :meth:`Series.append` will now raise a ``TypeError`` when passed a DataFrame (:issue:`30975`)
.. ---------------------------------------------------------------------------

.. _whatsnew_100.contributors:
Expand Down
4 changes: 4 additions & 0 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -2538,6 +2538,10 @@ def append(self, to_append, ignore_index=False, verify_integrity=False) -> "Seri
to_concat = [self]
to_concat.extend(to_append)
else:
if not isinstance(to_append, type(self)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think should allow a Series to be appended to a subclassed Series

Suggested change
if not isinstance(to_append, type(self)):
if not isinstance(to_append, Series):

also need to move the check into above condition to check the items of a sequence

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or instead of inside the else, loop over to_concat[1:] outside the else

Copy link
Contributor Author

@hvardhan20 hvardhan20 Jan 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the hint @simonjayhawkins.
Could you please suggest the better way to do this between the following 2 ways:
1)

Suggested change
if not isinstance(to_append, type(self)):
for x in to_concat[1:]:
if not isinstance(x, Series):
msg = (
f"to_append should be a Series or list/tuple of Series, "
f"got {type(to_append).__name__}"
f'{(" of " + type(x).__name__) if isinstance(to_append, (list, tuple)) else ""}'
)
raise TypeError(msg)
Suggested change
if not isinstance(to_append, type(self)):
if any(not isinstance(x, Series) for x in to_concat[1:]):
msg = (
f"to_append should be a Series or list/tuple of Series, "
f"got {type(to_append).__name__}"
)
raise TypeError(msg)

Both go after the else. The problem with 2nd way is we cannot get the type of the sequence element which raises the Error. If there's a better way to do this, please do share.
Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test case test_concatlike_same_dtypes is failing in pipeline, which is testing the appending of various data types and expecting to catch and raise a TypeError in Concatenator constructor.
With if not isinstance(x, Series): to_append will catch & raise TypeError for all data types.
Hence test_concatlike_same_dtypes is failing.

I think we should consider the following change:

Suggested change
if not isinstance(to_append, type(self)):
for x in to_concat[1:]:
if isinstance(x, pd.DataFrame):
raise TypeError

What are your thoughts on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC correctly from the issue the behaviour of adding a DataFrame needs to be revisited, so I agree that maybe checking for a DataFrame maybe better than checking for not a Series.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the simple loop of step 1 (fails faster) and the simplicity of message 2.

msg = f"to_append should be a Series or list/tuple of Series, " \
f"got {type(to_append)}"
raise TypeError(msg)
to_concat = [self, to_append]
return self._ensure_type(
concat(
Expand Down
14 changes: 14 additions & 0 deletions pandas/tests/series/methods/test_append.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,20 @@ def test_append_tuples(self):

tm.assert_series_equal(expected, result)

def test_append_dataframe(self):
# GH 30975
df = pd.DataFrame({"A": [1, 2], "B": [3, 4]})
df2 = pd.DataFrame({"C": [5, 6], "D": [7, 8]})

expected = pd.Series(pd.concat([df.A, df2.D]))
result = df.A.append(df2.D)

tm.assert_series_equal(expected, result)

msg = "to_append should be a Series or list/tuple of Series, got " \
"<class 'pandas.core.frame.DataFrame'>"
with pytest.raises(TypeError, match=msg):
df.A.append(df)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also need to add test to check items in a list/tuple

Copy link
Contributor Author

@hvardhan20 hvardhan20 Jan 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an acceptable test case for testing items in sequence?

Suggested change
df.A.append(df)
df = pd.DataFrame({"A": [1, 2], "B": [3, 4]})
li = [df.B, df]
msg = "to_append should be a Series or list/tuple of Series, got list of DataFrame"
with pytest.raises(TypeError, match=msg):
df.A.append(li)


class TestSeriesAppendWithDatetimeIndex:
def test_append(self):
Expand Down