Skip to content

API: Improper x/y arg given to df.plot #18695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 10, 2017
Merged

API: Improper x/y arg given to df.plot #18695

merged 4 commits into from
Dec 10, 2017

Conversation

masongallo
Copy link
Contributor

@masongallo masongallo commented Dec 8, 2017

I'm not incredibly familiar with your codebase so I did my best to follow conventions. Not sure what to do with whatsnew for this case?

Description:

  • Validation of x or y arg to df.plot to match specifications in documentation

@TomAugspurger
Copy link
Contributor

Thanks, this will be for 0.22 since it's an API change. Some people may have been ignoring the warning and relying on this.

Did you look into how difficult it'd be to properly support multiple columns for y? At the very least, we would need to make sure that label was a a list-like of the same length as y. Not sure what else.

@masongallo
Copy link
Contributor Author

Did you look into how difficult it'd be to properly support multiple columns for y?

I did think about this - the usecase that isn't already covered wasn't clear to me, especially given the complexity it would add to the method/API. The plot method already has subplots and secondary_y for example. What do you think?

@TomAugspurger
Copy link
Contributor

I don't have a strong opinion. I think the equivalent output would be df.set_index("x")[y].plot(), which isn't so bad. It's probably best to just assert that y is a a single column, so your changes here look good.

The release not can go in doc/source/whatsnew/v0.22.0.txt under plotting.

@masongallo
Copy link
Contributor Author

The release not can go in doc/source/whatsnew/v0.22.0.txt under plotting.

Thanks, I see api changes section will update that

@codecov
Copy link

codecov bot commented Dec 8, 2017

Codecov Report

Merging #18695 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18695      +/-   ##
==========================================
- Coverage    91.6%   91.59%   -0.02%     
==========================================
  Files         153      153              
  Lines       51272    51276       +4     
==========================================
- Hits        46967    46964       -3     
- Misses       4305     4312       +7
Flag Coverage Δ
#multiple 89.45% <100%> (ø) ⬆️
#single 40.67% <0%> (-0.12%) ⬇️
Impacted Files Coverage Δ
pandas/plotting/_core.py 82.42% <100%> (+0.05%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.81% <0%> (-0.1%) ⬇️
pandas/util/testing.py 82.01% <0%> (+0.19%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 288bf6e...e190aa3. Read the comment docs.

@codecov
Copy link

codecov bot commented Dec 8, 2017

Codecov Report

Merging #18695 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18695      +/-   ##
==========================================
- Coverage    91.6%   91.59%   -0.02%     
==========================================
  Files         153      153              
  Lines       51272    51276       +4     
==========================================
- Hits        46967    46964       -3     
- Misses       4305     4312       +7
Flag Coverage Δ
#multiple 89.45% <100%> (ø) ⬆️
#single 40.68% <12.5%> (-0.11%) ⬇️
Impacted Files Coverage Δ
pandas/plotting/_core.py 82.41% <100%> (+0.03%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.81% <0%> (-0.1%) ⬇️
pandas/io/excel.py 90.13% <0%> (ø) ⬆️
pandas/core/internals.py 94.44% <0%> (ø) ⬆️
pandas/io/parsers.py 95.55% <0%> (ø) ⬆️
pandas/util/testing.py 82.01% <0%> (+0.19%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 288bf6e...aea8383. Read the comment docs.

@@ -2170,6 +2170,15 @@ def test_invalid_kind(self):
with pytest.raises(ValueError):
df.plot(kind='aasdf')

def test_invalid_xy_args(self):
df = DataFrame({"A": [1, 2], 'B': [3, 4], 'C': [5, 6]})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can yu add the issue number here

@@ -2170,6 +2170,15 @@ def test_invalid_kind(self):
with pytest.raises(ValueError):
df.plot(kind='aasdf')

def test_invalid_xy_args(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you parameterize this on x, y

df = DataFrame({"A": [1, 2], 'B': [3, 4], 'C': [5, 6]})
bad_arg = ['B', 'C']
valid_arg = 'A'
with pytest.raises(ValueError):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also add a dataframe with duplicate columns as a test, e.g.

In [6]: df = DataFrame([[1, 3, 5], [2, 4, 6]], columns=list('AAB'))

In [7]: df
Out[7]: 
   A  A  B
0  1  3  5
1  2  4  6

In [8]: df['A']
Out[8]: 
   A  A
0  1  3
1  2  4

@@ -188,6 +188,7 @@ Other API Changes
- :func:`pandas.DataFrame.merge` no longer casts a ``float`` column to ``object`` when merging on ``int`` and ``float`` columns (:issue:`16572`)
- The default NA value for :class:`UInt64Index` has changed from 0 to ``NaN``, which impacts methods that mask with NA, such as ``UInt64Index.where()`` (:issue:`18398`)
- Refactored ``setup.py`` to use ``find_packages`` instead of explicitly listing out all subpackages (:issue:`18535`)
- :func: `DataFrame.plot` now raises a ``ValueError`` when the ``x`` or ``y`` argument is improperly formed (:issue:`18671`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can move this to plotting bugs.

@@ -1706,11 +1706,15 @@ def _plot(data, x=None, y=None, subplots=False,
if x is not None:
if is_integer(x) and not data.columns.holds_integer():
x = data.columns[x]
elif not isinstance(data[x], Series):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from pandas.core.dtypes.generic import ABCSeries, ABCDataFrame

not isinstance(data[x], ABCSeries)

and change the DataFrame tests to use ABCDataFrame (and remove the inline import)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and change the DataFrame tests to use ABCDataFrame (and remove the inline import)

Not sure what you mean here? I changed the test to use ABCDataFrame but that doesn't seem right?

@jreback jreback added this to the 0.22.0 milestone Dec 10, 2017
@jreback jreback merged commit d7d8f2d into pandas-dev:master Dec 10, 2017
@jreback
Copy link
Contributor

jreback commented Dec 10, 2017

thanks @masongallo nice patch! keep em coming!

data = data.set_index(x)

if y is not None:
if is_integer(y) and not data.columns.holds_integer():
y = data.columns[y]
elif not isinstance(data[y], Series):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I do df.plot.line(x='x', y=['y1', 'y2', 'y3']) this is now a ValueError?

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Mar 20, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UserWarning about columns and attribute while plotting in Jupyter
4 participants