-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
KeyError when generating scatter plot of DataFrame columns #4493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@fonnesbeck sorry to hear about all your plotting troubles! i'll check this one out |
if i do
I get the following plot: Here are my versions
|
Try going back to my example, and doing the following:
|
whoa i get a totally different error than the one you showed above... you can't plot converters = {'active': lambda x: bool(int(x))}
df = read_csv('processed_data.csv', index_col=0, parse_dates=[-2, -1], converters=converters)
df = df.convert_objects(convert_numeric=True)
scatter(df.pdgt10, df.loa) You should get: |
I wonder why |
in to see it do
|
Also some that can't just evaluate to 0
|
Sorry, that was a bad choice of column. The one in my real model is the result of a merge on 2 tables like this and use modified pdgt10 columns, without these unconvertable values. I will get a better example up when I get a chance. |
Try the
and see if you do not get the KeyError. |
i don't get the error
|
The issue is probably related to this then:
|
i'm guessing that's dev let me try it out |
Yes, current master. |
hm nope works with that too...
|
Okay, back to the drawing board for me then. |
can u try with |
Will try. I wonder if your recent fix for my other issue was related? Getting on a plane; will update later. Thanks. |
don't think so since these series have unique monotonic indices |
Can you try importing the data with |
able to plot on 1.3.1 trying 1.4.x |
NICE, GOT IT with 1.4.x |
the issue is that matplotlib calls a little unsatisfying, but i'm not sure what else you can do... |
i'll look at a diff of @fonnesbeck thanks for the report! |
BTW, I get similar behavior with
|
recent change to that that |
@fonnesbeck can u try with latest matplotlib master? it's working on my end |
here'st he diff for the offending chunk from latest matplotlib diff --git a/lib/matplotlib/units.py b/lib/matplotlib/units.py
index 8f08a64..73d363a 100644
--- a/lib/matplotlib/units.py
+++ b/lib/matplotlib/units.py
@@ -134,8 +134,19 @@ class Registry(dict):
converter = self.get(classx)
if isinstance(x, np.ndarray) and x.size:
- converter = self.get_converter(x.ravel()[0])
- return converter
+ xravel = x.ravel()
+ try:
+ # pass the first value of x that is not masked back to
+ # get_converter
+ if not np.all(xravel.mask):
+ # some elements are not masked
+ converter = self.get_converter(
+ xravel[np.argmin(xravel.mask)])
+ return converter
+ except AttributeError:
+ # not a masked_array
+ converter = self.get_converter(xravel[0])
+ return converter
if converter is None and iterable(x):
for thisx in x: |
success! |
excellent! |
hm i'm not sure this doing the right thing tho.... in pandas pre series->ndframe refactor in post refactor seriess world the |
OTOH maybe good enough is good enough |
pushing to 0.14 |
closing as stale. @fonnesbeck ping us if this comes up again |
Never mind, figured it out =). Was similar to issue #6127. Just passing @cpcloud, I'm getting what I think is the same error but when trying to create a histogram with matplotlib. Edit: After looking at the above errors a bit more, its not quite the same. Let me know if I should create a new issue for this. I have Pandas 0.14, but still running matplotlib 1.3.1. This function was working previously on the same data (not sure what versions, it was a few months ago) but isn't now. I was on Pandas 0.13.x earlier and it wasn't working, tried upgrading and it still isn't working =(. Any ideas? The histogram works on with some sub-sets of the data not others (example below). I haven't quite been able to find the differences in what is working or not. Update: This issue has something to do with series which have indices that don't start at 0. I can get it working if create a new index from 0 -> Length with the same series that wasn't working before. One of the sub-sets it is not working on:
And here are the errors:
|
This commit prevents the KeyError raised when DataFrame.plot() is called with xerr or yerr being a Series or DataFrame whose index doesn't include 0. The error comes from matplotlib code which tries to access xerr[0] or yerr[0], so to solve the problem, we convert xerr and yerr from Pandas objects to NumPy ndarrays before sending them through to matplotlib. This is a different instance of the same type of problem in Github issues pandas-dev#4493 and pandas-dev#6127 (and perhaps others).
I have two columns of a
DataFrame
that I am trying to plot usingscatter
in Matplotlib. Here are the two series:(Actually the latter is the result of a comparison operation on the column to see if values are greater or less than some threshold value)
Unfortunately, calling
scatter
on these two axes raises aKeyError: 0
exception for some reason.This worked like a charm prior to updating pandas and matplotlib a week or so ago.
Running Python 2.7.2 on OS X 10.8.4.
The text was updated successfully, but these errors were encountered: