Skip to content

KeyError when generating scatter plot of DataFrame columns #4493

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fonnesbeck opened this issue Aug 7, 2013 · 34 comments
Closed

KeyError when generating scatter plot of DataFrame columns #4493

fonnesbeck opened this issue Aug 7, 2013 · 34 comments
Labels
Milestone

Comments

@fonnesbeck
Copy link

I have two columns of a DataFrame that I am trying to plot using scatter in Matplotlib. Here are the two series:

mmsi
210234000    52.600000
211109000    62.250000
211200350    48.333333
211201390    52.887500
211203370    41.962500
211204500    32.233333
211205780    87.050000
211205790    42.500000
211207740    38.233333
211220290    53.233333
211262460    40.100000
211262630    31.716667
211327410    54.300000
211335760    43.175000
211378120    58.361290
...
636090986    62.812500
636091049    70.175000
636091052    45.600000
636091078    82.725000
636091137    59.075000
636091138    61.083333
636091195    75.030000
636091278    68.241667
636091394    57.700000
636091452    44.275000
636091501    63.480000
636091506    67.578571
636091545    44.500000
636091595    46.475000
636091800    61.850000
Name: pdgt10_early, Length: 235, dtype: float64

mmsi
210234000    True
211109000    True
211200350    True
211201390    True
211203370    True
211204500    True
211205780    True
211205790    True
211207740    True
211220290    True
211262460    True
211262630    True
211327410    True
211335760    True
211378120    True
...
636090986    True
636091049    True
636091052    True
636091078    True
636091137    True
636091138    True
636091195    True
636091278    True
636091394    True
636091452    True
636091501    True
636091506    True
636091545    True
636091595    True
636091800    True
Length: 235, dtype: bool

(Actually the latter is the result of a comparison operation on the column to see if values are greater or less than some threshold value)

Unfortunately, calling scatter on these two axes raises a KeyError: 0 exception for some reason.

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-52-4b02ddd54f50> in <module>()
      1 for sma in sma_list:
----> 2     plot_logistic('Cargo', sma, nova_traces[np.where(sma_list==sma)[0][0]].traces[1], season_a=-3)

<ipython-input-51-94c541930a55> in plot_logistic(ship, sma, trace, dataset, season_a, season_b, nlines)
     14     print x
     15     print y
---> 16     scatter(x, y)
     17     title('{0} in {1}'.format(ship, sma))
     18 

/Library/Python/2.7/site-packages/matplotlib-1.4.x-py2.7-macosx-10.8-intel.egg/matplotlib/pyplot.pyc in scatter(x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, hold, **kwargs)
   3088         ret = ax.scatter(x, y, s=s, c=c, marker=marker, cmap=cmap, norm=norm,
   3089                          vmin=vmin, vmax=vmax, alpha=alpha,
-> 3090                          linewidths=linewidths, verts=verts, **kwargs)
   3091         draw_if_interactive()
   3092     finally:

/Library/Python/2.7/site-packages/matplotlib-1.4.x-py2.7-macosx-10.8-intel.egg/matplotlib/axes/_axes.pyc in scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, **kwargs)
   3210             self.cla()
   3211 
-> 3212         self._process_unit_info(xdata=x, ydata=y, kwargs=kwargs)
   3213         x = self.convert_xunits(x)
   3214         y = self.convert_yunits(y)

/Library/Python/2.7/site-packages/matplotlib-1.4.x-py2.7-macosx-10.8-intel.egg/matplotlib/axes/_base.pyc in _process_unit_info(self, xdata, ydata, kwargs)
   1652             # we only need to update if there is nothing set yet.
   1653             if not self.xaxis.have_units():
-> 1654                 self.xaxis.update_units(xdata)
   1655             #print '\tset from xdata', self.xaxis.units
   1656 

/Library/Python/2.7/site-packages/matplotlib-1.4.x-py2.7-macosx-10.8-intel.egg/matplotlib/axis.pyc in update_units(self, data)
   1330         """
   1331 
-> 1332         converter = munits.registry.get_converter(data)
   1333         if converter is None:
   1334             return False

/Library/Python/2.7/site-packages/matplotlib-1.4.x-py2.7-macosx-10.8-intel.egg/matplotlib/units.pyc in get_converter(self, x)
    135 
    136         if isinstance(x, np.ndarray) and x.size:
--> 137             converter = self.get_converter(x.ravel()[0])
    138             return converter
    139 

/Library/Python/2.7/site-packages/pandas-0.12.0_80_gfcaf9a6_20130729-py2.7-macosx-10.8-intel.egg/pandas/core/series.pyc in __getitem__(self, key)
    617     def __getitem__(self, key):
    618         try:
--> 619             return self.index.get_value(self, key)
    620         except InvalidIndexError:
    621             pass

/Library/Python/2.7/site-packages/pandas-0.12.0_80_gfcaf9a6_20130729-py2.7-macosx-10.8-intel.egg/pandas/core/index.pyc in get_value(self, series, key)
    722         """
    723         try:
--> 724             return self._engine.get_value(series, key)
    725         except KeyError as e1:
    726             if len(self) > 0 and self.inferred_type == 'integer':

/Library/Python/2.7/site-packages/pandas-0.12.0_80_gfcaf9a6_20130729-py2.7-macosx-10.8-intel.egg/pandas/index.so in pandas.index.IndexEngine.get_value (pandas/index.c:2646)()

/Library/Python/2.7/site-packages/pandas-0.12.0_80_gfcaf9a6_20130729-py2.7-macosx-10.8-intel.egg/pandas/index.so in pandas.index.IndexEngine.get_value (pandas/index.c:2461)()

/Library/Python/2.7/site-packages/pandas-0.12.0_80_gfcaf9a6_20130729-py2.7-macosx-10.8-intel.egg/pandas/index.so in pandas.index.IndexEngine.get_loc (pandas/index.c:3198)()

/Library/Python/2.7/site-packages/pandas-0.12.0_80_gfcaf9a6_20130729-py2.7-macosx-10.8-intel.egg/pandas/hashtable.so in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:6422)()

/Library/Python/2.7/site-packages/pandas-0.12.0_80_gfcaf9a6_20130729-py2.7-macosx-10.8-intel.egg/pandas/hashtable.so in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:6366)()

KeyError: 0

This worked like a charm prior to updating pandas and matplotlib a week or so ago.

Running Python 2.7.2 on OS X 10.8.4.

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2013

@fonnesbeck sorry to hear about all your plotting troubles! i'll check this one out

@ghost ghost assigned cpcloud Aug 7, 2013
@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2013

if i do

In [49]: mmsi = Series(normal(50, 10, size=1000), name='mmsi')

In [50]: mmsi_thr = mmsi < 40

In [51]: scatter(mmsi, mmsi_thr)

I get the following plot:

thr-plot

Here are my versions

INSTALLED VERSIONS
------------------
Python: 2.7.5.final.0
OS: Linux 3.10.5-1-ARCH #1 SMP PREEMPT Mon Aug 5 08:04:22 CEST 2013 x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

Cython: 0.19.1
Numpy: 1.7.1
Scipy: 0.12.0
statsmodels: 0.4.3
    patsy: 0.1.0
scikits.timeseries: 0.91.3
dateutil: 2.1
pytz: 2013b
bottleneck: 0.6.0
PyTables: 3.0.0
    numexpr: 2.1
matplotlib: 1.3.0
openpyxl: 1.6.2
xlrd: 0.9.2
xlwt: 0.7.5
sqlalchemy: 0.8.1
lxml: 3.2.3
bs4: 4.2.1
html5lib: 0.95-dev

@fonnesbeck
Copy link
Author

Try going back to my example, and doing the following:

dataset = pd.read_csv("processed_data.csv", index_col=0, parse_dates=[-2,-1])
plt.scatter(dataset.pdgt10,dataset.loa)

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2013

whoa i get a totally different error than the one you showed above... you can't plot dataset.loa bc it has object dtype...try this:

converters = {'active': lambda x: bool(int(x))}
df = read_csv('processed_data.csv', index_col=0, parse_dates=[-2, -1], converters=converters)
df = df.convert_objects(convert_numeric=True)
scatter(df.pdgt10, df.loa)

You should get:

pdgt10-scatter

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2013

I wonder why loa is not being converted to float64 in read_csv

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2013

@fonnesbeck

in loa there are strings like "0.0/0.33":

to see it do

In [29]: df = read_csv('processed_data.csv', index_col=0, parse_dates=[-2, -1])

In [30]: counts = df.loa.str.count(r'\.')

In [31]: df.loa[counts > 1]
Out[31]:
2014    0.0/33.0
2015    0.0/33.0
2016    0.0/33.0
2017    0.0/33.0
2018    0.0/33.0
2019    0.0/33.0
2020    0.0/33.0
2021    0.0/33.0
2022    0.0/33.0
2023    0.0/33.0
2024    0.0/33.0
2025    0.0/33.0
2026    0.0/33.0
2027    0.0/33.0
2028    0.0/33.0
...
262353    0.0/33.0
262354    0.0/33.0
262355    0.0/33.0
262356    0.0/33.0
262357    0.0/33.0
262358    0.0/33.0
262359    0.0/33.0
262360    0.0/33.0
262361    0.0/33.0
262362    0.0/33.0
262363    0.0/33.0
262364    0.0/33.0
262365    0.0/33.0
262366    0.0/33.0
262367    0.0/33.0
Name: loa, Length: 38303, dtype: object

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2013

Also some that can't just evaluate to 0

In [37]: df.loa[counts > 1][~df.loa[counts > 1].str.startswith('0.0')]
Out[37]:
2619    276.0/277.0
2620    276.0/277.0
2621    276.0/277.0
2622    276.0/277.0
2623    276.0/277.0
2624    276.0/277.0
2625    276.0/277.0
2626    276.0/277.0
2627    276.0/277.0
2628    276.0/277.0
2629    276.0/277.0
2630    276.0/277.0
2631    276.0/277.0
2632    276.0/277.0
2633    276.0/277.0
...
262050    213.0/217.0
262128      44.0/78.0
262129      44.0/78.0
262130      44.0/78.0
262131      44.0/78.0
262132      44.0/78.0
262133      44.0/78.0
262134      44.0/78.0
262135      44.0/78.0
262136      44.0/78.0
262137      44.0/78.0
262138      44.0/78.0
262139      44.0/78.0
262140    142.0/148.0
262141    142.0/148.0
Name: loa, Length: 30000, dtype: object

@fonnesbeck
Copy link
Author

Sorry, that was a bad choice of column. The one in my real model is the result of a merge on 2 tables like this and use modified pdgt10 columns, without these unconvertable values. I will get a better example up when I get a chance.

@fonnesbeck
Copy link
Author

Try the test_data.csv dataset, with the following:

sma_data = pd.read_csv('test_data.csv')
scatter(sma_data.pdgt10_early, sma_data.pdgt10_late<sma_data.pdgt10_early)

and see if you do not get the KeyError.

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2013

i don't get the error

In [1]: df = read_csv('test_data.csv')

In [2]: df
Out[2]:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 143 entries, 0 to 142
Data columns (total 4 columns):
mmsi            143  non-null values
pdgt10_early    143  non-null values
pdgt10_late     143  non-null values
type            143  non-null values
dtypes: float64(2), int64(1), object(1)

In [3]: df.pdgt10_early
Out[3]:
0     77.500
1     84.467
2     27.037
3     54.200
4     25.248
5     28.750
6     44.362
7     92.600
8      9.300
9     26.525
10    63.400
11    36.750
12    12.050
13    20.867
14    85.150
...
128     2.581
129    19.320
130     6.444
131    21.477
132    36.575
133    46.950
134    14.450
135     0.100
136    91.700
137    10.418
138     0.014
139    11.393
140     7.045
141     1.257
142    51.050
Name: pdgt10_early, Length: 143, dtype: float64

In [4]: scatter(df.pdgt10_early, df.pdgt10_late < df.pdgt10_early)
Out[4]: <matplotlib.collections.PathCollection at 0x5ce0450>

pdgt10-scatter-working

@fonnesbeck
Copy link
Author

The issue is probably related to this then:

In [2]: matplotlib.__version__
Out[2]: '1.4.x'

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2013

i'm guessing that's dev let me try it out

@fonnesbeck
Copy link
Author

Yes, current master.

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2013

hm nope works with that too...

In [7]: mpl.__version__
Out[7]: '1.4.x'

@fonnesbeck
Copy link
Author

Okay, back to the drawing board for me then.

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2013

can u try with .values on those Series? not sure why it should be different tho....

@fonnesbeck
Copy link
Author

Will try. I wonder if your recent fix for my other issue was related? Getting on a plane; will update later. Thanks.

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2013

don't think so since these series have unique monotonic indices

@fonnesbeck
Copy link
Author

Can you try importing the data with index_col=0 and see if you can replicate?

@cpcloud
Copy link
Member

cpcloud commented Aug 8, 2013

able to plot on 1.3.1 trying 1.4.x

@cpcloud
Copy link
Member

cpcloud commented Aug 8, 2013

NICE, GOT IT with 1.4.x

@cpcloud
Copy link
Member

cpcloud commented Aug 8, 2013

the issue is that matplotlib calls x.ravel()[0] on your Series objects, but they don't have a 0 index, thus the KeyError...the solution is to call .values

a little unsatisfying, but i'm not sure what else you can do...

@cpcloud
Copy link
Member

cpcloud commented Aug 8, 2013

i'll look at a diff of matplotlib for that function to see if there's anything we or they can do about it to make it compatible

@fonnesbeck thanks for the report!

@fonnesbeck
Copy link
Author

BTW, I get similar behavior with RPlot (again, this worked a few weeks ago):

>>> cdystonia = pd.read_csv("../data/cdystonia.csv", index_col=None)
>>> cdystonia.head()

   patient  obs  week  site  id  treat  age sex  twstrs
0        1    1     0     1   1  5000U   65   F      32
1        1    2     2     1   1  5000U   65   F      30
2        1    3     4     1   1  5000U   65   F      24
3        1    4     8     1   1  5000U   65   F      37
4        1    5    12     1   1  5000U   65   F      39

>>> plt.figure(figsize=(12,12))
>>> bbp = RPlot(cdystonia, x='age', y='twstrs')
>>> bbp.add(TrellisGrid(['week', 'treat']))
>>> bbp.add(GeomScatter())
>>> bbp.add(GeomPolyFit(degree=2))
>>> _ = bbp.render(gcf())

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-30-4cbe319d1ffa> in <module>()
      4 bbp.add(GeomScatter())
      5 bbp.add(GeomPolyFit(degree=2))
----> 6 _ = bbp.render(gcf())

    /Library/Python/2.7/site-packages/pandas-0.12.0_146_g9cba0de_20130808-py2.7-macosx-10.8-intel.egg/pandas/tools/rplot.pyc in render(self, fig)
        878             # Prepare the subplots and draw on them
        879             new_layers = sequence_grids(new_layers)
    --> 880             axes_grids = [work_grid(grid, fig) for grid in new_layers]
        881             axes_grid = axes_grids[-1]
        882             adjust_subplots(fig, axes_grid, last_trellis, new_layers[-1])

    /Library/Python/2.7/site-packages/pandas-0.12.0_146_g9cba0de_20130808-py2.7-macosx-10.8-intel.egg/pandas/tools/rplot.pyc in work_grid(grid, fig)
        718         for col in range(ncols):
        719             axes[row][col] = fig.add_subplot(nrows, ncols, ncols * row + col + 1)
    --> 720             grid[row][col].work(ax=axes[row][col])
        721     return axes
        722 

    /Library/Python/2.7/site-packages/pandas-0.12.0_146_g9cba0de_20130808-py2.7-macosx-10.8-intel.egg/pandas/tools/rplot.pyc in work(self, fig, ax)
        462         x = self.data[self.aes['x']]
        463         y = self.data[self.aes['y']]
    --> 464         ax.scatter(x, y, marker=self.marker, c=self.colour, alpha=self.alpha)
        465         return fig, ax
        466 

    /Library/Python/2.7/site-packages/matplotlib-1.4.x-py2.7-macosx-10.8-intel.egg/matplotlib/axes/_axes.pyc in scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, **kwargs)
       3210             self.cla()
       3211 
    -> 3212         self._process_unit_info(xdata=x, ydata=y, kwargs=kwargs)
       3213         x = self.convert_xunits(x)
       3214         y = self.convert_yunits(y)

    /Library/Python/2.7/site-packages/matplotlib-1.4.x-py2.7-macosx-10.8-intel.egg/matplotlib/axes/_base.pyc in _process_unit_info(self, xdata, ydata, kwargs)
       1652             # we only need to update if there is nothing set yet.
       1653             if not self.xaxis.have_units():
    -> 1654                 self.xaxis.update_units(xdata)
       1655             #print '\tset from xdata', self.xaxis.units
       1656 

    /Library/Python/2.7/site-packages/matplotlib-1.4.x-py2.7-macosx-10.8-intel.egg/matplotlib/axis.pyc in update_units(self, data)
       1330         """
       1331 
    -> 1332         converter = munits.registry.get_converter(data)
       1333         if converter is None:
       1334             return False

    /Library/Python/2.7/site-packages/matplotlib-1.4.x-py2.7-macosx-10.8-intel.egg/matplotlib/units.pyc in get_converter(self, x)
        135 
        136         if isinstance(x, np.ndarray) and x.size:
    --> 137             converter = self.get_converter(x.ravel()[0])
        138             return converter
        139 

    /Library/Python/2.7/site-packages/pandas-0.12.0_146_g9cba0de_20130808-py2.7-macosx-10.8-intel.egg/pandas/core/series.pyc in __getitem__(self, key)
        616     def __getitem__(self, key):
        617         try:
    --> 618             return self.index.get_value(self, key)
        619         except InvalidIndexError:
        620             pass

    /Library/Python/2.7/site-packages/pandas-0.12.0_146_g9cba0de_20130808-py2.7-macosx-10.8-intel.egg/pandas/core/index.pyc in get_value(self, series, key)
        722         """
        723         try:
    --> 724             return self._engine.get_value(series, key)
        725         except KeyError as e1:
        726             if len(self) > 0 and self.inferred_type == 'integer':

    /Library/Python/2.7/site-packages/pandas-0.12.0_146_g9cba0de_20130808-py2.7-macosx-10.8-intel.egg/pandas/index.so in pandas.index.IndexEngine.get_value (pandas/index.c:2646)()

    /Library/Python/2.7/site-packages/pandas-0.12.0_146_g9cba0de_20130808-py2.7-macosx-10.8-intel.egg/pandas/index.so in pandas.index.IndexEngine.get_value (pandas/index.c:2461)()

/Library/Python/2.7/site-packages/pandas-0.12.0_146_g9cba0de_20130808-py2.7-macosx-10.8-intel.egg/pandas/index.so in pandas.index.IndexEngine.get_loc (pandas/index.c:3198)()

/Library/Python/2.7/site-packages/pandas-0.12.0_146_g9cba0de_20130808-py2.7-macosx-10.8-intel.egg/pandas/hashtable.so in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:6422)()

/Library/Python/2.7/site-packages/pandas-0.12.0_146_g9cba0de_20130808-py2.7-macosx-10.8-intel.egg/pandas/hashtable.so in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:6366)()

KeyError: 0

@cpcloud
Copy link
Member

cpcloud commented Aug 17, 2013

recent change to that that ravel()[0] (which is the issue here) business in mpl checking it out now

@cpcloud
Copy link
Member

cpcloud commented Aug 17, 2013

@fonnesbeck can u try with latest matplotlib master? it's working on my end

@cpcloud
Copy link
Member

cpcloud commented Aug 17, 2013

here'st he diff for the offending chunk from latest matplotlib

diff --git a/lib/matplotlib/units.py b/lib/matplotlib/units.py
index 8f08a64..73d363a 100644
--- a/lib/matplotlib/units.py
+++ b/lib/matplotlib/units.py
@@ -134,8 +134,19 @@ class Registry(dict):
             converter = self.get(classx)

         if isinstance(x, np.ndarray) and x.size:
-            converter = self.get_converter(x.ravel()[0])
-            return converter
+            xravel = x.ravel()
+            try:
+                # pass the first value of x that is not masked back to
+                # get_converter
+                if not np.all(xravel.mask):
+                    # some elements are not masked
+                    converter = self.get_converter(
+                        xravel[np.argmin(xravel.mask)])
+                    return converter
+            except AttributeError:
+                # not a masked_array
+                converter = self.get_converter(xravel[0])
+                return converter

         if converter is None and iterable(x):
             for thisx in x:

@fonnesbeck
Copy link
Author

success!

@cpcloud
Copy link
Member

cpcloud commented Aug 17, 2013

excellent!

@cpcloud
Copy link
Member

cpcloud commented Aug 17, 2013

hm i'm not sure this doing the right thing tho....

in pandas pre series->ndframe refactor xravel will be a Series object and if not np.all(xravel.mask) will test False because mask is a method on Series objects so actually this new mpl chunk of code will do nothing (will exit the try suite) when i'm not sure it's really supposed to

in post refactor seriess world the AttributeError will be caught since now ravel returns a raw numpy array (which means that it doesn't use the __array__ interface) and it will be successful in conversion because xravel[0] won't raise, when it could previously on a series, any time integer indices occur but 0 isn't in the index (for example)

@cpcloud
Copy link
Member

cpcloud commented Aug 17, 2013

OTOH maybe good enough is good enough

@cpcloud
Copy link
Member

cpcloud commented Sep 28, 2013

pushing to 0.14

@cpcloud
Copy link
Member

cpcloud commented Feb 27, 2014

closing as stale. @fonnesbeck ping us if this comes up again

@cpcloud cpcloud closed this as completed Feb 27, 2014
@joehand
Copy link

joehand commented Jun 12, 2014

Never mind, figured it out =). Was similar to issue #6127. Just passing .values instead of the numpy array.

@cpcloud, I'm getting what I think is the same error but when trying to create a histogram with matplotlib. Edit: After looking at the above errors a bit more, its not quite the same. Let me know if I should create a new issue for this.

I have Pandas 0.14, but still running matplotlib 1.3.1. This function was working previously on the same data (not sure what versions, it was a few months ago) but isn't now. I was on Pandas 0.13.x earlier and it wasn't working, tried upgrading and it still isn't working =(. Any ideas?

The histogram works on with some sub-sets of the data not others (example below). I haven't quite been able to find the differences in what is working or not.

Update: This issue has something to do with series which have indices that don't start at 0. I can get it working if create a new index from 0 -> Length with the same series that wasn't working before.

One of the sub-sets it is not working on:

49663     868.002991
49664     502.933014
49665     506.434998
49666     976.049011
49667    1735.439941
49668    1230.319946
49669     993.213013
49670    1551.069946
49671     732.591980
49672    1018.109985
49673     899.023010
49674     667.893005
49675    1405.270019
49676    1018.869995
49677    1117.680054
...
53686    4426.439941
53687    2557.149902
53688    1884.449951
53689     961.927002
53690    1248.160034
53691    2091.310059
53692    1292.329956
53693     860.046021
53694    1279.329956
53695    4214.609863
53696    3223.159912
53697    3815.270020
53698    2677.679932
53699    1410.280029
53700     807.500000
Name: PopDen, Length: 1570, dtype: float64

And here are the errors:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-11-0d069c72fca2> in <module>()
----> 5 plots.plot_single(atl, 'PopDen')

/Users/joe/Dropbox/SDI_Papers_CODE/statistics_neighborhoods/plot_functions.py in plot_single(self, df, plot_col, bins, skew, fig_title, xlabel, ylabel)
     73             x, mean, sigma = self._calc_params(log_col)
     74 
---> 75             plt.hist(log_col, bins, normed=True,  color=self.colors[0])
     76             plt.plot(x,mlab.normpdf(x,mean,sigma), label='Log-Normal', lw=3, color=self.colors[1])
     77 

/usr/local/lib/python2.7/site-packages/matplotlib/pyplot.pyc in hist(x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, hold, **kwargs)
   2825                       histtype=histtype, align=align, orientation=orientation,
   2826                       rwidth=rwidth, log=log, color=color, label=label,
-> 2827                       stacked=stacked, **kwargs)
   2828         draw_if_interactive()
   2829     finally:

/usr/local/lib/python2.7/site-packages/matplotlib/axes.pyc in hist(self, x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, **kwargs)
   8247         # Massage 'x' for processing.
   8248         # NOTE: Be sure any changes here is also done below to 'weights'
-> 8249         if isinstance(x, np.ndarray) or not iterable(x[0]):
   8250             # TODO: support masked arrays;
   8251             x = np.asarray(x)

/usr/local/lib/python2.7/site-packages/pandas/core/series.pyc in __getitem__(self, key)
    477     def __getitem__(self, key):
    478         try:
--> 479             result = self.index.get_value(self, key)
    480 
    481             if not np.isscalar(result):

/usr/local/lib/python2.7/site-packages/pandas/core/index.pyc in get_value(self, series, key)
   1169 
   1170         try:
-> 1171             return self._engine.get_value(s, k)
   1172         except KeyError as e1:
   1173             if len(self) > 0 and self.inferred_type == 'integer':

/usr/local/lib/python2.7/site-packages/pandas/index.so in pandas.index.IndexEngine.get_value (pandas/index.c:2987)()

/usr/local/lib/python2.7/site-packages/pandas/index.so in pandas.index.IndexEngine.get_value (pandas/index.c:2802)()

/usr/local/lib/python2.7/site-packages/pandas/index.so in pandas.index.IndexEngine.get_loc (pandas/index.c:3528)()

/usr/local/lib/python2.7/site-packages/pandas/hashtable.so in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:7032)()

/usr/local/lib/python2.7/site-packages/pandas/hashtable.so in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:6973)()

KeyError: 0

diazona added a commit to diazona/pandas that referenced this issue Dec 16, 2015
This commit prevents the KeyError raised when DataFrame.plot() is called
with xerr or yerr being a Series or DataFrame whose index doesn't include 0.
The error comes from matplotlib code which tries to access xerr[0] or yerr[0],
so to solve the problem, we convert xerr and yerr from Pandas objects to
NumPy ndarrays before sending them through to matplotlib. This is
a different instance of the same type of problem in Github issues pandas-dev#4493
and pandas-dev#6127 (and perhaps others).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants