Skip to content

style apply of rows fails for subset list, but works for single row? #13222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
paulperry opened this issue May 18, 2016 · 11 comments
Closed

style apply of rows fails for subset list, but works for single row? #13222

paulperry opened this issue May 18, 2016 · 11 comments
Labels
Docs IO HTML read_html, to_html, Styler.apply, Styler.applymap Visualization plotting
Milestone

Comments

@paulperry
Copy link

paulperry commented May 18, 2016

Code Sample, a copy-pastable example if possible

df = pd.DataFrame([["A", 1],["B", 2]], columns=["Letter", "Number"])
def highlight(s):
    return ['background-color: yellow']
df.style.apply(highlight, axis=0, subset=[0])

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/Users/paulperry/anaconda/lib/python3.5/site-packages/IPython/core/formatters.py in __call__(self, obj)
    341             method = _safe_get_formatter_method(obj, self.print_method)
    342             if method is not None:
--> 343                 return method()
    344             return None
    345         else:

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/style.py in _repr_html_(self)
    163     def _repr_html_(self):
    164         """Hooks into Jupyter notebook rich display system."""
--> 165         return self.render()
    166 
    167     def _translate(self):

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/style.py in render(self)
    356         the rendered HTML in the notebook.
    357         """
--> 358         self._compute()
    359         d = self._translate()
    360         # filter out empty styles, every cell will have a class

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/style.py in _compute(self)
    422         r = self
    423         for func, args, kwargs in self._todo:
--> 424             r = func(self)(*args, **kwargs)
    425         return r
    426 

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/style.py in _apply(self, func, axis, subset, **kwargs)
    429         subset = _non_reducing_slice(subset)
    430         if axis is not None:
--> 431             result = self.data.loc[subset].apply(func, axis=axis, **kwargs)
    432         else:
    433             # like tee

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/indexing.py in __getitem__(self, key)
   1282     def __getitem__(self, key):
   1283         if type(key) is tuple:
-> 1284             return self._getitem_tuple(key)
   1285         else:
   1286             return self._getitem_axis(key, axis=0)

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/indexing.py in _getitem_tuple(self, tup)
    783 
    784         # no multi-index, so validate all of the indexers
--> 785         self._has_valid_tuple(tup)
    786 
    787         # ugly hack for GH #836

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/indexing.py in _has_valid_tuple(self, key)
    136             if i >= self.obj.ndim:
    137                 raise IndexingError('Too many indexers')
--> 138             if not self._has_valid_type(k, i):
    139                 raise ValueError("Location based indexing can only have [%s] "
    140                                  "types" % self._valid_types)

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/indexing.py in _has_valid_type(self, key, axis)
   1365 
   1366                 raise KeyError("None of [%s] are in the [%s]" %
-> 1367                                (key, self.obj._get_axis_name(axis)))
   1368 
   1369             return True

KeyError: 'None of [[0]] are in the [columns]'

Out[16]:
<pandas.core.style.Styler at 0x11ae32cf8>

The code appears to be applying it to the columns, not the rows, but passing a single value as the subset renders the table with the correct row highlighted.

df.style.apply(highlight, axis=0, subset=0)

Letter  Number
0   A   1    <- in html this row is highlighted yellow
1   B   2

Passing a bad row index (e.g. 3) also correctly fails with 'None of [[3]] are in the [index]'

df.style.apply(highlight, axis=1, subset=['Letter']) works, but
df.style.apply(highlight, axis=1, subset=['Letter','Number']) Fails

Expected Output

df.style.apply(highlight, axis=0, subset=[0])

Letter  Number
0   A   1    <- in html this row should be highlighted yellow
1   B   2

output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 15.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.0
nose: 1.3.7
pip: 8.1.2
setuptools: 20.3
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 4.1.2
sphinx: 1.3.5
patsy: 0.4.0
dateutil: 2.5.1
pytz: 2016.2
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0

@jreback jreback added the Visualization plotting label May 18, 2016
@jreback
Copy link
Contributor

jreback commented May 18, 2016

@TomAugspurger

@TomAugspurger
Copy link
Contributor

I think this is functioning as designed.

The docs lay it out here. The second item "A list (or series or numpy array)" should say "A list (or series or numpy array) is treated as a slice on the columns".

To just style the first row you can use

  • df.style.apply(highlight, subset=(0,)) # note the comma to make it a tuple
  • df.style.apply(highlight, subset=pd.IndexSlice[0, :])

That section in the docs should be updated though.

@TomAugspurger TomAugspurger added Docs IO HTML read_html, to_html, Styler.apply, Styler.applymap labels May 18, 2016
@paulperry
Copy link
Author

I'm trying to highlight a subset of rows (not just the first one).

df.style.apply(highlight, axis=0, subset=pd.IndexSlice[[0,1], :])
df.style.apply(highlight, subset=pd.IndexSlice[[0,1], :])

Fail with

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/style.py in _update_ctx(self, attrs)
    376         matter.
    377         """
--> 378         for row_label, v in attrs.iterrows():
    379             for col_label, col in v.iteritems():
    380                 i = self.index.get_indexer([row_label])[0]

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/generic.py in __getattr__(self, name)
   2667             if name in self._info_axis:
   2668                 return self[name]
-> 2669             return object.__getattribute__(self, name)
   2670 
   2671     def __setattr__(self, name, value):

AttributeError: 'Series' object has no attribute 'iterrows'

How can I do this?

@TomAugspurger
Copy link
Contributor

OK, that one looks like it might be a bug. I'll take a closer look tonight.

On May 18 2016, at 1:44 pm, Paul Perry <[email protected]> wrote:

I'm trying to highlight a subset of rows (not just the first one).

df.style.apply(highlight, axis=0, subset=pd.IndexSlice[[0,1], :])

df.style.apply(highlight, subset=pd.IndexSlice[[0,1], :])

Fail with

/Users/paulperry/anaconda/lib/python3.5/site-

packages/pandas/core/style.py in _update_ctx(self, attrs)

    376         matter.

    377         """

--> 378         for row_label, v in attrs.iterrows():

    379             for col_label, col in v.iteritems():

    380                 i = self.index.get_indexer([row_label])[0]


/Users/paulperry/anaconda/lib/python3.5/site-

packages/pandas/core/generic.py in getattr(self, name)

   2667             if name in self._info_axis:

   2668                 return self[name]

-> 2669             return object.__getattribute__(self, name)

   2670

   2671     def __setattr__(self, name, value):


AttributeError: 'Series' object has no attribute 'iterrows'

How can I do this?


You are receiving this because you were mentioned.
Reply to this email directly or [view it on GitHub](https://github.com/pydata/
pandas/issues/13222#issuecomment-220120877)![](https://github.com/notification
s/beacon/ABQHIiOuLYlIIrjjk396jCeWO5G_tZUCks5qC14CgaJpZM4Ihh4e.gif)

@paulperry
Copy link
Author

While you are at it you might reconsider the interpretation that subset lists are only considered slices of columns. ("the second item"). It seems to me that the axis parameter already identifies whether it is an index or a column. It looks odd that axis=0 can only act on one row, and to do anything else requires a slice object. This renders the axis parameter useless.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented May 18, 2016

Ok, there's a few things going on...

  1. To accomplish this specific task you'd be better off using df.style.set_properties(subset=pd.IndexSlice[[0, 1], :], **{'background-color': 'yellow'}). set_properties is a nice shortcut for functions that don't depend on the data that's passed in. If you notice in the definition for your highlight, you don't actually use s in the body, so you're safe to use set_properties.
  2. When you use Styler.apply(func), the output shape of func has to be the same shape as the input. That's what's causing the error in your last message. To modify your highlight you'd want something like
def highlight(s):
    return ['background-color: yellow' for _ in s]

The df.style.apply(highlight, subset=(0,)) happened to work since the output shape was the same (one row input, one list output).
This need for the output-shape to match the input shape is not at all clear in the documentation.
3. We do need axis and subset. axis determines whether a row or column is passed to func. subset determines which subset of the original dataframe those rows or columns are drawn from.

I'll submit a PR tonight with some better documentation. Sorry it wasn't clearer to begin with, and thanks for the feedback.

I'll think about how we can fail better here when the output shapes doesn't match, the current error message isn't great.

@paulperry
Copy link
Author

Tom, that works, thank you. The doc had me believe the apply func worked the same on rows as columns.

@paulperry
Copy link
Author

A little more on what I ran into: when the column names are integers I don't have a way of selecting them for styling. This may be unrelated to styling, but I can't solve this without renaming the columns as strings. I seem to do it positionally in the first example, but not the last one. Do you have a better suggestion?

These work:

dff = pd.DataFrame([[1,2],[3,4]])
display(dff.style.set_properties(subset=[1], **{'background-color': 'pink'}))

good_df = pd.DataFrame([[1,2],[3,4]], columns=['5','6'], index=[0,1]) 
good_df.style.set_properties(subset=['5'], **{'background-color': 'pink'})

These fail:

bad_df = pd.DataFrame([[1,2],[3,4]], columns=[5,6], index=[0,1]) 
bad_df.style.set_properties(subset=[5], **{'background-color': 'pink'})
bad_df.style.set_properties(subset=pd.IndexSlice[:,[5]], **{'background-color': 'pink'})

positional_list = [bad_df.columns.get_loc(c) for c in [bad_df.columns[1]]]
bad_df.style.set_properties(subset=positional_list, **{'background-color': 'pink'})

Even though indexing by number works:

bad_df.loc[:,5]
bad_df.loc[pd.IndexSlice[:,[5]]]

@TomAugspurger
Copy link
Contributor

@paulperry your first two bad_df examples work for me

screen shot 2016-05-19 at 9 34 55 am

at least I assume that's your expected output. The last one (correctly) raises a KeyError. Only label-based indexing is allowed for subset (this one is documented, but maybe not emphasized enough).

@paulperry
Copy link
Author

@TomAugspurger yes! That's the right output. And I get the same good output you have in the environment I posted at the top of the thread, but not this one with the older pandas 0.17.1 :

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.8-boot2docker
machine: x86_64
processor: 
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8

pandas: 0.17.1
nose: None
pip: 8.1.1
setuptools: 20.3
Cython: None
numpy: 1.10.4
scipy: 0.17.0
statsmodels: None
IPython: 4.1.2
sphinx: None
patsy: None
dateutil: 2.5.0
pytz: 2016.1
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
Jinja2: None

The error:

bad_df = pd.DataFrame([[1,2],[3,4]], columns=[5,6], index=[0,1]) 
bad_df.style.set_properties(subset=[5], **{'background-color': 'pink'})

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/opt/conda/lib/python3.5/site-packages/IPython/core/formatters.py in __call__(self, obj)
    341             method = _safe_get_formatter_method(obj, self.print_method)
    342             if method is not None:
--> 343                 return method()
    344             return None
    345         else:

/opt/conda/lib/python3.5/site-packages/pandas/core/style.py in _repr_html_(self)
    158         Hooks into Jupyter notebook rich display system.
    159         '''
--> 160         return self.render()
    161 
    162     def _translate(self):

/opt/conda/lib/python3.5/site-packages/pandas/core/style.py in render(self)
    259         """
    260         self._compute()
--> 261         d = self._translate()
    262         # filter out empty styles, every cell will have a class
    263         # but the list of props may just be [['', '']].

/opt/conda/lib/python3.5/site-packages/pandas/core/style.py in _translate(self)
    220                 cs = [DATA_CLASS, "row%s" % r, "col%s" % c]
    221                 cs.extend(cell_context.get("data", {}).get(r, {}).get(c, []))
--> 222                 row_es.append({"type": "td", "value": self.data.iloc[r][c],
    223                                "class": " ".join(cs), "id": "_".join(cs[1:])})
    224                 props = []

/opt/conda/lib/python3.5/site-packages/pandas/core/series.py in __getitem__(self, key)
    555     def __getitem__(self, key):
    556         try:
--> 557             result = self.index.get_value(self, key)
    558 
    559             if not np.isscalar(result):

/opt/conda/lib/python3.5/site-packages/pandas/core/index.py in get_value(self, series, key)
   1788 
   1789         try:
-> 1790             return self._engine.get_value(s, k)
   1791         except KeyError as e1:
   1792             if len(self) > 0 and self.inferred_type in ['integer','boolean']:

pandas/index.pyx in pandas.index.IndexEngine.get_value (pandas/index.c:3204)()

pandas/index.pyx in pandas.index.IndexEngine.get_value (pandas/index.c:2903)()

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:3843)()

pandas/hashtable.pyx in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:6525)()

pandas/hashtable.pyx in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:6463)()

KeyError: 0

<pandas.core.style.Styler at 0x7f63edb63780>

Thank you!

@paulperry
Copy link
Author

I forgot to mention that the Linux version above is the docker image of the dev environment for the declarativewidgets project and easily obtainable here: https://github.com/jupyter-incubator/declarativewidgets#develop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs IO HTML read_html, to_html, Styler.apply, Styler.applymap Visualization plotting
Projects
None yet
Development

No branches or pull requests

4 participants