style apply of rows fails for subset list, but works for single row? #13222

paulperry · 2016-05-18T17:42:27Z

Code Sample, a copy-pastable example if possible

df = pd.DataFrame([["A", 1],["B", 2]], columns=["Letter", "Number"])
def highlight(s):
    return ['background-color: yellow']
df.style.apply(highlight, axis=0, subset=[0])

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/Users/paulperry/anaconda/lib/python3.5/site-packages/IPython/core/formatters.py in __call__(self, obj)
    341             method = _safe_get_formatter_method(obj, self.print_method)
    342             if method is not None:
--> 343                 return method()
    344             return None
    345         else:

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/style.py in _repr_html_(self)
    163     def _repr_html_(self):
    164         """Hooks into Jupyter notebook rich display system."""
--> 165         return self.render()
    166 
    167     def _translate(self):

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/style.py in render(self)
    356         the rendered HTML in the notebook.
    357         """
--> 358         self._compute()
    359         d = self._translate()
    360         # filter out empty styles, every cell will have a class

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/style.py in _compute(self)
    422         r = self
    423         for func, args, kwargs in self._todo:
--> 424             r = func(self)(*args, **kwargs)
    425         return r
    426 

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/style.py in _apply(self, func, axis, subset, **kwargs)
    429         subset = _non_reducing_slice(subset)
    430         if axis is not None:
--> 431             result = self.data.loc[subset].apply(func, axis=axis, **kwargs)
    432         else:
    433             # like tee

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/indexing.py in __getitem__(self, key)
   1282     def __getitem__(self, key):
   1283         if type(key) is tuple:
-> 1284             return self._getitem_tuple(key)
   1285         else:
   1286             return self._getitem_axis(key, axis=0)

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/indexing.py in _getitem_tuple(self, tup)
    783 
    784         # no multi-index, so validate all of the indexers
--> 785         self._has_valid_tuple(tup)
    786 
    787         # ugly hack for GH #836

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/indexing.py in _has_valid_tuple(self, key)
    136             if i >= self.obj.ndim:
    137                 raise IndexingError('Too many indexers')
--> 138             if not self._has_valid_type(k, i):
    139                 raise ValueError("Location based indexing can only have [%s] "
    140                                  "types" % self._valid_types)

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/indexing.py in _has_valid_type(self, key, axis)
   1365 
   1366                 raise KeyError("None of [%s] are in the [%s]" %
-> 1367                                (key, self.obj._get_axis_name(axis)))
   1368 
   1369             return True

KeyError: 'None of [[0]] are in the [columns]'

Out[16]:
<pandas.core.style.Styler at 0x11ae32cf8>

The code appears to be applying it to the columns, not the rows, but passing a single value as the subset renders the table with the correct row highlighted.

df.style.apply(highlight, axis=0, subset=0)

Letter  Number
0   A   1    <- in html this row is highlighted yellow
1   B   2

Passing a bad row index (e.g. 3) also correctly fails with 'None of [[3]] are in the [index]'

df.style.apply(highlight, axis=1, subset=['Letter']) works, but
df.style.apply(highlight, axis=1, subset=['Letter','Number']) Fails

Expected Output

df.style.apply(highlight, axis=0, subset=[0])

Letter  Number
0   A   1    <- in html this row should be highlighted yellow
1   B   2

output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 15.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.0
nose: 1.3.7
pip: 8.1.2
setuptools: 20.3
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 4.1.2
sphinx: 1.3.5
patsy: 0.4.0
dateutil: 2.5.1
pytz: 2016.2
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0

The text was updated successfully, but these errors were encountered:

jreback · 2016-05-18T17:56:16Z

@TomAugspurger

TomAugspurger · 2016-05-18T18:12:30Z

I think this is functioning as designed.

The docs lay it out here. The second item "A list (or series or numpy array)" should say "A list (or series or numpy array) is treated as a slice on the columns".

To just style the first row you can use

df.style.apply(highlight, subset=(0,)) # note the comma to make it a tuple
df.style.apply(highlight, subset=pd.IndexSlice[0, :])

That section in the docs should be updated though.

paulperry · 2016-05-18T18:44:09Z

I'm trying to highlight a subset of rows (not just the first one).

df.style.apply(highlight, axis=0, subset=pd.IndexSlice[[0,1], :])
df.style.apply(highlight, subset=pd.IndexSlice[[0,1], :])

Fail with

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/style.py in _update_ctx(self, attrs)
    376         matter.
    377         """
--> 378         for row_label, v in attrs.iterrows():
    379             for col_label, col in v.iteritems():
    380                 i = self.index.get_indexer([row_label])[0]

/Users/paulperry/anaconda/lib/python3.5/site-packages/pandas/core/generic.py in __getattr__(self, name)
   2667             if name in self._info_axis:
   2668                 return self[name]
-> 2669             return object.__getattribute__(self, name)
   2670 
   2671     def __setattr__(self, name, value):

AttributeError: 'Series' object has no attribute 'iterrows'

How can I do this?

TomAugspurger · 2016-05-18T18:55:40Z

OK, that one looks like it might be a bug. I'll take a closer look tonight.

On May 18 2016, at 1:44 pm, Paul Perry <[email protected]> wrote:

I'm trying to highlight a subset of rows (not just the first one).
df.style.apply(highlight, axis=0, subset=pd.IndexSlice[[0,1], :])

df.style.apply(highlight, subset=pd.IndexSlice[[0,1], :])
Fail with
/Users/paulperry/anaconda/lib/python3.5/site-
packages/pandas/core/style.py in _update_ctx(self, attrs)
    376         matter.

    377         """

--> 378         for row_label, v in attrs.iterrows():

    379             for col_label, col in v.iteritems():

    380                 i = self.index.get_indexer([row_label])[0]


/Users/paulperry/anaconda/lib/python3.5/site-
packages/pandas/core/generic.py in getattr(self, name)
   2667             if name in self._info_axis:

   2668                 return self[name]

-> 2669             return object.__getattribute__(self, name)

   2670

   2671     def __setattr__(self, name, value):


AttributeError: 'Series' object has no attribute 'iterrows'
How can I do this?

—
You are receiving this because you were mentioned.
Reply to this email directly or [view it on GitHub](https://github.com/pydata/
pandas/issues/13222#issuecomment-220120877)![](https://github.com/notification
s/beacon/ABQHIiOuLYlIIrjjk396jCeWO5G_tZUCks5qC14CgaJpZM4Ihh4e.gif)

paulperry · 2016-05-18T19:06:59Z

While you are at it you might reconsider the interpretation that subset lists are only considered slices of columns. ("the second item"). It seems to me that the axis parameter already identifies whether it is an index or a column. It looks odd that axis=0 can only act on one row, and to do anything else requires a slice object. This renders the axis parameter useless.

TomAugspurger · 2016-05-18T21:19:44Z

Ok, there's a few things going on...

To accomplish this specific task you'd be better off using df.style.set_properties(subset=pd.IndexSlice[[0, 1], :], **{'background-color': 'yellow'}). set_properties is a nice shortcut for functions that don't depend on the data that's passed in. If you notice in the definition for your highlight, you don't actually use s in the body, so you're safe to use set_properties.
When you use Styler.apply(func), the output shape of func has to be the same shape as the input. That's what's causing the error in your last message. To modify your highlight you'd want something like

def highlight(s):
    return ['background-color: yellow' for _ in s]

The df.style.apply(highlight, subset=(0,)) happened to work since the output shape was the same (one row input, one list output).
This need for the output-shape to match the input shape is not at all clear in the documentation.
3. We do need axis and subset. axis determines whether a row or column is passed to func. subset determines which subset of the original dataframe those rows or columns are drawn from.

I'll submit a PR tonight with some better documentation. Sorry it wasn't clearer to begin with, and thanks for the feedback.

I'll think about how we can fail better here when the output shapes doesn't match, the current error message isn't great.

paulperry · 2016-05-18T23:32:06Z

Tom, that works, thank you. The doc had me believe the apply func worked the same on rows as columns.

paulperry · 2016-05-19T14:30:02Z

A little more on what I ran into: when the column names are integers I don't have a way of selecting them for styling. This may be unrelated to styling, but I can't solve this without renaming the columns as strings. I seem to do it positionally in the first example, but not the last one. Do you have a better suggestion?

These work:

dff = pd.DataFrame([[1,2],[3,4]])
display(dff.style.set_properties(subset=[1], **{'background-color': 'pink'}))

good_df = pd.DataFrame([[1,2],[3,4]], columns=['5','6'], index=[0,1]) 
good_df.style.set_properties(subset=['5'], **{'background-color': 'pink'})

These fail:

bad_df = pd.DataFrame([[1,2],[3,4]], columns=[5,6], index=[0,1]) 
bad_df.style.set_properties(subset=[5], **{'background-color': 'pink'})
bad_df.style.set_properties(subset=pd.IndexSlice[:,[5]], **{'background-color': 'pink'})

positional_list = [bad_df.columns.get_loc(c) for c in [bad_df.columns[1]]]
bad_df.style.set_properties(subset=positional_list, **{'background-color': 'pink'})

Even though indexing by number works:

bad_df.loc[:,5]
bad_df.loc[pd.IndexSlice[:,[5]]]

TomAugspurger · 2016-05-19T14:37:36Z

@paulperry your first two bad_df examples work for me

at least I assume that's your expected output. The last one (correctly) raises a KeyError. Only label-based indexing is allowed for subset (this one is documented, but maybe not emphasized enough).

paulperry · 2016-05-19T14:54:14Z

@TomAugspurger yes! That's the right output. And I get the same good output you have in the environment I posted at the top of the thread, but not this one with the older pandas 0.17.1 :

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.8-boot2docker
machine: x86_64
processor: 
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8

pandas: 0.17.1
nose: None
pip: 8.1.1
setuptools: 20.3
Cython: None
numpy: 1.10.4
scipy: 0.17.0
statsmodels: None
IPython: 4.1.2
sphinx: None
patsy: None
dateutil: 2.5.0
pytz: 2016.1
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
Jinja2: None

The error:

bad_df = pd.DataFrame([[1,2],[3,4]], columns=[5,6], index=[0,1]) 
bad_df.style.set_properties(subset=[5], **{'background-color': 'pink'})

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/opt/conda/lib/python3.5/site-packages/IPython/core/formatters.py in __call__(self, obj)
    341             method = _safe_get_formatter_method(obj, self.print_method)
    342             if method is not None:
--> 343                 return method()
    344             return None
    345         else:

/opt/conda/lib/python3.5/site-packages/pandas/core/style.py in _repr_html_(self)
    158         Hooks into Jupyter notebook rich display system.
    159         '''
--> 160         return self.render()
    161 
    162     def _translate(self):

/opt/conda/lib/python3.5/site-packages/pandas/core/style.py in render(self)
    259         """
    260         self._compute()
--> 261         d = self._translate()
    262         # filter out empty styles, every cell will have a class
    263         # but the list of props may just be [['', '']].

/opt/conda/lib/python3.5/site-packages/pandas/core/style.py in _translate(self)
    220                 cs = [DATA_CLASS, "row%s" % r, "col%s" % c]
    221                 cs.extend(cell_context.get("data", {}).get(r, {}).get(c, []))
--> 222                 row_es.append({"type": "td", "value": self.data.iloc[r][c],
    223                                "class": " ".join(cs), "id": "_".join(cs[1:])})
    224                 props = []

/opt/conda/lib/python3.5/site-packages/pandas/core/series.py in __getitem__(self, key)
    555     def __getitem__(self, key):
    556         try:
--> 557             result = self.index.get_value(self, key)
    558 
    559             if not np.isscalar(result):

/opt/conda/lib/python3.5/site-packages/pandas/core/index.py in get_value(self, series, key)
   1788 
   1789         try:
-> 1790             return self._engine.get_value(s, k)
   1791         except KeyError as e1:
   1792             if len(self) > 0 and self.inferred_type in ['integer','boolean']:

pandas/index.pyx in pandas.index.IndexEngine.get_value (pandas/index.c:3204)()

pandas/index.pyx in pandas.index.IndexEngine.get_value (pandas/index.c:2903)()

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:3843)()

pandas/hashtable.pyx in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:6525)()

pandas/hashtable.pyx in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:6463)()

KeyError: 0

<pandas.core.style.Styler at 0x7f63edb63780>

Thank you!

paulperry · 2016-05-19T15:27:33Z

I forgot to mention that the Linux version above is the docker image of the dev environment for the declarativewidgets project and easily obtainable here: https://github.com/jupyter-incubator/declarativewidgets#develop

jreback added the Visualization plotting label May 18, 2016

TomAugspurger added Docs IO HTML read_html, to_html, Styler.apply, Styler.applymap labels May 18, 2016

TomAugspurger mentioned this issue May 19, 2016

DOC/API: Styler documentation changes #13225

Closed

jorisvandenbossche added this to the 0.18.2 milestone May 19, 2016

jorisvandenbossche closed this as completed in 20dd17a Jun 18, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

style apply of rows fails for subset list, but works for single row? #13222

style apply of rows fails for subset list, but works for single row? #13222

paulperry commented May 18, 2016 •

edited

Loading

jreback commented May 18, 2016

TomAugspurger commented May 18, 2016

paulperry commented May 18, 2016

TomAugspurger commented May 18, 2016

paulperry commented May 18, 2016

TomAugspurger commented May 18, 2016 •

edited

Loading

paulperry commented May 18, 2016

paulperry commented May 19, 2016

TomAugspurger commented May 19, 2016

paulperry commented May 19, 2016

paulperry commented May 19, 2016

style apply of rows fails for subset list, but works for single row? #13222

style apply of rows fails for subset list, but works for single row? #13222

Comments

paulperry commented May 18, 2016 • edited Loading

Code Sample, a copy-pastable example if possible

Expected Output

output of pd.show_versions()

INSTALLED VERSIONS

jreback commented May 18, 2016

TomAugspurger commented May 18, 2016

paulperry commented May 18, 2016

TomAugspurger commented May 18, 2016

paulperry commented May 18, 2016

TomAugspurger commented May 18, 2016 • edited Loading

paulperry commented May 18, 2016

paulperry commented May 19, 2016

TomAugspurger commented May 19, 2016

paulperry commented May 19, 2016

paulperry commented May 19, 2016

paulperry commented May 18, 2016 •

edited

Loading

output of `pd.show_versions()`

TomAugspurger commented May 18, 2016 •

edited

Loading