You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
---------------------------------------------------------------------------
UnicodeEncodeError Traceback (most recent call last)
<ipython-input-10-9de5a19c260e> in <module>()
----> 1 df.filter(regex=u'a')
C:\Users\...\AppData\Local\Continuum\32bit\Anaconda\envs\test\lib\site-packages\pandas\core\generic.pyc in filter(self, items, like, regex, axis)
2013 matcher = re.compile(regex)
2014 return self.select(lambda x: matcher.search(str(x)) is not None,
-> 2015 axis=axis_name)
2016 else:
2017 raise TypeError('Must pass either `items`, `like`, or `regex`')
C:\Users\...\AppData\Local\Continuum\32bit\Anaconda\envs\test\lib\site-packages\pandas\core\generic.pyc in select(self, crit, axis)
1545 if len(axis_values) > 0:
1546 new_axis = axis_values[
-> 1547 np.asarray([bool(crit(label)) for label in axis_values])]
1548 else:
1549 new_axis = axis_values
C:\Users\...\AppData\Local\Continuum\32bit\Anaconda\envs\test\lib\site-packages\pandas\core\generic.pyc in <lambda>(x)
2012 elif regex:
2013 matcher = re.compile(regex)
-> 2014 return self.select(lambda x: matcher.search(str(x)) is not None,
2015 axis=axis_name)
2016 else:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 0: ordinal not in range(128)
The text was updated successfully, but these errors were encountered:
griai
changed the title
Edit #10506 breaks if the DataFrame contains unicode column names with non-ASCII characters.
BUG: Edit #10506 breaks if the DataFrame contains unicode column names with non-ASCII characters.
May 6, 2016
yeah str(x) will try to encode, so probably easiest to either just catch this (and pass thru if it cannot encode), or just stringify integers (but then that leaves out things like float columns and such).
So I think the former is ok. want to do a PR?
would need to add some tests for other column label types as well
(e.g. the tests should loop thru all of the index types).
jreback
changed the title
BUG: Edit #10506 breaks if the DataFrame contains unicode column names with non-ASCII characters.
BUG: .filter with unicode labels when can't encode
May 6, 2016
I don't have an installed git environment at the moment. So I cannot do the Pull Request, unfortunately.
I would support the passing-through solution if the argument cannot be encoded, since it is the easiest and a pretty general fix (although this fallback mechanism might seem a bit intransparent).
Edit #10506 breaks if the DataFrame contains unicode column names with non-ASCII characters.
throws me a
The text was updated successfully, but these errors were encountered: