Skip to content

idxmax() fails for string type #11516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jakevdp opened this issue Nov 4, 2015 · 2 comments
Closed

idxmax() fails for string type #11516

jakevdp opened this issue Nov 4, 2015 · 2 comments
Labels
API Design Duplicate Report Duplicate issue or pull request Strings String extension data type and string data

Comments

@jakevdp
Copy link
Contributor

jakevdp commented Nov 4, 2015

While it's possible to find the max of a Series containing strings, it's not possible to find the idxmax:

>>> s = pd.Series(list('ABCDEFGHIJKLMNOPQRSTUVWXYZ'))
>>> s.max()
'Z'

>>> s.idxmax()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-292-28e0d79e56be> in <module>()
----> 1 s.idxmax()

/Users/jakevdp/anaconda/envs/python3.4/lib/python3.4/site-packages/pandas/core/series.py in idxmax(self, axis, out, skipna)
   1218         numpy.ndarray.argmax
   1219         """
-> 1220         i = nanops.nanargmax(_values_from_object(self), skipna=skipna)
   1221         if i == -1:
   1222             return np.nan

/Users/jakevdp/anaconda/envs/python3.4/lib/python3.4/site-packages/pandas/core/nanops.py in nanargmax(values, axis, skipna)
    492     """
    493     values, mask, dtype, _ = _get_values(values, skipna, fill_value_typ='-inf',
--> 494                                          isfinite=True)
    495     result = values.argmax(axis)
    496     result = _maybe_arg_null_out(result, axis, mask, skipna)

/Users/jakevdp/anaconda/envs/python3.4/lib/python3.4/site-packages/pandas/core/nanops.py in _get_values(values, skipna, fill_value, fill_value_typ, isfinite, copy)
    178     values = _values_from_object(values)
    179     if isfinite:
--> 180         mask = _isfinite(values)
    181     else:
    182         mask = isnull(values)

/Users/jakevdp/anaconda/envs/python3.4/lib/python3.4/site-packages/pandas/core/nanops.py in _isfinite(values)
    221             is_integer_dtype(values) or is_bool_dtype(values)):
    222         return ~np.isfinite(values)
--> 223     return ~np.isfinite(values.astype('float64'))
    224 
    225 

ValueError: could not convert string to float: 'Z'

This surprised me because it works without a problem in numpy:

>>> arr = np.array(list('ABCDEFGHIJKLMNOPQRSTUVWXYZ'))
>>> arr.argmax()
25

It seems that pandas idxmax implementation implicitly assumes numerical types.

@jakevdp
Copy link
Contributor Author

jakevdp commented Nov 4, 2015

Just poked around a bit and found #4279 and #6287 – it seems this is already on the Issues radar. Feel free to close as a duplicate if you feel it's appropriate.

@jreback
Copy link
Contributor

jreback commented Nov 4, 2015

yeh prob should work

thanks for the report and dupe refs

@jreback jreback closed this as completed Nov 4, 2015
@jreback jreback added API Design Duplicate Report Duplicate issue or pull request Strings String extension data type and string data labels Nov 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Duplicate Report Duplicate issue or pull request Strings String extension data type and string data
Projects
None yet
Development

No branches or pull requests

2 participants