Skip to content

BUG: get_level_values() on int level upcasts to Float64Index if needed #17930

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 10, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion doc/source/whatsnew/v0.22.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ Other API Changes
- :func:`pandas.DataFrame.merge` no longer casts a ``float`` column to ``object`` when merging on ``int`` and ``float`` columns (:issue:`16572`)
- The default NA value for :class:`UInt64Index` has changed from 0 to ``NaN``, which impacts methods that mask with NA, such as ``UInt64Index.where()`` (:issue:`18398`)
- Refactored ``setup.py`` to use ``find_packages`` instead of explicitly listing out all subpackages (:issue:`18535`)
- Rearranged the order of keyword arguments in :func:`read_excel()` to align with :func:`read_csv()` (:pr:`16672`)
- Rearranged the order of keyword arguments in :func:`read_excel()` to align with :func:`read_csv()` (:issue:`16672`)
- :func:`pandas.merge` now raises a ``ValueError`` when trying to merge on incompatible data types (:issue:`9780`)

.. _whatsnew_0220.deprecations:
Expand Down Expand Up @@ -265,6 +265,7 @@ Indexing

- Bug in :func:`Series.truncate` which raises ``TypeError`` with a monotonic ``PeriodIndex`` (:issue:`17717`)
- Bug in :func:`DataFrame.groupby` where tuples were interpreted as lists of keys rather than as keys (:issue:`17979`, :issue:`18249`)
- Bug in :func:`MultiIndex.get_level_values` which would return an invalid index on level of ints with missing values (:issue:`17924`)
- Bug in :func:`MultiIndex.remove_unused_levels` which would fill nan values (:issue:`18417`)
- Bug in :func:`MultiIndex.from_tuples`` which would fail to take zipped tuples in python3 (:issue:`18434`)
- Bug in :class:`Index` construction from list of mixed type tuples (:issue:`18505`)
Expand Down
16 changes: 13 additions & 3 deletions pandas/core/indexes/numeric.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,14 @@ def _maybe_cast_slice_bound(self, label, side, kind):
# we will try to coerce to integers
return self._maybe_cast_indexer(label)

@Appender(_index_shared_docs['_shallow_copy'])
def _shallow_copy(self, values=None, **kwargs):
if values is not None and not self._can_hold_na:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment here on why we are doing this (yes its obvious to me, but not generally to future readers).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(done)
ping

# Ensure we are not returning an Int64Index with float data:
return self._shallow_copy_with_infer(values=values, **kwargs)
return (super(NumericIndex, self)._shallow_copy(values=values,
**kwargs))

def _convert_for_op(self, value):
""" Convert value to be insertable to ndarray """

Expand Down Expand Up @@ -395,9 +403,11 @@ def __contains__(self, other):
try:
return len(other) <= 1 and ibase._try_get_item(other) in self
except TypeError:
return False
except:
return False
pass
except TypeError:
pass

return False

@Appender(_index_shared_docs['get_loc'])
def get_loc(self, key, method=None, tolerance=None):
Expand Down
23 changes: 18 additions & 5 deletions pandas/tests/indexes/test_multi.py
Original file line number Diff line number Diff line change
Expand Up @@ -997,8 +997,8 @@ def test_get_level_values(self):
exp = CategoricalIndex([1, 2, 3, 1, 2, 3])
tm.assert_index_equal(index.get_level_values(1), exp)

@pytest.mark.xfail(reason='GH 17924 (returns Int64Index with float data)')
def test_get_level_values_int_with_na(self):
# GH 17924
arrays = [['a', 'b', 'b'], [1, np.nan, 2]]
index = pd.MultiIndex.from_arrays(arrays)
result = index.get_level_values(1)
Expand All @@ -1024,14 +1024,27 @@ def test_get_level_values_na(self):

arrays = [['a', 'b', 'b'], pd.DatetimeIndex([0, 1, pd.NaT])]
index = pd.MultiIndex.from_arrays(arrays)
values = index.get_level_values(1)
result = index.get_level_values(1)
expected = pd.DatetimeIndex([0, 1, pd.NaT])
tm.assert_index_equal(values, expected)
tm.assert_index_equal(result, expected)

arrays = [[], []]
index = pd.MultiIndex.from_arrays(arrays)
values = index.get_level_values(0)
assert values.shape == (0, )
result = index.get_level_values(0)
expected = pd.Index([], dtype=object)
tm.assert_index_equal(result, expected)

def test_get_level_values_all_na(self):
# GH 17924 when level entirely consists of nan
arrays = [[np.nan, np.nan, np.nan], ['a', np.nan, 1]]
index = pd.MultiIndex.from_arrays(arrays)
result = index.get_level_values(0)
expected = pd.Index([np.nan, np.nan, np.nan], dtype=np.float64)
tm.assert_index_equal(result, expected)

result = index.get_level_values(1)
expected = pd.Index(['a', np.nan, 1], dtype=object)
tm.assert_index_equal(result, expected)

def test_reorder_levels(self):
# this blows up
Expand Down