-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: get_level_values() on int level upcasts to Float64Index if needed #17930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #17930 +/- ##
==========================================
- Coverage 91.58% 91.57% -0.02%
==========================================
Files 153 153
Lines 51307 51313 +6
==========================================
- Hits 46991 46988 -3
- Misses 4316 4325 +9
Continue to review full report at Codecov.
|
doc/source/whatsnew/v0.21.0.txt
Outdated
@@ -970,6 +970,7 @@ Indexing | |||
- Fixes ``DataFrame.loc`` for setting with alignment and tz-aware ``DatetimeIndex`` (:issue:`16889`) | |||
- Avoids ``IndexError`` when passing an Index or Series to ``.iloc`` with older numpy (:issue:`17193`) | |||
- Allow unicode empty strings as placeholders in multilevel columns in Python 2 (:issue:`17099`) | |||
- Bug in ``MultiIndex.get_level_values()`` which would return an invalid index on level of ints with missing values (:issue:`17924`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could provide a link via :meth:`MultiIndex.get_level_values`
pandas/core/indexes/multi.py
Outdated
values = unique._shallow_copy(filled) | ||
|
||
if not unique._can_hold_na and filled.dtype == 'float': | ||
values = Index(filled, copy=False, **unique._get_attributes_dict()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for all this just use
_shallow_copy_with_infer
pandas/tests/indexes/test_multi.py
Outdated
expected = np.array(['a', np.nan, 1], dtype=object) | ||
tm.assert_numpy_array_equal(values.values, expected) | ||
result = index.get_level_values(0) | ||
# workaround for #17929 (should be 'float') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rip out this particular test case and xfail with the correct result
then when fixed it will xpass
pandas/tests/indexes/test_multi.py
Outdated
values = index.get_level_values(0) | ||
assert values.shape == (0, ) | ||
result = index.get_level_values(0) | ||
assert result.shape == (0, ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
u could assert index equal here as well
10d7efa
to
83a4b0f
Compare
pandas/core/indexes/multi.py
Outdated
@@ -906,7 +906,12 @@ def _get_level_values(self, level): | |||
labels = self.labels[level] | |||
filled = algos.take_1d(unique._values, labels, | |||
fill_value=unique._na_value) | |||
values = unique._shallow_copy(filled) | |||
|
|||
if not unique._can_hold_na and filled.dtype == 'float': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would always use the infer method no need to have an if
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks several tests with dates (e.g. in pandas/tests/test_multilevel.py
and pandas/tests/indexes/test_multi.py
), in one case because "ValueError: Inferred frequency None from passed dates does not conform to passed frequency D", in other cases because DateTimeIndex
es with time zone are _re_added the time zone (if I understand correctly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to see if you can find a general soln. we don't want to have if/then cases all over the place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only "general" solution I can think is a property e.g. _na_upcast
of Index
subclasses (inherited from Index
) which is the subclass itself if _can_hold_na = True
(that is, all but Int64Index
), and is overridden to Float64Index
for Int64Index
. Then I would just call unique._na_upcast._shallow_copy(filled)
.
Do you have in mind anything better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have no idea ; you need to see why other tests are failing and work around
can you rebase and we can revisit |
83a4b0f
to
328e175
Compare
Hello @toobaz! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on December 10, 2017 at 18:36 Hours UTC |
6fe6775
to
2ffeaa5
Compare
Sidenote: @pep8speaks highlighted an error ("E129 visually indented line with same indent as next logical line") which neither my local flake8 nor the |
doc/source/whatsnew/v0.21.0.txt
Outdated
@@ -1061,6 +1061,7 @@ Indexing | |||
- Fixes ``DataFrame.loc`` for setting with alignment and tz-aware ``DatetimeIndex`` (:issue:`16889`) | |||
- Avoids ``IndexError`` when passing an Index or Series to ``.iloc`` with older numpy (:issue:`17193`) | |||
- Allow unicode empty strings as placeholders in multilevel columns in Python 2 (:issue:`17099`) | |||
- Bug in :meth:`MultiIndex.get_level_values` which would return an invalid index on level of ints with missing values (:issue:`17924`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to 0.22
pandas/core/indexes/numeric.py
Outdated
not is_integer_dtype(values) and | ||
isna(values).any()): | ||
attributes = self._get_attributes_dict() | ||
attributes.update(kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is too specific logic, rather call _shallow_copy_with_infer
2ffeaa5
to
7ff2af4
Compare
@jreback ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, do comments. Second would consider (new issue/PR) to split up test_multi.py into a sub-dir as getting pretty long.
@@ -62,6 +62,13 @@ def _maybe_cast_slice_bound(self, label, side, kind): | |||
# we will try to coerce to integers | |||
return self._maybe_cast_indexer(label) | |||
|
|||
@Appender(_index_shared_docs['_shallow_copy']) | |||
def _shallow_copy(self, values=None, **kwargs): | |||
if values is not None and not self._can_hold_na: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment here on why we are doing this (yes its obvious to me, but not generally to future readers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(done)
ping
pandas/tests/indexes/test_multi.py
Outdated
tm.assert_index_equal(result, expected) | ||
|
||
def test_get_level_values_all_na(self): | ||
arrays = [[np.nan, np.nan, np.nan], ['a', np.nan, 1]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add GH 17924 here as well as a comment
7ff2af4
to
78d967a
Compare
resolved conflicts. ping on green. |
thanks! |
git diff master -u -- "*.py" | flake8 --diff
Maybe the check could be more general... but I can't think of any case in which this is needed other than
int
->float
.