BUG: loc casting to float for scalar with MultiIndex df #41374

phofl · 2021-05-07T21:53:12Z

closes BUG: Accessing a cell in a pandas DataFrame with MultiIndex loses type information #41369
tests added / passed
Ensure all linting tests pass, see here for how to run them
whatsnew entry

In theory this would solve the issue, but I am not sure if this is desirable. We cast the row to a series, which forces the dtype conversion. If we loop in reverse we retrieve a column as a series which would avoid the conversion. Would you mind having a look @jbrockmendel ?

jbrockmendel · 2021-05-09T16:55:21Z

pandas/core/indexing.py

-            # has the dim of the obj changed?
-            # GH 7199
-            if obj.ndim < current_ndim:
-                axis -= 1


i take it this is unreachable?

No, in theory this is reachable, but does not make sense anymore.

We are counting from the maximum number of dimensions backwards, so even if we reduce the dimension we have already reduced our axis to the new maximum number.

DataFrame example:
axis is 1 -> we are reducing to a series here -> we reduce our axis with one -> we are already at 0 and don't need this case anymore

jbrockmendel · 2021-05-09T16:58:04Z

pandas/core/indexing.py

-        axis = 0
-        for key in tup:
+        # GH#41369 Loop in reverse order to avoid dtype conversion when converting df
+        # row to a series


IIUC the reason this works is bc this ensures we index along columns before rows, so that in multi-block cases we select only necessary blocks, avoiding the fast_xs/interleave/whatever call that is doing the casting ATM?

Yes exactly

can you flesh out the comment to that effect?

pending that, LGTM

jreback · 2021-05-10T23:36:27Z

can you rebase

� Conflicts: � pandas/tests/indexing/multiindex/test_loc.py

phofl · 2021-05-11T22:53:48Z

Done

jreback

test request. lgtm otherwise, ping on green.

jreback · 2021-05-12T01:17:20Z

pandas/tests/indexing/multiindex/test_loc.py

@@ -788,3 +788,12 @@ def test_mi_columns_loc_list_label_order():
        columns=MultiIndex.from_tuples([("B", 1), ("B", 2), ("A", 1), ("A", 2)]),
    )
    tm.assert_frame_equal(result, expected)
+
+
+def test_loc_get_scalar_casting_to_float():


can you add the .iloc example as well (to assert that its also an int as on master).

phofl · 2021-05-12T10:38:39Z

@jreback greenish, failure unrelated

simonjayhawkins · 2021-05-24T18:26:17Z

@phofl can you merge master to resolve conflicts

jreback · 2021-05-24T18:28:56Z

yeah this looks good (just resolved conflicts)

� Conflicts: � pandas/tests/indexing/multiindex/test_loc.py

phofl · 2021-05-24T18:55:35Z

merged master

� Conflicts: � pandas/tests/indexing/multiindex/test_loc.py

simonjayhawkins · 2021-05-25T08:33:00Z

pandas/tests/indexing/multiindex/test_loc.py

+        {"a": 1.0, "b": 2}, index=MultiIndex.from_arrays([[3], [4]], names=["c", "d"])
+    )
+    result = df.loc[(3, 4), "b"]
+    assert result == 2


this test passes on master. 2.0 == 2 is True.

Added an isinstance check

simonjayhawkins · 2021-05-25T12:29:07Z

Thanks @phofl

…1374)

jbrockmendel · 2021-06-03T15:08:35Z

doc/source/whatsnew/v1.3.0.rst

@@ -872,6 +872,7 @@ Indexing
 - Bug in :meth:`DataFrame.__setitem__` and :meth:`DataFrame.iloc.__setitem__` raising ``ValueError`` when trying to index with a row-slice and setting a list as values (:issue:`40440`)
 - Bug in :meth:`DataFrame.loc` not raising ``KeyError`` when key was not found in :class:`MultiIndex` when levels contain more values than used (:issue:`41170`)
 - Bug in :meth:`DataFrame.loc.__setitem__` when setting-with-expansion incorrectly raising when the index in the expanding axis contains duplicates (:issue:`40096`)
+- Bug in :meth:`DataFrame.loc.__getitem__` with :class:`MultiIndex` casting to float when at least one column is from has float dtype and we retrieve a scalar (:issue:`41369`)


"is from has" typo?

Yes thanks, opened #41808

…1374)

BUG: loc casting to float for scalar with MultiIndex df

5dc2757

phofl added Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex labels May 7, 2021

jbrockmendel reviewed May 9, 2021

View reviewed changes

Adjust comment

858299e

Merge branch 'master' of https://github.com/pandas-dev/pandas into 41369

6a4db6f

� Conflicts: � pandas/tests/indexing/multiindex/test_loc.py

jreback added this to the 1.3 milestone May 12, 2021

jreback requested changes May 12, 2021

View reviewed changes

Add iloc example

01dde93

jreback approved these changes May 24, 2021

View reviewed changes

Merge branch 'master' of https://github.com/pandas-dev/pandas into 41369

90f5e1c

� Conflicts: � pandas/tests/indexing/multiindex/test_loc.py

Merge branch 'master' of https://github.com/pandas-dev/pandas into 41369

4484d91

� Conflicts: � pandas/tests/indexing/multiindex/test_loc.py

simonjayhawkins reviewed May 25, 2021

View reviewed changes

Add isinstance check

832c236

simonjayhawkins merged commit 4f7da2c into pandas-dev:master May 25, 2021

phofl deleted the 41369 branch May 25, 2021 14:43

TLouf pushed a commit to TLouf/pandas that referenced this pull request Jun 1, 2021

BUG: loc casting to float for scalar with MultiIndex df (pandas-dev#4…

f0be3d2

…1374)

jbrockmendel reviewed Jun 3, 2021

View reviewed changes

phofl mentioned this pull request Jun 3, 2021

Fix typo in whatsnew #41808

Merged

1 task

JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021

BUG: loc casting to float for scalar with MultiIndex df (pandas-dev#4…

197504a

…1374)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: loc casting to float for scalar with MultiIndex df #41374

BUG: loc casting to float for scalar with MultiIndex df #41374

phofl commented May 7, 2021

jbrockmendel May 9, 2021

phofl May 9, 2021

jbrockmendel May 9, 2021

phofl May 9, 2021

jbrockmendel May 9, 2021

jbrockmendel May 9, 2021

phofl May 9, 2021

jreback commented May 10, 2021

phofl commented May 11, 2021

jreback left a comment

jreback May 12, 2021

phofl May 12, 2021

phofl commented May 12, 2021

simonjayhawkins commented May 24, 2021

jreback commented May 24, 2021

phofl commented May 24, 2021

simonjayhawkins May 25, 2021

phofl May 25, 2021

simonjayhawkins commented May 25, 2021

jbrockmendel Jun 3, 2021

phofl Jun 3, 2021

BUG: loc casting to float for scalar with MultiIndex df #41374

BUG: loc casting to float for scalar with MultiIndex df #41374

Conversation

phofl commented May 7, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented May 10, 2021

phofl commented May 11, 2021

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phofl commented May 12, 2021

simonjayhawkins commented May 24, 2021

jreback commented May 24, 2021

phofl commented May 24, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonjayhawkins commented May 25, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment