Skip to content

CLN/BUG: Fix stacklevel on setting with copy warning. #5581

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jtratner
Copy link
Contributor

May not be perfect, but gets us most of the way there. Resolves @TomAugspurger's comment on #5390.

Put the following in a file called test.py.

from pandas import DataFrame
def myfunction():
    df = DataFrame({"A": [1, 2, 3, 4, 5], "B": [3.125, 4.12, 3.1, 6.2, 7.]})
    row = df.loc[0]
    row["A"] = 0

def anotherfunction():
    myfunction()

def third_function():
    anotherfunction()

third_function()

Running it will generate this warning:

test.py:5: SettingWithCopyWarning: A value is trying to be set on a copy
of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
  row["A"] = 0

I don't think it's worth testing (especially because there's not necessarily a simple way to do it).

@jtratner
Copy link
Contributor Author

@TomAugspurger - please tell me whether this works for you and if it does I'll merge it.

@TomAugspurger
Copy link
Contributor

I'm not sure it got the entire stack. You can try it, I'm looking at the test_index.py file. Now it points to pandas/core/indexing.py:343, which I don't think is quite it.

@jtratner
Copy link
Contributor Author

Can you post the point you're looking at? It may be that different functions need to set different stack levels.

@jreback
Copy link
Contributor

jreback commented Nov 24, 2013

@jtratner you prob need to specify the stacklevel in core/indexing.py where the copy check routine is actually called

@TomAugspurger
Copy link
Contributor

I'm just running

(pandas-dev) ~/E/p/l/p/s/p/p/tests (jtratner-fix-stacklevel-for-basic-setting=) nosetests test_index.py
................................................................................................................................................................./Users/tom/Envs/pandas-dev/lib/python2.7/site-packages/pandas/pandas/core/indexing.py:343: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
  self.obj[item] = s
.................
----------------------------------------------------------------------
Ran 178 tests in 0.927s

OK

Looks like 8 was the correct stack level for this one. Hardcoding that pointed to test_index.py:1385, this function:

    def test_set_value_keeps_names(self):
        # motivating example from #3742
        lev1 = ['hans', 'hans', 'hans', 'grethe', 'grethe', 'grethe']
        lev2 = ['1', '2', '3'] * 2
        idx = pd.MultiIndex.from_arrays(
            [lev1, lev2],
            names=['Name', 'Number'])
        df = pd.DataFrame(
            np.random.randn(6, 4),
            columns=['one', 'two', 'three', 'four'],
            index=idx)
        df = df.sortlevel()
        self.assertEqual(df.index.names, ('Name', 'Number'))
        df = df.set_value(('grethe', '4'), 'one', 99.34)
        self.assertEqual(df.index.names, ('Name', 'Number'))

@jtratner
Copy link
Contributor Author

I'm -1 on making that work, you'd have to have a separate function to set
with stacklevel, and it's not worth it when you can just change it to an
error instead:

            def setter(item, v):
                s = self.obj[item]
                pi = plane_indexer[0] if lplane_indexer == 1 else
plane_indexer

                # set the item, possibly having a dtype change
                s = s.copy()
                s._data = s._data.setitem(pi, v)
                s._maybe_update_cacher(clear=True)
                # would need to swap this out with a function to preserve
stacklevel
                self.obj[item] = s

Also, the problem with this is that you'd end up trying to do heroic things
to get it up to the code level, and that's not really pandas'
responsibility. The warning should be produced at that last line, because
that's the behavior that's causing the error.

@jtratner
Copy link
Contributor Author

Isn't the actual issue that df.set_value is doing the setting incorrectly?

@jreback
Copy link
Contributor

jreback commented Nov 24, 2013

it's even more complicated than that
it depends on the input how exactly this is called (eg a list or scalar)

so I agree because the user can always make it raise and investigate

it IS an error after all; I don't think there are very many false positives

@TomAugspurger did you see where the error is coming from?

can u post to see the use case. is it a false positive? or actual error?

@jtratner
Copy link
Contributor Author

And that warning at least shows you what is doing the setting, and for the majority of people who are doing something interactively, should be fine.

@TomAugspurger
Copy link
Contributor

This warning came from test_index line 1385I haven't looked closely at the code, but I think jtratner is correct that set_value is doing something funky.

@TomAugspurger
Copy link
Contributor

Also, yes the place your PR points to should work well for most cases. This use was a bit odd since it was from a test.

@jtratner
Copy link
Contributor Author

@TomAugspurger good you noticed it though, because we should fix it so that df.set_value() it doesn't have that issue...

jtratner added a commit that referenced this pull request Nov 24, 2013
CLN/BUG: Fix stacklevel on setting with copy warning.
@jtratner jtratner merged commit 044ee06 into pandas-dev:master Nov 24, 2013
@jtratner jtratner deleted the fix-stacklevel-for-basic-setting branch November 24, 2013 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants