Skip to content

Keep name after DataFrame drop (issue 2939) #2961

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

hayd
Copy link
Contributor

@hayd hayd commented Mar 2, 2013

This should fix #2939.

@hayd
Copy link
Contributor Author

hayd commented Mar 3, 2013

Ag, although this is working locally, there is a typo in my test case... and yet it still passes.

Presumably this is because assert_frame_equal doesn't check the names (?), I'll update with an additional test for the names.

@hayd
Copy link
Contributor Author

hayd commented Mar 3, 2013

I removed the assert_frame_equal and only test the (index and column) names i.e. just what's been fixed.

(Aside: should assert_frame_equal test column/index names?)

@hayd
Copy link
Contributor Author

hayd commented Mar 3, 2013

Sorry, this last commit is a bad one, I shouldn't have done it. We shouldn't be checking names (only) on the check_type flag, as such this is only passing the tests by fluke. If I check all assert_frame_equalss with a check_names flag (i.e. always check names) I get a ton of errors. If I default it to False it would probably ok (but then it's not checking anything as nowhether uses check_name :s).

I should have added this to assert_frame_equal in a separate pull request, or at least waiting til your answer.

Is there a way to drop this last commit? (Should I just add another reverting it?)

@jreback
Copy link
Contributor

jreback commented Mar 3, 2013

http://stackoverflow.com/questions/927358/how-to-undo-last-git-commit
basically just

git reset HEAD-1

then u can recommit

rebase

then force push to update the remote

@jreback
Copy link
Contributor

jreback commented Mar 3, 2013

just add as an option to asset_frame_equal
set to False by default

@jreback
Copy link
Contributor

jreback commented Mar 3, 2013

or try getattr(left,'names',None) == getattr(right,'names',None)

so will check on if one side is bad

neither or both is ok though (as names prob
not propagated correctly in most situations)

or maybe it's just a few - not sure

@hayd
Copy link
Contributor Author

hayd commented Mar 3, 2013

Thanks @jreback, I will close this pull-request and add as separate pull.

@hayd hayd closed this Mar 3, 2013
@jreback
Copy link
Contributor

jreback commented Mar 3, 2013

ok

u can do it with this one ..

make a new commit that fixes what u want
git rebase -i commit_sha_that_starts_this_pr
git push myfork thisbranch -f

will replace the commits here with the revised ones

@hayd
Copy link
Contributor Author

hayd commented Mar 3, 2013

@jreback I'm not sure what you mean about getattr...

I squashed my commits to one, and added them, will make the pull-request now (I default check_names as False).

I'll have a play in a separate branch for assert_frame_equal to see where names isn't propogated, but I think it's a fair few places (whether it should be I'm not sure)...

I'd prefer to be able to set default check_names to True (but might be too much work to make that work), but doing so could be this throws up some current bugs similar to this one.

@hayd
Copy link
Contributor Author

hayd commented Mar 3, 2013

@jreback I'll give that a try for good measure :)

@jreback
Copy link
Contributor

jreback commented Mar 3, 2013

getattr will allow u to check names
but If the attribute does exist it will be none
so will compare as try on both sides (and test won't fail)

not sure how much these names are correctly propagating (or not)
so need to figure out what's appropriate

@hayd
Copy link
Contributor Author

hayd commented Mar 3, 2013

@jreback But it should always have the attribute names? I thought the reason where it's failing is because it isn't propogating. ?

I will investigate :)

@jreback
Copy link
Contributor

jreback commented Mar 3, 2013

I don't know - not sure when they r assigned or if a default is assigned
so maybe the attribute doesn't exist unless a name is created - worth having a look into the general case

@hayd
Copy link
Contributor Author

hayd commented Mar 3, 2013

I think it's always there:

In [1]: df = pd.DataFrame([])

In [2]: df.index.names
Out[2]: [None]

In [3]: df.columns.names
Out[3]: [None]

@jreback
Copy link
Contributor

jreback commented Mar 3, 2013

so if you set check_names to True
look and see what is failing and why - prob similar to the bug u fixed
something is not propagating

@hayd
Copy link
Contributor Author

hayd commented Mar 3, 2013

Precisely, so far it seems to be mostly expected results not including the name field. (I'm working through it.)

@hayd
Copy link
Contributor Author

hayd commented Mar 4, 2013

@jreback FYI I added the check_names=True as a new pull-request #2964.

I itemized the possible bugs/questionable behaviour there to do with index/column names, there's quite a couple I wasn't sure about (before I go "fixing" things).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DataFrame.drop loses index name
2 participants