Skip to content

Change ._data to ._parent for accessors #21906

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Aug 8, 2018

Conversation

jbrockmendel
Copy link
Member

The idea is to reduce the number of distinct meanings ._data has. With this it is down to just Index._data and NDFrame._data, I think.

@codecov
Copy link

codecov bot commented Jul 14, 2018

Codecov Report

Merging #21906 into master will not change coverage.
The diff coverage is 97.67%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #21906   +/-   ##
=======================================
  Coverage   92.06%   92.06%           
=======================================
  Files         169      169           
  Lines       50698    50698           
=======================================
  Hits        46675    46675           
  Misses       4023     4023
Flag Coverage Δ
#multiple 90.47% <97.67%> (ø) ⬆️
#single 42.32% <20.93%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/plotting/_core.py 83.48% <100%> (ø) ⬆️
pandas/core/indexes/accessors.py 90.09% <100%> (ø) ⬆️
pandas/core/arrays/categorical.py 95.75% <100%> (ø) ⬆️
pandas/core/strings.py 98.63% <96.96%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0041681...4ece5b3. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented Jul 14, 2018

I can see changing this as _parent is more clear; is it intention to change other uses of _data?

@jreback jreback added the Clean label Jul 14, 2018
@jbrockmendel
Copy link
Member Author

is it intention to change other uses of _data?

Not really. At the sprint I brought up the idea of changing BlockManager references from _data to something more distinctive like _mgr, but the reception was luke-warm at best.

There is also #19658 but AFAIK there has been no action on that.

@@ -27,14 +27,14 @@ def __init__(self, data, orig):
raise TypeError("cannot convert an object of type {0} to a "
"datetimelike index".format(type(data)))

self.values = data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

am puzzled why this doesn't actually break anything; i think that the dateitmelike accessors do something different?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think that the dateitmelike accessors do something different?

CategoricalAccessor uses self.categories, is the outlier this PR leaves unchanged.

@jbrockmendel
Copy link
Member Author

The CI is having unrelated errors here much more consistently than usual.

@jbrockmendel
Copy link
Member Author

@mroeschke if you're comfortable with the "Take easy stuff off jreback's plate" strategy, this may qualify.

@jreback
Copy link
Contributor

jreback commented Aug 6, 2018

i don’t think we should do this though

let me look a bit later

@jbrockmendel
Copy link
Member Author

i don’t think we should do this though

Huh, here you seemed on board. My mistake.

let me look a bit later

The goal was to take "easy" stuff off your plate. This is definitely not a priority.

@jreback
Copy link
Contributor

jreback commented Aug 7, 2018

@jbrockmendel hah, right I do remember that now. let me look again. my worry is that we are changing some things to ._parent, which makes sense, but now we have a dicotomy and now unclear what means what. maybe survey where we use ._data and propose a logical split?

@TomAugspurger
Copy link
Contributor

I think using ._parent for accessors (and only for accessors) is a decent start.

Whether we we want to go down the road of changing Index._data or DataFrame._data can be pursued separately.

@TomAugspurger
Copy link
Contributor

and now unclear what means what.

I think that's @jbrockmendel's reason for changing this in the first place, since ._data can mean so many things. But with this PR when you need to deal with accessors, or you see ._parent, you know at a glance what's going on.

@jreback
Copy link
Contributor

jreback commented Aug 7, 2018

yeah I think this change is ok. Just want to document this somewhere and quickly survey other uses for comonaility.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Aug 7, 2018 via email

@jbrockmendel
Copy link
Member Author

Just want to document this somewhere and quickly survey other uses for comonaility.

OK, if we're going down this road, should I update CategoricalAccessor to use _parent instead of categories? (ATM it is the one case that doesn't use _data).

@jreback
Copy link
Contributor

jreback commented Aug 7, 2018

yes i am ok with all non internals things changing to use _parent

@jreback jreback added this to the 0.24.0 milestone Aug 8, 2018
@jreback
Copy link
Contributor

jreback commented Aug 8, 2018

ok, can you add some docs in internals.rst about this

@jreback jreback merged commit d52748c into pandas-dev:master Aug 8, 2018
@jbrockmendel jbrockmendel deleted the cln_data branch August 8, 2018 15:50
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants