-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CLN: Refactor string special methods #4092
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: Refactor string special methods #4092
Conversation
Can squash this into fewer commits if you want ... just like to lay out the changes clearly for people to review. |
are there any major api changes here? |
- New ``StringMixin`` that, given a ``__unicode__`` method, gets python 2 and | ||
python 3 compatible string methods (``__str__``, ``__bytes__``, and | ||
``__repr__``). Plus string safety throughout. Now employed in many places | ||
throughout the pandas library. (:issue:`4090`) | ||
|
||
**Experimental Feautres** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u correct this? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean move it out of API changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no there's a typo, Feautres -> Features
ahem user facing api changes is what i meant :) |
py2/py3. | ||
""" | ||
# Should be overwritten by base classes | ||
return object.__repr__(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be super(PandasObject, self).__repr__()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. That would be an infinite loop. (because StringMixin calls __str__
, etc.) That's the only awkward part of this setup...but for most PandasObjects, it works out okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh whoops blah sorry
This doesn't change any user-facing API at all. It does change the default NDFrame |
this would close #3231 the only remaining questions is do we need a PandasScalar ? |
I didn't put this into Period or TimeStamp - should it go there too? On Sun, Jun 30, 2013 at 11:15 PM, jreback [email protected] wrote:
|
would it make string methods in Period-Timestamp ? |
what purpose would |
Side note: added PandasObject to |
my question is are there ops on scalars that are useful outside of their role in arrays and repring themselves, personally i almost never use the scalar versions for anything except inspection |
also is there a problem with moving |
No. Would that just be On Sun, Jun 30, 2013 at 11:46 PM, Phillip Cloud [email protected]:
|
yep |
only reason to have |
if just need the instance check then can just fake with def is_pd_scalar(obj):
return isinstance(obj, (Timestamp, Period)) adding to the instance check if (when?) more scalar types are added. |
i think ok for 0.12 |
👍 for 0.12 here as well |
@jtratner maybe squash it down a bit? |
If it's list like, use PandasContainer? Think that's already the case.
|
retract need for PandasScalar.....noneed really (and if there is legit need later, can always add) On Jul 1, 2013, at 7:41 AM, Jeff Tratner [email protected] wrote:
|
Yeah, I'm going to squash it down...just wanted to show steps while As an aside, it might make more sense to just create an abstract base class
|
FWIW i like that you don't squash for a while, while you're building, i'm starting to do that. i think it helps show the thought process. plenty of examples too, i think the series-as-ndframe pr has a ton of commits |
Plus, if you squash, it keeps the text of all the commits which is nice. For me, I like to make sure every commit passes its tests and distinct changes should end up in different commits to make it easier to localize issues if they occur. (e.g. via git bisect) |
@cpcloud - I added the constructor argument to the baseclass and removed a On Mon, Jul 1, 2013 at 12:08 PM, jreback [email protected] wrote:
|
Previous PandasObject becomes PandasContainer. New PandasObject becomes baseclass for more elements (like Index, Categorical, etc.), moves string methods to baseclass and subclassing objects need only define `__unicode__` methods to get all string methods for free (and Py2/3 compatible). CLN: Cleanup extraneous str methods from Panel CLN: Remove unnecessary string methods from frame CLN: Change name of TestPandasObjects --> TestPandasContainer
CLN: Make Categorical inherit from PandasObject CLN: Make GroupBy inherit from PandasObject CLN/ENH: Make Sparse* into PandasObjects Plus get all the string methods working... CLN: Index now a PandasObject + str method cleanup CLN: Make tseries/index fit with PandasObject. CLN: Use PandasObject in internals + cleanup CLN: Make Period into a PandasObject + cleanup CLN: Remove extraneous __repr__ from io/excel
CLN: Make PyTables unicode safe + add StringMixin CLN: Make StataMissingValue use StringMixin ENH: Use StringMixin for addl string methods in stats
DOC/CLN: Remove extra whitespace from v0.12.0.txt DOC: Add PR issue number too DOC: Fix spelling error
just confirming there's no user-facing API changes and have you run the perf regression test? |
@wesm there's definitely no user-facing API changes. I ran |
nope that would be it. bombs away |
test suite above @wesm - is that what reasonably close? |
looks good @jtratner merging UNODIR (unless otherwise directed) |
@jreback go ahead - all good on my end |
CLN: Refactor string special methods
closes #4090, #3231
pandas
class hierarchy has changed (slightly). Theprevious
PandasObject
now is calledPandasContainer
and a newPandasObject
has become the baseclass forPandasContainer
as wellas
Index
,Categorical
,GroupBy
,SparseList
, andSparseArray
(+ their base classes). Currently,PandasObject
provides string methods (from
StringMixin
).StringMixin
that, given a__unicode__
method, gets python 2 andpython 3 compatible string methods (
__str__
,__bytes__
, and__repr__
). Now employed in many places throughout the pandas library.used in the string or repr methods of a library).
I'm hoping to use this for the new objects in the MultiIndex naming PR, so
hopefully it's okay to merge for v0.12.