-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Handle unsortable Periods correctly in set_index, MultiIndex #18208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,6 +11,51 @@ | |
from ..datetimelike import DatetimeLike | ||
|
||
|
||
class TestPeriodLevelMultiIndex(object): | ||
# TODO: Is there a more appropriate place for these? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's figure that out in this PR (at first look, this location might actually be fine). BTW, your class name isn't entirely accurate IIUC. Your first test doesn't involve a multi-index. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fair enough. The class originally had several other tests before this PR was broken up into 3 pieces. Have a suggested name? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the 2nd should go in test_multi.py |
||
def test_set_index(self): | ||
# GH#17112 | ||
index = Index(['PCE'] * 4, name='Variable') | ||
data = [Period('2018Q2'), | ||
Period('2021', freq='5A-Dec'), | ||
Period('2026', freq='10A-Dec'), | ||
Period('2017Q2')] | ||
ser = Series(data, index=index, name='Period') | ||
df = ser.to_frame() | ||
|
||
res = df.set_index('Period', append=True) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this could go in tests/frame where we do .set_index tests don’t add commentary |
||
# If the doesn't raise then that's a good start | ||
assert res.index.names == ['Variable', 'Period'] | ||
|
||
def test_from_arrays_period_level(self): | ||
# GH#17112 | ||
index = Index(['PCE'] * 4, name='Variable') | ||
data = [Period('2018Q2'), | ||
Period('2021', freq='5A-Dec'), | ||
Period('2026', freq='10A-Dec'), | ||
Period('2017Q2')] | ||
ser = Series(data, index=index, name='Period') | ||
|
||
mi = pd.MultiIndex.from_arrays([ser.index, ser]) | ||
assert mi.names == ['Variable', 'Period'] | ||
assert mi.get_level_values('Variable').equals(index) | ||
|
||
def test_from_arrays_dataframe_level_invalid(self): | ||
# GH#17112 | ||
index = pd.Index(['CPROF', 'HOUSING', 'INDPROD', 'NGDP', 'PGDP'], | ||
name='Variable') | ||
data = [pd.Period('1968Q4')] * 5 | ||
df = pd.DataFrame(data, index=index, columns=['Period']) | ||
with pytest.raises(TypeError): | ||
# user should not pass a DataFrame as an index level. | ||
# In this single-column case the user needs to specifically pass | ||
# df['Period']. | ||
# Check that this raises at construction time instead of later | ||
# when accessing `mi.shape`, which used to raise | ||
# "ValueError: all arrays must be the same length", | ||
pd.MultiIndex.from_arrays([df.index, df]) | ||
|
||
|
||
class TestPeriodIndex(DatetimeLike): | ||
_holder = PeriodIndex | ||
_multiprocess_can_split_ = True | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you going thru all this trouble? not to mention this is going to be less performant.
let's just check that things are strings or ints and be done (anything else can just return as is)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So don't bother sorting non-(str or int)? OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 4 tests that fail once we start ignoring non-(str or int); they are specifically checking for TypeErrors being raised here.