-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Maintain Dict Ordering with Concat #21512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 5 commits
87bb1ac
f9d105e
b044e04
eaa7cc0
6869e2c
e9561fe
53aca81
726755f
a7aa052
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -83,4 +83,5 @@ Bug Fixes | |
**Other** | ||
|
||
- Bug in :meth:`Series.nlargest` for signed and unsigned integer dtypes when the minimum value is present (:issue:`21426`) | ||
- | ||
- Bug in :class:`_Concatenator` should maintain dict ordering when :meth:`concat` is called (:issue:`2151`) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this should be a user facing comment, meaning it should only use the public API. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it |
||
- | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this needs to go into 0.24.0 as even though its a bug fix, its a jaring one, should be in reshaping. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
""" | ||
concat routines | ||
""" | ||
|
||
from collections import OrderedDict | ||
import numpy as np | ||
from pandas import compat, DataFrame, Series, Index, MultiIndex | ||
from pandas.core.index import (_get_objs_combined_axis, | ||
|
@@ -250,7 +250,10 @@ def __init__(self, objs, axis=0, join='outer', join_axes=None, | |
|
||
if isinstance(objs, dict): | ||
if keys is None: | ||
keys = sorted(objs) | ||
if not isinstance(objs, OrderedDict): | ||
keys = sorted(objs) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Normal dicts are ordered too in python3.6+, that must be checked too. There is a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if we already took a stance on this as a project, but the orderedness of dicts in 3.6 is considered an implementation detail of cpython, while it will be a language feature starting from 3.7 . So in principle in 3.6 users should not rely on it, and we should not assume they do. @topper-123 's comment still holds I think, but for 3.7 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, given that Python 3.6 is at 3.6.5 now, and its's being formalised in 3.7, I think this is quite safe. They're not going to change implementation now... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is very simliar to in structure to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback you mean use: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes that's the point i am raising, I want a refactor here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So, from |
||
else: | ||
keys = objs | ||
objs = [objs[k] for k in keys] | ||
else: | ||
objs = list(objs) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,4 @@ | ||
from collections import OrderedDict | ||
from warnings import catch_warnings | ||
from itertools import combinations, product | ||
|
||
|
@@ -1294,6 +1295,17 @@ def test_concat_rename_index(self): | |
tm.assert_frame_equal(result, exp) | ||
assert result.index.names == exp.index.names | ||
|
||
def test_concat_with_ordered_dict(self): | ||
# GH 21510 | ||
result = pd.concat(OrderedDict([('First', pd.Series(range(3))), | ||
('Another', pd.Series(range(4)))])) | ||
a = pd.Series(range(3), range(3)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if you use the 2 form constructor, then specify There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why should I specify that? |
||
b = pd.Series(range(4), range(4)) | ||
a.index = pd.MultiIndex.from_tuples([('First', v) for v in a.index]) | ||
b.index = pd.MultiIndex.from_tuples([('Another', v) for v in b.index]) | ||
expected = a.append(b) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be easier if you just constructed the Series directly. You could doing something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. AttributeError: type object 'MultiIndex' has no attribute 'from_records' |
||
tm.assert_series_equal(result, expected) | ||
|
||
def test_crossed_dtypes_weird_corner(self): | ||
columns = ['A', 'B', 'C', 'D'] | ||
df1 = DataFrame({'A': np.array([1, 2, 3, 4], dtype='f8'), | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you picked up some extra text here. also move to 0.24