Skip to content

Ordering should be kept when OrderedDict records are passed #10056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Zaharid opened this issue May 4, 2015 · 11 comments
Closed

Ordering should be kept when OrderedDict records are passed #10056

Zaharid opened this issue May 4, 2015 · 11 comments
Labels
API Design Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@Zaharid
Copy link

Zaharid commented May 4, 2015

When passing a list OrderedDict objects to pd.DataFrame, the ordering of the columns is not kept:

od =  OrderedDict([('Z', 5), ('xx', 8), ('uno', 2), ('etc', 1), ('j', 76), ('e', 7)])
l = [od]*4
pd.DataFrame(l).columns
#Out[23]: Index(['Z', 'e', 'etc', 'j', 'uno', 'xx'], dtype='object')
@Zaharid Zaharid changed the title P Ordering should be kept when OrderedDict records are passed May 4, 2015
@jorisvandenbossche
Copy link
Member

Can you be a bit more specific? Passed to what? (DataFrame?)
Maybe give a small example.

@Zaharid
Copy link
Author

Zaharid commented May 4, 2015

Sorry, I pressed submit by mistake. Hope it's clear now.

@jreback
Copy link
Contributor

jreback commented May 4, 2015

It does work, but you are passing a list of dicts, which is quite different.

In [20]: od
Out[20]: OrderedDict([('Z', [5]), ('xx', [8]), ('uno', [2]), ('etc', [1]), ('j', [76]), ('e', [7])])

In [21]: DataFrame(od)
Out[21]: 
   Z  xx  uno  etc   j  e
0  5   8    2    1  76  7

I am +0 about supporting a nested version of this. I don't see the utility.

@Zaharid
Copy link
Author

Zaharid commented May 4, 2015

My use case is:

def results_table(results):
    records = [OrderedDict([
                ('Observable'       , result.obs),
                ('PDF'              , result.pdf),
                ('Collaboration'    , result.pdf.collaboration),
                ('alpha_sMref'      , result.pdf.AlphaS_MZ),
                ('PDF_OrderQCD'     , result.pdf.oqcd_str),
                ('NumFlavors'       , result.pdf.NumFlavors),
                ('CV'               , result.central_value),
                ('Up68'             , result.error68['max']),
                ('Down68'           , result.error68['min']),
                ('Remarks'          , [],  )
               ]) for result in results]
    return pd.DataFrame(records)

That is, I have a list of objects and I would like to turn them into a DataFrame to be able to display and aggregate them. Doing a list comprehension for each column doesn't look very nice.

@hayd
Copy link
Contributor

hayd commented May 5, 2015

This seems similar to passing namedtuples. Where I think it's clearer to be more explicit:

In [11]: pd.DataFrame(l, columns=od.keys()).columns
Out[11]: Index(['Z', 'xx', 'uno', 'etc', 'j', 'e'], dtype='object')

In [12]: pd.DataFrame(l, columns=l[0].keys()).columns
Out[12]: Index(['Z', 'xx', 'uno', 'etc', 'j', 'e'], dtype='object')

@bashtage
Copy link
Contributor

bashtage commented May 5, 2015

One workaround:

data=((1,2),(3,4))
records = [pd.Series(OrderedDict([('a',d[0]),('b',d[1])])) for d in data]
records =  pd.DataFrame(records)

@jason-curtis
Copy link

Since ordering is now guaranteed on dicts in Python 3.7+, it seems reasonable to expect this behavior with regular dicts in addition to OrderedDicts. This should work in cpython 3.6 as well though ordering is not "guaranteed" as strongly as it is in 3.7.

Expected:

In[2]: DataFrame([{'First': 1, 'Second': 2, 'Third': 3, 'Fourth': 4}])                                      
Out[2]: 
   First  Second  Third  Fourth
0      1       2       3      4

Actual (pandas 0.24.1 on Python 3.6.7) (sometimes - the output ordering is inconsistent):

In[2]: DataFrame([{'First': 1, 'Second': 2, 'Third': 3, 'Fourth': 4}])                                      
Out[2]: 
   First  Fourth  Second  Third
0      1       4       2      3

@bashtage
Copy link
Contributor

bashtage commented Jul 8, 2019

I was never sure if ordering was guaranteed with literals. It is if you use dict(a=a, b=b) or insert line by line.

@jason-curtis
Copy link

The thread I just linked to is entitled "Guarantee ordered dict literals in v3.7?" so I would say yes. A quick test in 3.6 vs 2.7 seemed to corroborate this too.

@jason-curtis
Copy link

jason-curtis commented Jul 8, 2019

to be clear, the quick test I did was based on

list({1:2, 3:4, 5:6, 7:8, 'nine': 10, 'eleven': 12}.items())                                                 
# in Python 3.6 but not 2.7, always produces
[(1, 2), (3, 4), (5, 6), (7, 8), ('nine', 10), ('eleven', 12)]

Hard to pin down these issues though, as often it'll look ordered for a bunch of examples and then you'll change something and the order will change.

Is this related to #22708 ?

@ghost
Copy link

ghost commented Jul 11, 2019

This issue is a dupe of the later #13304, and it should have been closed when #13309 was merged. Three years ago. Wish I'd known before getting into a fistfight in #27309.

cc @jorisvandenbossche

@jreback jreback added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Jul 11, 2019
@jreback jreback closed this as completed Jul 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

6 participants