Skip to content

ENH: retain attrs when concat dataframes #41828

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
xiki-tempula opened this issue Jun 5, 2021 · 6 comments · Fixed by #42252
Closed

ENH: retain attrs when concat dataframes #41828

xiki-tempula opened this issue Jun 5, 2021 · 6 comments · Fixed by #42252
Labels
Enhancement metadata _metadata, .attrs Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@xiki-tempula
Copy link
Contributor

Is your feature request related to a problem?

I wish the attrs could be retained when concat data frames.

d = {'col1': [1, 2], 'col2': [3, 4]}
df1 = pd.DataFrame(data=d)
df1.attrs = {1:1}
df2 = pd.DataFrame(data=d)
df2.attrs = {1:1}
pd.concat([df1, df2]).attrs
{}

Describe the solution you'd like

d = {'col1': [1, 2], 'col2': [3, 4]}
df1 = pd.DataFrame(data=d)
df1.attrs = {1:1}
df2 = pd.DataFrame(data=d)
df2.attrs = {1:1}
pd.concat([df1, df2]).attrs
{1: 1}

API breaking implications

N/A

Describe alternatives you've considered

N/A

@xiki-tempula xiki-tempula added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 5, 2021
@lithomas1 lithomas1 added metadata _metadata, .attrs Reshaping Concat, Merge/Join, Stack/Unstack, Explode and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 6, 2021
@lithomas1 lithomas1 added this to the Contributions Welcome milestone Jun 6, 2021
@lithomas1
Copy link
Member

Hi @xiki-tempula, thanks for the report.
This is indeed currently not implemented(xref #28283), and I guess the correct way to propogate metadata in this case would be to drop them when they don't match as in your example. PRs to fix this are welcome.
cc @TomAugspurger

@xiki-tempula
Copy link
Contributor Author

@lithomas1 Thanks for the comment.

I'm thinking that the logic of the concat should be

def concat(objs, *args, **kwargs):
    '''Concatenate pandas objects along a particular axis with optional set
    logic along the other axes. If all pandas objects has the same attrs
    attribute, the new pandas objects would have this attrs attribute. A
    ValueError would be raised if any pandas object has a different attrs.

    Returns
    -------
    DataFrame
        Concatenated pandas object.
    '''
    # Sanity check
    attrs = objs[0].attrs
    for obj in objs:
        if attrs != obj.attrs:
            raise ValueError('All pandas objects should have the same attrs.')
    new = pd.concat(objs, *args, **kwargs)
    new.attrs = attrs
    return new

I wonder what is your thought?

@TomAugspurger
Copy link
Contributor

No, I don't think we should raise if the attrs don't match. They aren't supposed to affect the result of the computation.

For now, let's just support the case where attrs match, dropping them in other cases. We can add a keyword to concat later to control that behavior.

@amreesh
Copy link

amreesh commented Apr 14, 2022

Using pandas 1.4.2

Instead of append or concat I used loc to add a row into the dataframe

import pandas as pd
df = pd.DataFrame(columns={'field1','field2','field3'})
df.loc[len(df.index)] = ['a','b','c']
print(df)
field3 field2 field1
0 a b c

How do we keep the order?
any clue?

@dss010101
Copy link

is this merge in the latest version of pandas?

@xiki-tempula
Copy link
Contributor Author

@MSingh00 Yes. This should be the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement metadata _metadata, .attrs Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants