Skip to content

ENH: Add a level option to pd.DataFrame.append() #43821

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MoAly98 opened this issue Sep 30, 2021 · 4 comments
Closed

ENH: Add a level option to pd.DataFrame.append() #43821

MoAly98 opened this issue Sep 30, 2021 · 4 comments

Comments

@MoAly98
Copy link

MoAly98 commented Sep 30, 2021

I have two dataframes:

The first one looks like this:

variable
entry subentry
0 1 X
2 Y
3 Z

and the second one looks like:

variable
entry subentry
0 1 A
2 B

I would like to merge the two dataframe such that I get:

variable
entry subentry
0 1 X
2 Y
3 Z
1 1 A
2 B

Simply using df1.append(df2, ignore_index=True) gives

variable
0 X
1 Y
2 Z
3 A
4 B

In other words, it appends on the highest level (level 1) in the dataframes. I think it important to allow the user dealing with mutli-index dataframes to specify which level of the dataframe to append on. In the simple example above, this would allow me to specify "entry" as the index to append on.

Here is a code sinppet that will reproduce the problem:

	arrays = [
	    np.array([0,0,0]),
	    np.array([0,1,2]),]
	arrays_2 = [
	    np.array([0,0]),
	    np.array([0,1]),]
	df1 = pd.DataFrame(np.random.randn(3, 1), index=arrays)
	df2 = pd.DataFrame(np.random.randn(2, 1), index=arrays_2)
	df = df1.append(df2, ignore_index=True)
	print(df)

In practice, I am looking ao combine N dataframes, each with a different number of "entry" rows. So I am looking for an approach that will not rely on me knowing the exact of the dataframes I am combining.

@MoAly98 MoAly98 added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 30, 2021
@phofl
Copy link
Member

phofl commented Sep 30, 2021

It does append on all levels and calls reset_index afterwards.

-1 on this

@MoAly98
Copy link
Author

MoAly98 commented Oct 1, 2021

This is exactly what I'm asking to make more flexible. The reseting of the index causes the loss of my multi-index structure. I think there should be flexibility in append to just append on one level and only reset one level's index.

@mroeschke
Copy link
Member

I would be also -1 toward this suggestion for similar reasons as well as the likelihood for append to be deprecated in a future release: #35407

@mroeschke mroeschke removed the Needs Triage Issue that has not been reviewed by a pandas team member label Oct 2, 2021
@mroeschke
Copy link
Member

Thanks for the suggestion, but since there hasn't been additional support from the community or other core devs, going to close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants