-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Disable Numpy memory allocation while concat #59956
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you provide a reproducible example? |
Something like this? If my script is correct, I did not observe extra memory allocation in concatenation. Reproducible script
Result:
INSTALLED VERSIONScommit : 7c0ee27 pandas : 3.0.0.dev0+1237.g7c0ee27e6c |
Hi, Sorry for delayed response, Below are the points
|
Just to add, pandas dataframe is consuming lot of space
|
@sandeyshc - can you share your output of |
INSTALLED VERSIONScommit : d9cdd2e pandas : 2.2.2 |
@rhshadrach I think this problem is already fixed |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
We have sparse data with many null values, and while reading it using Pandas with PyArrow, it doesn't consume much memory because of pandas internal compression logic. However, during concatenation, NumPy allocates memory that isn't actually used, causing our Python script to fail due to memory allocation issues. Can you provide an option to disable NumPy memory allocation when concatenating DataFrames along axis=1?
Feature Description
pd.concat(df_list,axis=1,numpy_allocation=False)
Alternative Solutions
Atleast can you provide how can we change C++ script internally and use it for our purpose
Additional Context
Please let me know if i am wrong.
The text was updated successfully, but these errors were encountered: