-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: ignore_index for Series corr #49617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi, thanks for your report. The keyword should ignore the index during the operation not in the result? In this case we need a different name, since |
Here's an example of the issue. Another approach might be to spit out a warning, though that can get noisy. The benefit of a parameter to not do the automatic alignment is that it serves as a warning. A final approach would just be a note in the documentation. |
all operations in pandas align -1 on changing anything - this would be a very special case |
I think reset_index() is actually the right answer here. to_numpy() would require scipy. Your point is taken, but I guess what got me was that all operations do appear to sort of align but in rather arbitrary ways, leaving things quite confusing. eg:
If it were consistent with corr, than n1+n2 should be a new DF with only the agreed upon indexes, but instead the missing val is NaN. If corr followed the same logic consistently, it would give a NaN correlation. When I saw a seemingly valid correlation I assumed everything was correct, which of course was mistaken. A short note in the docs describing the assumptions made would at least help a bit. |
@blazespinnaker you might be interested in PDEP5, which may (mind you I said may, it's not yet been accepted, and it's still being ironed out) allow you to not need to think about alignment if you don't want to Closing then as I don't think there's anything actionable here - regarding clarifying docs, PRs to improve them are welcome, feel free to submit one https://pandas.pydata.org/docs/dev/development/contributing.html |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
Sometimes you can get pretty strange results with simple operations on Series.corr(Series) because of index mismatches. An ignore_index would be very useful.
Feature Description
eg, s1.corr(s2, ignore_index = True)
Alternative Solutions
None
Additional Context
No response
The text was updated successfully, but these errors were encountered: