Skip to content

BUG: item_cache not cleared on DataFrame.values #34999

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 26, 2020

Conversation

jbrockmendel
Copy link
Member

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@jreback jreback added Performance Memory or execution speed performance Clean labels Jun 26, 2020
@jreback jreback added this to the 1.1 milestone Jun 26, 2020
@jreback
Copy link
Contributor

jreback commented Jun 26, 2020

any perf impact? (i would actually expect some small regressions)

@jbrockmendel
Copy link
Member Author

In [1]: import pandas as pd                                                                                                                                                                                        


In [2]:                                                                                                                                                                                                            

In [2]: df = pd.DataFrame([1, 2])                                                                                                                                                                                  

In [3]: %timeit df.values
2.25 µs ± 42.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)  # <-- master
3.82 µs ± 106 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)  # <-- PR

@jreback
Copy link
Contributor

jreback commented Jun 26, 2020

hmm, do we have any to_numpy() benchmarks? just to see not blocking this PR

@jbrockmendel
Copy link
Member Author

do we have any to_numpy() benchmarks

Looks like we do not

@jreback
Copy link
Contributor

jreback commented Jun 26, 2020

do we have any to_numpy() benchmarks

Looks like we do not

kk happy to merge and can create issue for benchmarks (as a good first - people like to do those)

@jbrockmendel
Copy link
Member Author

issue opened

@jreback jreback merged commit dbfbef7 into pandas-dev:master Jun 26, 2020
@jreback
Copy link
Contributor

jreback commented Jun 26, 2020

thanks

@jbrockmendel jbrockmendel deleted the ref-is_mixed_dtype-2 branch June 27, 2020 00:21
fangchenli pushed a commit to fangchenli/pandas that referenced this pull request Jun 27, 2020
@TomAugspurger
Copy link
Contributor

@jbrockmendel
Copy link
Member Author

caused a 100x slowdown in replace.Convert.time_replace [...] Is that expected?

I'd expect something closer to the 70% slowdown we see in df.values. 100x is surprising.

@TomAugspurger
Copy link
Contributor

Opened #35053 to track this.

@phs-sakshi
Copy link

Hey I am new to open source. Can I take up this issue?

@jorisvandenbossche
Copy link
Member

@B417037 this is a merged PR, not an issue to take up. Have a look at good first issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants