-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Fix performance issues when creating multiple instances of Period (#12903, #11831) #12909
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -0,0 +1,7 @@ | |||
US_RESO = 0 | |||
MS_RESO = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't add another module instead put these in frequencies
pls run the asv suite for relevant benchmarks |
needs a whatsnew entry |
also pls show the asv runs for the period benchmarks |
…ps in asv_benchmarks/period.py
@jreback , @tswicegood Is there a doc describing the hoops I need to jump through to get Travis and Jeff happy to merge? 😄 I.e. the names of the scripts I need to run to add asv benchmarks, pass coverage and pep8 requirements? |
@rs2 see @jreback's previous comments:
|
can you show some benchmark results (prob need some asv's for this as well). |
Will submit another pull request |
just force push to this PR |
The failing test is not mine. |
@@ -243,7 +243,7 @@ Performance Improvements | |||
|
|||
|
|||
- Improved performance of ``DataFrame.to_sql`` when checking case sensitivity for tables. Now only checks if table has been created correctly when table name is not lower case. (:issue:`12876`) | |||
|
|||
- Improved performance of creating multiple instances of Period. This has regressed from ``O(1)`` to ``O(n)`` in terms on number of calls to ``_find_and_load`` between ``0.15.2`` and ``0.18.0``. (:issue:`12903`, :issue:`11831`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just say improved perf in period construction and plotting of periods
@rs2 that test test runner has been known failing for a while. |
Current coverage is 83.84%@@ master #12909 diff @@
========================================
Files 134 136 +2
Lines 49565 49716 +151
Methods 0 0
Branches 0 0
========================================
+ Hits 41507 41684 +177
+ Misses 8058 8032 -26
Partials 0 0
|
Adding |
|
|
||
def time_period_index(self): | ||
# Simulate irregular PeriodIndex | ||
PeriodIndex(date_range('1985', periods=10000).to_pydatetime(), freq='D') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just make this 1000 (it will be the same ratio, but the previous versions will take less time)
Nice speedup! Did you run the benchmarks on py2 or py3? |
|
@jorisvandenbossche : plot Before the change
After the change
import numpy as np
import pandas as pd
class test_foo_bar(object):
goal_time = 0.2
N = 2000
def setup(self):
self.date_ranged_df = pd.DataFrame(np.random.randn(self.N, 5), index=pd.date_range('1/1/1975', periods=self.N))
self.plain_df = pd.DataFrame(np.random.randn(self.N, 5))
def time_plot_date_ranged_df(self):
self.date_ranged_df.plot()
def time_plot_plain_df(self):
self.plain_df.plot() |
Anything else needs to be done to be able to merge? |
looks good. ping when green. |
thanks @rs2 |
This was a big TODO I had when moving Period to Cython. Thank you for working it out @rs2. |
closes #12903
closes #11831
git diff upstream/master | flake8 --diff