Skip to content

Import statements in period.pyx significantly impact performance #12903

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rs2 opened this issue Apr 15, 2016 · 4 comments
Closed

Import statements in period.pyx significantly impact performance #12903

rs2 opened this issue Apr 15, 2016 · 4 comments
Labels
Performance Memory or execution speed performance Period Period data type
Milestone

Comments

@rs2
Copy link
Contributor

rs2 commented Apr 15, 2016

The following imports in pandas/src/period.pyx significantly impact performance when dealing with multiple Period objects. A quick win, guys.

    def __init__(self, value=None, freq=None, ordinal=None,
                 year=None, month=1, quarter=None, day=1,
                 hour=0, minute=0, second=0):
        from pandas.tseries import frequencies
        from pandas.tseries.frequencies import get_freq_code as _gfc

        # freq points to a tuple (base, mult);  base is one of the defined
        # periods such as A, Q, etc. Every five minutes would be, e.g.,
        # ('T', 5) but may be passed in as a string like '5T'

Just profile the code below and observe the number of times _find_and_load gets called:

import pandas as pd

for _ in range(1000):
    pd.Period('2015-04-26')

bfa8066 is the commit that has introduced the problem.

I will submit a pull request that rectifies the incorrect fix to the circular dependency.

@sinhrks
Copy link
Member

sinhrks commented Apr 15, 2016

Thanks for the catch. Moving some frequency related codes to offset.pyx is one idea (#11214).

@sinhrks sinhrks added Performance Memory or execution speed performance Period Period data type labels Apr 15, 2016
@jreback
Copy link
Contributor

jreback commented Apr 16, 2016

so this is the same issue behind #11831

@jreback jreback added this to the 0.18.1 milestone Apr 16, 2016
@rs2
Copy link
Contributor Author

rs2 commented Apr 16, 2016

@jreback Correct. I'll submit a pull request after lunch. Tested the fix
locally and ran unit tests. It's all good.
On 15 Apr 2016 10:21 p.m., "Sinhrks" [email protected] wrote:

Thanks for the catch. Moving some frequency related codes to offset.pyx is
one idea (#11214 #11214).


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#12903 (comment)

@rs2
Copy link
Contributor Author

rs2 commented Apr 16, 2016

See #12909

On Sat, Apr 16, 2016 at 11:26 AM, Paul A [email protected] wrote:

@jreback Correct. I'll submit a pull request after lunch. Tested the fix
locally and ran unit tests. It's all good.
On 15 Apr 2016 10:21 p.m., "Sinhrks" [email protected] wrote:

Thanks for the catch. Moving some frequency related codes to offset.pyx
is one idea (#11214 #11214).


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#12903 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Period Period data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants