-
Notifications
You must be signed in to change notification settings - Fork 16
Optimize with dask #1981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize with dask #1981
Conversation
Need to check if the memory is big enough to be able to pass the processed dataframe in memory otherwise need to look into chunking. |
------- | ||
''' | ||
filename = Path(filepath).name | ||
logger.info(f"Processing {filename}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: filepath might be more helpful, as it includes input dir.
logger.info(f"Processing {filename}") | |
logger.info(f"Processing {filepath}") |
def update_sensor( | ||
filepath, startdate, enddate, dropdate, geo, parallel, | ||
weekday, se, logger | ||
data:pd.DataFrame, startdate:datetime, enddate:datetime, dropdate:datetime, geo:str, parallel: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 we should start doing type specification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah but I think that should be a seperate ticket maybe? don't want to have more PR/confusion within already scoped out feature.
e919fda
to
935b7dd
Compare
935b7dd
to
d1ee4ce
Compare
doctor_visit_EDI_AGG_OUTPATIENT_26062024_1455CDT.log
doctor_visit_refactored_EDI_AGG_OUTPATIENT_26062024_1455CDT.log
doctor_visit_refactored_using_dask_7b6e0764_EDI_AGG_OUTPATIENT_26062024_1455CDT.log
!WORK IN PROGESS! need to make test and more optimization
Description
continuing optimizing for doctors_visit
Main ran: 569
doctor_visit_refactor: 3135
optimize_with_dask (this branch): 174
Changelog
Itemize code/test/documentation changes and files added/removed.
Fixes