-
Notifications
You must be signed in to change notification settings - Fork 7
Memory usage for pandas.read_xml #362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I wouldn't invest too much time into this. 4G for parsing a ~1 Mio file microscope acquisition is not unreasonable. See:
=> let's increase this default to 4G It likely could be optimized further, but the potential gain is not really worth the time investment at the time being. |
…age-for-pandasread_xml Increase memory requirements for create-ome-zarr tasks (close #362)
Now updated in fractal-tasks-core 0.9.2. |
We have examples where create-ome-zarr goes out of memory when the limit is set to 1G or 2G, e.g. for a XML file of 160k lines (see fractal-analytics-platform/fractal-server#599 (comment)).
Maybe it's worth checking that we are using
pandas.read_xml
correctly. We can quickly debug the memory usage of this function, and possibly look around for known issues (pandas-dev/pandas#45442 - possibly related?).If all looks reasonable on the XML-parsing side, should we set some more generous default memory in the manifest? It's a non-parallel task, and it should be simple for SLURM to schedule it even if it requires 4G.
The text was updated successfully, but these errors were encountered: