-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Updating HDFStore in place #6857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The modify methods are not implemented, see here: http://pytables.github.io/usersguide/libref/structured_storage.html#table-methods-writing its not conceptually hard to do this, might be some details as the data needs to be coerced similar to when writing. Note that that the data has to be exactly the same dtype (otherwise PyTables will raise) as it simply an overwrite. If you have small amount of data relative to a big set then it does make sense. I don't do this myself as it makes 'cleaner' stores to simply write it again (from code perspective), as well as makes the store effectively read-only. want to take a stab? |
Sure, I think this would be a great thing to have. Right now loading/rewriting 90k points to update 24 points doesn't make much sense. |
gr8! I think maybe a signature something like:
value is a frame, same length as the indexer You get the indexer by effecitively doing a I would start simple and make the user pass back the coordinates in. |
And I think it would be nice to have some logic here if the data doesn't exist to append so only one update method needs to be called and it will append or modify appropriately. |
make a modify method, then can think about adding the append/modify logic |
Hello, I'm wondering if there's been any progress on adding .modify() to HDFStore. It'd be such a boon for me. Thanks! |
Unfortunately there hasn't. I agree that it would be great to add. I have something started, but not even worth pushing somewhere at this point. |
I would be interested in this too. But this seems dead. |
see my comment above, you can implement the |
Yes I saw that. It seems that HDF5 would not work well for large data flows where the data is modified frequently even if someone were to write store.modify. P.S. I can take this to the mailing list if you prefer. |
no, this is specifically for HDF5. You generally don't want to modify data, you simply append, that's how its designed and makes it performant. You can already do this with SQL if you really want. |
Can you point me to this? |
http://pandas.pydata.org/pandas-docs/stable/io.html#io-sql you would simply do some kind of an update query. |
Currently I use the HDF5 interface to store timeseries and it works great for selection. However, the only way I see to update them is to select the existing stored timeseries and then merge the update with the existing data. This can obviously be expensive if the existing series is large. Is it possible to update a store by just passing in the new data? If so, is there an example somewhere?
The text was updated successfully, but these errors were encountered: