-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Fix HDFStore empty keys on native HDF5 file by adding keyword include #32723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix HDFStore empty keys on native HDF5 file by adding keyword include #32723
Conversation
Hello @roberthdevries! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-06-14 21:02:12 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks fine. does this solve the OP? meaning folks simply want to use .keys() to get all the tables, then use .get() to read them?
pandas/io/pytables.py
Outdated
if include == "pandas": | ||
return [n._v_pathname for n in self.groups()] | ||
|
||
if include == "native": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use elif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really needed in this case as the previous if returns in all cases. Why would that help us?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you change this (its more idiomatic)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
pandas/io/pytables.py
Outdated
@@ -580,16 +580,37 @@ def __enter__(self): | |||
def __exit__(self, exc_type, exc_value, traceback): | |||
self.close() | |||
|
|||
def keys(self) -> List[str]: | |||
def keys(self, include: Optional[str] = "pandas") -> List[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not Optional, just str (alternatively, we could actually do Optiona[str] = None
and then allow passing include='native')
but find either way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the Optional
annotation
025a348
to
001be86
Compare
pandas/io/pytables.py
Outdated
Parameters | ||
---------- | ||
|
||
.. versionadded:: 1.1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment on other PR - put below include and indent one level (I think will fix CI)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
3e114be
to
020fc5b
Compare
We could also merge the two sets of tables. I think pandas does not use native HDF5 tables to store DataFrames. In that case the two sets are disjoint and we should be fine returning the joined set. |
I tried the variant with returning the combination of keys of pandas tables and HDF5 native tables, and the unit test |
4ee5d7e
to
ba1a8ca
Compare
83bce97
to
151aa4f
Compare
151aa4f
to
8ce3475
Compare
Hi there, this is interesting, how did this test pass before the
Hopefully, thanks for working on it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks fine, pls merge master and ping on green
pandas/io/pytables.py
Outdated
if include == "pandas": | ||
return [n._v_pathname for n in self.groups()] | ||
|
||
if include == "native": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you change this (its more idiomatic)
An additional kind parameter has been added that defaults to pandas original behavior, but with 'tables' value gives you the list of non-pandas tables in the file
- improve type annotation of the HDFStore.keys method - minor improvement in ValueError string - minor improvement in doc-string
8ce3475
to
51d632d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't you need this also in .groups()
?
pandas/io/pytables.py
Outdated
include : str, default 'pandas' | ||
When kind equals 'pandas' return pandas objects | ||
When kind equals 'native' return native HDF5 Table objects | ||
Otherwise fail with a ValueError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line is not needed (the otherwise fail)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
Could be added there as well, although that use case is less typical. Most users are interested in the tables in the file. |
@jreback tests should have run fine, except that one of the tests took too long I think. So this is my ping on sort of green |
thanks @roberthdevries |
thanks @roberthdevries and @jreback for resolving this issue for keys, it will help my project a lot. @roberthdevries the source files coming into my system use groups (not created by pandas) so I would need the same fix applying to .groups() as well as .keys() if at all possible. |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff