Skip to content

BUG: HDFStore.select_as_multiple doesn't respect start/stop kwargs #16209

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
JosephWagner opened this issue May 3, 2017 · 4 comments · Fixed by #16317
Closed

BUG: HDFStore.select_as_multiple doesn't respect start/stop kwargs #16209

JosephWagner opened this issue May 3, 2017 · 4 comments · Fixed by #16317
Labels
Bug IO HDF5 read_hdf, HDFStore
Milestone

Comments

@JosephWagner
Copy link
Contributor

import pandas as pd

df = pd.DataFrame({"foo": [1, 2], "bar": [1, 2]})

with pd.HDFStore("foo.h5", 'w') as store:
    store.append_to_multiple({'selector': ['foo'], 'data': None}, df, selector='selector')
    single_row = store.select_as_multiple(['selector', 'data'], selector='selector', start=0, stop=1)
    assert len(single_row) == 1, "requested 1 row, got back {}".format(len(single_row))

Currently select_as_multiple returns the entire table. I would expect, if start=0 and stop=1, for just one row to be returned

INSTALLED VERSIONS ------------------ commit: None python: 2.7.12.final.0 python-bits: 64 OS: Linux OS-release: 2.6.32-642.15.1.el6.centos.plus.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None

pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 32.3.1.post20170108
Cython: 0.25.2
numpy: 1.11.3
scipy: 0.18.1
statsmodels: 0.8.0
xarray: 0.8.2
IPython: 5.1.0
sphinx: 1.5.3
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.4.0
numexpr: 2.6.1
matplotlib: None
openpyxl: None
xlrd: 1.0.0
xlwt: None
xlsxwriter: 0.9.6
lxml: None
bs4: 4.5.3
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 1.1.5
pymysql: 0.7.9.None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None

@jreback
Copy link
Contributor

jreback commented May 3, 2017

yep, looks like https://github.com/pandas-dev/pandas/blob/master/pandas/io/pytables.py#L829 needs to pass start/stop as well.

pull-requests welcome!

@jreback jreback added this to the Next Major Release milestone May 3, 2017
@JosephWagner
Copy link
Contributor Author

Hmm, so I'm attempting to submit a PR. I've added a new test that mirrors the example from above, but when I insert start/stop into line 829, I get the following test failure. I've spent some time digging around to see why this isn't working, but I can't quite figure out why. It looks like this exception is raised here, which means the 'where' variable (an int64index) is passed to self.generate. I don't know enough about the code to know if that's expected or not.

self = [0, 1]

    def evaluate(self):
        """ create and return the numexpr condition and filter """
    
        try:
            self.condition = self.terms.prune(ConditionBinOp)
        except AttributeError:
            raise ValueError("cannot process expression [{0}], [{1}] is not a "
>                            "valid condition".format(self.expr, self))
E           ValueError: cannot process expression [[0 1]], [[0, 1]] is not a valid condition

@jreback
Copy link
Contributor

jreback commented May 6, 2017

you have to step thru. This is an evaluation tree which can go pretty deep. However I think that all you need to do is make the change I suggested above.

@JosephWagner
Copy link
Contributor Author

Ahh, I think I figured it out. I also needed to pass self.start/stop to this function call here: https://github.com/pandas-dev/pandas/blob/master/pandas/io/pytables.py#L1423

I'll submit a PR soon!

@jreback jreback modified the milestones: 0.20.2, Next Major Release May 10, 2017
JosephWagner pushed a commit to JosephWagner/pandas that referenced this issue May 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO HDF5 read_hdf, HDFStore
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants