-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Feature: store.select should warn about unimplemented operators #2973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
can u post a couple of sample rows from df |
Here's the store: <class 'pandas.io.pytables.HDFStore'>
File path: /Users/maye/data/marszoo/planet_four_classifications_2013-02-23.h5
/df frame_table (typ->appendable,nrows->9275468,ncols->17,indexers->[index],dc->[image_id,image_name,user_name,marking]) Here's df.info(): <class 'pandas.core.frame.DataFrame'>
Int64Index: 9275468 entries, 0 to 9275467
Data columns:
classification_id 9275468 non-null values
created_at 9275468 non-null values
image_id 9275468 non-null values
image_name 9275468 non-null values
user_name 9275468 non-null values
marking 9275468 non-null values
x_tile 9275468 non-null values
y_tile 9275468 non-null values
x 9275468 non-null values
y 9275468 non-null values
image_x 9275468 non-null values
image_y 9275468 non-null values
radius_1 9275468 non-null values
radius_2 9275468 non-null values
distance 9275468 non-null values
angle 9275468 non-null values
spread 9275468 non-null values
dtypes: object(17) and here are a couple of lines with that 'none' entry (note that everything from column 'x' is empty for those lines): classification_id created_at image_id image_name user_name marking x_tile y_tile x y image_x image_y radius_1 radius_2 distance angle spread
df_index
23 50eace09e39956220600081f 2013-01-07 13:30:49 APF00008jy ESP_012265_0950 lukesmith none 5 2
41 50eaf01ae3995621fc00093e 2013-01-07 15:56:10 APF00007su ESP_012604_0965 not-logged-in none 1 2
42 50eaf01ee3995621d300010a 2013-01-07 15:56:14 APF00003bv ESP_011460_0980 not-logged-in none 2 1
44 50eaf03a45d7e142f50000cb 2013-01-07 15:56:42 APF000030j ESP_011900_0985 not-logged-in none 5 1
45 50eaf03ce3995621d3000117 2013-01-07 15:56:44 APF0000p3r ESP_020150_0950 not-logged-in none 7 1 |
sorry I think df.info() doesn't show me what I need an u show df.get_dtype_counts() x is an object column (string)? |
fixed! ty had an incorrect implementation...thanks for the debugging! fyi...you can put np.nan in your string columns and they will come back that way (rather than '') only issue is I can't think of a nice way to select on them e.g.
|
BUG: HDFStore didn't implement != correctly for string columns query, GH #2973
Thanks, amazing fast response! ;) About np.nan, I don't understand where I could 'put np.nan in my string'? Do you mean, that the store.select will treat empty strings as np.nans and return np.nan if I tried to filter for it? But that might not always be what the user wants? Maybe I get you wrong. I can see 4 scenarios:
Does that help? |
you can also specify terms like: Term('x','=',value) value could be a string/list/object (useful when say u are comparing against a date) On Mar 5, 2013, at 10:10 PM, "K.-Michael Aye" [email protected] wrote:
|
in any event let me know if u have any more issues Jeff On Mar 5, 2013, at 10:10 PM, "K.-Michael Aye" [email protected] wrote:
|
it does not seem to be working yet: |
afba0 is the current commit, your version references the prior one |
Hm, am I doing something wrong? (master)[maye@lunatic ~/Dropbox/src/pandas]$ git log -1
commit afba0d0e401a294f83cde3df4609179f64f165b2
Merge: 3790f16 197b3c7
Author: jreback <[email protected]>
Date: Tue Mar 5 19:04:09 2013 -0800
Merge pull request #2976 from jreback/pytables_select
BUG: HDFStore didn't implement != correctly for string columns query, GH #2973
(master)[maye@lunatic ~/Dropbox/src/pandas]$ git pull upstream master
From https://github.com/pydata/pandas
* branch master -> FETCH_HEAD
Already up-to-date. |
that looks right try in your git dir with ipython at command line I can be reached on my cell 917-971-6387 On Mar 6, 2013, at 3:31 AM, "K.-Michael Aye" [email protected] wrote:
|
did you get this to work? |
yes, finally. I think it was not found because I tried to install via |
great never used egg install |
but: what about the initial idea of this feature request? ;) |
do you have a case where it fails? (aside from the != and bool cases we fixed) the issue is that I think we always do a valid conversion, so how do I know when its invalid? |
Doh, sorry, I didn't click that all is covered now. (BTW, the website need to include the |
you can't specify an OR explicity (yet!), are you doing this somehow? you basically just concat 2 results of 2 queries together (in docs under advanced) |
No, I'm not hacking it somehow together. ;) |
thanks. but..added in another PR but feel free to review the docs for anything missing/confusing! |
@michaelaye shall we close this ? do you have a case where the selection should warn? (I am guessing if you did then we would have fixed it!) |
yep. thanks! |
Currently, when doing this (with object dtypes)
one receives exactly the same object as when doing
The suggestion is that the query parser warns about the unknown or unimplemented operator.
The text was updated successfully, but these errors were encountered: