Skip to content

BUG: Ensure min_itemsize is always a list (#11412) #14728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

toobaz
Copy link
Member

@toobaz toobaz commented Nov 24, 2016

(I plan to fix #10381 and #12154 next, but first we need coherence on type(data_columns) - and Index seems to me the best option.)

@codecov-io
Copy link

codecov-io commented Nov 24, 2016

Current coverage is 85.21% (diff: 50.00%)

Merging #14728 into master will decrease coverage by 0.05%

@@             master     #14728   diff @@
==========================================
  Files           144        143     -1   
  Lines         50947      50804   -143   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
- Hits          43445      43293   -152   
- Misses         7502       7511     +9   
  Partials          0          0          

Powered by Codecov. Last update 5d0e157...1f70a6e

@jreback
Copy link
Contributor

jreback commented Nov 24, 2016

this is technically an API change because we currently store the data_columns as a pickled list (there is a separate issues on actually storing this as a table). I am surpised this didn't break.

So i would prefer to keep this as a list for now, though see if you can create a tests which fails on this.

@jreback jreback added the IO HDF5 read_hdf, HDFStore label Nov 24, 2016
@jreback
Copy link
Contributor

jreback commented Nov 24, 2016

the issue you reference is closed. so what issue is this fixing?

@toobaz
Copy link
Member Author

toobaz commented Nov 24, 2016

Regarding the issue: yes, it is closed, but not solved, it is different from the #11364 you referred to. I can open a new issue if you prefer. But the thing is: I am not solving #11364 here, I am solving #11412, which is different.

Regarding the data_columns storage: I had missed that indeed, can you give me a pointer to understand more (I don't see that use of pickle in pytables.py)?

My motivation was that in some cases data_columns already is an index (e.g. at line 4257 of pytables.py), but then I can certainly fix that (as I was doing in the previous #12252).

@jreback
Copy link
Contributor

jreback commented Nov 24, 2016

then are you solving this? #10381

@jreback
Copy link
Contributor

jreback commented Nov 24, 2016

https://github.com/pandas-dev/pandas/blob/master/pandas/io/pytables.py#L3129

you would need to be pretty careful about this (though not 100% sure it actually matters). It would be nice to be consistent, though everything is a list. yes Index has the same semantics. but I don't want to change this unless it stays consistent (which is already a list for almost everything else).

so bottom line is, keep it a list. If you have a repro where its NOT a list. then pls show that.

@toobaz
Copy link
Member Author

toobaz commented Nov 24, 2016

OK, new version works with lists rather than indices.

No, I'm not fixing #10381, otherwise I wouldn't have written that I plan to fix it when I opened the PR ...

@toobaz
Copy link
Member Author

toobaz commented Nov 25, 2016

The AppVeyor failure is a bug in AppVeyor, right?

@jreback
Copy link
Contributor

jreback commented Nov 25, 2016

I restarted. it sometimes does get stuck.

@jorisvandenbossche jorisvandenbossche changed the title Minitemsizefix BUG: Ensure min_itemsize is always a list (#11412) Nov 25, 2016
@toobaz
Copy link
Member Author

toobaz commented Nov 27, 2016

again...

@toobaz
Copy link
Member Author

toobaz commented Dec 3, 2016

ping




- Bug in ``HDFStore.append()`` with Series and 'index' appearing in ``min_itemsize`` (:issue:`11412`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a little bit more verbose as I don't understand what you are fixing.

@@ -4254,7 +4254,7 @@ def write(self, obj, data_columns=None, **kwargs):
if data_columns is None:
data_columns = []
elif data_columns is True:
data_columns = obj.columns[:]
data_columns = list(obj.columns)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was just fixed in #14791

@@ -4153,7 +4153,7 @@ def write(self, obj, data_columns=None, **kwargs):
obj = DataFrame({name: obj}, index=obj.index)
obj.columns = [name]
return super(AppendableSeriesTable, self).write(
obj=obj, data_columns=obj.columns, **kwargs)
obj=obj, data_columns=list(obj.columns), **kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use obj.columns.tolist() to be consistent





Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put underneath the whatsnew for #14791

@toobaz
Copy link
Member Author

toobaz commented Dec 5, 2016

Stuck again

@jreback jreback added this to the 0.19.2 milestone Dec 5, 2016
@jreback jreback added the Bug label Dec 5, 2016
@jreback
Copy link
Contributor

jreback commented Dec 5, 2016

thanks!

@jreback jreback closed this in 53bf1b2 Dec 5, 2016
@toobaz toobaz deleted the minitemsizefix branch December 6, 2016 07:14
jorisvandenbossche pushed a commit that referenced this pull request Dec 15, 2016
closes #11412

Author: Pietro Battiston <[email protected]>

Closes #14728 from toobaz/minitemsizefix and squashes the following commits:

e25cd1f [Pietro Battiston] Whatsnew
b9bb88f [Pietro Battiston] Tests for previous commit
6406ee8 [Pietro Battiston] BUG: Ensure min_itemsize is always a list

(cherry picked from commit 53bf1b2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO HDF5 read_hdf, HDFStore
Projects
None yet
3 participants