File tree 1 file changed +21
-4
lines changed
1 file changed +21
-4
lines changed Original file line number Diff line number Diff line change @@ -2574,10 +2574,27 @@ The default is 50,000 rows returned in a chunk.
2574
2574
for df in read_hdf(' store.h5' ,' df' , chunsize = 3 ):
2575
2575
print (df)
2576
2576
2577
- Note, that the chunksize keyword applies to the **returned ** rows. So if you
2578
- are doing a query, then that set will be subdivided and returned in the
2579
- iterator. Keep in mind that if you do not pass a ``where `` selection criteria
2580
- then the ``nrows `` of the table are considered.
2577
+ Note, that the chunksize keyword applies to the **source ** rows. So if you
2578
+ are doing a query, then the chunksize will subdivide the total rows in the table
2579
+ and the query applied, returning an iterator on potentially unequal sized chunks.
2580
+
2581
+ Here is a recipe for generating a query and using it to create equal sized return
2582
+ chunks.
2583
+
2584
+ .. ipython :: python
2585
+
2586
+ dfeq = DataFrame({' number' : np.arange(1 ,11 )})
2587
+ dfeq
2588
+
2589
+ store.append(' dfeq' , dfeq, data_columns = [' number' ])
2590
+
2591
+ def chunks (l , n ):
2592
+ return [l[i:i+ n] for i in xrange (0 , len (l), n)]
2593
+
2594
+ evens = [2 ,4 ,6 ,8 ,10 ]
2595
+ coordinates = store.select_as_coordinates(' dfeq' ,' number=evens' )
2596
+ for c in chunks(coordinates, 2 ):
2597
+ print store.select(' dfeq' ,where = c)
2581
2598
2582
2599
Advanced Queries
2583
2600
~~~~~~~~~~~~~~~~
You can’t perform that action at this time.
0 commit comments