@@ -5264,99 +5264,6 @@ You could inadvertently turn an actual ``nan`` value into a missing value.
5264
5264
store.append(" dfss2" , dfss, nan_rep = " _nan_" )
5265
5265
store.select(" dfss2" )
5266
5266
5267
- .. _io.external_compatibility :
5268
-
5269
- External compatibility
5270
- ''''''''''''''''''''''
5271
-
5272
- ``HDFStore `` writes ``table `` format objects in specific formats suitable for
5273
- producing loss-less round trips to pandas objects. For external
5274
- compatibility, ``HDFStore `` can read native ``PyTables `` format
5275
- tables.
5276
-
5277
- It is possible to write an ``HDFStore `` object that can easily be imported into ``R `` using the
5278
- ``rhdf5 `` library (`Package website `_). Create a table format store like this:
5279
-
5280
- .. _package website : https://www.bioconductor.org/packages/release/bioc/html/rhdf5.html
5281
-
5282
- .. ipython :: python
5283
-
5284
- df_for_r = pd.DataFrame(
5285
- {
5286
- " first" : np.random.rand(100 ),
5287
- " second" : np.random.rand(100 ),
5288
- " class" : np.random.randint(0 , 2 , (100 ,)),
5289
- },
5290
- index = range (100 ),
5291
- )
5292
- df_for_r.head()
5293
-
5294
- store_export = pd.HDFStore(" export.h5" )
5295
- store_export.append(" df_for_r" , df_for_r, data_columns = df_dc.columns)
5296
- store_export
5297
-
5298
- .. ipython :: python
5299
- :suppress:
5300
-
5301
- store_export.close()
5302
- os.remove(" export.h5" )
5303
-
5304
- In R this file can be read into a ``data.frame `` object using the ``rhdf5 ``
5305
- library. The following example function reads the corresponding column names
5306
- and data values from the values and assembles them into a ``data.frame ``:
5307
-
5308
- .. code-block :: R
5309
-
5310
- # Load values and column names for all datasets from corresponding nodes and
5311
- # insert them into one data.frame object.
5312
-
5313
- library(rhdf5)
5314
-
5315
- loadhdf5data <- function(h5File) {
5316
-
5317
- listing <- h5ls(h5File)
5318
- # Find all data nodes, values are stored in *_values and corresponding column
5319
- # titles in *_items
5320
- data_nodes <- grep("_values", listing$name)
5321
- name_nodes <- grep("_items", listing$name)
5322
- data_paths = paste(listing$group[data_nodes], listing$name[data_nodes], sep = "/")
5323
- name_paths = paste(listing$group[name_nodes], listing$name[name_nodes], sep = "/")
5324
- columns = list()
5325
- for (idx in seq(data_paths)) {
5326
- # NOTE: matrices returned by h5read have to be transposed to obtain
5327
- # required Fortran order!
5328
- data <- data.frame(t(h5read(h5File, data_paths[idx])))
5329
- names <- t(h5read(h5File, name_paths[idx]))
5330
- entry <- data.frame(data)
5331
- colnames(entry) <- names
5332
- columns <- append(columns, entry)
5333
- }
5334
-
5335
- data <- data.frame(columns)
5336
-
5337
- return(data)
5338
- }
5339
-
5340
- Now you can import the ``DataFrame `` into R:
5341
-
5342
- .. code-block :: R
5343
-
5344
- > data = loadhdf5data("transfer.hdf5")
5345
- > head(data)
5346
- first second class
5347
- 1 0.4170220047 0.3266449 0
5348
- 2 0.7203244934 0.5270581 0
5349
- 3 0.0001143748 0.8859421 1
5350
- 4 0.3023325726 0.3572698 1
5351
- 5 0.1467558908 0.9085352 1
5352
- 6 0.0923385948 0.6233601 1
5353
-
5354
- .. note ::
5355
- The R function lists the entire HDF5 file's contents and assembles the
5356
- ``data.frame `` object from all matching nodes, so use this only as a
5357
- starting point if you have stored multiple ``DataFrame `` objects to a
5358
- single HDF5 file.
5359
-
5360
5267
5361
5268
Performance
5362
5269
'''''''''''
0 commit comments