Skip to content

Commit d923385

Browse files
MoisanTomAugspurger
authored andcommitted
DOC: Fix DataFrame.to_xarray doctests and allow the CI to run it. (#22673)
1 parent 51aeba4 commit d923385

File tree

2 files changed

+53
-63
lines changed

2 files changed

+53
-63
lines changed

ci/doctests.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ if [ "$DOCTEST" ]; then
3535
fi
3636

3737
pytest --doctest-modules -v pandas/core/generic.py \
38-
-k"-_set_axis_name -_xs -describe -droplevel -groupby -interpolate -pct_change -pipe -reindex -reindex_axis -resample -sample -to_json -to_xarray -transpose -values -xs"
38+
-k"-_set_axis_name -_xs -describe -droplevel -groupby -interpolate -pct_change -pipe -reindex -reindex_axis -resample -sample -to_json -transpose -values -xs"
3939

4040
if [ $? -ne "0" ]; then
4141
RET=1

pandas/core/generic.py

+52-62
Original file line numberDiff line numberDiff line change
@@ -2500,80 +2500,70 @@ def to_xarray(self):
25002500
25012501
Returns
25022502
-------
2503-
a DataArray for a Series
2504-
a Dataset for a DataFrame
2505-
a DataArray for higher dims
2503+
xarray.DataArray or xarray.Dataset
2504+
Data in the pandas structure converted to Dataset if the object is
2505+
a DataFrame, or a DataArray if the object is a Series.
2506+
2507+
See Also
2508+
--------
2509+
DataFrame.to_hdf : Write DataFrame to an HDF5 file.
2510+
DataFrame.to_parquet : Write a DataFrame to the binary parquet format.
25062511
25072512
Examples
25082513
--------
2509-
>>> df = pd.DataFrame({'A' : [1, 1, 2],
2510-
'B' : ['foo', 'bar', 'foo'],
2511-
'C' : np.arange(4.,7)})
2514+
>>> df = pd.DataFrame([('falcon', 'bird', 389.0, 2),
2515+
... ('parrot', 'bird', 24.0, 2),
2516+
... ('lion', 'mammal', 80.5, 4),
2517+
... ('monkey', 'mammal', np.nan, 4)],
2518+
... columns=['name', 'class', 'max_speed',
2519+
... 'num_legs'])
25122520
>>> df
2513-
A B C
2514-
0 1 foo 4.0
2515-
1 1 bar 5.0
2516-
2 2 foo 6.0
2521+
name class max_speed num_legs
2522+
0 falcon bird 389.0 2
2523+
1 parrot bird 24.0 2
2524+
2 lion mammal 80.5 4
2525+
3 monkey mammal NaN 4
25172526
25182527
>>> df.to_xarray()
25192528
<xarray.Dataset>
2520-
Dimensions: (index: 3)
2529+
Dimensions: (index: 4)
25212530
Coordinates:
2522-
* index (index) int64 0 1 2
2531+
* index (index) int64 0 1 2 3
25232532
Data variables:
2524-
A (index) int64 1 1 2
2525-
B (index) object 'foo' 'bar' 'foo'
2526-
C (index) float64 4.0 5.0 6.0
2527-
2528-
>>> df = pd.DataFrame({'A' : [1, 1, 2],
2529-
'B' : ['foo', 'bar', 'foo'],
2530-
'C' : np.arange(4.,7)}
2531-
).set_index(['B','A'])
2532-
>>> df
2533-
C
2534-
B A
2535-
foo 1 4.0
2536-
bar 1 5.0
2537-
foo 2 6.0
2538-
2539-
>>> df.to_xarray()
2533+
name (index) object 'falcon' 'parrot' 'lion' 'monkey'
2534+
class (index) object 'bird' 'bird' 'mammal' 'mammal'
2535+
max_speed (index) float64 389.0 24.0 80.5 nan
2536+
num_legs (index) int64 2 2 4 4
2537+
2538+
>>> df['max_speed'].to_xarray()
2539+
<xarray.DataArray 'max_speed' (index: 4)>
2540+
array([389. , 24. , 80.5, nan])
2541+
Coordinates:
2542+
* index (index) int64 0 1 2 3
2543+
2544+
>>> dates = pd.to_datetime(['2018-01-01', '2018-01-01',
2545+
... '2018-01-02', '2018-01-02'])
2546+
>>> df_multiindex = pd.DataFrame({'date': dates,
2547+
... 'animal': ['falcon', 'parrot', 'falcon',
2548+
... 'parrot'],
2549+
... 'speed': [350, 18, 361, 15]}).set_index(['date',
2550+
... 'animal'])
2551+
>>> df_multiindex
2552+
speed
2553+
date animal
2554+
2018-01-01 falcon 350
2555+
parrot 18
2556+
2018-01-02 falcon 361
2557+
parrot 15
2558+
2559+
>>> df_multiindex.to_xarray()
25402560
<xarray.Dataset>
2541-
Dimensions: (A: 2, B: 2)
2561+
Dimensions: (animal: 2, date: 2)
25422562
Coordinates:
2543-
* B (B) object 'bar' 'foo'
2544-
* A (A) int64 1 2
2563+
* date (date) datetime64[ns] 2018-01-01 2018-01-02
2564+
* animal (animal) object 'falcon' 'parrot'
25452565
Data variables:
2546-
C (B, A) float64 5.0 nan 4.0 6.0
2547-
2548-
>>> p = pd.Panel(np.arange(24).reshape(4,3,2),
2549-
items=list('ABCD'),
2550-
major_axis=pd.date_range('20130101', periods=3),
2551-
minor_axis=['first', 'second'])
2552-
>>> p
2553-
<class 'pandas.core.panel.Panel'>
2554-
Dimensions: 4 (items) x 3 (major_axis) x 2 (minor_axis)
2555-
Items axis: A to D
2556-
Major_axis axis: 2013-01-01 00:00:00 to 2013-01-03 00:00:00
2557-
Minor_axis axis: first to second
2558-
2559-
>>> p.to_xarray()
2560-
<xarray.DataArray (items: 4, major_axis: 3, minor_axis: 2)>
2561-
array([[[ 0, 1],
2562-
[ 2, 3],
2563-
[ 4, 5]],
2564-
[[ 6, 7],
2565-
[ 8, 9],
2566-
[10, 11]],
2567-
[[12, 13],
2568-
[14, 15],
2569-
[16, 17]],
2570-
[[18, 19],
2571-
[20, 21],
2572-
[22, 23]]])
2573-
Coordinates:
2574-
* items (items) object 'A' 'B' 'C' 'D'
2575-
* major_axis (major_axis) datetime64[ns] 2013-01-01 2013-01-02 2013-01-03 # noqa
2576-
* minor_axis (minor_axis) object 'first' 'second'
2566+
speed (date, animal) int64 350 18 361 15
25772567
25782568
Notes
25792569
-----

0 commit comments

Comments
 (0)