@@ -71,8 +71,10 @@ the **array** property
71
71
s.array
72
72
s.index.array
73
73
74
- Depending on the data type (see :ref: `basics.dtypes `), :attr: `~Series.array `
75
- be either a NumPy array or an :ref: `ExtensionArray <extending.extension-type >`.
74
+ :attr: `~Series.array ` will always be an :class: `~pandas.api.extensions.ExtensionArray `.
75
+ The exact details of what an ``ExtensionArray `` is and why pandas uses them is a bit
76
+ beyond the scope of this introduction. See :ref: `basics.dtypes ` for more.
77
+
76
78
If you know you need a NumPy array, use :meth: `~Series.to_numpy `
77
79
or :meth: `numpy.asarray `.
78
80
@@ -81,10 +83,30 @@ or :meth:`numpy.asarray`.
81
83
s.to_numpy()
82
84
np.asarray(s)
83
85
84
- For Series and Indexes backed by NumPy arrays (like we have here), this will
85
- be the same as :attr: `~Series.array `. When the Series or Index is backed by
86
- a :class: `~pandas.api.extension.ExtensionArray `, :meth: `~Series.to_numpy `
87
- may involve copying data and coercing values.
86
+ When the Series or Index is backed by
87
+ an :class: `~pandas.api.extension.ExtensionArray `, :meth: `~Series.to_numpy `
88
+ may involve copying data and coercing values. See :ref: `basics.dtypes ` for more.
89
+
90
+ :meth: `~Series.to_numpy ` gives some control over the ``dtype `` of the
91
+ resulting :class: `ndarray `. For example, consider datetimes with timezones.
92
+ NumPy doesn't have a dtype to represent timezone-aware datetimes, so there
93
+ are two possibly useful representations:
94
+
95
+ 1. An object-dtype :class: `ndarray ` with :class: `Timestamp ` objects, each
96
+ with the correct ``tz ``
97
+ 2. A ``datetime64[ns] `` -dtype :class: `ndarray `, where the values have
98
+ been converted to UTC and the timezone discarded
99
+
100
+ Timezones may be preserved with ``dtype=object ``
101
+
102
+ .. ipython :: python
103
+
104
+ ser = pd.Series(pd.date_range(' 2000' , periods = 2 , tz = " CET" ))
105
+ ser.to_numpy(dtype = object )
106
+
107
+ Or thrown away with ``dtype='datetime64[ns]' ``
108
+
109
+ ser.to_numpy(dtype="datetime64[ns]")
88
110
89
111
:meth: `~Series.to_numpy ` gives some control over the ``dtype `` of the
90
112
resulting :class: `ndarray `. For example, consider datetimes with timezones.
@@ -109,7 +131,7 @@ Or thrown away with ``dtype='datetime64[ns]'``
109
131
110
132
Getting the "raw data" inside a :class: `DataFrame ` is possibly a bit more
111
133
complex. When your ``DataFrame `` only has a single data type for all the
112
- columns, :attr : `DataFrame.to_numpy ` will return the underlying data:
134
+ columns, :meth : `DataFrame.to_numpy ` will return the underlying data:
113
135
114
136
.. ipython :: python
115
137
@@ -136,8 +158,9 @@ drawbacks:
136
158
137
159
1. When your Series contains an :ref: `extension type <extending.extension-type >`, it's
138
160
unclear whether :attr: `Series.values ` returns a NumPy array or the extension array.
139
- :attr: `Series.array ` will always return the actual array backing the Series,
140
- while :meth: `Series.to_numpy ` will always return a NumPy array.
161
+ :attr: `Series.array ` will always return an ``ExtensionArray ``, and will never
162
+ copy data. :meth: `Series.to_numpy ` will always return a NumPy array,
163
+ potentially at the cost of copying / coercing values.
141
164
2. When your DataFrame contains a mixture of data types, :attr: `DataFrame.values ` may
142
165
involve copying data and coercing values to a common dtype, a relatively expensive
143
166
operation. :meth: `DataFrame.to_numpy `, being a method, makes it clearer that the
0 commit comments