Skip to content

DOC: update DataFrame.to_records #20191

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 53 additions & 4 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -1209,20 +1209,69 @@ def from_records(cls, data, index=None, exclude=None, columns=None,

def to_records(self, index=True, convert_datetime64=True):
"""
Convert DataFrame to record array. Index will be put in the
'index' field of the record array if requested
Convert DataFrame to record array.

Index will be put in the 'index' field of the record array if
requested.

Parameters
----------
index : boolean, default True
Include index in resulting record array, stored in 'index' field
Include index in resulting record array, stored in 'index' field.
convert_datetime64 : boolean, default True
Whether to convert the index to datetime.datetime if it is a
DatetimeIndex
DatetimeIndex.

Returns
-------
y : recarray
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry one last change. Could you make this y : numpy.recarray. And maybe say that in the opening line. "Convert the DataFrame to a NumPy record array"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No probs. Fixed in b1d0b09.


See Also
--------
DataFrame.from_records: convert structured or record ndarray
to DataFrame.
numpy.recarray: ndarray that allows field access using
attributes, analogous to typed (typed) columns in a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why typed in the parenthesis?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, typo! Fixed in 823862b.

spreadsheet.

Examples
--------

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blank line here

Copy link
Contributor Author

@samuelsinayoko samuelsinayoko Mar 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 2d0a3bb.

>>> df = pd.DataFrame({'A': [1, 2], 'B': [0.5, 0.75]},
... index=['a', 'b'])
>>> df
A B
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some extra spaces here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try pytest --doctest-modules pandas/core/frame.py -k to_records to validate that it's correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍.
Renamed the columns at the last minute to save space and looks like he'd messed up the tests. Not sure how it sill passed the validate_docstring.py script though.
Fixed in 5f6e1c5.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

column names A and B should line up with the data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5f6e1c5.

a 1 0.50
b 2 0.75
>>> df.to_records()
rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)],
dtype=[('index', 'O'), ('A', '<i8'), ('B', '<f8')])

The index can be excluded from the record array:

>>> df.to_records(index=False)
rec.array([(1, 0.5 ), (2, 0.75)],
dtype=[('A', '<i8'), ('B', '<f8')])

By default, timestamps are converted to `datetime.datetime`:

>>> df.index = pd.date_range('2018-01-01 09:00', periods=2, freq='min')
>>> df
A B
2018-01-01 09:00:00 1 0.50
2018-01-01 09:01:00 2 0.75
>>> df.to_records()
rec.array([(datetime.datetime(2018, 1, 1, 9, 0), 1, 0.5 ),
(datetime.datetime(2018, 1, 1, 9, 1), 2, 0.75)],
dtype=[('index', 'O'), ('A', '<i8'), ('B', '<f8')])

The timestamp conversion can be disabled so NumPy's datetime64
data type is used instead:

>>> df.to_records(convert_datetime64=False)
rec.array([('2018-01-01T09:00:00.000000000', 1, 0.5 ),
('2018-01-01T09:01:00.000000000', 2, 0.75)],
dtype=[('index', '<M8[ns]'), ('A', '<i8'), ('B', '<f8')])
"""
if index:
if is_datetime64_any_dtype(self.index) and convert_datetime64:
Expand Down