Skip to content

Added paragraph on creating DataFrame from list of namedtuples #35507

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 5, 2020
Merged

Added paragraph on creating DataFrame from list of namedtuples #35507

merged 4 commits into from
Aug 5, 2020

Conversation

sjvrijn
Copy link
Contributor

@sjvrijn sjvrijn commented Aug 1, 2020

Is a whatsnew entry necessary for this PR?

Copy link
Member

@arw2019 arw2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @sjvrijn

Some comments below.

Could add an example to illustrate that the column names are taken from the first namedtuple, whatever the field names of the later tuples, such as here

In [8]: pd.DataFrame([(0, 0), Point(1, 2)])                                                                           
Out[8]: 
   0  1
0  0  0
1  1  2

but up to you - I don't feel strongly

.. ipython:: python

from collections import namedtuple

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have a trailing whitespace here


from collections import namedtuple

Point = namedtuple('Point', 'x y')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also indent problem


.. ipython:: python

from collections import namedtuple
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI Checks report a problem with the indent here


Point = namedtuple('Point', 'x y')

pd.DataFrame([Point(0, 0), Point(0, 3), Point(2, 3)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also indent problem


pd.DataFrame([Point(0, 0), Point(0, 3), Point(2, 3)])

Point3D = namedtuple('Point3D', 'x y z')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also indent problem


Point3D = namedtuple('Point3D', 'x y z')

pd.DataFrame([Point3D(0, 0, 0), Point3D(0, 3, 5), Point(2, 3)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also indent problem

From a list of namedtuples
~~~~~~~~~~~~~~~~~~~~~~~~~~

The first namedtuple will be used to determine the columns of the DataFrame.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might write:

The field names of the first ``namedtuple`` in the list determine the columns of the ``DataFrame``. 
The remaining namedtuples (or tuples) are simply unpacked and their values are fed
into the rows of the ``DataFrame``. If any of those tuples is shorter than the first ``namedtuple`` 
then the later columns in the corresponding row are marked as missing values or, in case 
it is longer than the first ``namedtuple``, a ``ValueError`` is raised.

@simonjayhawkins
Copy link
Member

Is a whatsnew entry necessary for this PR?

no

@sjvrijn
Copy link
Contributor Author

sjvrijn commented Aug 2, 2020

@arw2019 Thanks for the suggestions, I've updated the text and one of the examples.

I'm confused about the indentation though, I've checked that it consists of 4 spaces, and throughout the file it's a mix of 3- and 4-space indentation actually...

Copy link
Member

@arw2019 arw2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused about the indentation though, I've checked that it consists of 4 spaces, and throughout the file it's a mix of 3- and 4-space indentation actually...

I got that from the CI checks. Looking at them now I think you're good

There's one failing test - definitely nothing to do with this PR since you only touched the docs. Try merging with master and once the checks are green you're good to go

@jreback jreback added this to the 1.2 milestone Aug 5, 2020
@jreback jreback merged commit 3701a9b into pandas-dev:master Aug 5, 2020
@jreback
Copy link
Contributor

jreback commented Aug 5, 2020

thanks @sjvrijn

@sjvrijn sjvrijn deleted the GH35438 branch August 5, 2020 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DOC: Can't find mention of DataFrame accepting namedtuples as input
4 participants