From d7d5f6078aff8d24c771a7268f7a58bb0502d8ce Mon Sep 17 00:00:00 2001 From: Sander van Rijn Date: Sat, 1 Aug 2020 20:50:07 +0200 Subject: [PATCH 1/2] Added paragraph on creating DataFrame from list of namedtuples --- doc/source/user_guide/dsintro.rst | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/doc/source/user_guide/dsintro.rst b/doc/source/user_guide/dsintro.rst index 360a14998b227..175254526ebce 100644 --- a/doc/source/user_guide/dsintro.rst +++ b/doc/source/user_guide/dsintro.rst @@ -397,6 +397,30 @@ The result will be a DataFrame with the same index as the input Series, and with one column whose name is the original name of the Series (only if no other column name provided). + +.. _basics.dataframe.from_list_namedtuples: + +From a list of namedtuples +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The first namedtuple will be used to determine the columns of the DataFrame. +This means that further (named)tuples are simply unpacked. If additional +tuples are shorter, the later columns are marked as missing data, while +longer tuples cause a ``ValueError``. + +.. ipython:: python + + from collections import namedtuple + + Point = namedtuple('Point', 'x y') + + pd.DataFrame([Point(0, 0), Point(0, 3), Point(2, 3)]) + + Point3D = namedtuple('Point3D', 'x y z') + + pd.DataFrame([Point3D(0, 0, 0), Point3D(0, 3, 5), Point(2, 3)]) + + .. _basics.dataframe.from_list_dataclasses: From a list of dataclasses From 6c2e4fabebadf1ca985f4bf5a07cc231c35a978d Mon Sep 17 00:00:00 2001 From: Sander van Rijn Date: Sun, 2 Aug 2020 17:26:02 +0200 Subject: [PATCH 2/2] rewrite description & example; removed trailing whitespace --- doc/source/user_guide/dsintro.rst | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/doc/source/user_guide/dsintro.rst b/doc/source/user_guide/dsintro.rst index 175254526ebce..23bd44c1969a5 100644 --- a/doc/source/user_guide/dsintro.rst +++ b/doc/source/user_guide/dsintro.rst @@ -403,18 +403,20 @@ column name provided). From a list of namedtuples ~~~~~~~~~~~~~~~~~~~~~~~~~~ -The first namedtuple will be used to determine the columns of the DataFrame. -This means that further (named)tuples are simply unpacked. If additional -tuples are shorter, the later columns are marked as missing data, while -longer tuples cause a ``ValueError``. +The field names of the first ``namedtuple`` in the list determine the columns +of the ``DataFrame``. The remaining namedtuples (or tuples) are simply unpacked +and their values are fed into the rows of the ``DataFrame``. If any of those +tuples is shorter than the first ``namedtuple`` then the later columns in the +corresponding row are marked as missing values. If any are longer than the +first ``namedtuple``, a ``ValueError`` is raised. .. ipython:: python from collections import namedtuple - + Point = namedtuple('Point', 'x y') - pd.DataFrame([Point(0, 0), Point(0, 3), Point(2, 3)]) + pd.DataFrame([Point(0, 0), Point(0, 3), (2, 3)]) Point3D = namedtuple('Point3D', 'x y z')