Skip to content

DOC: Add missing API item to reference docs #48455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 13, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 52 additions & 16 deletions doc/source/reference/arrays.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,20 +19,21 @@ objects contained with a :class:`Index`, :class:`Series`, or
For some data types, pandas extends NumPy's type system. String aliases for these types
can be found at :ref:`basics.dtypes`.

=================== ========================= ============================= =============================
Kind of Data pandas Data Type Scalar Array
=================== ========================= ============================= =============================
TZ-aware datetime :class:`DatetimeTZDtype` :class:`Timestamp` :ref:`api.arrays.datetime`
Timedeltas (none) :class:`Timedelta` :ref:`api.arrays.timedelta`
Period (time spans) :class:`PeriodDtype` :class:`Period` :ref:`api.arrays.period`
Intervals :class:`IntervalDtype` :class:`Interval` :ref:`api.arrays.interval`
Nullable Integer :class:`Int64Dtype`, ... (none) :ref:`api.arrays.integer_na`
Categorical :class:`CategoricalDtype` (none) :ref:`api.arrays.categorical`
Sparse :class:`SparseDtype` (none) :ref:`api.arrays.sparse`
Strings :class:`StringDtype` :class:`str` :ref:`api.arrays.string`
Boolean (with NA) :class:`BooleanDtype` :class:`bool` :ref:`api.arrays.bool`
PyArrow :class:`ArrowDtype` Python Scalars or :class:`NA` :ref:`api.arrays.arrow`
=================== ========================= ============================= =============================
=================== ========================== ============================= =============================
Kind of Data pandas Data Type Scalar Array
=================== ========================== ============================= =============================
TZ-aware datetime :class:`DatetimeTZDtype` :class:`Timestamp` :ref:`api.arrays.datetime`
Timedeltas (none) :class:`Timedelta` :ref:`api.arrays.timedelta`
Period (time spans) :class:`PeriodDtype` :class:`Period` :ref:`api.arrays.period`
Intervals :class:`IntervalDtype` :class:`Interval` :ref:`api.arrays.interval`
Nullable Integer :class:`Int64Dtype`, ... (none) :ref:`api.arrays.integer_na`
Nullable Float :class:`Float64Dtype`, ... (none) :ref:`api.arrays.float_na`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only line you added, right? With the wider column is not immediately clear in the diff.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, since :class:`Float64Dtype`, ... was wider than the section boundaries (===...) everything needed to shift.

Also renamed Boolean (with NA) to Nullable Boolean to match the int and float titles.

Categorical :class:`CategoricalDtype` (none) :ref:`api.arrays.categorical`
Sparse :class:`SparseDtype` (none) :ref:`api.arrays.sparse`
Strings :class:`StringDtype` :class:`str` :ref:`api.arrays.string`
Nullable Boolean :class:`BooleanDtype` :class:`bool` :ref:`api.arrays.bool`
PyArrow :class:`ArrowDtype` Python Scalars or :class:`NA` :ref:`api.arrays.arrow`
=================== ========================== ============================= =============================

pandas and third-party libraries can extend NumPy's type system (see :ref:`extending.extension-types`).
The top-level :meth:`array` method can be used to create a new array, which may be
Expand Down Expand Up @@ -91,13 +92,20 @@ with the :class:`arrays.DatetimeArray` extension array, which can hold timezone-
or timezone-aware values.

:class:`Timestamp`, a subclass of :class:`datetime.datetime`, is pandas'
scalar type for timezone-naive or timezone-aware datetime data.
scalar type for timezone-naive or timezone-aware datetime data. :class:`NaT`
is the missing value for datetime data.

.. autosummary::
:toctree: api/

Timestamp

.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

NaT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need a blank line between directive options and the values, I think this is why the CI is failing.


Properties
~~~~~~~~~~
.. autosummary::
Expand Down Expand Up @@ -208,13 +216,20 @@ Timedeltas
----------

NumPy can natively represent timedeltas. pandas provides :class:`Timedelta`
for symmetry with :class:`Timestamp`.
for symmetry with :class:`Timestamp`. :class:`NaT`
is the missing value for timedelta data.

.. autosummary::
:toctree: api/

Timedelta

.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

NaT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blank line


Properties
~~~~~~~~~~
.. autosummary::
Expand Down Expand Up @@ -419,6 +434,26 @@ pandas provides this through :class:`arrays.IntegerArray`.
UInt16Dtype
UInt32Dtype
UInt64Dtype
NA

.. _api.arrays.float_na:

Nullable float
--------------

.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

arrays.FloatingArray

.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

Float32Dtype
Float64Dtype
NA
Comment on lines +444 to +456
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what's the advantage of having this in two separate autosummaries? I see you used the same format of the rest, and it's fine, but I wonder if this is just to allow different sizes of the columns of the two tables, or if there is anything else.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not immediately sure. I suppose different column sizes was the reason in the past but I was just using the same formatting as other sections


.. _api.arrays.categorical:

Expand Down Expand Up @@ -555,6 +590,7 @@ with a bool :class:`numpy.ndarray`.
:template: autosummary/class_without_autosummary.rst

BooleanDtype
NA


.. Dtype attributes which are manually listed in their docstrings: including
Expand Down
1 change: 1 addition & 0 deletions doc/source/reference/general_functions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Data manipulations
from_dummies
factorize
unique
lreshape
wide_to_long

Top-level missing data
Expand Down
7 changes: 7 additions & 0 deletions doc/source/reference/groupby.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,13 @@ Indexing, iteration

Grouper

Function application helper
---------------------------
.. autosummary::
:toctree: api/

NamedAgg

.. currentmodule:: pandas.core.groupby

Function application
Expand Down
7 changes: 7 additions & 0 deletions doc/source/reference/options.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,10 @@ Working with options
get_option
set_option
option_context

Numeric formatting
------------------
.. autosummary::
:toctree: api/

set_eng_float_format
25 changes: 25 additions & 0 deletions pandas/core/groupby/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,31 @@


class NamedAgg(NamedTuple):
"""
Helper for column specific aggregation with control over output column names.

Subclass of typing.NamedTuple.

Parameters
----------
column : Hashable
Column label in the DataFrame to apply aggfunc.
aggfunc : function or str
Function to apply to the provided column. If string, the name of a built-in
pandas function.

Examples
--------
>>> df = pd.DataFrame({"key": [1, 1, 2], "a": [-1, 0, 1], 1: [10, 11, 12]})
>>> agg_a = pd.NamedAgg(column="a", aggfunc="min")
>>> agg_1 = pd.NamedAgg(column=1, aggfunc=np.mean)
>>> df.groupby("key").agg(result_a=agg_a, result_1=agg_1)
result_a result_1
key
1 -1 10.5
2 1 12.0
"""

column: Hashable
aggfunc: AggScalar

Expand Down
53 changes: 49 additions & 4 deletions pandas/io/formats/format.py
Original file line number Diff line number Diff line change
Expand Up @@ -2116,11 +2116,56 @@ def __call__(self, num: float) -> str:

def set_eng_float_format(accuracy: int = 3, use_eng_prefix: bool = False) -> None:
"""
Alter default behavior on how float is formatted in DataFrame.
Format float in engineering format. By accuracy, we mean the number of
decimal digits after the floating point.
Format float representation in DataFrame with SI notation.

See also EngFormatter.
Parameters
----------
accuracy : int, default 3
Number of decimal digits after the floating point.
use_eng_prefix : bool, default False
Whether to represent a value with SI prefixes.

Returns
-------
None

Examples
--------
>>> df = pd.DataFrame([1e-9, 1e-3, 1, 1e3, 1e6])
>>> df
0
0 1.000000e-09
1 1.000000e-03
2 1.000000e+00
3 1.000000e+03
4 1.000000e+06

>>> pd.set_eng_float_format(accuracy=1)
>>> df
0
0 1.0E-09
1 1.0E-03
2 1.0E+00
3 1.0E+03
4 1.0E+06

>>> pd.set_eng_float_format(use_eng_prefix=True)
>>> df
0
0 1.000n
1 1.000m
2 1.000
3 1.000k
4 1.000M

>>> pd.set_eng_float_format(accuracy=1, use_eng_prefix=True)
>>> df
0
0 1.0n
1 1.0m
2 1.0
3 1.0k
4 1.0M
"""
set_option("display.float_format", EngFormatter(accuracy, use_eng_prefix))

Expand Down