ENH: add Series.info #31796

MarcoGorelli · 2020-02-07T22:12:38Z

initial draft to close API: add Series.info method #5167
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

screenshot of the example

Note to self: this change from ba72b59b44f15076877e784e16cbc66aa03e0adc breaks the tests

-    # groupby dtype.name to collect e.g. Categorical columns
-    counts = data.dtypes.value_counts().groupby(lambda x: x.name).sum()
+    counts = data._data.get_dtype_counts()

Updated

Currently, looks like this:

pandas/core/frame.py

pandas/core/series.py

pandas/tests/series/test_repr.py

MarcoGorelli · 2020-02-09T12:04:42Z

Some screenshots

pep8speaks · 2020-02-09T12:27:58Z

Hello @MarcoGorelli! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-09-19 16:26:44 UTC

MarcoGorelli · 2020-02-09T14:18:22Z

Thanks @WillAyd for your review, have updated

jreback

so, before doing this, can you do a pre-cursor which moves the .info code into pandas/io/formats/info.py. we already have way too many things inside the core/frame.py and core/generic.py modules. I think this would promote easy sharing of this code. also happy for the added tests, you can move them as well to pandas/io/formats/test_info.py

MarcoGorelli · 2020-02-19T13:32:48Z

Screenshots of docs:

pandas/core/series.py

pandas/io/formats/info.py

WillAyd · 2020-02-20T01:48:40Z

pandas/io/formats/info.py

+        cols = Index([data.name])
+        dtypes = Index([data.dtypes])
+
+    col_count = len(cols)


Hmm I think this is confusing to have col_count be populated by a Series in some cases - can you think of a way to refactor?

@WillAyd sure, have given this a go

pandas/io/formats/info.py

MarcoGorelli · 2020-07-04T10:04:44Z

I've pushed a new attempt at this. I've factored out some pieces of _verbose_repr to avoid code duplication, although their arguments lists and return values have become rather long.

WillAyd · 2020-09-10T18:48:14Z

@MarcoGorelli can you fix up the PR here? Someone will take a look again thereafter

MarcoGorelli · 2020-09-19T17:13:44Z

@MarcoGorelli can you fix up the PR here? Someone will take a look again thereafter

@WillAyd Sure, have given another go at it, now using a namedtuple to group together some similar variables

WillAyd

I think this is headed in the right direction

WillAyd · 2020-09-21T21:38:45Z

pandas/io/formats/info.py

@@ -15,6 +15,32 @@
    from pandas.core.series import Series  # noqa: F401


+class CountConfigs(NamedTuple):


Can you generate this using collections.namedtuple instead? We typically don't subclass things from typing

Sure, can do, although I think this is the newer syntax, it's taken directly from https://docs.python.org/3/library/typing.html#typing.NamedTuple

They also recommend it in the mypy docs: https://mypy.readthedocs.io/en/stable/kinds_of_types.html#named-tuples

MarcoGorelli · 2020-10-07T16:39:54Z

(closing as @ivanovmg will take it over)

WillAyd requested changes Feb 8, 2020

View reviewed changes

pandas/core/frame.py Outdated Show resolved Hide resolved

pandas/core/series.py Outdated Show resolved Hide resolved

pandas/tests/series/test_repr.py Outdated Show resolved Hide resolved

WillAyd added API Design Series Series data structure labels Feb 8, 2020

MarcoGorelli force-pushed the series-info branch from cfa80b3 to 362a224 Compare February 9, 2020 11:44

MarcoGorelli changed the title ~~(wip) ENH: add Series.info~~ ENH: add Series.info Feb 9, 2020

jreback requested changes Feb 9, 2020

View reviewed changes

MarcoGorelli mentioned this pull request Feb 11, 2020

CLN: Move info #31876

Merged

Marco Gorelli added 4 commits February 19, 2020 10:05

first draft

2b1e5fc

add whatsnew

a4ad077

docstring sharing

c7bfb94

wip

01fd802

MarcoGorelli force-pushed the series-info branch from 4d88b07 to abbae9a Compare February 19, 2020 11:27

Marco Gorelli added 7 commits February 19, 2020 11:30

add series tests

abbae9a

formatting

1a474fe

formatting

b30ce1b

remove old file

6d8c765

clean

99411e4

add test

4651bd7

add test

7de4703

MarcoGorelli changed the title ~~ENH: add Series.info~~ wip ENH: add Series.info Feb 19, 2020

Marco Gorelli added 2 commits February 19, 2020 12:27

isort

99472fd

remove test

2902fe7

MarcoGorelli changed the title ~~wip ENH: add Series.info~~ ENH: add Series.info Feb 19, 2020

WillAyd requested changes Feb 20, 2020

View reviewed changes

MarcoGorelli added 2 commits February 23, 2020 11:51

Merge remote-tracking branch 'upstream/master' into series-info

8b8adfa

use isinstance abcdataframe, disallow max_cols for series.info

c6d8a76

MarcoGorelli commented Jul 2, 2020

View reviewed changes

pandas/io/formats/info.py Outdated Show resolved Hide resolved

MarcoGorelli added 6 commits July 2, 2020 17:58

fix typing

6eccf00

factor out _display_counts_and_dtypes

81d22eb

fix typing, factor out _get_header_and_spaces

f2ca520

document _get_count_configs

669ff38

document _display_counts_and_dtypes and _get_header_and_spaces

6f8f8b1

remove breakpoints

97dc73c

MarcoGorelli marked this pull request as ready for review July 4, 2020 09:45

fix docstring substitution

0707f32

MarcoGorelli requested review from WillAyd and jreback July 4, 2020 10:03

Merge remote-tracking branch 'upstream/master' into series-info

2a2324b

Merge remote-tracking branch 'upstream/master' into series-info

0c08335

MarcoGorelli marked this pull request as draft September 14, 2020 07:51

MarcoGorelli added 6 commits September 19, 2020 16:35

Merge remote-tracking branch 'upstream/master' into series-info

c93f1ad

fix failing doctest

ddf9efc

use CountConfigs namedtuple

21d94b2

🔥

a213d9c

remove trailing comma

089ce24

fix failing doctest

4581385

MarcoGorelli marked this pull request as ready for review September 19, 2020 16:26

WillAyd requested changes Sep 21, 2020

View reviewed changes

ivanovmg mentioned this pull request Sep 30, 2020

REF: simplify info.py #36752

Merged

5 tasks

MarcoGorelli closed this Oct 7, 2020

MarcoGorelli deleted the series-info branch October 10, 2020 14:14

ivanovmg mentioned this pull request Oct 21, 2020

ENH: enable Series.info() #37320

Merged

5 tasks

		@@ -15,6 +15,32 @@
		from pandas.core.series import Series # noqa: F401


		class CountConfigs(NamedTuple):

Uh oh!

ENH: add Series.info #31796

ENH: add Series.info #31796

Uh oh!

Conversation

MarcoGorelli commented Feb 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Updated

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MarcoGorelli commented Feb 9, 2020

Uh oh!

pep8speaks commented Feb 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2020-09-19 16:26:44 UTC

Uh oh!

MarcoGorelli commented Feb 9, 2020

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

MarcoGorelli commented Feb 19, 2020

Uh oh!

Uh oh!

Uh oh!

WillAyd Feb 20, 2020

Choose a reason for hiding this comment

Uh oh!

MarcoGorelli Feb 23, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MarcoGorelli commented Jul 4, 2020

Uh oh!

WillAyd commented Sep 10, 2020

Uh oh!

MarcoGorelli commented Sep 19, 2020

Uh oh!

WillAyd left a comment

Choose a reason for hiding this comment

Uh oh!

WillAyd Sep 21, 2020

Choose a reason for hiding this comment

Uh oh!

MarcoGorelli Sep 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MarcoGorelli commented Oct 7, 2020

Uh oh!

Uh oh!

MarcoGorelli commented Feb 7, 2020 •

edited

Loading

pep8speaks commented Feb 9, 2020 •

edited

Loading

MarcoGorelli Sep 22, 2020 •

edited

Loading