-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: enable Series.info() #37320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: enable Series.info() #37320
Changes from 21 commits
a903f32
e07d6e2
0990d54
1814795
4c390a8
81929e6
824d8d6
ce68e94
ede6dc4
789e03e
f41596d
40b71f8
3e71336
739c62d
e9c5220
f7cb4f8
def5ed6
dd4205d
e74cdce
d41ecf1
98d0f55
0d1c5d8
816803e
9e0198f
1e2aaef
688080b
4e87b1a
dc999fe
4bb4e40
f114293
16ac96e
aac2954
22303dc
9428a32
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -97,6 +97,7 @@ | |
from pandas.core.tools.datetimes import to_datetime | ||
|
||
import pandas.io.formats.format as fmt | ||
from pandas.io.formats.info import BaseInfo, SeriesInfo | ||
import pandas.plotting | ||
|
||
if TYPE_CHECKING: | ||
|
@@ -4559,6 +4560,98 @@ def replace( | |
method=method, | ||
) | ||
|
||
@Substitution( | ||
klass="Series", | ||
type_sub="", | ||
max_cols_sub="", | ||
null_counts_sub=dedent( | ||
"""\ | ||
null_counts : bool, default True | ||
Whether to show the non-null counts.""" | ||
), | ||
examples_sub=dedent( | ||
"""\ | ||
>>> int_values = [1, 2, 3, 4, 5] | ||
>>> text_values = ['alpha', 'beta', 'gamma', 'delta', 'epsilon'] | ||
>>> s = pd.Series(text_values, index=int_values) | ||
>>> s.info() | ||
<class 'pandas.core.series.Series'> | ||
Int64Index: 5 entries, 1 to 5 | ||
Series name: None | ||
Non-Null Count Dtype | ||
-------------- ----- | ||
5 non-null object | ||
dtypes: object(1) | ||
memory usage: 80.0+ bytes | ||
|
||
Prints a summary excluding information about its values: | ||
|
||
>>> s.info(verbose=False) | ||
<class 'pandas.core.series.Series'> | ||
Int64Index: 5 entries, 1 to 5 | ||
dtypes: object(1) | ||
memory usage: 80.0+ bytes | ||
|
||
Pipe output of Series.info to buffer instead of sys.stdout, get | ||
buffer content and writes to a text file: | ||
|
||
>>> import io | ||
>>> buffer = io.StringIO() | ||
>>> s.info(buf=buffer) | ||
>>> s = buffer.getvalue() | ||
>>> with open("df_info.txt", "w", | ||
... encoding="utf-8") as f: # doctest: +SKIP | ||
... f.write(s) | ||
260 | ||
|
||
The `memory_usage` parameter allows deep introspection mode, specially | ||
useful for big Series and fine-tune memory optimization: | ||
|
||
>>> random_strings_array = np.random.choice(['a', 'b', 'c'], 10 ** 6) | ||
>>> s = pd.Series(np.random.choice(['a', 'b', 'c'], 10 ** 6)) | ||
>>> s.info() | ||
<class 'pandas.core.series.Series'> | ||
RangeIndex: 1000000 entries, 0 to 999999 | ||
Series name: None | ||
Non-Null Count Dtype | ||
-------------- ----- | ||
1000000 non-null object | ||
dtypes: object(1) | ||
memory usage: 7.6+ MB | ||
|
||
>>> s.info(memory_usage='deep') | ||
<class 'pandas.core.series.Series'> | ||
RangeIndex: 1000000 entries, 0 to 999999 | ||
Series name: None | ||
Non-Null Count Dtype | ||
-------------- ----- | ||
1000000 non-null object | ||
dtypes: object(1) | ||
memory usage: 55.3 MB""" | ||
), | ||
see_also_sub=dedent( | ||
"""\ | ||
Series.describe: Generate descriptive statistics of Series. | ||
Series.memory_usage: Memory usage of Series.""" | ||
), | ||
version_added_sub="\n.. versionadded:: 1.2.0\n", | ||
) | ||
@doc(BaseInfo.render) | ||
def info( | ||
self, | ||
verbose: Optional[bool] = None, | ||
buf: Optional[IO[str]] = None, | ||
max_cols: Optional[int] = None, | ||
memory_usage: Optional[Union[bool, str]] = None, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the docstring for DataFrame.info is
I think this should be
might be able to use Literal here (see #37137) and maybe create an alias in typing . follow-on OK too. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried to use Literal, but looks like that is available only starting from Python 3.8. |
||
null_counts: bool = True, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, but one thing at a time. Will wait for the public API update first. |
||
) -> None: | ||
return SeriesInfo(self, memory_usage).render( | ||
buf=buf, | ||
max_cols=max_cols, | ||
verbose=verbose, | ||
show_counts=null_counts, | ||
) | ||
|
||
def _replace_single(self, to_replace, method, inplace, limit): | ||
""" | ||
Replaces values in a Series using the fill method specified when no | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the Substitution decorator should be necessary with the doc decorator. (and not seen them used together)
The doc decorator was created to supersede the Appender and Substitution decorators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that Substitution is still necessary if we use one generic docstring for DataFrame and Series info. I could not figure out how I can replace some keywords in the base docstring, to make it suitable for both frame and series.
Probably I do not know how to use doc decorator.