Skip to content

ENH: show numbers on .info() with verbose flag #28876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jan 3, 2020

Conversation

Roymprog
Copy link
Contributor

@Roymprog Roymprog commented Oct 9, 2019

@WillAyd
Copy link
Member

WillAyd commented Oct 10, 2019

This is a duplicate of #28696 which was a continuation of #17332 - did you address comments from those? @Alexandreae were you still working on that?

@Roymprog
Copy link
Contributor Author

Apologies, I completely missed this entire discussion. I was not aware someone else was working on it. I did see the comment left in #17304, but I started working on it as that was months ago.

Let me have a look at the comments, then I'll finish up. Still, you can close this if @Alexandreae responds.

@pep8speaks
Copy link

pep8speaks commented Oct 11, 2019

Hello @Roymprog! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-01-02 20:42:37 UTC

@Roymprog
Copy link
Contributor Author

All the suggestions from #17332 have been implemented. Please let me know what you think

@WillAyd WillAyd added the DataFrame DataFrame data structure label Oct 11, 2019
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool thanks for picking this up. cc @bashtage who had some thoughts on original design

@bashtage
Copy link
Contributor

Output looks pretty good and better than current. Does it work well with very long variable names? Or other long info properties? Wondering since the ----- bits are new.

@Roymprog
Copy link
Contributor Author

Sample for long column name variables.
Script:

col_count = 10
max_col_name = 80
df = pd.DataFrame(np.random.rand(4, col_count), columns=[
          '%s' % (np.random.randint(2, max_col_name)*'a') for x in range(col_count)])
df.info(verbose=True)

Output:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 10 columns):
 #   Column                                                                         Non-Null Count  Dtype  
---  ------                                                                         --------------  -----  
 0   aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa                                               4 non-null      float64
 1   aaaaaaaaaaaaaaaaaaaaaaaaaaaa                                                   4 non-null      float64
 2   aaaaaaaa                                                                       4 non-null      float64
 3   aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa  4 non-null      float64
 4   aaaaaaaaaaaaaaaaaaaaaaaaaaaaa                                                  4 non-null      float64
 5   aaaaaaaaaaaaaaaaaaaaaaaaaaa                                                    4 non-null      float64
 6   aaaaaaaaaaaaaaa                                                                4 non-null      float64
 7   aaaaaaaaa                                                                      4 non-null      float64
 8   aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa                     4 non-null      float64
 9   aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa   4 non-null      float64

Regarding dtypes, there's a test, and for non-null count there is a docstring test. For the numbering I added an assertion in the verbosity test, as I did not feel like posting a thousand line output here.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs some eye balls in it from @pandas-dev/pandas-core

@bashtage
Copy link
Contributor

Look readable in this extreme case.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is nice and improves readability

@WillAyd
Copy link
Member

WillAyd commented Oct 22, 2019

@pandas-dev/pandas-core anyone else care to take a look?

@WillAyd WillAyd added this to the 1.0 milestone Oct 22, 2019
@WillAyd
Copy link
Member

WillAyd commented Oct 25, 2019

ping @pandas-dev/pandas-core for any other comments

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall an improvement from the prior output. One small comment.

@WillAyd
Copy link
Member

WillAyd commented Oct 29, 2019

@jreback care to take a look or merge?

@jreback
Copy link
Contributor

jreback commented Oct 29, 2019

yep will again

@@ -50,6 +50,8 @@ including other versions of pandas.
Enhancements
~~~~~~~~~~~~

- :meth:`Dataframe.info` now shows line numbers for the columns summary (:issue:`17304`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make a small sub-section showing these changes. move to the api breaking section as this is a change in the output repr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the difference to the Backwards incompatible API changes section. Hope that is what you meant

@WillAyd
Copy link
Member

WillAyd commented Nov 19, 2019

@Roymprog looks like some merge conflicts - can you clean up and re-push?

@WillAyd
Copy link
Member

WillAyd commented Dec 17, 2019

@Roymprog can you fix up merge conflict?

@Roymprog
Copy link
Contributor Author

I will in the coming week

@WillAyd
Copy link
Member

WillAyd commented Jan 2, 2020

@jreback any outstanding thoughts on this?

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. small doc-presentation change. ping on green.

@@ -292,6 +292,34 @@ New repr for :class:`~pandas.arrays.IntervalArray`

pd.arrays.IntervalArray.from_tuples([(0, 1), (2, 3)])

- :meth:`Dataframe.info` now shows line numbers for the columns summary (:issue:`17304`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a title and underline here (like other sub-sections).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an underlined title for the new feature, thanks for the feedback

@TomAugspurger
Copy link
Contributor

@Roymprog CI failure:

Check for use of not concatenated strings
[error]./pandas/tests/frame/test_repr_info.py:216:String unnecessarily split in two by black. Please merge them manually.
Check for use of not concatenated strings DONE

@jreback jreback merged commit 1e32421 into pandas-dev:master Jan 3, 2020
@jreback
Copy link
Contributor

jreback commented Jan 3, 2020

thanks @Roymprog very . nice!

@Roymprog
Copy link
Contributor Author

Roymprog commented Jan 3, 2020

Thanks for the fix @TomAugspurger.

Thank you all for all the feedback I received. This was my first open source pull request, thanks for making it happen 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DataFrame DataFrame data structure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feature wanted: pd.DataFrame.info() should show line numbers
7 participants