-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New DataFrame display information? #6547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
started in 0.13, you can turn off if you want by: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#output-formatting-enhancements (I think set |
I figured you could, but that's a global option that we can't control after we pass back to a user. We're trying more and more to take advantage of DataFrames as containers to display information, to_latex, to_csv, etc. and this is just noise here. It is descriptive information, which is why I think it should be in |
Also, |
this is only for interactive display, e.g. its actually quite helpful when you have the default to display the actual data, but the frame itself is truncated (e.g. you are not using info, which you always have the option to do of course) |
Yes, I'm not saying it affects anything else. I'm saying that we like those features, but we also like a clean It used to be that by default when the frame was truncated, that you got the But, again, as you pointed out, you always have I hate to nit-pick here, but this is a pretty jarring visual change from what I'm used to, and there's not much gained. |
I think their was a bit of discussion on the linked issues. http://pandas.pydata.org/pandas-docs/dev/whatsnew.html#dataframe-repr-changes. I personally like it, but I can see your point. Hard to have everyone like everything all the time! |
@jseabold Why not just return a subclass of DataFrame with the repr methods changed? Maybe the |
IMO the case for including that extra information has not been made well enough for me to start subclassing DataFrames to avoid seeing a piece of information I don't want or need. Are there any other data structures out there that do this? Matlab? data.frames? Numpy arrays? I think scipy.sparse matrices do, but are we really saying that a DataFrame that's truncated on output is like a sparse matrix? Originally, DFs showed the |
@jseabold My opinion is that when no truncation of the dataframe occurs, printing the dimensions is in fact noise. When truncation does occur, I do believe that it makes sense to print the dimensions of the full frame. |
Yeah, I could live with that. I won't back up to argue whether there should even be this |
@jreback What do you think about that? |
I think these discussions are personal preferences and have long been discussed no one is ever going to be happy so will wait for more consensus to potentially change this again |
Ok, then here's another gripe. Discussions on github are not public enough IMO. There is a lot of a github noise from pandas. I try my best to keep up with development, but I don't see this until I install it. It seems to me, you're saying "I participated in the discussion, and it went on for a long time, so that's that." This was discussed by 3 people on #4886 and #5550. @y-p made some strenuous initial objections (which are in line with my initial reaction) and then withdrew them. Particularly this bit
There are several other sensible comments to this effect. It would've been great to see a ping on the mailing list about this. Something. Anything. Maybe I missed it and it's my fault. But, hey, let's think of this release as the usability study. Coming back to the point, the issue of the footer was discussed only in passing on #5550. And here's what it says (my emphasis).
|
@jseabold all for more discussion esp on UX pls post an issue to the mailing list if u would then |
FWIW, I completely agree with @jseabold that github discussions aren't really sufficient notification of any large or potentially breaking changes. A case in point - I knew nothing about these UX changes until I upgraded and am only here now because I saw Skippers post to the mailing list. I have no strong opinion about these changes though Skippers suggestion of only printing the shape when truncated sounds sensible to me. One thing I've noticed is that the truncated view displays the first |
@dhirschfeld funny thing is I just referened your comment on the ML...hahaha... their is an issue to do exactly this type of 2-side truncation display, see here: #5603 |
@jseabold @dhirschfeld what is the decision on this? |
If I look at the dicussion on the mailing list and here, I think there is a majority for the option "only show dimensions when truncated" (4 votes against 1 or two for "always show dimensions" on the mailing list, and also here two extra votes for "only show dimensions when truncated"). And I think I am also +1 on only showing the dimensions when the dataframe is truncated. BTW, R does something vaguely similar. It also truncates the output of a very large dataframe (only at a much higher limit) and then you get the message @takluyver You were the author of the display changes, what do you think? |
I quite like seeing the number of rows even when it's not truncated. When I'm not going to fight this if people want to change it, but I will note |
@takluyver would you mind repeating your comments to the mailing list issue? |
Can you point me to it? I'm not on the pandas mailing list. |
ok after reading the thread...seems consensus is to allow: to be:
good? |
What will be the new default?
|
I wouldn't change the default (of course that is the point of an option, users can change). @jseabold points out that 0.13 made an API change where this did change, but that was a conscious decision. |
To have consensus to allow to change the behaviour via an option is not that difficult I think :-) I also don't know if it was that conscious. OK, it was discussed and clearly chosen to show a truncated view, but not really when the dimensions should be showed (see eg comment of @jseabold above #6547 (comment)). But of course, the options can already be implemented. Then changing the default if a clear consensus arises is trivial. |
are their any other options under consideration that I didn't point out? |
No, I think those three options (always/never/truncated) are all relevant options. |
We are using DataFrames to hold information in a lot places now. E.g., ANOVA tables
Where'd that bottom
[]
business come from? Is this new? Isn't this something that's better included in theinfo
method?The text was updated successfully, but these errors were encountered: