ENH: Expanded display of dataframe, akin to postgres \x #38827

samzhang111 · 2020-12-30T18:31:49Z

Is your feature request related to a problem?

When viewing large dataframes in a terminal, it is often preferable to display the data in "expanded format" or "long format" (as opposed to "wide"), essentially the result of melting every column. This makes the entire dataframe less sensitive to the width of the terminal.

Describe the solution you'd like

Postgres handles this nicely with its expanded mode (see -x or --expanded). To take a random postgres example, the following dataframe

 id | time  |       humanize_time             | value 
----+-------+---------------------------------+-------
  1 | 09:30 |  Early Morning - (9.30 am)      |   570
  2 | 11:30 |  Late Morning - (11.30 am)      |   690
  3 | 13:30 |  Early Afternoon - (1.30pm)     |   810
  4 | 15:30 |  Late Afternoon - (3.30 pm)     |   930
(4 rows)

becomes printed as

-[ RECORD 1 ]-+---------------------------
id            | 1
time          | 09:30
humanize_time | Early Morning - (9.30 am)
value         | 570
-[ RECORD 2 ]-+---------------------------
id            | 2
time          | 11:30
humanize_time | Late Morning - (11.30 am)
value         | 690
-[ RECORD 3 ]-+---------------------------
id            | 3
time          | 13:30
humanize_time | Early Afternoon - (1.30pm)
value         | 810
-[ RECORD 4 ]-+---------------------------
id            | 4
time          | 15:30
humanize_time | Late Afternoon - (3.30 pm)
value         | 930

API breaking implications

I don't see there being any.

Additional context

My naive suggestion would be to place a global option that turns this on, such as

pd.set_option('display.expanded_mode', True)

Under the hood, this can be as simple as melting the dataframe, and printing each record out with a separator between them. However I am not familiar with the intricacies of the display logic and leave this here for others' consideration.

The text was updated successfully, but these errors were encountered:

jreback · 2020-12-30T22:47:46Z

isn't it just this

In [168]: df = pd.DataFrame({'A': pd.date_range('20200101', periods=3), 'B': list('abc'), 'C': [1,2,3]})                                               

In [169]: df                                                                                                                                           
Out[169]: 
           A  B  C
0 2020-01-01  a  1
1 2020-01-02  b  2
2 2020-01-03  c  3

In [170]: df.stack()                                                                                                                                   
Out[170]: 
0  A    2020-01-01 00:00:00
   B                      a
   C                      1
1  A    2020-01-02 00:00:00
   B                      b
   C                      2
2  A    2020-01-03 00:00:00
   B                      c
   C                      3
dtype: object

jreback · 2020-12-30T22:48:04Z

output formatting is already quite complex. so we would need a really good reason.

samzhang111 · 2020-12-31T19:15:22Z

Thanks, stack is definitely a better description of what I want than melt.

The output of running stack is just a Series, so it has the same limitations as printing any Series directly to the console, which is that we may not even see an entire record before it is truncated, that the truncation can occur in the middle of a record, and that by default the individual entries are themselves given a fairly short default maximum length before they're truncated with ellipses.

Thus this proposal is really to add an output setting where dataframes are displayed in stacked format, but, say, showing at least K records (which is different than K rows of the series).

I can't claim this to be a "really" good reason, just that it's something that would be slick and convenient. It's one of my favorite features in the postgres console! I will play with my personal settings and just assume this isn't something that will likely be worked on, though. Thanks for the reply!

mroeschke · 2021-08-14T22:57:13Z

Thanks for the report, but agreed since stack outputs a very similar result, I think this feature would be best suited for an external library to implement as pandas aims to have a limited amount of direct display APIs.

Closing, but happy to reopen if there is renewed interest from the other core devs and community.

samzhang111 added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 30, 2020

jreback added Output-Formatting __repr__ of pandas objects, to_string and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 30, 2020

jreback mentioned this issue Aug 2, 2021

ENH: A new method that will more efficiently display 'tall' df #42837

Closed

mroeschke closed this as completed Aug 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Expanded display of dataframe, akin to postgres \x #38827

ENH: Expanded display of dataframe, akin to postgres \x #38827

samzhang111 commented Dec 30, 2020

jreback commented Dec 30, 2020

jreback commented Dec 30, 2020

samzhang111 commented Dec 31, 2020

mroeschke commented Aug 14, 2021

ENH: Expanded display of dataframe, akin to postgres \x #38827

ENH: Expanded display of dataframe, akin to postgres \x #38827

Comments

samzhang111 commented Dec 30, 2020

Is your feature request related to a problem?

Describe the solution you'd like

API breaking implications

Additional context

jreback commented Dec 30, 2020

jreback commented Dec 30, 2020

samzhang111 commented Dec 31, 2020

mroeschke commented Aug 14, 2021