Skip to content

Styling console/terminal output #18066

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
s-celles opened this issue Nov 1, 2017 · 18 comments
Open

Styling console/terminal output #18066

s-celles opened this issue Nov 1, 2017 · 18 comments
Labels
Enhancement Output-Formatting __repr__ of pandas objects, to_string

Comments

@s-celles
Copy link
Contributor

s-celles commented Nov 1, 2017

Hello,

Since v0.17.1, Pandas have an interesting API for styling Jupyter HTML outputs for Series / DataFrames...
https://pandas.pydata.org/pandas-docs/stable/style.html

It will be a nice feature to be able to have styling for console/terminal output
I wonder if such feature should be part of Pandas core or be inside an other package (like tabulate from @astanin which was mentioned in #11052

There is also PrettyPandas from @HHammond

Kind regards

@gfyoung gfyoung added Enhancement Needs Discussion Requires discussion from core team before further action Output-Formatting __repr__ of pandas objects, to_string labels Nov 2, 2017
@gfyoung
Copy link
Member

gfyoung commented Nov 3, 2017

@scls19fr : Thanks for the report! I think in the interest of modularity, we would probably want to move away from incorporating giant chunks of code into our codebase.

That being said, we could add functions that call those function from something like tabulate or PrettyPandas (assuming they're installed). However, we would probably need to choose one or the other (or make a uniform API to be able to call both).

@cbrnr
Copy link
Contributor

cbrnr commented Mar 7, 2018

RStudio adds color to the output of a data frame in the console:

dxm8a7uw4aakwvg

Since IPython now supports syntax highlighting in a normal console, I was wondering if something similar was possible in pandas. This would likely not require any additional packages, but instead adding color codes to the __str__ or __repr__ methods could be all there needs to be done.

@jreback
Copy link
Contributor

jreback commented Mar 7, 2018

yes this certainly could be done. would likely take a separate formatting method (just to avoid having string / html / color strings code) in one place.

@cbrnr
Copy link
Contributor

cbrnr commented Mar 7, 2018

Great! What do you mean with separate method? Ideally, color strings should be used when I type df in IPython, so this has to be in __repr__, right?

@cbrnr
Copy link
Contributor

cbrnr commented Mar 27, 2018

ANSI escape color codes work, so it is pretty straightforward to get colorized output, e.g.:

screen shot 2018-03-27 at 10 45 05

The bigger questions are:

  • Where do we want to add this functionality? Currently, I'm modifying the string representation in pandas.io.formats.format.to_string (before strcols get combined into text).
  • What kind of implementation would be suitable? Do we add a helper function (e.g. colorize)?
  • What are the defaults and how do you change them? E.g. there are white on black terminals, black on white terminals, and everything in between.

@TomAugspurger
Copy link
Contributor

Where do we want to add this functionality?

Is this going to be doable with how to_string is currently implemented? From what I recall, that's a bit of a minefield.

Styler used jinja, but I'm not sure what our appetite is for adopting that as a full dependency.

@cbrnr
Copy link
Contributor

cbrnr commented Mar 27, 2018

I guess it's doable in to_string, but carrying around defaults and arguments might require some thinking. I wouldn't call it minefield, but the way the final string is constructed forced me to apply colors in the previous step (in the list of str_cols).

If there is an external tool that can do this coloring for us, we should use it if it means less work implementing it. It could be an optional dependency, in the sense that if it's not installed there will only be plain non-colored output.

@rswgnu
Copy link

rswgnu commented May 5, 2019

@cbrnr wrote:

ANSI escape color codes work, so it is pretty straightforward to get colorized output

Say I wanted to colorize just specific table cells upon output to a terminal to highlight them, based on a list of iloc-compatible cell locations. How would I do that? I just need some pointers on what Pandas functions to tweak to allow this. Thanks.

@texxronn
Copy link

texxronn commented Jun 13, 2019

I could do somehting like

from colorama import Fore, Back, Style
df[c] = Fore.RED + Style.BRIGHT + df[c].astype(str) + Style.RESET_ALL
print (df)

@cbrnr
Copy link
Contributor

cbrnr commented Jun 13, 2019

Interesting. This was already suggested in #459, but never implemented.

@rswgnu
Copy link

rswgnu commented Jun 13, 2019 via email

@ghost711
Copy link

ghost711 commented Aug 17, 2019

I was looking for a way to use ANSI color codes in the terminal or qtconsole for a long time, and finally put this together. The main problem was that ANSI codes were being incorporated into the print width calculation, which messed up registration of the columns.

This is too hacky for a pull request, but it solves my problem, so I thought I'd post it in case it helps anyone else.

You can replace the "TextAdjustment" class with the version below in this file:
site-packages/pandas/io/formats/format.py

class TextAdjustment(object): 
    def __init__(self):
        import re
        self.ansi_regx = re.compile(r'\x1B[@-_][0-?]*[ -/]*[@-~]')
        self.encoding  = get_option("display.encoding")
    
    def len(self, text):  
        return compat.strlen(self.ansi_regx.sub('', text), 
                             encoding=self.encoding) 
            
    def justify(self, texts, max_len, mode='right'):       
        jfunc = str.ljust if (mode == 'left')  else \
                str.rjust if (mode == 'right') else str.center     
        out = [];  
        for s in texts:
            escapes = self.ansi_regx.findall(s)    
            if len(escapes) == 2:
                out.append(escapes[0].strip() + 
                           jfunc(self.ansi_regx.sub('', s), max_len) + 
                           escapes[1].strip()) 
            else:
                out.append(jfunc(s, max_len)) 
        return out;  
      
    def _join_unicode(self, lines, sep=''):
        try:
            return sep.join(lines)
        except UnicodeDecodeError:
            sep = compat.text_type(sep)
            return sep.join([x.decode('utf-8') if isinstance(x, str) else x
                                                            for x in lines])
    
    def adjoin(self, space, *lists, **kwargs): 
        # Add space for all but the last column: 
        pads = ([space] * (len(lists) - 1)) + [0] 
        max_col_len = max([len(col) for col in lists])
        new_cols = []
        for col, pad in zip(lists, pads): 
            width = max([self.len(s) for s in col]) + pad
            c     = self.justify(col, width, mode='left')
            # Add blank cells to end of col if needed for different col lens: 
            if len(col) < max_col_len:
                c.extend([' ' * width] * (max_col_len - len(col)))
            new_cols.append(c)
             
        rows = [self._join_unicode(row_tup) for row_tup in zip(*new_cols)] 
        return self._join_unicode(rows, sep='\n') 

@Beanking77
Copy link

I got similar requirements to colorize specific column and here's my workaround:

from colorama import Fore, Back, Style

def color_red_green(val):
    if val < 0:
        color = Fore.GREEN
    else:
        color = Fore.RED
    return color + str('{0:.2%}'.format(val)) + Style.RESET_ALL

# apply to specific column
dfs["percent"] = dfs["percent"].apply(color_red_green)

thanks @texxronn

@cscanlin-kwh
Copy link

Trying to use corlorama on a pandas dataframe, but running into the same problem with misaligned printing menioned above: #18066 (comment)

Does anyone know of a way to patch this behavior in without modifying pandas itself?

I see pandas.DataFrame.to_string has a formatters parameter, but it's not clear to me how to use it.

@TomAugspurger
Copy link
Contributor

@cscanlin I think pandas will need to be updated to handle this. #30778 had a start, which allows things like

I'm not planning to return to that anytime soon, so feel free to take over if you want.

@mroeschke mroeschke removed the Needs Discussion Requires discussion from core team before further action label Jun 12, 2021
@Erotemic
Copy link
Contributor

Erotemic commented Jul 4, 2021

I've updated the workaround from @ghost711 to work with pandas 1.2.4. Hopefully this feature will be supported in the main branch somehow, but in the meantime:

def monkeypatch_pandas():
    """
    References:
        https://github.com/pandas-dev/pandas/issues/18066
    """
    import pandas.io.formats.format as format_
    from six import text_type

    # Made wrt pd.__version__ == '1.2.4'

    class TextAdjustmentMonkey(object):
        def __init__(self):
            import re
            self.ansi_regx = re.compile(r'\x1B[@-_][0-?]*[ -/]*[@-~]')
            self.encoding  = format_.get_option("display.encoding")

        def len(self, text):
            return len(self.ansi_regx.sub('', text))

        def justify(self, texts, max_len, mode='right'):
            jfunc = str.ljust if (mode == 'left')  else \
                    str.rjust if (mode == 'right') else str.center
            out = []
            for s in texts:
                escapes = self.ansi_regx.findall(s)
                if len(escapes) == 2:
                    out.append(escapes[0].strip() +
                               jfunc(self.ansi_regx.sub('', s), max_len) +
                               escapes[1].strip())
                else:
                    out.append(jfunc(s, max_len))
            return out

        def _join_unicode(self, lines, sep=''):
            try:
                return sep.join(lines)
            except UnicodeDecodeError:
                sep = text_type(sep)
                return sep.join([x.decode('utf-8') if isinstance(x, str) else x
                                                                for x in lines])

        def adjoin(self, space, *lists, **kwargs):
            # Add space for all but the last column:
            pads = ([space] * (len(lists) - 1)) + [0]
            max_col_len = max([len(col) for col in lists])
            new_cols = []
            for col, pad in zip(lists, pads):
                width = max([self.len(s) for s in col]) + pad
                c     = self.justify(col, width, mode='left')
                # Add blank cells to end of col if needed for different col lens:
                if len(col) < max_col_len:
                    c.extend([' ' * width] * (max_col_len - len(col)))
                new_cols.append(c)

            rows = [self._join_unicode(row_tup) for row_tup in zip(*new_cols)]
            return self._join_unicode(rows, sep='\n')

    format_.TextAdjustment = TextAdjustmentMonkey

@brechtm
Copy link

brechtm commented Nov 4, 2022

For what it's worth, I found it pretty frustrating that even something basic like representing NaNs differently on the terminal is impossible. For HTML and LaTeX output, there's the Styler and its na_rep argument, but there doesn't seem to be something equivalent for terminal output.

I'm currently working around this by replacing NaNs with math.inf (using fillna) and providing custom float_format function:

def float_format(value: float):
    if isinf(value):
        return ''
    return str(value)

pd.options.display.float_format = float_format

A simple solution would be to also pass NaNs through the float_format function, but I guess that's not an option considering backward compatibilty.

@cbrnr
Copy link
Contributor

cbrnr commented Nov 4, 2022

I still think that rendering the output with Rich would be the best option. Basically, pd.DataFrames could have a __rich_repr__() method, which is responsible for the repr when Rich is available (https://rich.readthedocs.io/en/stable/pretty.html). Unfortunately, I don't have time to implement it myself at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

No branches or pull requests