Series repr html only #29383

big-o · 2019-11-03T15:31:41Z

closes Series do not display HTML repr #5563
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

big-o · 2019-11-03T15:39:07Z

Wasn't sure where to add the whatsnew entry so left it out for now. Grateful for any advice.

big-o · 2019-11-03T15:41:00Z

This PR supercedes #29248 which was a messy attempt at creating full HTML functionality for the Series class by creating a new HTMLColumnFormatter class and changing SeriesFormatter. This PR is much smaller in scope and hopefully easier to debug/merge/maintain. The scope can be expanded later to give full to_html functionality.

big-o · 2019-11-03T15:45:31Z

The HTML rendering looks just like that proposed in #29248:

however rather than doing this via an independent formatter, it just uses DataFrame.to_html without any header or footer information, and then inserts a new footer into the HTML.

pandas/core/series.py

WillAyd · 2019-11-04T22:16:15Z

pandas/core/series.py

+                table_id=None,
+                render_links=False,
+            )
+            html = formatter.to_html(notebook=True).split("\n")


Is this approach drastically different than what was proposed in #27228 ?

Not really, the only difference is in using the Formatter class rather than just invoking it via to_html. This is to allow support for respecting the min_rows, max_rows and show_dimensions options (which can't be specified in DataFrame.to_html).

I think we would prefer to define a SeriesFormatter that inherits a common base class with DataFrameFormatter (callit GenericFromatter). This should't be much more complicated and would be way less fragile than this.

WillAyd · 2019-11-04T22:17:18Z

pandas/tests/io/formats/test_format.py

+            else:
+                assert "<td>{}</td>".format(val) in sm_html
+
+        assert "<p>Name: <b>{}</b>".format(small.name) in sm_html


Can you construct sm_html literally instead? I think small enough to do; would make for stronger assertions

I've pushed a commit for this now - personally I still prefer the old approach which is just checking for the meaningful content and is relaxed about other things like HTML indentation which don't affect the meaning of the output. I also think it makes the tests more readable. But let me know which you prefer and I can always revert if you want.

pep8speaks · 2019-11-05T12:21:01Z

Hello @big-o! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-12-29 22:58:31 UTC

big-o · 2019-11-11T19:34:21Z

@WillAyd are you happy with the changes? I think everything is in place now but please let me know if any further work is needed. I made an assumption about where the What's new entry needs to go, let me know if this is wrong.

WillAyd

@simonjayhawkins any thoughts on this?

WillAyd · 2019-11-12T06:10:41Z

pandas/core/series.py

+        # TODO: Full independent HTML generation in SeriesFormatter, rather
+        # than depending on a limited subset of functionality via to_frame().
+
+        if get_option("display.notebook_repr_html"):


As a follow up I think should put this in the NDFrame superclass shared by DataFrame and Series rather than copy paste

big-o · 2019-11-17T13:10:42Z

Are you happy with everything in this PR now @WillAyd @simonjayhawkins? I noticed that some of the regression tests are failing again but I'm confident this isn't due to any of this work: Everything was passing then I had to merge in some other updates due to a clash when I updated the whatsnew RST with a new entry for my change. It looks like all the issues that are failing are from other tickets that are being worked on.

All the actual code changes are in 4d40bb6, which passes all regression tests.

WillAyd

Can you merge master again? Will probably solve CI issue

WillAyd · 2019-11-18T02:00:04Z

pandas/core/series.py

+                footer.append("Length: {rows}".format(rows=len(self)))
+            footer.append("dtype: <tt>{dtype}</tt>".format(dtype=self.dtype))
+
+            html.insert(tbl_end + 1, "<p>{footer}</p>".format(footer=", ".join(footer)))


Would this still insert paragraph tags when no footer is required at all?

Yes, but a footer will always be required so there's no need to check len(footer) I believe. Just like the string representation, there will always be at least a dtype displayed in the footer. The footer is the main thing that distinguishes between a Series and a single-column DataFrame (the only other difference is the lack of column headers in Series).

I've merged in the latest master and all CI checks pass now.

…r_html_only

jreback · 2019-11-20T13:27:24Z

pandas/core/series.py

+                table_id=None,
+                render_links=False,
+            )
+            html = formatter.to_html(notebook=True).split("\n")


I think we would prefer to define a SeriesFormatter that inherits a common base class with DataFrameFormatter (callit GenericFromatter). This should't be much more complicated and would be way less fragile than this.

big-o · 2019-11-20T17:51:00Z

This is exactly what I did originally, minus the base class (#29248), but it @WillAyd asked for it to be replaced with this PR because they preferred a smaller change, which could be expanded on in future.

This PR is small and contained and would be easy to replace with a SeriesFormatter in the future. Making a SeriesFormatter with a common Base class to DataFrameFormatter is a lot of work, unless you're happy to have a lot of repeated code. This is why the original PR was rejected.

If you now want a series formatter, please take a look at #29248 and decide if you want to accept that instead. If you want a more complete implementation with a shared base class then this is a large task that I wouldn't have time for right now.

Please have a think and maybe discuss it with @WillAyd and then let me know how you want to proceed.

I'd ask you to reconsider this PR though - it leaves a small footprint on the project, is well tested and will be easy to remove when you have a more complete solution. A complete solution will probably take a long time to implement and you'll be without the feature for a long time.

big-o · 2019-11-29T22:53:56Z

@WillAyd @jreback have you had any thoughts on how you'd like to take this PR forward? Keen to get this feature into pandas but at the moment I've got some conflicting guidance on how to progress. If you need to think more about how you'd like to take this forward that's fine, but it would be good to know if it can be implemented in its current shape or if it needs to be reworked again first.

jreback · 2019-11-29T23:28:13Z

pandas/core/series.py

+            html = formatter.to_html(notebook=True).split("\n")
+
+            # Find out where the column ends - we will insert footer information here.
+            tbl_end = [


@big-o I think would be better on this PR if you can do this inside of DataFrameFormatter in its construction itself, rather than 'finding' then end later. is this possible?

Do you mean adding an extra footer parameter to either DataFrameFormatter.__init__ or DataFrameFormatter.to_html? Since the DataFrameFormatter isn't limited to single column, data, what should be footer look like for frames with multiple columns? Should there even be one? Should an Exception be raised? Would appreciate some clear guidance on what is expected before I start on another rewrite, as I only have limited time to spend on this.

jreback · 2019-12-01T23:09:18Z

cc @jorisvandenbossche if you'd look here

jreback · 2019-12-27T19:45:21Z

can you merge master and we'll look again

…l_only

big-o · 2019-12-29T23:00:53Z

can you merge master and we'll look again

Done - I'm assuming you don't want any changes other than those necessary for the merge right now as I haven't heard anything regarding which approach to take. Please let me know if you want any further changes and what they should be if so.

WillAyd · 2020-02-02T01:17:12Z

@big-o any chance you took a look at the design suggested by @jreback to have a GenericFormatter that DataFrameFormatter and SeriesFormatter would both inherit from? The latter are already defined in format.py so wondering if you have ideas on how to better construct a class hierarchy for them

big-o · 2020-02-03T10:17:48Z

I looked into it originally, as that was my plan too. It's quite a complicated class for someone who is new to the codebase so I think it would take me quite a while to unpick. That's why I originally went for a slightly messy compromise with #29248, which was rejected by @WillAyd who suggested the approach in this PR.

I know this PR isn't perfect but it gets a new feature in with almost no changes to the original codebase, and only small additions that can be easily removed and replaced with a proper SeriesFormatter at a later date.

I'd be happy to write the SeriesFormatter, but it will take some time for me to implement - as well as the time needed to figure it all out and make the code changes necessary, I also need to fit this work around my job. In the meantime people could be making use of the feature via this PR without the rest of the codebase being affected, so would it be possible to go along with the original decision of taking this more limited approach, giving a workable feature for people to use until the more complete solution comes along?

WillAyd · 2020-07-29T20:47:26Z

@big-o looks like the desired design is to have the GenericFormatter as a base which SeriesFormatter / DataFrameFormatter inherit from - is that something you are still looking at?

WillAyd · 2020-09-10T18:56:28Z

Closing as I think stale but ping if you'd like to give the suggested design approach a shot

big-o · 2020-09-11T17:22:02Z

Sorry for the slow response; my time has been limited lately. I need to prioritize other things right now but if I get time to look into it again I'll leave a comment here.

big-o added 8 commits October 27, 2019 21:54

initial HTML rendering for Series

a771c97

PEP8 formatting fixes and missing to_html impl

982a9e1

<tt/> formatting for dtypes

58628e0

lint bug fixes and hierarchical index support

73722b8

bold index by default

4bb34fa

fixed existing unit test

a217806

Series _repr_html_ via frame.to_html

a6e4e89

unit test for series._repr_html_

c08e2ad

big-o mentioned this pull request Nov 3, 2019

initial HTML rendering for Series #29248

Closed

5 tasks

py3.5 compat

b245c30

jreback added the IO HTML read_html, to_html, Styler.apply, Styler.applymap label Nov 4, 2019

WillAyd requested changes Nov 4, 2019

View reviewed changes

big-o added 3 commits November 5, 2019 12:03

removed max_cols config from series._repr_html_ formatter

4672ae5

replaced HTML assertions with exact string match for small html

60ec007

exact string match assertion for large html

18877d9

big-o added 4 commits November 5, 2019 12:28

PEP8 line length

446ec80

updated test_publishes to include text/html format for series

4d40bb6

what's new entry for v1.0.0

a1aac92

Merge branch 'master' into series_repr_html_only

f61aed2

WillAyd reviewed Nov 12, 2019

View reviewed changes

WillAyd reviewed Nov 18, 2019

View reviewed changes

Merge branch 'master' of github.com:pandas-dev/pandas into series_rep…

49fa29b

…r_html_only

jreback requested changes Nov 20, 2019

View reviewed changes

jreback reviewed Nov 29, 2019

View reviewed changes

Merge branch 'master' of github.com:big-o/pandas into series_repr_htm…

7a270cb

…l_only

WillAyd closed this Sep 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Series repr html only #29383

Series repr html only #29383

big-o commented Nov 3, 2019 •

edited

Loading

big-o commented Nov 3, 2019

big-o commented Nov 3, 2019

big-o commented Nov 3, 2019

WillAyd Nov 4, 2019

big-o Nov 5, 2019 •

edited

Loading

jreback Nov 20, 2019

WillAyd Nov 4, 2019

big-o Nov 5, 2019

pep8speaks commented Nov 5, 2019 •

edited

Loading

big-o commented Nov 11, 2019

WillAyd left a comment

WillAyd Nov 12, 2019

big-o commented Nov 17, 2019 •

edited

Loading

WillAyd left a comment

WillAyd Nov 18, 2019

big-o Nov 18, 2019

jreback Nov 20, 2019

big-o commented Nov 20, 2019

big-o commented Nov 29, 2019

jreback Nov 29, 2019

big-o Nov 30, 2019

jreback commented Dec 1, 2019

jreback commented Dec 27, 2019

big-o commented Dec 29, 2019

WillAyd commented Feb 2, 2020

big-o commented Feb 3, 2020

WillAyd commented Jul 29, 2020

WillAyd commented Sep 10, 2020

big-o commented Sep 11, 2020

Series repr html only #29383

Series repr html only #29383

Conversation

big-o commented Nov 3, 2019 • edited Loading

big-o commented Nov 3, 2019

big-o commented Nov 3, 2019

big-o commented Nov 3, 2019

Choose a reason for hiding this comment

big-o Nov 5, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pep8speaks commented Nov 5, 2019 • edited Loading

Comment last updated at 2019-12-29 22:58:31 UTC

big-o commented Nov 11, 2019

WillAyd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

big-o commented Nov 17, 2019 • edited Loading

WillAyd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

big-o commented Nov 20, 2019

big-o commented Nov 29, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Dec 1, 2019

jreback commented Dec 27, 2019

big-o commented Dec 29, 2019

WillAyd commented Feb 2, 2020

big-o commented Feb 3, 2020

WillAyd commented Jul 29, 2020

WillAyd commented Sep 10, 2020

big-o commented Sep 11, 2020

big-o commented Nov 3, 2019 •

edited

Loading

big-o Nov 5, 2019 •

edited

Loading

pep8speaks commented Nov 5, 2019 •

edited

Loading

big-o commented Nov 17, 2019 •

edited

Loading