-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Add Series._repr_html_ #27228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Series._repr_html_ #27228
Conversation
@mrocklin we just black :-> so if you can reformat. |
Sure. Glad to hear that y'all are adopting black. I wasn't sure how to apply black to just a diff, but hopefully this passes. I'm also happy to track down the errors in CI, but I wanted to make sure that this is something that folks want. My assumption was that folks might have thoughts on how this gets rendered. |
IIRC @simonjayhawkins might have thoughts here |
+1, this has been a wart for a long time. Can we have support for (@jorisvandenbossche, |
Unfortunately I don't know what this means. Hopefully though it's already handled? All this does is call |
firstly to make my position clear. I am -1 on changing the default Series display in the notebook. That said, adding a I think some pd.option is required to activate html output of the Series and it needs to be indepenent of the DataFrame option. As a quick and dirty implementation the current PR could do the job. But unfortunately it could be much more complex. I'm against postprocessing the But I think that potentially this postprocessing (if it was deemed an acceptable approach) could be reduced considerably.. consider this series
the DataFrame looks like.
this includes
so it's not necessarily as straightforward as removing it. there are options on
SeriesFormatter already has the logic to print this could be added in a
10 rows × 1 columns so the series represation could look like..
Name: Series, dtype: int64 I think also that the general concensus on formatting is that the formatting should be done in a Some parameters will need to be accepted by Another issue is the Styler. With this current implementation, we would need to be able to apply a Styler to the Series and then transfer it to the DataFrame during the |
Can you clarify why you wouldn't change the default repr for the notebook? (we have a html repr for DataFrame for a long time) The other arguments more seem (relevant!) discussion points about the implementation, but not about why we would / would not have a html repr for Series. |
see #8829 (comment) : "A Series isn't a Table-like thing, it's a vector like thing" summed it up back in Nov 2014. |
But an html repr, although maybe using the html table construct, doesn't necessarily need to look like a table. You can make it look like a 1D vector ? There have been several attempts / issues (open issue for this is actually #5563, see linked issues and PRs there), and I have not seen that remark come up anywhere else. I think the discussion mainly has been: we need a good design proposal as it needs to look clearly distinct from a 1-column DataFrame. |
To be honest I just set this up in a spare hour of work. It seems like
there is still a decision to be made among the Pandas maintainers. I might
let you all handle that and then come back if you all settle things.
My personal opinion is that Series should have a `_repr_html_` method that
is similar (but not exactly the same as) the DataFrame version. That is my
only interest in this work. I don't personally care about to_html on its
own.
…On Sat, Jul 6, 2019 at 10:27 AM Simon Hawkins ***@***.***> wrote:
open issue for this is actually #5563
<#5563>
I would suggest then to reduce the scope of this PR to address #8829
<#8829> only. i.e "Pandas
Series should provide to_html method" and cover #5563
<#5563> in a follow-on.
Alternatively, update this PR to reference #5563
<#5563> and do a precursor PR
for #8829 <#8829>
@mrocklin <https://github.com/mrocklin> ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#27228?email_source=notifications&email_token=AACKZTHKL4YLYHX2O4SAVCTP6BQQTA5CNFSM4H5Z2YOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZKV6OA#issuecomment-508911416>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AACKZTD5YBUEFGADB6SHIETP6BQQTANCNFSM4H5Z2YOA>
.
|
Agreed. I think the issue is only whether it should be the default. If it is the default, it requires a complete, integrated, maintainable and well tested and documented solution (possibly with additional options to allow disgruntled users to revert to the old behavior) from the get-go. This makes the review process more rigorous and drawn-out. |
Yup. Fair enough. I probably don't have more than a few hours to devote
to this, so I'm probably not the right person to take this on. Happy to
close.
…On Sat, Jul 6, 2019 at 12:51 PM Simon Hawkins ***@***.***> wrote:
My personal opinion is that Series should have a _repr_html_ method that
is similar (but not exactly the same as) the DataFrame version.
Agreed. I think the issue is only whether it should be the default.
If it is the default, it requires a complete, integrated, maintainable and
well tested and documented solution (possibly with additional options to
allow disgruntled users to revert to the old behavior) from the get-go.
This makes the review process more rigorous and drawn-out.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#27228?email_source=notifications&email_token=AACKZTGQPRVBKTR65G5PI4DP6CBNFA5CNFSM4H5Z2YOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZKYEGI#issuecomment-508920345>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AACKZTCJF4NQJPWVKT72MQDP6CBNFANCNFSM4H5Z2YOA>
.
|
Alternatively, I would personally also be fine to focus this on only adding
We already have |
Agreed. Simplifies things greatly. Not adding a new public method reduces and simplifies documentation additions (no need to setup shared docstrings). not adding an IO method means no need to add file-like buffer support. |
I've removed the |
Checking in. Is this approach generally palatable to the Pandas maintainers? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to implementing just _repr_html_
now.
On the actual repr itself, I'm concerned about losing the Series dtype in the repr. I played a bit with a <caption>
, but that has some poor interactions with the width of the table.
def _repr_html_(self): | ||
text = self.to_frame()._repr_html_() | ||
|
||
lines = text.split("\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answering a question I had: Does this fail on data with embedded newlines? No, to_html
escapes the newlines, so we're OK.
@@ -1611,6 +1611,32 @@ def __repr__(self): | |||
|
|||
return result | |||
|
|||
def _repr_html_(self): | |||
text = self.to_frame()._repr_html_() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than all the head_start
, head_stop
calculation, can we instead use self.to_frame().to_html(header=False, notebook=True)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Played with this briefly, and it seems to work reasonably well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we wouldn't want to this be housed in SeriesFormatter (pandas.io.formats.format) instead? I think if we did that could consolidate logic in core (maybe move to generic) and leave dispatching to the actual formatter code
@@ -1769,6 +1769,24 @@ def test_repr_html(self, float_frame): | |||
|
|||
tm.reset_display_options() | |||
|
|||
def test_series(series): | |||
df = DataFrame({"abc": range(1000)}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to create a smaller series and make a complete assertion about the content?
@@ -1611,6 +1611,32 @@ def __repr__(self): | |||
|
|||
return result | |||
|
|||
def _repr_html_(self): | |||
text = self.to_frame()._repr_html_() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we wouldn't want to this be housed in SeriesFormatter (pandas.io.formats.format) instead? I think if we did that could consolidate logic in core (maybe move to generic) and leave dispatching to the actual formatter code
@mrocklin is this PR still active? |
Closing as stale but certainly ping if you'd like to continue |
git diff upstream/master -u -- "*.py" | flake8 --diff