Skip to content

BUG: url regex in style_render does not pass colon and other valid #46457

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 24, 2022

Conversation

kianelbo
Copy link
Contributor

@kianelbo kianelbo commented Mar 21, 2022

URLs containing some valid characters such as colon in port numbers get
cut off when html-formatting. As a workaround, expanded the regex to
match a wider variety of URLs.

Copy link
Contributor

@attack68 attack68 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm but for 2 things:

1: needs a "whats new" release rotes (from the docs folder) expplaining the changes.
2: the second test should include all the new charcaters so that future changes to the regex do not remove them

@@ -780,6 +780,8 @@ def test_hiding_index_columns_multiindex_trimming():
("ftp scheme: ftp://www.web", True, "ftp://www.web"),
("subdirectories: www.web.com/directory", True, "www.web.com/directory"),
("Multiple domains: www.1.2.3.4", True, "www.1.2.3.4"),
("with port: http://web.com:80", True, "http://web.com:80"),
("with anchor: http://web.com/anc#hor", True, "http://web.com/anc#hor"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add all the special characters to this line so that they are tested simultaneously.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I made the tests more exhaustive and also tweaked the regex a little bit more.

URLs containing some valid characters such as colon in port numbers get
cut off when html-formatting. As a workaround, expanded the regex to
match a wider variety of URLs.
@jreback jreback added Styler conditional formatting using DataFrame.style IO HTML read_html, to_html, Styler.apply, Styler.applymap labels Mar 22, 2022
@jreback jreback added this to the 1.4.2 milestone Mar 22, 2022
Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kianelbo for the PR.

minor change request to give the release note some context.

@mroeschke mroeschke merged commit f99ec8b into pandas-dev:main Mar 24, 2022
@mroeschke
Copy link
Member

Thanks @kianelbo

@mroeschke
Copy link
Member

@meeseeksdev backport 1.4.x

@lumberbot-app
Copy link

lumberbot-app bot commented Mar 24, 2022

Something went wrong ... Please have a look at my logs.

@kianelbo kianelbo deleted the fix-href-render branch March 24, 2022 18:46
jreback pushed a commit that referenced this pull request Mar 25, 2022
yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022
…andas-dev#46457)

* BUG: url regex in `style_render` does not pass colon and other valid

URLs containing some valid characters such as colon in port numbers get
cut off when html-formatting. As a workaround, expanded the regex to
match a wider variety of URLs.

* Add whatsnew entry for pandas-dev#46389 fix

* Update whatsnew entry for fix pandas-dev#46389

Co-authored-by: Simon Hawkins <[email protected]>

Co-authored-by: Simon Hawkins <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO HTML read_html, to_html, Styler.apply, Styler.applymap Styler conditional formatting using DataFrame.style
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: colon in URL gets cut off during call of to_html() when applying format(hyperlinks='html')
5 participants