-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Support for Merging Columns in HTML Parser #4683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
what sort of API would you want for this? This is an example of a horribly designed and written table that uses style attributes in almost every tag. Something like: One issue is coming up with a way to specify the |
At a certain point you can just do that sort of thing yourself and keep |
It does. See #4679. I gave an example. |
So then I'm not clear on what the value is here. Edge cases where this |
Maybe. I think this should be left to the user. |
I am not worried about My biggest concern is how the reader would express this data to the user:
In this case the natural results would be a a @cpcloud While I agree that this is in fact a "horribly designed and written table that uses style attributes in almost every tag", most SEC filings that I have seen share similar layout characteristics (I would be happy to provide more examples). I would say that having the ability to load tables from SEC filings to worthwhile even if they make use cringe. I did fine one example so far of a a ragged table (ick): http://www.sec.gov/Archives/edgar/data/1108524/000119312511075314/d10k.htm#tx138159_26 So, I am all for keeping I agree that it does seem like a generalization of the |
Appears that this issue didn't gain a lot of traction. Thanks for the suggestion but closing due to lack of activity. Happy to reopen if this community finds this feature useful |
See the table here: http://www.sec.gov/Archives/edgar/data/47217/000104746913006802/a2215416z10-q.htm#CCSE
the trailing ")" and the leading "$" are in different columns (aka td's / cells) from the number..
There should be an option to merge all column under a given heading (#4679).
The text was updated successfully, but these errors were encountered: