-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
read_html omits a column when reading a wikipedia table #51629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi, thanks for your report. Can you please provide a minimal and reproducible example? You can define your html table as
Also, please provide your pandas versions and dependencies, e.g. fill out the issue template |
#here's a few first lines of the table in question. Pandas 1.3.5. I'm running it on Google Colab.
|
Can you try on the newest pandas version? 2.0.0rc0 |
Unfortunately the issue is there with 2.0.0rc0
|
Good, now please reduce everything from your html table that is not necessary to reproduce. The example should be minimal |
Hi, it seems the problem is related to
I can try to fix this problem. |
take |
* for removing elements when display:none in read_html * test added
I have created PR which should resolve main issue. But in this wikipedia example we can see another one, |
https://en.wikipedia.org/wiki/List_of_countries_by_road_network_size

#Copy/paste the table HTML in a file
df=pd.read_html(table)
df=pd.DataFrame(df[0])
#The result is shown in the image. The Density column values are missing.
The text was updated successfully, but these errors were encountered: