Closed
Description
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
# Your code here
Python 3.9.1 | packaged by conda-forge | (default, Dec 9 2020, 01:07:47)
[Clang 11.0.0 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> pd.__version__
'1.2.0'
>>> df=pd.read_excel('full_data.xlsx')
Problem description
I am not quite sure how to describe the bug, the code just got stuck when I run pd.read_excel('full_data.xlsx')
. I found this line cost a significant amount of memories (almost 14G
but the .xlsx
file is just 9MB
).
I speculate it is result from read_excel
now leverage openpyxl
as default engine in python3.9
. Loading this file in python3.8
works fine.
>>> from openpyxl import load_workbook
>>> wb=load_workbook('full_data.xlsx')
>>> df=pd.DataFrame(wb['Sheet1'].values)
The above codes also leads to the same issue.