-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: pandas.read_csv reads in a comma separated file when delim #54918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you try
I have another read_csv problem in 2.1.0 (lines longer than 512 ?) where engine python works, c doesn't, will post |
running with for engine in ["python", "c"]:
print( "\n-- engine", engine )
df = pd.read_csv(s, engine=engine, delim_whitespace=True )
print(df)
print(df.columns) I got
so it does seem like it's a problem with the c engine only |
Hi, Thanks, |
at the risk of piling in:
throws:
It works as expected with |
Here
Logfile --
|
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
trying to load a comma separated
.csv
file with thedelim_whitespace
parameter set toTrue
should cause the file to be loaded as a single column, but instead it reads it as if it was delimited by commas and breaks it "properly" (in our case, not how we expect, because the CI test was expecting a failure to load the required columns when we passed a separator parameter as "whitespace" with the above file ). I tested this in2.1.0
,2.0.3
, and1.5.3
and in the older versions everything worked as expected, but2.1.0
loaded it with comma separation.Expected Behavior
should not breakup the csv based on a comma delimiter when
delim_whitespace
is set to True.Installed Versions
INSTALLED VERSIONS
commit : ba1cccd
python : 3.10.12.final.0
python-bits : 64
OS : Darwin
OS-release : 21.6.0
Version : Darwin Kernel Version 21.6.0: Sat Jun 18 17:07:22 PDT 2022; root:xnu-8020.140.41~1/RELEASE_ARM64_T6000
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.1.0
numpy : 1.24.3
pytz : 2022.7
dateutil : 2.8.2
setuptools : 68.0.0
pip : 23.2.1
Cython : 3.0.0
pytest : 7.4.0
hypothesis : None
sphinx : 6.1.3
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.14.0
pandas_datareader : None
bs4 : 4.12.2
bottleneck : 1.3.5
dataframe-api-compat: None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.5.3
numba : 0.57.1
numexpr : 2.8.4
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.11.1
sqlalchemy : None
tables : 3.8.0
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None
The text was updated successfully, but these errors were encountered: