You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
importpandasaspdfromioimportStringIOdataframe_file= (
"# this is a comment\n""1,2,3,4\n""1,2,3,4#inline comment\n""1,2#,3,4\n""1,2,#N/A,4\n"
)
dataframe=pd.read_csv(StringIO(dataframe_file), comment="#", na_values="#N/A")
print(dataframe)
Output:
1 2 3 4
0 1 2 3.0 4.0
1 1 2 NaN NaN
2 1 2 NaN NaN
Problem description
On the last row, it should read the 3rd value (#N/A) as NA-Value (NaN) and not as a comment.
But because the value #N/A starts with the comment-character # the rest of the line is interpreted as comment and the last value is not read.
Expected Output
1 2 3 4
0 1 2 3.0 4.0
1 1 2 NaN NaN
2 1 2 NaN 4.0
Output of pd.show_versions()
INSTALLED VERSIONS
commit : None
python : 3.7.7.final.0
python-bits : 64
OS : Linux
OS-release : 5.6.8-arch1-1
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
Hey,
sorry, I am a bit confused. Why do you post "add pr template"?
There is a pr template from pandas available here.
Also I don't understand what does this have to do with my issue? Is there something missing that I should add?
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
Output:
Problem description
On the last row, it should read the 3rd value (
#N/A
) as NA-Value (NaN
) and not as a comment.But because the value
#N/A
starts with the comment-character#
the rest of the line is interpreted as comment and the last value is not read.Expected Output
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : None
python : 3.7.7.final.0
python-bits : 64
OS : Linux
OS-release : 5.6.8-arch1-1
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.0.3
numpy : 1.18.1
pytz : 2020.1
dateutil : 2.8.1
pip : 20.0.2
setuptools : 46.1.3.post20200330
Cython : 0.29.17
pytest : 5.4.1
hypothesis : 5.8.3
sphinx : 3.0.3
blosc : None
feather : None
xlsxwriter : 1.2.8
lxml.etree : 4.5.0
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.13.0
pandas_datareader: None
bs4 : 4.9.0
bottleneck : 1.3.2
fastparquet : None
gcsfs : None
lxml.etree : 4.5.0
matplotlib : 3.1.3
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.3
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.4.1
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.16
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.2.8
numba : 0.49.0
The text was updated successfully, but these errors were encountered: