Skip to content

Commit 8c2e1ca

Browse files
committed
BUG: read_csv not converting to float for python engine with decimal sep, usecols and parse_dates
1 parent 22dbef1 commit 8c2e1ca

File tree

3 files changed

+23
-2
lines changed

3 files changed

+23
-2
lines changed

doc/source/whatsnew/v1.2.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -724,6 +724,7 @@ I/O
724724
- Bug in :meth:`DataFrame.to_hdf` was not dropping missing rows with ``dropna=True`` (:issue:`35719`)
725725
- Bug in :func:`read_html` was raising a ``TypeError`` when supplying a ``pathlib.Path`` argument to the ``io`` parameter (:issue:`37705`)
726726
- :meth:`DataFrame.to_excel`, :meth:`Series.to_excel`, :meth:`DataFrame.to_markdown`, and :meth:`Series.to_markdown` now support writing to fsspec URLs such as S3 and Google Cloud Storage (:issue:`33987`)
727+
- Bug in :meth:`read_csv` returning object dtype when ``delimiter=","`` with ``usecols`` and ``parse_dates`` specified for ``engine="python"`` (:issue:`35873`)
727728
- Bug in :func:`read_fwf` with ``skip_blank_lines=True`` was not skipping blank lines (:issue:`37758`)
728729
- Parse missing values using :func:`read_json` with ``dtype=False`` to ``NaN`` instead of ``None`` (:issue:`28501`)
729730
- :meth:`read_fwf` was inferring compression with ``compression=None`` which was not consistent with the other :meth:``read_*`` functions (:issue:`37909`)

pandas/io/parsers.py

+5-1
Original file line numberDiff line numberDiff line change
@@ -2354,12 +2354,16 @@ def _set_no_thousands_columns(self):
23542354
# Create a set of column ids that are not to be stripped of thousands
23552355
# operators.
23562356
noconvert_columns = set()
2357+
if self._col_indices is not None:
2358+
col_indices = sorted(self._col_indices)
2359+
else:
2360+
col_indices = list(range(len(self.columns)))
23572361

23582362
def _set(x):
23592363
if is_integer(x):
23602364
noconvert_columns.add(x)
23612365
else:
2362-
noconvert_columns.add(self.columns.index(x))
2366+
noconvert_columns.add(col_indices[self.columns.index(x)])
23632367

23642368
if isinstance(self.parse_dates, list):
23652369
for val in self.parse_dates:

pandas/tests/io/parser/test_python_parser_only.py

+17-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212

1313
from pandas.errors import ParserError
1414

15-
from pandas import DataFrame, Index, MultiIndex
15+
from pandas import DataFrame, Index, MultiIndex, Timestamp
1616
import pandas._testing as tm
1717

1818

@@ -314,3 +314,19 @@ def test_malformed_skipfooter(python_parser_only):
314314
msg = "Expected 3 fields in line 4, saw 5"
315315
with pytest.raises(ParserError, match=msg):
316316
parser.read_csv(StringIO(data), header=1, comment="#", skipfooter=1)
317+
318+
319+
def test_delimiter_with_usecols_and_parse_dates(python_parser_only):
320+
# GH#35873
321+
result = python_parser_only.read_csv(
322+
StringIO('"dump","-9,1","-9,1",20101010'),
323+
engine="python",
324+
names=["col", "col1", "col2", "col3"],
325+
usecols=["col1", "col2", "col3"],
326+
parse_dates=["col3"],
327+
decimal=",",
328+
)
329+
expected = DataFrame(
330+
{"col1": [-9.1], "col2": [-9.1], "col3": [Timestamp("2010-10-10")]}
331+
)
332+
tm.assert_frame_equal(result, expected)

0 commit comments

Comments
 (0)