Skip to content

Commit 030f5ec

Browse files
committed
BUG: Parse two date columns broken in read_csv with multiple headers
Fix for GH15376 In `io/parsers/_try_convert_dates()` when selecting columns based on a column index from a set of columns with multi-level names, the column `name` was converted to a string. This appears to be a bug since the `name` was a tuple before the conversion. This causes problems downstream when threre is an attempt to use this name to lookup a column, and that lookup fails becuase the desired column is keyed from the tuple, not its string representation.
1 parent 1bcc10d commit 030f5ec

File tree

3 files changed

+20
-1
lines changed

3 files changed

+20
-1
lines changed

doc/source/whatsnew/v0.20.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -574,6 +574,7 @@ Bug Fixes
574574

575575

576576

577+
- Bug in ``.read_csv()`` which caused ``parse_dates={'datetime': [0, 1]}`` to fail with multiline headers (:issue:`15376`)
577578

578579

579580
- Bug in ``DataFrame.boxplot`` where ``fontsize`` was not applied to the tick labels on both axes (:issue:`15108`)

pandas/io/parsers.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -2856,7 +2856,7 @@ def _try_convert_dates(parser, colspec, data_dict, columns):
28562856
if c in colset:
28572857
colnames.append(c)
28582858
elif isinstance(c, int) and c not in columns:
2859-
colnames.append(str(columns[c]))
2859+
colnames.append(columns[c])
28602860
else:
28612861
colnames.append(c)
28622862

pandas/tests/io/test_date_converters.py

+18
Original file line numberDiff line numberDiff line change
@@ -148,3 +148,21 @@ def test_parse_date_column_with_empty_string(self):
148148
[621, ' ']]
149149
expected = DataFrame(expected_data, columns=['case', 'opdate'])
150150
assert_frame_equal(result, expected)
151+
152+
def test_parse_date_time_multi_level_column_name(self):
153+
data = """\
154+
D,T,A,B
155+
date, time,a,b
156+
2001-01-05, 09:00:00, 0.0, 10.
157+
2001-01-06, 00:00:00, 1.0, 11.
158+
"""
159+
datecols = {'date_time': [0, 1]}
160+
result = read_csv(StringIO(data), sep=',', header=[0, 1],
161+
parse_dates=datecols,
162+
date_parser=conv.parse_date_time)
163+
164+
expected_data = [[datetime(2001, 1, 5, 9, 0, 0), 0., 10.],
165+
[datetime(2001, 1, 6, 0, 0, 0), 1., 11.]]
166+
expected = DataFrame(expected_data,
167+
columns=['date_time', ('A', 'a'), ('B', 'b')])
168+
assert_frame_equal(result, expected)

0 commit comments

Comments
 (0)