Skip to content

Commit 3ed8551

Browse files
committed
BUG: Parse two date columns broken in read_csv with multiple headers
Fix for GH15376 In `io/parsers/_try_convert_dates()` when selecting columns based on a column index from a set of columns with multi-level names, the column `name` was converted to a string. This appears to be a bug since the `name` was a tuple before the conversion. This causes problems downstream when threre is an attempt to use this name to lookup a column, and that lookup fails becuase the desired column is keyed from the tuple, not its string representation.
1 parent 1bcc10d commit 3ed8551

File tree

3 files changed

+19
-1
lines changed

3 files changed

+19
-1
lines changed

doc/source/whatsnew/v0.20.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -580,3 +580,4 @@ Bug Fixes
580580
- Bug in ``Series.replace`` and ``DataFrame.replace`` which failed on empty replacement dicts (:issue:`15289`)
581581
- Bug in ``pd.melt()`` where passing a tuple value for ``value_vars`` caused a ``TypeError`` (:issue:`15348`)
582582
- Bug in ``.eval()`` which caused multiline evals to fail with local variables not on the first line (:issue:`15342`)
583+
- Bug in ``.read_csv()`` which caused ``parse_dates={'datetime': [0, 1]}`` to fail with multiline headers (:issue:`15376`)

pandas/io/parsers.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -2856,7 +2856,7 @@ def _try_convert_dates(parser, colspec, data_dict, columns):
28562856
if c in colset:
28572857
colnames.append(c)
28582858
elif isinstance(c, int) and c not in columns:
2859-
colnames.append(str(columns[c]))
2859+
colnames.append(columns[c])
28602860
else:
28612861
colnames.append(c)
28622862

pandas/tests/io/test_date_converters.py

+17
Original file line numberDiff line numberDiff line change
@@ -148,3 +148,20 @@ def test_parse_date_column_with_empty_string(self):
148148
[621, ' ']]
149149
expected = DataFrame(expected_data, columns=['case', 'opdate'])
150150
assert_frame_equal(result, expected)
151+
152+
def test_parse_date_time_multi_level_column_name(self):
153+
# GH 15376
154+
result = conv.parse_date_time(self.dates, self.times)
155+
self.assertTrue((result == self.expected).all())
156+
157+
data = """\
158+
D, T, A, B
159+
date, time, a, b
160+
2001-01-05, 10:00:00, 0.0, 10.
161+
2001-01-05, 00:00:00, 1., 11.
162+
"""
163+
datecols = {'date_time': [0, 1]}
164+
df = read_table(StringIO(data), sep=',', header=[0, 1],
165+
parse_dates=datecols, date_parser=conv.parse_date_time)
166+
self.assertIn('date_time', df)
167+
self.assertEqual(df.date_time.loc[0], datetime(2001, 1, 5, 10, 0, 0))

0 commit comments

Comments
 (0)