Skip to content

Commit fb7dc7d

Browse files
stephenrauchjreback
authored andcommitted
BUG: Parse two date columns broken in read_csv with multiple headers
In `io/parsers/_try_convert_dates()` when selecting columns based on a column index from a set of columns with multi- level names, the column `name` was converted to a string. This appears to be a bug since the `name` was a tuple before the conversion. This causes problems downstream when there is an attempt to use this name to lookup a column, and that lookup fails because the desired column is keyed from the tuple, not its string representation closes #15376 Author: Stephen Rauch <[email protected]> Closes #15378 from stephenrauch/fix_read_csv_merge_datetime and squashes the following commits: 030f5ec [Stephen Rauch] BUG: Parse two date columns broken in read_csv with multiple headers
1 parent b3ae4c7 commit fb7dc7d

File tree

3 files changed

+21
-1
lines changed

3 files changed

+21
-1
lines changed

doc/source/whatsnew/v0.20.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -625,6 +625,7 @@ Bug Fixes
625625

626626

627627

628+
- Bug in ``.read_csv()`` with ``parse_dates`` when multiline headers are specified (:issue:`15376`)
628629

629630

630631
- Bug in ``DataFrame.boxplot`` where ``fontsize`` was not applied to the tick labels on both axes (:issue:`15108`)

pandas/io/parsers.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -2858,7 +2858,7 @@ def _try_convert_dates(parser, colspec, data_dict, columns):
28582858
if c in colset:
28592859
colnames.append(c)
28602860
elif isinstance(c, int) and c not in columns:
2861-
colnames.append(str(columns[c]))
2861+
colnames.append(columns[c])
28622862
else:
28632863
colnames.append(c)
28642864

pandas/tests/io/parser/parse_dates.py

+19
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
import pandas.tseries.tools as tools
1919
import pandas.util.testing as tm
2020

21+
import pandas.io.date_converters as conv
2122
from pandas import DataFrame, Series, Index, DatetimeIndex
2223
from pandas import compat
2324
from pandas.compat import parse_date, StringIO, lrange
@@ -491,3 +492,21 @@ def test_parse_dates_noconvert_thousands(self):
491492
result = self.read_csv(StringIO(data), index_col=[0, 1],
492493
parse_dates=True, thousands='.')
493494
tm.assert_frame_equal(result, expected)
495+
496+
def test_parse_date_time_multi_level_column_name(self):
497+
data = """\
498+
D,T,A,B
499+
date, time,a,b
500+
2001-01-05, 09:00:00, 0.0, 10.
501+
2001-01-06, 00:00:00, 1.0, 11.
502+
"""
503+
datecols = {'date_time': [0, 1]}
504+
result = self.read_csv(StringIO(data), sep=',', header=[0, 1],
505+
parse_dates=datecols,
506+
date_parser=conv.parse_date_time)
507+
508+
expected_data = [[datetime(2001, 1, 5, 9, 0, 0), 0., 10.],
509+
[datetime(2001, 1, 6, 0, 0, 0), 1., 11.]]
510+
expected = DataFrame(expected_data,
511+
columns=['date_time', ('A', 'a'), ('B', 'b')])
512+
tm.assert_frame_equal(result, expected)

0 commit comments

Comments
 (0)