Skip to content

Commit 7e43c78

Browse files
grahamjeffriesjreback
authored andcommitted
Remove NotImplementedError for parse_dates keyword in read_excel
Rebase and update of PR #12051 Author: Joris Van den Bossche <[email protected]> Author: Graham R. Jeffries <[email protected]> This patch had conflicts when merged, resolved by Committer: Jeff Reback <[email protected]> Closes #14326 from jorisvandenbossche/pr/12051 and squashes the following commits: 0b65a7a [Joris Van den Bossche] update wording 656ec44 [Joris Van den Bossche] Fix detection to raise warning b1c7f87 [Joris Van den Bossche] add whatsnew 925ce1b [Joris Van den Bossche] Update tests 0e10a9d [Graham R. Jeffries] remove read_excel kwd NotImplemented error, update documentation #11544
1 parent 71f621f commit 7e43c78

File tree

4 files changed

+43
-20
lines changed

4 files changed

+43
-20
lines changed

doc/source/io.rst

+14
Original file line numberDiff line numberDiff line change
@@ -2767,6 +2767,20 @@ indices to be parsed.
27672767
27682768
read_excel('path_to_file.xls', 'Sheet1', parse_cols=[0, 2, 3])
27692769
2770+
2771+
Parsing Dates
2772+
+++++++++++++
2773+
2774+
Datetime-like values are normally automatically converted to the appropriate
2775+
dtype when reading the excel file. But if you have a column of strings that
2776+
*look* like dates (but are not actually formatted as dates in excel), you can
2777+
use the `parse_dates` keyword to parse those strings to datetimes:
2778+
2779+
.. code-block:: python
2780+
2781+
read_excel('path_to_file.xls', 'Sheet1', parse_dates=['date_strings'])
2782+
2783+
27702784
Cell Converters
27712785
+++++++++++++++
27722786

doc/source/whatsnew/v0.19.0.txt

+4
Original file line numberDiff line numberDiff line change
@@ -517,13 +517,17 @@ Other enhancements
517517
- The ``pd.read_json`` and ``DataFrame.to_json`` has gained support for reading and writing json lines with ``lines`` option see :ref:`Line delimited json <io.jsonl>` (:issue:`9180`)
518518
- :func:`read_excel` now supports the true_values and false_values keyword arguments (:issue:`13347`)
519519
- ``groupby()`` will now accept a scalar and a single-element list for specifying ``level`` on a non-``MultiIndex`` grouper. (:issue:`13907`)
520+
<<<<<<< HEAD
520521
- Non-convertible dates in an excel date column will be returned without conversion and the column will be ``object`` dtype, rather than raising an exception (:issue:`10001`).
521522
- ``pd.Timedelta(None)`` is now accepted and will return ``NaT``, mirroring ``pd.Timestamp`` (:issue:`13687`)
522523
- ``pd.read_stata()`` can now handle some format 111 files, which are produced by SAS when generating Stata dta files (:issue:`11526`)
523524
- ``Series`` and ``Index`` now support ``divmod`` which will return a tuple of
524525
series or indices. This behaves like a standard binary operator with regards
525526
to broadcasting rules (:issue:`14208`).
526527

528+
=======
529+
- Re-enable the ``parse_dates`` keyword of ``read_excel`` to parse string columns as dates (:issue:`14326`)
530+
>>>>>>> PR_TOOL_MERGE_PR_14326
527531

528532
.. _whatsnew_0190.api:
529533

pandas/io/excel.py

+3-6
Original file line numberDiff line numberDiff line change
@@ -343,13 +343,10 @@ def _parse_excel(self, sheetname=0, header=0, skiprows=None, names=None,
343343
if 'chunksize' in kwds:
344344
raise NotImplementedError("chunksize keyword of read_excel "
345345
"is not implemented")
346-
if parse_dates:
347-
raise NotImplementedError("parse_dates keyword of read_excel "
348-
"is not implemented")
349346

350-
if date_parser is not None:
351-
raise NotImplementedError("date_parser keyword of read_excel "
352-
"is not implemented")
347+
if parse_dates is True and not index_col:
348+
warn("The 'parse_dates=True' keyword of read_excel was provided"
349+
" without an 'index_col' keyword value.")
353350

354351
import xlrd
355352
from xlrd import (xldate, XL_CELL_DATE,

pandas/tests/io/test_excel.py

+22-14
Original file line numberDiff line numberDiff line change
@@ -924,17 +924,27 @@ def test_read_excel_chunksize(self):
924924
chunksize=100)
925925

926926
def test_read_excel_parse_dates(self):
927-
# GH 11544
928-
with tm.assertRaises(NotImplementedError):
929-
pd.read_excel(os.path.join(self.dirpath, 'test1' + self.ext),
930-
parse_dates=True)
927+
# GH 11544, 12051
931928

932-
def test_read_excel_date_parser(self):
933-
# GH 11544
934-
with tm.assertRaises(NotImplementedError):
935-
dateparse = lambda x: pd.datetime.strptime(x, '%Y-%m-%d %H:%M:%S')
936-
pd.read_excel(os.path.join(self.dirpath, 'test1' + self.ext),
937-
date_parser=dateparse)
929+
df = DataFrame(
930+
{'col': [1, 2, 3],
931+
'date_strings': pd.date_range('2012-01-01', periods=3)})
932+
df2 = df.copy()
933+
df2['date_strings'] = df2['date_strings'].dt.strftime('%m/%d/%Y')
934+
935+
with ensure_clean(self.ext) as pth:
936+
df2.to_excel(pth)
937+
938+
res = read_excel(pth)
939+
tm.assert_frame_equal(df2, res)
940+
941+
res = read_excel(pth, parse_dates=['date_strings'])
942+
tm.assert_frame_equal(df, res)
943+
944+
dateparser = lambda x: pd.datetime.strptime(x, '%m/%d/%Y')
945+
res = read_excel(pth, parse_dates=['date_strings'],
946+
date_parser=dateparser)
947+
tm.assert_frame_equal(df, res)
938948

939949
def test_read_excel_skiprows_list(self):
940950
# GH 4903
@@ -1382,8 +1392,7 @@ def test_to_excel_multiindex(self):
13821392
# round trip
13831393
frame.to_excel(path, 'test1', merge_cells=self.merge_cells)
13841394
reader = ExcelFile(path)
1385-
df = read_excel(reader, 'test1', index_col=[0, 1],
1386-
parse_dates=False)
1395+
df = read_excel(reader, 'test1', index_col=[0, 1])
13871396
tm.assert_frame_equal(frame, df)
13881397

13891398
# GH13511
@@ -1424,8 +1433,7 @@ def test_to_excel_multiindex_cols(self):
14241433
frame.to_excel(path, 'test1', merge_cells=self.merge_cells)
14251434
reader = ExcelFile(path)
14261435
df = read_excel(reader, 'test1', header=header,
1427-
index_col=[0, 1],
1428-
parse_dates=False)
1436+
index_col=[0, 1])
14291437
if not self.merge_cells:
14301438
fm = frame.columns.format(sparsify=False,
14311439
adjoin=False, names=False)

0 commit comments

Comments
 (0)