Skip to content

Commit 73fbd21

Browse files
committed
CLN: modified timmie PR 4631
1 parent 253a45d commit 73fbd21

16 files changed

+81
-361
lines changed

doc/source/basics.rst

-3
Original file line numberDiff line numberDiff line change
@@ -465,9 +465,6 @@ take an optional ``axis`` argument:
465465
df.apply(lambda x: x.max() - x.min())
466466
df.apply(np.cumsum)
467467
df.apply(np.exp)
468-
469-
Please note that the default is the application along the DataFrame's index
470-
(axis=0) whereas for applying along the columns (axis=1) axis must be specified explicitly (see also :ref:`api.dataframe` and :ref:`api.series`).
471468
472469
Depending on the return type of the function passed to ``apply``, the result
473470
will either be of lower dimension or the same dimension.

doc/source/dsintro.rst

-17
Original file line numberDiff line numberDiff line change
@@ -431,9 +431,6 @@ available to insert at a particular location in the columns:
431431
432432
Indexing / Selection
433433
~~~~~~~~~~~~~~~~~~~~
434-
435-
.. _dsintro.basics-of-indexing:
436-
437434
The basics of indexing are as follows:
438435

439436
.. csv-table::
@@ -453,20 +450,6 @@ DataFrame:
453450
454451
df.loc['b']
455452
df.iloc[2]
456-
457-
There is also support for purely integer-based indexing provided by the following methods:
458-
459-
.. _dsintro.integer-indexing:
460-
461-
.. csv-table::
462-
:header: "Method","Description"
463-
:widths: 40,60
464-
465-
``Series.iget_value(i)``, Retrieve value stored at location ``i``
466-
``Series.iget(i)``, Alias for ``iget_value``
467-
``DataFrame.irow(i)``, Retrieve the ``i``-th row
468-
``DataFrame.icol(j)``, Retrieve the ``j``-th column
469-
"``DataFrame.iget_value(i, j)``", Retrieve the value at row ``i`` and column ``j``
470453
471454
For a more exhaustive treatment of more sophisticated label-based indexing and
472455
slicing, see the :ref:`section on indexing <indexing>`. We will address the

doc/source/index.rst

-1
Original file line numberDiff line numberDiff line change
@@ -133,5 +133,4 @@ See the package overview for more detail about what's in the library.
133133
related
134134
comparison_with_r
135135
api
136-
shortcuts
137136
release

doc/source/release.rst

+3
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,9 @@ pandas 0.13
5959
- A Series of dtype ``timedelta64[ns]`` can now be divided by another
6060
``timedelta64[ns]`` object to yield a ``float64`` dtyped Series. This
6161
is frequency conversion.
62+
- read_excel (:issue:`4332`) supports a date_parser. This enables reading in hours
63+
in a form of 01:00-24:00 in both `Excel datemodes <http://support.microsoft.com/kb/214330/en-us>`_
64+
courtesy of @timmie
6265

6366
**API Changes**
6467

doc/source/shortcuts.rst

-30
This file was deleted.

doc/source/timeseries.rst

-52
Original file line numberDiff line numberDiff line change
@@ -387,8 +387,6 @@ regularity will result in a ``DatetimeIndex`` (but frequency is lost):
387387
DateOffset objects
388388
------------------
389389

390-
.. _timeseries.dateoffset:
391-
392390
In the preceding examples, we created DatetimeIndex objects at various
393391
frequencies by passing in frequency strings like 'M', 'W', and 'BM to the
394392
``freq`` keyword. Under the hood, these frequency strings are being translated
@@ -549,8 +547,6 @@ calendars which account for local holidays and local weekend conventions.
549547
Offset Aliases
550548
~~~~~~~~~~~~~~
551549

552-
.. _timeseries.offset-aliases:
553-
554550
A number of string aliases are given to useful common time series
555551
frequencies. We will refer to these aliases as *offset aliases*
556552
(referred to as *time rules* prior to v0.8.0).
@@ -1250,51 +1246,3 @@ The following are equivalent statements in the two versions of numpy.
12501246
y / np.timedelta64(1,'D')
12511247
y / np.timedelta64(1,'s')
12521248
1253-
Working with timedate-based indices
1254-
-------------------------------------------------------
1255-
1256-
The :func:`pd.datetime` allows for vectorised operations using datetime information stored in a :ref:`timeseries.datetimeindex`.
1257-
1258-
Use cases are:
1259-
1260-
* calculation of sunsunrise, sunset, daylength
1261-
* boolean test of working hours
1262-
1263-
An example contributed by a savvy user at `Stackoverflow <http://stackoverflow.com/a/15839530>`_:
1264-
1265-
.. ipython:: python
1266-
1267-
import pandas as pd
1268-
1269-
###1) create a date column from indiviadual year, month, day columns
1270-
df = pd.DataFrame({"year": [1992, 2003, 2014], "month": [2,3,4], "day": [10,20,30]})
1271-
df
1272-
1273-
df["Date"] = df.apply(lambda x: pd.datetime(x['year'], x['month'], x['day']), axis=1)
1274-
df
1275-
1276-
###2) alternatively, use the equivalent to datetime.datetime.combine
1277-
import numpy as np
1278-
1279-
#create a hourly timeseries
1280-
data_randints = np.random.randint(1, 10, 4000)
1281-
data_randints = data_randints.reshape(1000, 4)
1282-
ts = pd.Series(randn(1000), index=pd.date_range('1/1/2000', periods=1000, freq='H'))
1283-
df = pd.DataFrame(data_randints, index=ts.index, columns=['A', 'B', 'C', 'D'])
1284-
df.head()
1285-
1286-
#only for examplary purposes: get the date & time from the df.index
1287-
# in real world, these would be read in or generated from different columns
1288-
df['date'] = df.index.date
1289-
df.head()
1290-
1291-
df['time'] = df.index.time
1292-
df.head()
1293-
1294-
#combine both:
1295-
df['datetime'] = df.apply((lambda x: pd.datetime.combine(x['date'], x['time'])), axis=1)
1296-
df.head()
1297-
1298-
#the index could be set to the created column
1299-
df = df.set_index(['datetime'])
1300-
df.head()

doc/source/v0.13.0.txt

-4
Original file line numberDiff line numberDiff line change
@@ -270,10 +270,6 @@ Bug Fixes
270270

271271
- Suppressed DeprecationWarning associated with internal calls issued by repr() (:issue:`4391`)
272272

273-
- read_excel (:issue:`4332`) supports a date_parser. This enabales to reading in hours like 01:00-24:00 in both `Excel datemodes <http://support.microsoft.com/kb/214330/en-us>`_
274-
275-
- (Excel) parser (:issue:`4340`) allows skipping an arbitrary number of lines between header and first row.
276-
277273
See the :ref:`full release notes
278274
<release>` or issue tracker
279275
on GitHub for a complete list.

doc/source/v0.7.0.txt

+12-2
Original file line numberDiff line numberDiff line change
@@ -150,8 +150,18 @@ This change also has the same impact on DataFrame:
150150
In [5]: df.ix[3]
151151
KeyError: 3
152152

153-
In order to support purely `integer-based indexing <dsintro.integer-indexing>`, `corresponding methods <dsintro.integer-indexing>` have been added.
154-
153+
In order to support purely integer-based indexing, the following methods have
154+
been added:
155+
156+
.. csv-table::
157+
:header: "Method","Description"
158+
:widths: 40,60
159+
160+
``Series.iget_value(i)``, Retrieve value stored at location ``i``
161+
``Series.iget(i)``, Alias for ``iget_value``
162+
``DataFrame.irow(i)``, Retrieve the ``i``-th row
163+
``DataFrame.icol(j)``, Retrieve the ``j``-th column
164+
"``DataFrame.iget_value(i, j)``", Retrieve the value at row ``i`` and column ``j``
155165

156166
API tweaks regarding label-based slicing
157167
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

pandas/io/date_converters.py

+10-42
Original file line numberDiff line numberDiff line change
@@ -60,9 +60,9 @@ def _check_columns(cols):
6060
return N
6161

6262

63-
## Datetime Conversion for date_parsers
64-
## see also: create a community supported set of typical converters
65-
## https://github.com/pydata/pandas/issues/1180
63+
# Datetime Conversion for date_parsers
64+
# see also: create a community supported set of typical converters
65+
# https://github.com/pydata/pandas/issues/1180
6666

6767
def offset_datetime(dt_in, days=0, hours=0, minutes=0,
6868
seconds=0, microseconds=0):
@@ -82,52 +82,20 @@ def offset_datetime(dt_in, days=0, hours=0, minutes=0,
8282
8383
output
8484
------
85-
ti_corr : datetime.time or datetime.datetime object
85+
ti_corr : datetime.time object (or pass thru if no conversion)
8686
8787
8888
'''
8989
# if a excel time like '23.07.2013 24:00' they actually mean
9090
# in Python '23.07.2013 23:59', must be converted
91-
# offset = -10 # minutes
9291
delta = timedelta(days=days, hours=hours, minutes=minutes,
9392
seconds=seconds, microseconds=microseconds)
9493

95-
#check if offset it to me applied on datetime or time
96-
if type(dt_in) is time:
97-
#create psydo datetime
98-
dt_now = datetime.now()
99-
dt_base = datetime.combine(dt_now, dt_in)
100-
else:
101-
dt_base = dt_in
102-
103-
dt_corr = (dt_base) + delta
104-
105-
#if input is time, we return it.
106-
if type(dt_in) is time:
107-
dt_corr = dt_corr.time()
108-
109-
return dt_corr
110-
111-
112-
def dt2ti(dt_in):
113-
'''converts wrong datetime.datetime to datetime.time
114-
115-
input
116-
-----
117-
dt_in : dt_in : datetime.time or datetime.datetime object
118-
119-
output
120-
-------
121-
ti_corr : datetime.time object
122-
'''
123-
# so we correct those which are not of type :mod:datetime.time
124-
# impdt2tiortant hint:
125-
# http://stackoverflow.com/a/12906456
126-
if type(dt_in) is not time:
127-
dt_in = dt_in.time()
128-
elif type(dt_in) is datetime:
129-
dt_in = dt_in.time()
130-
else:
131-
pass
94+
offsetter = lambda base: (base) + delta
13295

96+
# check if offset it to me applied on datetime or time
97+
if isinstance(dt_in, time):
98+
return offsetter(datetime.combine(datetime.now(), dt_in)).time()
99+
elif isinstance(dt_in, datetime):
100+
return offsetter(datetime.combine(datetime.now(), dt_in.time())).time()
133101
return dt_in

pandas/io/excel.py

+10-16
Original file line numberDiff line numberDiff line change
@@ -127,18 +127,15 @@ def parse(self, sheetname, header=0, skiprows=None, skip_footer=0,
127127
skipfooter = kwds.pop('skipfooter', None)
128128
if skipfooter is not None:
129129
skip_footer = skipfooter
130-
131-
# this now gives back a df
132-
res = self._parse_excel(sheetname, header=header, skiprows=skiprows,
130+
131+
return self._parse_excel(sheetname, header=header, skiprows=skiprows,
133132
index_col=index_col,
134133
has_index_names=has_index_names,
135134
parse_cols=parse_cols,
136135
parse_dates=parse_dates,
137136
date_parser=date_parser, na_values=na_values,
138137
thousands=thousands, chunksize=chunksize,
139138
skip_footer=skip_footer, **kwds)
140-
141-
return res
142139

143140
def _should_parse(self, i, parse_cols):
144141

@@ -198,24 +195,21 @@ def _parse_excel(self, sheetname, header=0, skiprows=None, skip_footer=0,
198195
if parse_cols is None or should_parse[j]:
199196
if typ == XL_CELL_DATE:
200197
dt = xldate_as_tuple(value, datemode)
201-
198+
202199
# how to produce this first case?
203200
# if the year is ZERO then values are time/hours
204201
if dt[0] < datetime.MINYEAR: # pragma: no cover
205202
datemode = 1
206203
dt = xldate_as_tuple(value, datemode)
207-
208-
value = datetime.time(*dt[3:])
209-
204+
value = datetime.time(*dt[3:])
210205

211206
#or insert a full date
212207
else:
213208
value = datetime.datetime(*dt)
214-
215-
#apply eventual date_parser correction
209+
216210
if date_parser:
217-
value = date_parser(value)
218-
211+
value = date_parser(value)
212+
219213
elif typ == XL_CELL_ERROR:
220214
value = np.nan
221215
elif typ == XL_CELL_BOOLEAN:
@@ -237,13 +231,13 @@ def _parse_excel(self, sheetname, header=0, skiprows=None, skip_footer=0,
237231
skip_footer=skip_footer,
238232
chunksize=chunksize,
239233
**kwds)
240-
res = parser.read()
241-
234+
235+
res = parser.read()
236+
242237
if header is not None:
243238

244239
if len(data[header]) == len(res.columns.tolist()):
245240
res.columns = data[header]
246-
247241

248242
return res
249243

pandas/io/parsers.py

+2-30
Original file line numberDiff line numberDiff line change
@@ -1150,11 +1150,7 @@ def TextParser(*args, **kwds):
11501150
returns Series if only one column
11511151
"""
11521152
kwds['engine'] = 'python'
1153-
1154-
res = TextFileReader(*args, **kwds)
1155-
1156-
1157-
return res
1153+
return TextFileReader(*args, **kwds)
11581154

11591155
# delimiter=None, dialect=None, names=None, header=0,
11601156
# index_col=None,
@@ -1389,7 +1385,6 @@ def _convert_data(self, data):
13891385
clean_conv)
13901386

13911387
def _infer_columns(self):
1392-
#TODO: this full part is too complex and somewhat strage!!!
13931388
names = self.names
13941389

13951390
if self.header is not None:
@@ -1401,20 +1396,13 @@ def _infer_columns(self):
14011396
header = list(header) + [header[-1]+1]
14021397
else:
14031398
have_mi_columns = False
1404-
#TODO: explain why header (in this case 1 number) needs to be a list???
14051399
header = [ header ]
14061400

14071401
columns = []
14081402
for level, hr in enumerate(header):
1409-
#TODO: explain why self.buf is needed.
1410-
# the header is correctly retrieved in excel.py by
1411-
# data[header] = _trim_excel_header(data[header])
1403+
14121404
if len(self.buf) > 0:
14131405
line = self.buf[0]
1414-
1415-
elif (header[0] == hr) and (level == 0) and (header[0] > 0):
1416-
line = self._get_header()
1417-
14181406
else:
14191407
line = self._next_line()
14201408

@@ -1468,24 +1456,8 @@ def _infer_columns(self):
14681456
columns = [ names ]
14691457

14701458
return columns
1471-
1472-
def _get_header(self):
1473-
''' reads header if e.g. header
1474-
FIXME: this tshoul be turned into something much less complicates
1475-
FIXME: all due to the header assuming that there is never a row between
1476-
data and header
1477-
'''
1478-
if isinstance(self.data, list):
1479-
line = self.data[self.header]
1480-
self.pos = self.header +1
1481-
else:
1482-
line = self._next_line()
1483-
1484-
return line
14851459

14861460
def _next_line(self):
1487-
#FIXME: why is self.data at times a list and sometimes a _scv.reader??
1488-
# reduce complexity here!!!
14891461
if isinstance(self.data, list):
14901462
while self.pos in self.skiprows:
14911463
self.pos += 1

0 commit comments

Comments
 (0)