Skip to content

Cross off a few tslibs-TODOs #18443

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Nov 25, 2017
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 0 additions & 13 deletions pandas/_libs/lib.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -929,19 +929,6 @@ def write_csv_rows(list data, ndarray data_index,
# ------------------------------------------------------------------------------
# Groupby-related functions

@cython.boundscheck(False)
def arrmap(ndarray[object] index, object func):
cdef int length = index.shape[0]
cdef int i = 0

cdef ndarray[object] result = np.empty(length, dtype=np.object_)

for i from 0 <= i < length:
result[i] = func(index[i])

return result


@cython.wraparound(False)
@cython.boundscheck(False)
def is_lexsorted(list list_of_arrays):
Expand Down
1 change: 0 additions & 1 deletion pandas/_libs/period.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -559,7 +559,6 @@ cdef class _Period(object):
int64_t ordinal
object freq

_comparables = ['name', 'freqstr']
_typ = 'period'

def __cinit__(self, ordinal, freq):
Expand Down
1 change: 0 additions & 1 deletion pandas/_libs/src/datetime.pxd
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,6 @@ cdef extern from "datetime/np_datetime.h":

int dayofweek(int y, int m, int d) nogil
int is_leapyear(int64_t year) nogil
PANDAS_DATETIMEUNIT get_datetime64_unit(object o)

cdef extern from "datetime/np_datetime_strings.h":

Expand Down
64 changes: 30 additions & 34 deletions pandas/_libs/src/datetime/np_datetime.c
Original file line number Diff line number Diff line change
Expand Up @@ -564,18 +564,15 @@ void pandas_datetime_to_datetimestruct(npy_datetime val, PANDAS_DATETIMEUNIT fr,

void pandas_timedelta_to_timedeltastruct(npy_timedelta val,
PANDAS_DATETIMEUNIT fr,
pandas_timedeltastruct *result) {
pandas_timedeltastruct *result) {
pandas_datetime_metadata meta;

meta.base = fr;
meta.num - 1;
meta.num = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this typo. Do we even need the assignment at all though? Worth considering removal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm advocating moving the timedelta_struct stuff out of src/ entirely. The src/ directory is a PITA and we can put it in cython directly without taking a performance hit.


convert_timedelta_to_timedeltastruct(&meta, val, result);
}

PANDAS_DATETIMEUNIT get_datetime64_unit(PyObject *obj) {
return (PANDAS_DATETIMEUNIT)((PyDatetimeScalarObject *)obj)->obmeta.base;
}

/*
* Converts a datetime from a datetimestruct to a datetime based
Expand Down Expand Up @@ -1001,7 +998,6 @@ int convert_datetime_to_datetimestruct(pandas_datetime_metadata *meta,
int convert_timedelta_to_timedeltastruct(pandas_timedelta_metadata *meta,
npy_timedelta td,
pandas_timedeltastruct *out) {
npy_int64 perday;
npy_int64 frac;
npy_int64 sfrac;
npy_int64 ifrac;
Expand All @@ -1016,11 +1012,11 @@ int convert_timedelta_to_timedeltastruct(pandas_timedelta_metadata *meta,

// put frac in seconds
if (td < 0 && td % (1000LL * 1000LL * 1000LL) != 0)
frac = td / (1000LL * 1000LL * 1000LL) - 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we already lint this file (cpplint). is this not picked up? rather have the proper config settings than manually changing things.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No argument here, but cpplint is well outside my domain of competence. I only noticed this indentation when trying to move this func into cython.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd any experience with cpplint? want to give this a shot, e.g. making sure we are linting the *.c (more/better)?

Copy link
Member

@WillAyd WillAyd Nov 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't claim experience with cpplint but looking at the bash script it appears to already hit all of the files in the datetime folder. I ran both versions of this file through cpplint locally and neither caused any issue with the highest level of verbosity. Surprisingly I couldn't get an error when changing the line to a 1 or 3 space indent either, so I think there might be some bug with cpplint itself.

In looking at Google's style guide they suggest a 2 space indent. To match that and keep consistent with the rest of the file I would suggest removing the two spaces added here and remove the two from the indented block after the else statement instead

https://google.github.io/styleguide/cppguide.html#Conditionals

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see https://github.com/apache/arrow?files=1

for some hints in how to lint / enforce this as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that within arrow cpplint is calling out those whitespace issues? I tried replicating the filters I saw in that project via:

cpplint --verbose=2 --filter=-whitespace/comments,-readability/todo,-build/header_guard,-build/c++11,-runtime/references,-build/include_order pandas/_libs/src/datetime/np_datetime.c

And then tried an even more verbose version without any filter

cpplint --verbose=1 pandas/_libs/src/datetime/np_datetime.c

And neither made any mention of those lines, whether I set them at 2, 4 or even 3 space indents.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no just a guess that they have more options set for linting

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the cpplint source, I don't see anything that checks for proper 2-space indentation of conditional statements, nor was there any documentation hinting that it could. The whitespace/indentat check offered only seems to make sure that you don't start with 1 or 3 spaces, but doesn't do much beyond that. Unfortunately I don't see any programmatic options to continually validate this with that tool as is

frac = td / (1000LL * 1000LL * 1000LL) - 1;
else
frac = td / (1000LL * 1000LL * 1000LL);

if (frac < 0) {
if (frac < 0) {
sign = -1;

// even fraction
Expand All @@ -1030,66 +1026,66 @@ int convert_timedelta_to_timedeltastruct(pandas_timedelta_metadata *meta,
} else {
frac = -frac;
}
} else {
} else {
sign = 1;
out->days = 0;
}
}

if (frac >= 86400) {
if (frac >= 86400) {
out->days += frac / 86400LL;
frac -= out->days * 86400LL;
}
}

if (frac >= 3600) {
if (frac >= 3600) {
out->hrs = frac / 3600LL;
frac -= out->hrs * 3600LL;
} else {
} else {
out->hrs = 0;
}
}

if (frac >= 60) {
if (frac >= 60) {
out->min = frac / 60LL;
frac -= out->min * 60LL;
} else {
} else {
out->min = 0;
}
}

if (frac >= 0) {
if (frac >= 0) {
out->sec = frac;
frac -= out->sec;
} else {
} else {
out->sec = 0;
}
}

sfrac = (out->hrs * 3600LL + out->min * 60LL
+ out->sec) * (1000LL * 1000LL * 1000LL);
sfrac = (out->hrs * 3600LL + out->min * 60LL
+ out->sec) * (1000LL * 1000LL * 1000LL);

if (sign < 0)
if (sign < 0)
out->days = -out->days;

ifrac = td - (out->days * DAY_NS + sfrac);
ifrac = td - (out->days * DAY_NS + sfrac);

if (ifrac != 0) {
if (ifrac != 0) {
out->ms = ifrac / (1000LL * 1000LL);
ifrac -= out->ms * 1000LL * 1000LL;
out->us = ifrac / 1000LL;
ifrac -= out->us * 1000LL;
out->ns = ifrac;
} else {
} else {
out->ms = 0;
out->us = 0;
out->ns = 0;
}
}

out->seconds = out->hrs * 3600 + out->min * 60 + out->sec;
out->microseconds = out->ms * 1000 + out->us;
out->nanoseconds = out->ns;
break;
out->seconds = out->hrs * 3600 + out->min * 60 + out->sec;
out->microseconds = out->ms * 1000 + out->us;
out->nanoseconds = out->ns;
break;

default:
PyErr_SetString(PyExc_RuntimeError,
"NumPy datetime metadata is corrupted with invalid "
"base unit");
"NumPy timedelta metadata is corrupted with "
"invalid base unit");
return -1;
}

Expand Down
3 changes: 0 additions & 3 deletions pandas/_libs/src/datetime/np_datetime.h
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,4 @@ convert_timedelta_to_timedeltastruct(pandas_timedelta_metadata *meta,
pandas_timedeltastruct *out);


PANDAS_DATETIMEUNIT get_datetime64_unit(PyObject *obj);


#endif // PANDAS__LIBS_SRC_DATETIME_NP_DATETIME_H_
12 changes: 6 additions & 6 deletions pandas/_libs/tslib.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ from tslibs.conversion cimport (tz_convert_single, _TSObject,
from tslibs.conversion import tz_convert_single

from tslibs.nattype import NaT, nat_strings, iNaT
from tslibs.nattype cimport _checknull_with_nat, NPY_NAT
from tslibs.nattype cimport checknull_with_nat, NPY_NAT

from tslibs.timestamps cimport (create_timestamp_from_ts,
_NS_UPPER_BOUND, _NS_LOWER_BOUND)
Expand Down Expand Up @@ -409,7 +409,7 @@ cpdef array_with_unit_to_datetime(ndarray values, unit, errors='coerce'):
for i in range(n):
val = values[i]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these array_* should be moved to conversion I think (obviously new PR), then tslib.pyx is moot?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's hold off on that. I'd like to do auditing before moving, and the array_* funcs may need some work/thought (#17697)


if _checknull_with_nat(val):
if checknull_with_nat(val):
iresult[i] = NPY_NAT

elif is_integer_object(val) or is_float_object(val):
Expand Down Expand Up @@ -475,7 +475,7 @@ cpdef array_with_unit_to_datetime(ndarray values, unit, errors='coerce'):
for i in range(n):
val = values[i]

if _checknull_with_nat(val):
if checknull_with_nat(val):
oresult[i] = NaT
elif is_integer_object(val) or is_float_object(val):

Expand Down Expand Up @@ -526,7 +526,7 @@ cpdef array_to_datetime(ndarray[object] values, errors='raise',
for i in range(n):
val = values[i]

if _checknull_with_nat(val):
if checknull_with_nat(val):
iresult[i] = NPY_NAT

elif PyDateTime_Check(val):
Expand Down Expand Up @@ -686,7 +686,7 @@ cpdef array_to_datetime(ndarray[object] values, errors='raise',
val = values[i]

# set as nan except if its a NaT
if _checknull_with_nat(val):
if checknull_with_nat(val):
if PyFloat_Check(val):
oresult[i] = np.nan
else:
Expand All @@ -704,7 +704,7 @@ cpdef array_to_datetime(ndarray[object] values, errors='raise',

for i in range(n):
val = values[i]
if _checknull_with_nat(val):
if checknull_with_nat(val):
oresult[i] = val
elif is_string_object(val):

Expand Down
4 changes: 2 additions & 2 deletions pandas/_libs/tslibs/conversion.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ from timezones cimport (
from parsing import parse_datetime_string

from nattype import nat_strings, NaT
from nattype cimport NPY_NAT, _checknull_with_nat
from nattype cimport NPY_NAT, checknull_with_nat

# ----------------------------------------------------------------------
# Constants
Expand Down Expand Up @@ -140,7 +140,7 @@ def datetime_to_datetime64(ndarray[object] values):
iresult = result.view('i8')
for i in range(n):
val = values[i]
if _checknull_with_nat(val):
if checknull_with_nat(val):
iresult[i] = NPY_NAT
elif PyDateTime_Check(val):
if val.tzinfo is not None:
Expand Down
2 changes: 1 addition & 1 deletion pandas/_libs/tslibs/nattype.pxd
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ cdef int64_t NPY_NAT

cdef bint _nat_scalar_rules[6]

cdef bint _checknull_with_nat(object val)
cdef bint checknull_with_nat(object val)
2 changes: 1 addition & 1 deletion pandas/_libs/tslibs/nattype.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -572,7 +572,7 @@ NaT = NaTType()

# ----------------------------------------------------------------------

cdef inline bint _checknull_with_nat(object val):
cdef inline bint checknull_with_nat(object val):
""" utility to check if a value is a nat or not """
return val is None or (
PyFloat_Check(val) and val != val) or val is NaT
4 changes: 2 additions & 2 deletions pandas/_libs/tslibs/strptime.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ from np_datetime cimport (check_dts_bounds,

from util cimport is_string_object

from nattype cimport _checknull_with_nat, NPY_NAT
from nattype cimport checknull_with_nat, NPY_NAT
from nattype import nat_strings


Expand Down Expand Up @@ -142,7 +142,7 @@ def array_strptime(ndarray[object] values, object fmt,
iresult[i] = NPY_NAT
continue
else:
if _checknull_with_nat(val):
if checknull_with_nat(val):
iresult[i] = NPY_NAT
continue
else:
Expand Down
8 changes: 4 additions & 4 deletions pandas/_libs/tslibs/timedeltas.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ from np_datetime cimport (cmp_scalar, reverse_ops, td64_to_tdstruct,
pandas_timedeltastruct)

from nattype import nat_strings, NaT
from nattype cimport _checknull_with_nat, NPY_NAT
from nattype cimport checknull_with_nat, NPY_NAT

# ----------------------------------------------------------------------
# Constants
Expand Down Expand Up @@ -111,7 +111,7 @@ cpdef convert_to_timedelta64(object ts, object unit):
# kludgy here until we have a timedelta scalar
# handle the numpy < 1.7 case
"""
if _checknull_with_nat(ts):
if checknull_with_nat(ts):
return np.timedelta64(NPY_NAT)
elif isinstance(ts, Timedelta):
# already in the proper format
Expand Down Expand Up @@ -443,7 +443,7 @@ cdef inline timedelta_from_spec(object number, object frac, object unit):

cdef bint _validate_ops_compat(other):
# return True if we are compat with operating
if _checknull_with_nat(other):
if checknull_with_nat(other):
return True
elif PyDelta_Check(other) or is_timedelta64_object(other):
return True
Expand Down Expand Up @@ -837,7 +837,7 @@ class Timedelta(_Timedelta):
elif is_integer_object(value) or is_float_object(value):
# unit=None is de-facto 'ns'
value = convert_to_timedelta64(value, unit)
elif _checknull_with_nat(value):
elif checknull_with_nat(value):
return NaT
else:
raise ValueError(
Expand Down