Skip to content

Fix get_datetimestruct_days overflow #56001

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Aug 19, 2024
48 changes: 29 additions & 19 deletions pandas/_libs/src/vendored/numpy/datetime/np_datetime.c
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,12 @@ This file is derived from NumPy 1.7. See NUMPY_LICENSE.txt
#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
#endif // NPY_NO_DEPRECATED_API

#include <Python.h>

#include "pandas/vendored/numpy/datetime/np_datetime.h"

#define NO_IMPORT_ARRAY
#define PY_ARRAY_UNIQUE_SYMBOL PANDAS_DATETIME_NUMPY
#include <numpy/ndarrayobject.h>
#include <numpy/npy_common.h>
#include <stdbool.h>

#if defined(_WIN32)
#ifndef ENABLE_INTSAFE_SIGNED_FUNCTIONS
Expand Down Expand Up @@ -58,12 +56,15 @@ _Static_assert(0, "__has_builtin not detected; please try a newer compiler");
#endif
#endif

#define XSTR(a) STR(a)
#define STR(a) #a

#define PD_CHECK_OVERFLOW(FUNC) \
do { \
if ((FUNC) != 0) { \
PyGILState_STATE gstate = PyGILState_Ensure(); \
PyErr_SetString(PyExc_OverflowError, \
"Overflow occurred in npy_datetimestruct_to_datetime"); \
"Overflow occurred at " __FILE__ ":" XSTR(__LINE__)); \
PyGILState_Release(gstate); \
return -1; \
} \
Expand Down Expand Up @@ -139,53 +140,53 @@ npy_int64 get_datetimestruct_days(const npy_datetimestruct *dts) {
npy_int64 year, days = 0;
const int *month_lengths;

year = dts->year - 1970;
days = year * 365;
PD_CHECK_OVERFLOW(checked_int64_sub(dts->year, 1970, &year));
PD_CHECK_OVERFLOW(checked_int64_mul(year, 365, &days));

/* Adjust for leap years */
if (days >= 0) {
/*
* 1968 is the closest leap year before 1970.
* Exclude the current year, so add 1.
*/
year += 1;
PD_CHECK_OVERFLOW(checked_int64_add(year, 1, &year));
/* Add one day for each 4 years */
days += year / 4;
PD_CHECK_OVERFLOW(checked_int64_add(days, year / 4, &days));
/* 1900 is the closest previous year divisible by 100 */
year += 68;
PD_CHECK_OVERFLOW(checked_int64_add(year, 68, &year));
/* Subtract one day for each 100 years */
days -= year / 100;
PD_CHECK_OVERFLOW(checked_int64_sub(days, year / 100, &days));
/* 1600 is the closest previous year divisible by 400 */
year += 300;
PD_CHECK_OVERFLOW(checked_int64_add(year, 300, &year));
/* Add one day for each 400 years */
days += year / 400;
PD_CHECK_OVERFLOW(checked_int64_add(days, year / 400, &days));
} else {
/*
* 1972 is the closest later year after 1970.
* Include the current year, so subtract 2.
*/
year -= 2;
PD_CHECK_OVERFLOW(checked_int64_sub(year, 2, &year));
/* Subtract one day for each 4 years */
days += year / 4;
PD_CHECK_OVERFLOW(checked_int64_add(days, year / 4, &days));
/* 2000 is the closest later year divisible by 100 */
year -= 28;
PD_CHECK_OVERFLOW(checked_int64_sub(year, 28, &year));
/* Add one day for each 100 years */
days -= year / 100;
PD_CHECK_OVERFLOW(checked_int64_sub(days, year / 100, &days));
/* 2000 is also the closest later year divisible by 400 */
/* Subtract one day for each 400 years */
days += year / 400;
PD_CHECK_OVERFLOW(checked_int64_add(days, year / 400, &days));
}

month_lengths = days_per_month_table[is_leapyear(dts->year)];
month = dts->month - 1;

/* Add the months */
for (i = 0; i < month; ++i) {
days += month_lengths[i];
PD_CHECK_OVERFLOW(checked_int64_add(days, month_lengths[i], &days));
}

/* Add the days */
days += dts->day - 1;
PD_CHECK_OVERFLOW(checked_int64_add(days, dts->day - 1, &days));

return days;
}
Expand Down Expand Up @@ -430,6 +431,15 @@ npy_datetime npy_datetimestruct_to_datetime(NPY_DATETIMEUNIT base,
}

const int64_t days = get_datetimestruct_days(dts);
if (days == -1) {
PyGILState_STATE gstate = PyGILState_Ensure();
bool did_error = PyErr_Occurred() == NULL ? false : true;
PyGILState_Release(gstate);
if (did_error) {
return -1;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is acquiring the gil, checking if an overflow happened, then releasing it? is there a nogil alternative?

id be more comfortable with this if we had a test where it made a difference

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GIL is only acquired when days == -1 because you need to check if that is a legitimate value or if a Python exception was thrown. This is essentially what except? -1 does in Cython


if (base == NPY_FR_D) {
return days;
}
Expand Down
Loading