Skip to content

REF: do all casting _before_ call to DatetimeEngine.get_loc #30948

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jan 15, 2020

Conversation

jbrockmendel
Copy link
Member

Fixes: incorrectly letting integers or mismatched types through.

Removes: maybe_datetimelike_to_i8, along with the recently-removed _to_M8 and he hopefully soon-removed pydt_to_i8 (xred #30854) i think we'll be rid of our kludgy datetime casting functions.

Performance neutral:

       before           after         ratio
     [28e909c6]       [890204ca]
     <master>         <cln-maybe_datetimelike_to_i8>
+         459±3μs        694±300μs     1.51  groupby.GroupByMethods.time_dtype_as_field('int', 'tail', 'transformation')
+      1.84±0.2μs         2.38±2μs     1.29  index_cached_properties.IndexCache.time_shape('Float64Index')
+     1.09±0.07μs      1.27±0.09μs     1.16  index_cached_properties.IndexCache.time_inferred_type('Float64Index')
+     1.43±0.01μs       1.63±0.2μs     1.14  tslibs.timestamp.TimestampConstruction.time_fromordinal
+         510±7μs         576±60μs     1.13  ctors.SeriesConstructors.time_series_constructor(<function list_of_str at 0x7fe19dbc09d8>, True, 'int')
+     6.75±0.06μs      7.51±0.05μs     1.11  index_object.Indexing.time_get_loc_non_unique_sorted('Int')
+         517±8μs         573±70μs     1.11  ctors.SeriesConstructors.time_series_constructor(<function list_of_str at 0x7fe19dbc09d8>, True, 'float')
+     4.83±0.06ms      5.33±0.07ms     1.10  index_object.SetOperations.time_operation('int', 'symmetric_difference')
+     1.08±0.09μs      1.19±0.07μs     1.10  index_cached_properties.IndexCache.time_is_all_dates('Float64Index')
+     1.74±0.02ms      1.92±0.04ms     1.10  inference.ToNumericDowncast.time_downcast('datetime64', 'float')
-      25.4±0.8μs       23.1±0.2μs     0.91  indexing.NumericSeriesIndexing.time_getitem_scalar(<class 'pandas.core.indexes.numeric.Float64Index'>, 'unique_monotonic_inc')
-     8.31±0.06ms       7.53±0.2ms     0.91  inference.DateInferOps.time_timedelta_plus_datetime
-      16.1±0.7ms       14.6±0.3ms     0.91  timedelta.ToTimedelta.time_convert_string_days
-      14.0±0.8μs      12.7±0.08μs     0.90  index_object.Indexing.time_get_loc('Float')
-     1.94±0.01ms      1.75±0.02ms     0.90  index_object.Indexing.time_get_loc_non_unique('Int')
-     12.8±0.04ms       11.5±0.1ms     0.90  multiindex_object.GetLoc.time_med_get_loc_warm
-      14.1±0.7μs       12.6±0.1μs     0.89  index_object.Indexing.time_get_loc_sorted('Float')
-     12.5±0.04μs       11.1±0.1μs     0.89  multiindex_object.GetLoc.time_string_get_loc
-      10.2±0.6μs      9.05±0.05μs     0.89  index_object.Float64IndexMethod.time_get_loc
-      12.3±0.1ms       10.8±0.1ms     0.88  multiindex_object.GetLoc.time_small_get_loc_warm
-      34.4±0.6μs       30.2±0.6μs     0.88  index_object.Indexing.time_get_loc_non_unique_sorted('Float')
-         104±3μs         91.0±2μs     0.88  multiindex_object.GetLoc.time_large_get_loc
-      4.90±0.1ms      4.21±0.02ms     0.86  timeseries.ResampleSeries.time_resample('period', '1D', 'ohlc')
-      3.78±0.2ms      3.23±0.01ms     0.85  timeseries.ResampleSeries.time_resample('datetime', '5min', 'mean')
-      10.4±0.6μs      8.74±0.03μs     0.84  tslibs.timedelta.TimedeltaConstructor.time_from_iso_format
-      4.38±0.1ms      3.69±0.02ms     0.84  timeseries.ResampleSeries.time_resample('period', '5min', 'mean')
-      3.23±0.5μs       2.24±0.7μs     0.69  index_cached_properties.IndexCache.time_values('TimedeltaIndex')
-       641±200μs          443±1μs     0.69  groupby.GroupByMethods.time_dtype_as_field('int', 'head', 'direct')
-        2.73±1μs       1.45±0.2μs     0.53  index_cached_properties.IndexCache.time_values('UInt64Index')

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jbrockmendel can you add match arguments to pytest.raises see #23922

This was referenced Jan 14, 2020
@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves Performance Memory or execution speed performance Datetime Datetime data dtype labels Jan 15, 2020
@jreback jreback added this to the 1.1 milestone Jan 15, 2020
@jreback jreback merged commit 698920f into pandas-dev:master Jan 15, 2020
@jreback
Copy link
Contributor

jreback commented Jan 15, 2020

thanks @jbrockmendel

keep em coming!

@jbrockmendel
Copy link
Member Author

keep em coming!

Count on it.

@jbrockmendel jbrockmendel deleted the cln-maybe_datetimelike_to_i8 branch January 15, 2020 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants