Skip to content

REF: strictness and checks for Timedelta _simple_new #23433

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 3, 2018

Conversation

jbrockmendel
Copy link
Member

Broken off from #23426.

@pep8speaks
Copy link

Hello @jbrockmendel! Thanks for submitting the PR.

@gfyoung gfyoung added Dtype Conversions Unexpected or buggy dtype conversions Internals Related to non-user accessible pandas implementation Timedelta Timedelta data type labels Oct 31, 2018
assert isinstance(values, np.ndarray), type(values)
if values.dtype == 'i8':
values = values.view('m8[ns]')
assert values.dtype == 'm8[ns]', values.dtype
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same questions for all your assert statements:

  • Are these internal? If not, it would be nice to have user-friendly error messages.
  • Can these assert statements be tested in any way?

Copy link
Member

@gfyoung gfyoung Oct 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question of testing is actually more general to all of these changes. Even though it's been labeled as internal, not sure if any of these edits will surface in any way.

Copy link
Member

@gfyoung gfyoung Oct 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The edits in simplenew are thoroughly internal; users shouldn’t get near it.

Not sure what testing these assertions would look like. Can you elaborate what you have in mind?

(moving your responses to the conversation bubble in the UI, organizational thing)

@jbrockmendel : What I was wondering was whether we could trigger these assert statements (e.g. an invalid input to a publicly facing function or method)?

Might be tricky if these edits are purely internal, and if it is too difficult, not a big deal. Just out of curiosity since tests are good if we can have them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Yah, we could
Write tests where we directly pass invalid inputs to simplenew. (Typing with thumbs, feel free to reformat if necessary)

@pandas-dev pandas-dev deleted a comment from jbrockmendel Oct 31, 2018
@codecov
Copy link

codecov bot commented Oct 31, 2018

Codecov Report

Merging #23433 into master will decrease coverage by 0.01%.
The diff coverage is 93.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #23433      +/-   ##
==========================================
- Coverage   92.22%   92.21%   -0.02%     
==========================================
  Files         161      161              
  Lines       51187    51188       +1     
==========================================
- Hits        47209    47202       -7     
- Misses       3978     3986       +8
Flag Coverage Δ
#multiple 90.64% <93.33%> (+0.03%) ⬆️
#single 42.24% <73.33%> (-0.03%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/base.py 96.51% <100%> (+0.05%) ⬆️
pandas/core/arrays/timedeltas.py 94.27% <90%> (-0.03%) ⬇️
pandas/core/indexes/timedeltas.py 90.74% <94.11%> (+0.12%) ⬆️
pandas/io/feather_format.py 77.14% <0%> (-12.61%) ⬇️
pandas/core/arrays/datetimes.py 97.9% <0%> (-0.95%) ⬇️
pandas/core/indexes/datetimes.py 95.82% <0%> (-0.6%) ⬇️
pandas/core/indexes/multi.py 95.46% <0%> (-0.01%) ⬇️
pandas/core/resample.py 96.98% <0%> (-0.01%) ⬇️
pandas/core/reshape/pivot.py 96.55% <0%> (ø) ⬆️
pandas/core/computation/align.py 97.89% <0%> (ø) ⬆️
... and 22 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4f71755...e7dd05e. Read the comment docs.

@@ -166,6 +167,19 @@ def __new__(cls, data=None, unit=None, freq=None, start=None, end=None,
elif copy:
data = np.array(data, copy=True)

data = np.array(data, copy=False)
if data.dtype == np.object_:
data = array_to_timedelta64(data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these checks NOT done in _simple_new? this is inconsistent with other code.

We should be really really clear on what is acceptable in _simple_new vs. what is not. IIRC from another of your PR's you did checks on object type in _simple_new for example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT the current verbose-checking is largely driven by the weird cases (that these PRs get rid of) where None or [] is passed to _shallow_copy.

This and the associated DatetimeIndex PR impose a simple/strict API for _simple_new: it expects an np.ndarray that may be either i8 or M8[ns]/m8[ns].

@@ -131,6 +131,10 @@ def __new__(cls, values, freq=None):

freq, freq_infer = dtl.maybe_infer_freq(freq)

values = np.array(values, copy=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so why do you need to accept object type here (you are also checking for this in TDI.new). ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ATM we are checking for it in TimedeltaArray._simple_new, so de-facto accepting it in TimedeltaArray.__new__. This is leaving the effective __new__ policy unchanged while clearing up the _simple_new policy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, assume this is on the list to de-duplicate

@jbrockmendel
Copy link
Member Author

Travis failure is for (new?) linting in scripts/docs

@jbrockmendel
Copy link
Member Author

Gentle ping, I’m hoping to make a big push on datetime/timedelta array this weekend, getting the constructors locked down is the first step.

@jreback jreback added this to the 0.24.0 milestone Nov 3, 2018
@jreback jreback merged commit ee7d856 into pandas-dev:master Nov 3, 2018
@jreback
Copy link
Contributor

jreback commented Nov 3, 2018

ok merging.

@jbrockmendel jbrockmendel deleted the simple4 branch November 3, 2018 14:39
thoo added a commit to thoo/pandas that referenced this pull request Nov 3, 2018
…xamples

* repo_org/master: (66 commits)
  CLN: doc string (pandas-dev#23469)
  DOC: Add cookbook entry for triangular correlation matrix (GH22840) (pandas-dev#23032)
  add number of Errors, Warnings to scripts/validate_docstrings.py (pandas-dev#23150)
  BUG: Allow freq conversion from dt64 to period (pandas-dev#23460)
  ENH: Add FrozenList.union and .difference (pandas-dev#23394)
  REF: cython cleanup, typing, optimizations (pandas-dev#23464)
  strictness and checks for Timedelta _simple_new (pandas-dev#23433)
  Fixing flake8 problems new to flake8 3.6.0 (pandas-dev#23472)
  DOC: Updating the docstring of Series.dot  (pandas-dev#22890)
  TST: Fixturize series/test_analytics.py (pandas-dev#22755)
  BUG/ENH: Handle NonexistentTimeError in date rounding (pandas-dev#23406)
  PERF: speed up concat on Series by making _get_axis_number() a classmethod (pandas-dev#23404)
  REF: Remove DatetimelikeArrayMixin._shallow_copy (pandas-dev#23430)
  REF: strictness/simplification in DatetimeArray/Index _simple_new (pandas-dev#23431)
  REF: cython cleanup, typing, optimizations (pandas-dev#23456)
  TST: tweak Hypothesis configuration and idioms (pandas-dev#23441)
  BUG: fix HDFStore.append with all empty strings error (GH12242) (pandas-dev#23435)
  TST: Skip 32bit failing IntervalTree tests (pandas-dev#23442)
  BUG: Deprecate nthreads argument (pandas-dev#23112)
  style: fix import format at pandas/core/reshape (pandas-dev#23387)
  ...
JustinZhengBC pushed a commit to JustinZhengBC/pandas that referenced this pull request Nov 14, 2018
tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Internals Related to non-user accessible pandas implementation Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Numpy dtype weirdness in TimedeltaIndex._simple_new
4 participants