Skip to content

Datetimelike __setitem__ #24477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 29, 2018
Merged

Conversation

TomAugspurger
Copy link
Contributor

Split from #24024.

This is mostly just a move from PeriodArray to DatetimelikeArray, with period-specific things replaced with _check_compatible_with, _unbox_scalar, etc.

This is under-tested at the moment, just the basics are actually hit. But I think this is OK for two reasons

  1. I'll immediately rebase REF: DatetimeLikeArray #24024 with these changes, so we'll get the setitem tests shortly after merging (and can revert / whatever as needed)
  2. We do run the full suite of tests for PeriodArray, and any type-specific stuff should be hit currently with the tests here (_check_compatible_with, freq invalidation, etc.).

@TomAugspurger TomAugspurger added Datetime Datetime data dtype ExtensionArray Extending pandas with custom dtypes or arrays. labels Dec 29, 2018
@TomAugspurger TomAugspurger added this to the 0.24.0 milestone Dec 29, 2018
if is_list_like(value):
is_slice = isinstance(key, slice)
if (not is_slice
and len(key) != len(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key could be a scalar here right? (in which case u will get an odd exception about len of unsized object)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm... yeah

In [3]: pd.period_range('2000', periods=2)._data
Out[3]:
<PeriodArray>
['2000-01-01', '2000-01-02']
Length: 2, dtype: period[D]

In [4]: arr = pd.period_range('2000', periods=2)._data

In [5]: arr[0] = arr[[0, 1]]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-9a9ffe90069d> in <module>
----> 1 arr[0] = arr[[0, 1]]

~/sandbox/pandas/pandas/core/arrays/datetimelike.py in __setitem__(self, key, value)
    494             is_slice = isinstance(key, slice)
    495             if (not is_slice
--> 496                     and len(key) != len(value)
    497                     and not com.is_bool_indexer(key)):
    498                 msg = ("shape mismatch: value array of length '{}' does not "

TypeError: object of type 'int' has no len()

This is a gap in the base extension tests as well. I'll ad one. that should be a... what? ValueError for trying to set list-like into a single element?

Copy link
Contributor Author

@TomAugspurger TomAugspurger Dec 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NumPy raises with ValueError: setting an array element with a sequence. which seems like a fine error message to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep seems good
maybe just let it fall thru this if will work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seemed a tad easier to explicitly check for a scalar key here.

Actually, grr, this is kinda annoying but NumPy allows setting a sequence of length 1.

In [2]: x = np.array([1, 2])

In [3]: x[0] = x[[0]]

but now I would raise on that.

In [10]: arr = pd.period_range('2000', periods=2)._data

In [11]: arr[0] = arr[[0]]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-11-666653b9c59a> in <module>
----> 1 arr[0] = arr[[0]]

~/sandbox/pandas/pandas/core/arrays/datetimelike.py in __setitem__(self, key, value)
    495
    496             if lib.is_scalar(key):
--> 497                 raise ValueError("setting an array element with a sequence.")
    498
    499             if (not is_slice

ValueError: setting an array element with a sequence.

Thoughts? I think this should raise, since a length-1 sequence is more like a sequence than a scalar.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i think we raise on this

this is why setitem in Block is so complicated :)

* Failing test for setting a sequence into a scalar
* Fixed implementation
@codecov
Copy link

codecov bot commented Dec 29, 2018

Codecov Report

Merging #24477 into master will decrease coverage by 49.26%.
The diff coverage is 15.38%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #24477       +/-   ##
===========================================
- Coverage   92.32%   43.06%   -49.27%     
===========================================
  Files         166      166               
  Lines       52298    52303        +5     
===========================================
- Hits        48285    22524    -25761     
- Misses       4013    29779    +25766
Flag Coverage Δ
#multiple ?
#single 43.06% <15.38%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/arrays/period.py 40.96% <ø> (-57.52%) ⬇️
pandas/core/arrays/timedeltas.py 40.29% <50%> (-47.34%) ⬇️
pandas/core/arrays/datetimes.py 65.1% <50%> (-33.16%) ⬇️
pandas/core/arrays/datetimelike.py 46.21% <9.09%> (-49.45%) ⬇️
pandas/io/formats/latex.py 0% <0%> (-100%) ⬇️
pandas/core/categorical.py 0% <0%> (-100%) ⬇️
pandas/io/sas/sas_constants.py 0% <0%> (-100%) ⬇️
pandas/tseries/plotting.py 0% <0%> (-100%) ⬇️
pandas/tseries/converter.py 0% <0%> (-100%) ⬇️
pandas/io/formats/html.py 0% <0%> (-98.65%) ⬇️
... and 128 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 80295f9...2dc85b7. Read the comment docs.

@codecov
Copy link

codecov bot commented Dec 29, 2018

Codecov Report

Merging #24477 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24477      +/-   ##
==========================================
+ Coverage   92.32%   92.32%   +<.01%     
==========================================
  Files         166      166              
  Lines       52298    52306       +8     
==========================================
+ Hits        48285    48294       +9     
+ Misses       4013     4012       -1
Flag Coverage Δ
#multiple 90.75% <100%> (ø) ⬆️
#single 43.06% <17.24%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/arrays/period.py 98.39% <ø> (-0.08%) ⬇️
pandas/core/arrays/datetimelike.py 95.84% <100%> (+0.19%) ⬆️
pandas/core/indexes/datetimelike.py 97.53% <100%> (ø) ⬆️
pandas/core/arrays/timedeltas.py 87.68% <100%> (+0.05%) ⬆️
pandas/core/arrays/datetimes.py 98.43% <100%> (+0.17%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 80295f9...e1b7d1d. Read the comment docs.

@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented Dec 29, 2018 via email

@TomAugspurger
Copy link
Contributor Author

All green.

@@ -108,6 +108,8 @@ def astype(self, dtype, copy=True):

def __setitem__(self, key, value):
if pd.api.types.is_list_like(value):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should prob import these like we normally do (is_like_like) at the top, but no big deal

)
raise TypeError(msg.format(scalar=self._scalar_type.__name__,
typ=type(value).__name__))
self._data[key] = value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to move smoe of this logic a big higher up to EA base if possible and/or make some helper methods to avoid code duplication of EA's implementating setitem, but for another time.

@jreback jreback merged commit 2c8c5df into pandas-dev:master Dec 29, 2018
@jreback
Copy link
Contributor

jreback commented Dec 29, 2018

thanks!

@TomAugspurger TomAugspurger deleted the split-setitem branch January 2, 2019 20:18
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype ExtensionArray Extending pandas with custom dtypes or arrays.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants