Skip to content

ERR: _shallow_copy should assert dtype #13294

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jreback opened this issue May 26, 2016 · 7 comments
Closed

ERR: _shallow_copy should assert dtype #13294

jreback opened this issue May 26, 2016 · 7 comments
Labels
Compat pandas objects compatability with Numpy or Python functions Constructors Series/DataFrame/Index/pd.array Constructors Dtype Conversions Unexpected or buggy dtype conversions Index Related to the Index class or subclasses

Comments

@jreback
Copy link
Contributor

jreback commented May 26, 2016

xref #13288

In [2]: i = Index([0,'A'])

In [3]: i._shallow_copy([0])
Out[3]: Index([0], dtype='int64')

We should assert that we have a valid dtype in the constructed object (so should do this in _simple_new to avoid these guarantee violations)

@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves Dtype Conversions Unexpected or buggy dtype conversions Compat pandas objects compatability with Numpy or Python functions Difficulty Intermediate labels May 26, 2016
@jreback jreback added this to the 0.18.2 milestone May 26, 2016
@pijucha
Copy link
Contributor

pijucha commented Jul 3, 2016

An example for the same issue: np.array inside _simple_new converts nan to a string

i = pd.Index([0, 'A'])

i._shallow_copy(['A', np.nan])
Out[7]: Index(['A', 'nan'], dtype='<U3')

The output should be as follows

pd.Index(['A', np.nan])
Out[8]: Index(['A', nan], dtype='object')

And another issue in _simple_new: _shallow_copy returning an array

dti = pd.DatetimeIndex([0])

dti._shallow_copy([pd.Timestamp(0)])
Out[12]: array(['1970-01-01T00:00:00.000000000'], dtype='datetime64[ns]')

The expected output is as in the following:

dti._shallow_copy([0])
Out[13]: DatetimeIndex(['1970-01-01'], dtype='datetime64[ns]', freq=None)

INSTALLED VERSIONS
------------------
commit: 91617085c0696db3b457b524fe7273ecc928fedc
python: 3.5.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.1.20-1
machine: x86_64
processor: Intel(R)_Core(TM)_i5-2520M_CPU_@_2.50GHz
byteorder: little

pandas: 0.18.1
numpy: 1.11.0

@jreback
Copy link
Contributor Author

jreback commented Jul 3, 2016

these violate guarantees of _shallow_copy

the data must already be an index

@pijucha
Copy link
Contributor

pijucha commented Jul 3, 2016

This is probably something I misunderstood (misunderstand) about _shallow_copy and this issue.

Does it mean that all those examples (including [0] in the first one) is rather a misuse of _shallow_copy, which failed to be detected/asserted, and not a real bug?

And the only valid values in the expression

idx._shallow_copy(values)

is an index or an np.array with dtype compatible with idx.dtype (and not just a plain list)?

@jreback
Copy link
Contributor Author

jreback commented Jul 3, 2016

yep - this issue is about

a) making sure that we are an Index (or values is an ndarray)
b) validating that the dtype is compat for the class

or raise

@pijucha
Copy link
Contributor

pijucha commented Jul 3, 2016

Thanks!

@jreback jreback modified the milestones: Next Major Release, 0.18.2 Jul 6, 2016
@jbrockmendel
Copy link
Member

A lot of thought has gone into related topics for PeriodArray etc. I think the conclusion is pretty much that the caller is responsible for checking the inputs, since shallowxopy is private. Assertions in simplenew might be cleaner.

@jbrockmendel jbrockmendel added Constructors Series/DataFrame/Index/pd.array Constructors Index Related to the Index class or subclasses and removed Indexing Related to indexing on series/frames, not to indexes themselves labels Feb 11, 2020
@mroeschke
Copy link
Member

I think this asserts correctly as Brock mentions. Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Constructors Series/DataFrame/Index/pd.array Constructors Dtype Conversions Unexpected or buggy dtype conversions Index Related to the Index class or subclasses
Projects
None yet
Development

No branches or pull requests

4 participants