-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ERR: Series must have a singluar dtype otherwise should raise #13296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
You are wanting to construct a
|
@jreback Whether or not my purposes would be better served with a dataframe in this case, I think this is still a valid bug, considering that you can construct and use As for why you might want to do something like this - occasionally there are uses where the semantics are much easier when you can treat a single value as a scalars rather than multiple columns. One toy example would be operations on coordinate systems: import pandas as pd
import numpy as np
three_vec = np.dtype([('x', 'f8'), ('y', 'f8'), ('z', 'f8')])
def rotate_coordinates(x, u, theta):
I = np.identity(3)
ux = np.array([
[ 0, -u['z'], u['y']],
[ u['z'], 0, -u['x']],
[-u['y'], u['x'], 0]
])
uu = np.array([
[ u['x'] ** 2, u['x'] * u['y'], u['x'] * u['z']],
[u['x'] * u['y'], u['y'] ** 2, u['y'] * u['z']],
[u['x'] * u['z'], u['y'] * u['z'], u['z'] ** 2]
])
R = np.cos(theta) * I + np.sin(theta) * ux + (1 - np.cos(theta)) * uu
xx = x.view(np.float64).reshape(x.shape + (-1,)).T
out_array = (R @ xx).round(15)
return np.core.records.fromarrays(out_array, dtype=three_vec)
# Rotate these arrays about z
z = np.array([(0, 0, 1)], dtype=three_vec)[0]
v1 = np.array([(0, 1, 0), (1, 0, 0)], dtype=three_vec)
vp = rotate_coordinates(v1, z, np.pi / 2)
print(v1)
print(vp) Now imagine that I wanted a |
@pganssle you are violating the guarantees of a Series. it is by-definition a singular dtype. The bug is that it accepts (a non-singular one) in the first place. I'll reopen for that purpose. There is NO support for a Series with the use-case you describe. EIther use a DataFrame or xarray. |
@jreback My suggestion is that compound types are a single type in the same way that a |
@pganssle a compound dtype is simply not supported, nor do I think should be. Sure an extension type that is innately a compound type is fine because it singular. But a structured dtype is NOT. it has sub-dtypes. This is just making an already complicated structure WAY more complex. |
as I said for not this should simply raise |
@jreback Does pandas support custom dtypes? I'm not sure that I've ever seen someone create one, other than |
But these required a lot of support to integrate properly. These are fundamental types. I suppose a Coordinate could also be in that category. But as I said its a MAJOR effort to properly handle things. |
Principally the issue is efficient storage. What you are suggesting is NOT stored efficiently and that's the problem. |
I have NEVER seen a good use of |
It's just a toy example of why the semantics would be useful. You could achieve the same thing with I think it's fine to consider my suggestion a "low reward / high effort" enhancement - it may be fundamentally difficult to deal with this sort of thing and not something that comes up a lot, I just think it's worth considering as a "nice to have", since, if possible, it would be better to have first-class support for complex datatypes than not. When I have a bit of time I will be happy to look into the underlying details and see if I can get a better understanding of difficulty and/or propose an alternate approach. Probably it will be a while, though, since I have quite a backlog of other stuff to get to. In the meantime, I would think this could be profitably handled by just converting compound datatypes to tuple on import, possibly with a warning about the inefficiency of this approach. At least this would allow people who are less performance sensitive to write some wrapper functions to allow the use of normal semantics. |
@pganssle if you have time for this great. But I don't have time for every enhancement (actually most of them). So if you'd like to propose something great. However the very simplest thing is to raise an error. If you are someone wants to implement a better soln. great. |
When constructing a
Series
object using a numpy structured data array, if you try and cast it to astr
(or print it), it throws:You can print a single value from the series, but not the whole series.
Code Sample, a copy-pastable example if possible
Output (actual):
output of
pd.show_versions()
:Stack Trace
I swallowed the stack traces to show where this was failing, so here's the traceback for that last error:
The text was updated successfully, but these errors were encountered: