-
Notifications
You must be signed in to change notification settings - Fork 104
Unpin pandas
#342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The pin was added in: To fix the issue described in: ...but that just avoids the problem whilst causing another problem; this library can't be used with the latest |
I'm opening this issue to track any progress towards compatibility with the latest |
Bump! I would like to upgrade to the latest version but am stuck on 3.0.1 because of this pin 😔 |
Does 3.0.1 work with latest pandas? That would be an interesting data point. |
I've been using 3.0.1 in combination with
...but that's apparently because I don't query all with engine.connect() as conn:
res = conn.execute(sa.text("select 1")).scalar_one() gives:
|
It seems like it doesn't like assigning a None into an integer array: > /opt/python/envs/dev310/lib/python3.10/site-packages/pandas/core/internals/managers.py(1703)as_array()
1701 pass
1702 else:
-> 1703 arr[isna(arr)] = na_value
1704
1705 return arr.transpose()
ipdb> arr
array([[1]], dtype=int32)
ipdb> isna(arr)
array([[False]])
ipdb> na_value
ipdb> na_value is None
True If we go up the stack we can see we get type errors if we try to assign anything other than an integer: > /opt/python/envs/dev310/lib/python3.10/site-packages/databricks/sql/client.py(1149)_convert_arrow_table()
1147 )
1148
-> 1149 res = df.to_numpy(na_value=None)
1150 return [ResultRow(*v) for v in res]
1151
ipdb> df.to_numpy(na_value=None)
*** TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
ipdb> df.to_numpy(na_value=float('NaN'))
*** ValueError: cannot convert float NaN to integer
ipdb> df.to_numpy(na_value=-99)
array([[1]], dtype=int32) Casting to ipdb> df.astype(object).to_numpy(na_value=None)
array([[1]], dtype=object) |
The problematic function: databricks-sql-python/src/databricks/sql/client.py Lines 1130 to 1166 in a6e9b11
|
I can work around the issue by disabling pandas: with engine.connect() as conn:
cursor = conn.connection.cursor()
cursor.connection.disable_pandas = True
res = cursor.execute("select 1").fetchall() >>> res
[Row(1=1)] ...but obviously the casting to numpy needs to be fixed. |
Probably casting to object before assigning a |
I second this. I cannot use Also, it would be good if you delete the distutils dependency |
@dhirschfeld any idea when this is going to make it to a release? Looks like it didn't go into 3.2.0 as I am unable to |
I'm not a maintainer here so I couldn't say. I was hoping to do some more testing at some point, but haven't found the time. |
I would like to be able to use this library with the latest
pandas
version. Currentlypandas
is pinned to<2.2.0
:databricks-sql-python/pyproject.toml
Lines 14 to 16 in 0552990
It would be good to remove this restriction.
The text was updated successfully, but these errors were encountered: