Skip to content

Cursor methods catalogs(), schemas(), tables() & columns() return ColumnTable when using fetchall_arrow() #550

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
alexmalins opened this issue Apr 23, 2025 · 3 comments

Comments

@alexmalins
Copy link

alexmalins commented Apr 23, 2025

Since the v3.5.0 release and PR #440, the Cursor methods catalogs(), schemas(), tables() & columns() return the new ColumnTable objects when fetching the results with fetchall_arrow() despite pyarrow being installed:

Since 3.5.0:

from databricks import sql
import os
conn = sql.connect(
    server_hostname=os.getenv("DATABRICKS_HOST"),
    http_path=os.getenv("DATABRICKS_HTTP_PATH"),
    access_token=os.getenv("DATABRICKS_TOKEN"),
)
cursor = conn.cursor()
cursor.catalogs()
type(cursor.fetchall_arrow())  # databricks.sql.utils.ColumnTable

Prior to 3.5.0 databricks-sql-python would respect the users wish and return a pyarrow table, i.e. output of above code is pyarrow.lib.Table.

As far as I can understand, the goal of #440 was only to return ColumnTable objects if pyarrow is not installed. But since 3.5.0 the behaviour of fetchall_arrow() when pyarrow is installed is inconsistent. For cursor.execute() then cursor.fetchall_arrow() queries it will return pyarrow tables, but for the catalogs(), schemas(), tables() & columns() methods it now returns ColumnTables when it should really return pyarrow tables.

@shivam2680
Copy link
Contributor

Hi @alexmalins
Thrift metadata responses return COLUMN_BASED_SET. Before #440 , we converted everything to arrow tables in execution result. Now, we convert to correct table type based on result set format.

@alexmalins
Copy link
Author

hey @shivam2680 🙌

Now, we convert to correct table type based on result set format.

Ok, but doesn't this break the return type hint of fetchall_arrow(), which implies a pyarrow table will be returned? Now it is returning pyarrow.Table | ColumnTable

I don't think this is a good design. If users want a ColumnTable returned it would make more sense to use the new fetch_columnar() method (albeit this new method is undocumented and has no return type hint in the code)

@shivam2680
Copy link
Contributor

Yea, I get your point and already have a draft PR out to address this. We are finalizing design at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants