Skip to content

Commit ee7f1e3

Browse files
committed
Bumped up the version to 4.0.0.b3 and also changed the structure to have pyarrow as optional
1 parent 3d1ef79 commit ee7f1e3

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

78 files changed

+2403
-1111
lines changed

CHANGELOG.md

+15
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,20 @@
11
# Release History
22

3+
# 3.6.0 (2024-10-25)
4+
5+
- Support encryption headers in the cloud fetch request (https://github.com/databricks/databricks-sql-python/pull/460 by @jackyhu-db)
6+
7+
# 3.5.0 (2024-10-18)
8+
9+
- Create a non pyarrow flow to handle small results for the column set (databricks/databricks-sql-python#440 by @jprakash-db)
10+
- Fix: On non-retryable error, ensure PySQL includes useful information in error (databricks/databricks-sql-python#447 by @shivam2680)
11+
12+
# 3.4.0 (2024-08-27)
13+
14+
- Unpin pandas to support v2.2.2 (databricks/databricks-sql-python#416 by @kfollesdal)
15+
- Make OAuth as the default authenticator if no authentication setting is provided (databricks/databricks-sql-python#419 by @jackyhu-db)
16+
- Fix (regression): use SSL options with HTTPS connection pool (databricks/databricks-sql-python#425 by @kravets-levko)
17+
318
# 3.3.0 (2024-07-18)
419

520
- Don't retry requests that fail with HTTP code 401 (databricks/databricks-sql-python#408 by @Hodnebo)

CONTRIBUTING.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -85,18 +85,18 @@ We use [Pytest](https://docs.pytest.org/en/7.1.x/) as our test runner. Invoke it
8585
Unit tests do not require a Databricks account.
8686

8787
```bash
88-
poetry run python -m pytest databricks_sql_connector_core/tests/unit
88+
poetry run python -m pytest tests/unit
8989
```
9090
#### Only a specific test file
9191

9292
```bash
93-
poetry run python -m pytest databricks_sql_connector_core/tests/unit/tests.py
93+
poetry run python -m pytest tests/unit/tests.py
9494
```
9595

9696
#### Only a specific method
9797

9898
```bash
99-
poetry run python -m pytest databricks_sql_connector_core/tests/unit/tests.py::ClientTestSuite::test_closing_connection_closes_commands
99+
poetry run python -m pytest tests/unit/tests.py::ClientTestSuite::test_closing_connection_closes_commands
100100
```
101101

102102
#### e2e Tests
@@ -133,7 +133,7 @@ There are several e2e test suites available:
133133
To execute the core test suite:
134134

135135
```bash
136-
poetry run python -m pytest databricks_sql_connector_core/tests/e2e/driver_tests.py::PySQLCoreTestSuite
136+
poetry run python -m pytest tests/e2e/driver_tests.py::PySQLCoreTestSuite
137137
```
138138

139139
The `PySQLCoreTestSuite` namespace contains tests for all of the connector's basic features and behaviours. This is the default namespace where tests should be written unless they require specially configured clusters or take an especially long-time to execute by design.

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
[![PyPI](https://img.shields.io/pypi/v/databricks-sql-connector?style=flat-square)](https://pypi.org/project/databricks-sql-connector/)
44
[![Downloads](https://pepy.tech/badge/databricks-sql-connector)](https://pepy.tech/project/databricks-sql-connector)
55

6-
The Databricks SQL Connector for Python allows you to develop Python applications that connect to Databricks clusters and SQL warehouses. It is a Thrift-based client with no dependencies on ODBC or JDBC. It conforms to the [Python DB API 2.0 specification](https://www.python.org/dev/peps/pep-0249/) and exposes a [SQLAlchemy](https://www.sqlalchemy.org/) dialect for use with tools like `pandas` and `alembic` which use SQLAlchemy to execute DDL. Use `pip install databricks-sql-connector[databricks-sqlalchemy]` to install with SQLAlchemy's dependencies. `pip install databricks-sql-connector[alembic]` will install alembic's dependencies.
6+
The Databricks SQL Connector for Python allows you to develop Python applications that connect to Databricks clusters and SQL warehouses. It is a Thrift-based client with no dependencies on ODBC or JDBC. It conforms to the [Python DB API 2.0 specification](https://www.python.org/dev/peps/pep-0249/) and exposes a [SQLAlchemy](https://www.sqlalchemy.org/) dialect for use with tools like `pandas` and `alembic` which use SQLAlchemy to execute DDL. Use `pip install databricks-sql-connector[sqlalchemy]` to install with SQLAlchemy's dependencies. `pip install databricks-sql-connector[alembic]` will install alembic's dependencies.
77

88
This connector uses Arrow as the data-exchange format, and supports APIs to directly fetch Arrow tables. Arrow tables are wrapped in the `ArrowQueue` class to provide a natural API to get several rows at a time.
99

File renamed without changes.

databricks_sql_connector/pyproject.toml

-24
This file was deleted.

databricks_sql_connector_core/tests/unit/__init__.py

Whitespace-only changes.

examples/custom_cred_provider.py

+13-9
Original file line numberDiff line numberDiff line change
@@ -4,23 +4,27 @@
44
from databricks.sdk.oauth import OAuthClient
55
import os
66

7-
oauth_client = OAuthClient(host=os.getenv("DATABRICKS_SERVER_HOSTNAME"),
8-
client_id=os.getenv("DATABRICKS_CLIENT_ID"),
9-
client_secret=os.getenv("DATABRICKS_CLIENT_SECRET"),
10-
redirect_url=os.getenv("APP_REDIRECT_URL"),
11-
scopes=['all-apis', 'offline_access'])
7+
oauth_client = OAuthClient(
8+
host=os.getenv("DATABRICKS_SERVER_HOSTNAME"),
9+
client_id=os.getenv("DATABRICKS_CLIENT_ID"),
10+
client_secret=os.getenv("DATABRICKS_CLIENT_SECRET"),
11+
redirect_url=os.getenv("APP_REDIRECT_URL"),
12+
scopes=["all-apis", "offline_access"],
13+
)
1214

1315
consent = oauth_client.initiate_consent()
1416

1517
creds = consent.launch_external_browser()
1618

17-
with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"),
18-
http_path = os.getenv("DATABRICKS_HTTP_PATH"),
19-
credentials_provider=creds) as connection:
19+
with sql.connect(
20+
server_hostname=os.getenv("DATABRICKS_SERVER_HOSTNAME"),
21+
http_path=os.getenv("DATABRICKS_HTTP_PATH"),
22+
credentials_provider=creds,
23+
) as connection:
2024

2125
for x in range(1, 5):
2226
cursor = connection.cursor()
23-
cursor.execute('SELECT 1+1')
27+
cursor.execute("SELECT 1+1")
2428
result = cursor.fetchall()
2529
for row in result:
2630
print(row)

examples/insert_data.py

+14-12
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,23 @@
11
from databricks import sql
22
import os
33

4-
with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"),
5-
http_path = os.getenv("DATABRICKS_HTTP_PATH"),
6-
access_token = os.getenv("DATABRICKS_TOKEN")) as connection:
4+
with sql.connect(
5+
server_hostname=os.getenv("DATABRICKS_SERVER_HOSTNAME"),
6+
http_path=os.getenv("DATABRICKS_HTTP_PATH"),
7+
access_token=os.getenv("DATABRICKS_TOKEN"),
8+
) as connection:
79

8-
with connection.cursor() as cursor:
9-
cursor.execute("CREATE TABLE IF NOT EXISTS squares (x int, x_squared int)")
10+
with connection.cursor() as cursor:
11+
cursor.execute("CREATE TABLE IF NOT EXISTS squares (x int, x_squared int)")
1012

11-
squares = [(i, i * i) for i in range(100)]
12-
values = ",".join([f"({x}, {y})" for (x, y) in squares])
13+
squares = [(i, i * i) for i in range(100)]
14+
values = ",".join([f"({x}, {y})" for (x, y) in squares])
1315

14-
cursor.execute(f"INSERT INTO squares VALUES {values}")
16+
cursor.execute(f"INSERT INTO squares VALUES {values}")
1517

16-
cursor.execute("SELECT * FROM squares LIMIT 10")
18+
cursor.execute("SELECT * FROM squares LIMIT 10")
1719

18-
result = cursor.fetchall()
20+
result = cursor.fetchall()
1921

20-
for row in result:
21-
print(row)
22+
for row in result:
23+
print(row)

examples/interactive_oauth.py

+5-3
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,14 @@
1313
token across script executions.
1414
"""
1515

16-
with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"),
17-
http_path = os.getenv("DATABRICKS_HTTP_PATH")) as connection:
16+
with sql.connect(
17+
server_hostname=os.getenv("DATABRICKS_SERVER_HOSTNAME"),
18+
http_path=os.getenv("DATABRICKS_HTTP_PATH"),
19+
) as connection:
1820

1921
for x in range(1, 100):
2022
cursor = connection.cursor()
21-
cursor.execute('SELECT 1+1')
23+
cursor.execute("SELECT 1+1")
2224
result = cursor.fetchall()
2325
for row in result:
2426
print(row)

examples/m2m_oauth.py

+7-5
Original file line numberDiff line numberDiff line change
@@ -22,17 +22,19 @@ def credential_provider():
2222
# Service Principal UUID
2323
client_id=os.getenv("DATABRICKS_CLIENT_ID"),
2424
# Service Principal Secret
25-
client_secret=os.getenv("DATABRICKS_CLIENT_SECRET"))
25+
client_secret=os.getenv("DATABRICKS_CLIENT_SECRET"),
26+
)
2627
return oauth_service_principal(config)
2728

2829

2930
with sql.connect(
30-
server_hostname=server_hostname,
31-
http_path=os.getenv("DATABRICKS_HTTP_PATH"),
32-
credentials_provider=credential_provider) as connection:
31+
server_hostname=server_hostname,
32+
http_path=os.getenv("DATABRICKS_HTTP_PATH"),
33+
credentials_provider=credential_provider,
34+
) as connection:
3335
for x in range(1, 100):
3436
cursor = connection.cursor()
35-
cursor.execute('SELECT 1+1')
37+
cursor.execute("SELECT 1+1")
3638
result = cursor.fetchall()
3739
for row in result:
3840
print(row)

examples/persistent_oauth.py

+27-20
Original file line numberDiff line numberDiff line change
@@ -17,37 +17,44 @@
1717
from typing import Optional
1818

1919
from databricks import sql
20-
from databricks.sql.experimental.oauth_persistence import OAuthPersistence, OAuthToken, DevOnlyFilePersistence
20+
from databricks.sql.experimental.oauth_persistence import (
21+
OAuthPersistence,
22+
OAuthToken,
23+
DevOnlyFilePersistence,
24+
)
2125

2226

2327
class SampleOAuthPersistence(OAuthPersistence):
24-
def persist(self, hostname: str, oauth_token: OAuthToken):
25-
"""To be implemented by the end user to persist in the preferred storage medium.
28+
def persist(self, hostname: str, oauth_token: OAuthToken):
29+
"""To be implemented by the end user to persist in the preferred storage medium.
2630
27-
OAuthToken has two properties:
28-
1. OAuthToken.access_token
29-
2. OAuthToken.refresh_token
31+
OAuthToken has two properties:
32+
1. OAuthToken.access_token
33+
2. OAuthToken.refresh_token
3034
31-
Both should be persisted.
32-
"""
33-
pass
35+
Both should be persisted.
36+
"""
37+
pass
3438

35-
def read(self, hostname: str) -> Optional[OAuthToken]:
36-
"""To be implemented by the end user to fetch token from the preferred storage
39+
def read(self, hostname: str) -> Optional[OAuthToken]:
40+
"""To be implemented by the end user to fetch token from the preferred storage
3741
38-
Fetch the access_token and refresh_token for the given hostname.
39-
Return OAuthToken(access_token, refresh_token)
40-
"""
41-
pass
42+
Fetch the access_token and refresh_token for the given hostname.
43+
Return OAuthToken(access_token, refresh_token)
44+
"""
45+
pass
4246

43-
with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"),
44-
http_path = os.getenv("DATABRICKS_HTTP_PATH"),
45-
auth_type="databricks-oauth",
46-
experimental_oauth_persistence=DevOnlyFilePersistence("./sample.json")) as connection:
47+
48+
with sql.connect(
49+
server_hostname=os.getenv("DATABRICKS_SERVER_HOSTNAME"),
50+
http_path=os.getenv("DATABRICKS_HTTP_PATH"),
51+
auth_type="databricks-oauth",
52+
experimental_oauth_persistence=DevOnlyFilePersistence("./sample.json"),
53+
) as connection:
4754

4855
for x in range(1, 100):
4956
cursor = connection.cursor()
50-
cursor.execute('SELECT 1+1')
57+
cursor.execute("SELECT 1+1")
5158
result = cursor.fetchall()
5259
for row in result:
5360
print(row)

examples/query_cancel.py

+37-32
Original file line numberDiff line numberDiff line change
@@ -5,47 +5,52 @@
55
The current operation of a cursor may be cancelled by calling its `.cancel()` method as shown in the example below.
66
"""
77

8-
with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"),
9-
http_path = os.getenv("DATABRICKS_HTTP_PATH"),
10-
access_token = os.getenv("DATABRICKS_TOKEN")) as connection:
8+
with sql.connect(
9+
server_hostname=os.getenv("DATABRICKS_SERVER_HOSTNAME"),
10+
http_path=os.getenv("DATABRICKS_HTTP_PATH"),
11+
access_token=os.getenv("DATABRICKS_TOKEN"),
12+
) as connection:
1113

12-
with connection.cursor() as cursor:
13-
def execute_really_long_query():
14-
try:
15-
cursor.execute("SELECT SUM(A.id - B.id) " +
16-
"FROM range(1000000000) A CROSS JOIN range(100000000) B " +
17-
"GROUP BY (A.id - B.id)")
18-
except sql.exc.RequestError:
19-
print("It looks like this query was cancelled.")
14+
with connection.cursor() as cursor:
2015

21-
exec_thread = threading.Thread(target=execute_really_long_query)
16+
def execute_really_long_query():
17+
try:
18+
cursor.execute(
19+
"SELECT SUM(A.id - B.id) "
20+
+ "FROM range(1000000000) A CROSS JOIN range(100000000) B "
21+
+ "GROUP BY (A.id - B.id)"
22+
)
23+
except sql.exc.RequestError:
24+
print("It looks like this query was cancelled.")
2225

23-
print("\n Beginning to execute long query")
24-
exec_thread.start()
26+
exec_thread = threading.Thread(target=execute_really_long_query)
2527

26-
# Make sure the query has started before cancelling
27-
print("\n Waiting 15 seconds before canceling", end="", flush=True)
28+
print("\n Beginning to execute long query")
29+
exec_thread.start()
2830

29-
seconds_waited = 0
30-
while seconds_waited < 15:
31-
seconds_waited += 1
32-
print(".", end="", flush=True)
33-
time.sleep(1)
31+
# Make sure the query has started before cancelling
32+
print("\n Waiting 15 seconds before canceling", end="", flush=True)
3433

35-
print("\n Cancelling the cursor's operation. This can take a few seconds.")
36-
cursor.cancel()
34+
seconds_waited = 0
35+
while seconds_waited < 15:
36+
seconds_waited += 1
37+
print(".", end="", flush=True)
38+
time.sleep(1)
3739

38-
print("\n Now checking the cursor status:")
39-
exec_thread.join(5)
40+
print("\n Cancelling the cursor's operation. This can take a few seconds.")
41+
cursor.cancel()
4042

41-
assert not exec_thread.is_alive()
42-
print("\n The previous command was successfully canceled")
43+
print("\n Now checking the cursor status:")
44+
exec_thread.join(5)
4345

44-
print("\n Now reusing the cursor to run a separate query.")
46+
assert not exec_thread.is_alive()
47+
print("\n The previous command was successfully canceled")
4548

46-
# We can still execute a new command on the cursor
47-
cursor.execute("SELECT * FROM range(3)")
49+
print("\n Now reusing the cursor to run a separate query.")
4850

49-
print("\n Execution was successful. Results appear below:")
51+
# We can still execute a new command on the cursor
52+
cursor.execute("SELECT * FROM range(3)")
5053

51-
print(cursor.fetchall())
54+
print("\n Execution was successful. Results appear below:")
55+
56+
print(cursor.fetchall())

examples/query_execute.py

+10-8
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,15 @@
11
from databricks import sql
22
import os
33

4-
with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"),
5-
http_path = os.getenv("DATABRICKS_HTTP_PATH"),
6-
access_token = os.getenv("DATABRICKS_TOKEN")) as connection:
4+
with sql.connect(
5+
server_hostname=os.getenv("DATABRICKS_SERVER_HOSTNAME"),
6+
http_path=os.getenv("DATABRICKS_HTTP_PATH"),
7+
access_token=os.getenv("DATABRICKS_TOKEN"),
8+
) as connection:
79

8-
with connection.cursor() as cursor:
9-
cursor.execute("SELECT * FROM default.diamonds LIMIT 2")
10-
result = cursor.fetchall()
10+
with connection.cursor() as cursor:
11+
cursor.execute("SELECT * FROM default.diamonds LIMIT 2")
12+
result = cursor.fetchall()
1113

12-
for row in result:
13-
print(row)
14+
for row in result:
15+
print(row)

0 commit comments

Comments
 (0)