Skip to content

Please add async/await support #176

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
thetadweller opened this issue Jul 11, 2023 · 23 comments
Closed

Please add async/await support #176

thetadweller opened this issue Jul 11, 2023 · 23 comments
Assignees
Labels
enhancement New feature or request

Comments

@thetadweller
Copy link

I am writing a web app that needs to run multiple concurrent queries against Databricks SQL Warehouse. Due to existing library is synchronous my processes tend to get locked for a duration of SQL query so that subsequent calls from other clients end up being queued. As such, I am forced to run multiple Python processes to handle multiple concurrent calls even though all of them are I/O bound and would have been handled by a handful of processes had I been able to write queries using async / await.

I tried to find a workaround using SQLAlchemy and async I/O wrappers but returned a message that connection is not asynchronous:
InvalidRequestError: The asyncio extension requires an async driver to be used. The loaded 'databricks-sql-python' is not async.

@susodapop
Copy link
Contributor

susodapop commented Jul 11, 2023

Thanks for raising this issue. I agree that async support would be a huge asset in the growing world of asyncio Python.

This is a nontrivial project though, namely because we use urllib3 for our HTTP handling and urllib3 doesn't support async. In the short term you can work around this with stdlib/threading rather than multiple processes but obviously the ergonomics there aren't good. We have an example of it in examples/query_cancel.py

I'm self-assigning this issue for now.

I'd like to see more signal from other users so we can gauge the demand for this feature. Single-threaded usage comprises the majority of customer feature requests with regard to this connector (the connector is 18 months old and this request has only come up once before here).

The more people ask for it the easier it will be to prioritise

@susodapop susodapop self-assigned this Jul 11, 2023
@krishnamenon22
Copy link

Similar Request here. Key concern is to handle timeout when cluster boots up .

@vhrichfernandez
Copy link

I need to run 30 async queries in parallel to see if they will parse in UC with 1 minute timeout to consider it a parseable query in Unity Catalog. I'm using concurrent.futures.ProcessPoolExecutor but getting the following error:

>               _, error_list = execute_sql_queries_async(list(tables.values()))

tests/infrastructure/database/test_dbx_connect.py:25:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
sql_utils.py:240: in execute_sql_queries_async
    _, error = future.result()
/root/.pyenv/versions/3.10.12/lib/python3.10/concurrent/futures/_base.py:451: in result
    return self.__get_result()
/root/.pyenv/versions/3.10.12/lib/python3.10/concurrent/futures/_base.py:403: in __get_result
    raise self._exception
/root/.pyenv/versions/3.10.12/lib/python3.10/multiprocessing/queues.py:244: in _feed
    obj = _ForkingPickler.dumps(obj)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

cls = <class 'multiprocessing.reduction.ForkingPickler'>
obj = <concurrent.futures.process._CallItem object at 0x7fffe7e99420>, protocol = None

    @classmethod
    def dumps(cls, obj, protocol=None):
        buf = io.BytesIO()
>       cls(buf, protocol).dump(obj)
E       TypeError: cannot pickle 'SSLContext' object

@susodapop
Copy link
Contributor

@vhrichfernandez that error is pretty straightforward. Here's a related StackOverflow answer.

As for the topic of this issue: we're actively planning out the introduction of both async/await and a blocking but async execution method for this connector.

@vhrichfernandez
Copy link

@susodapop

Adding the following code to my module:

import pickle, copyreg, ssl

def save_sslcontext(obj):
    return obj.__class__, (obj.protocol,)

copyreg.pickle(ssl.SSLContext, save_sslcontext)
context = ssl.create_default_context()

results in the following error:

sql_utils.py:248: in execute_sql_queries_async
    _, error = future.result()
/root/.pyenv/versions/3.10.12/lib/python3.10/concurrent/futures/_base.py:451: in result
    return self.__get_result()
/root/.pyenv/versions/3.10.12/lib/python3.10/concurrent/futures/_base.py:403: in __get_result
    raise self._exception
/root/.pyenv/versions/3.10.12/lib/python3.10/multiprocessing/queues.py:244: in _feed
    obj = _ForkingPickler.dumps(obj)
/root/.pyenv/versions/3.10.12/lib/python3.10/multiprocessing/reduction.py:51: in dumps
    cls(buf, protocol).dump(obj)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <ssl.SSLSocket fd=16, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('172.17.0.2', 56568), raddr=('3.237.73.239', 443)>

    def __getstate__(self):
>       raise TypeError(f"cannot pickle {self.__class__.__name__!r} object")
E       TypeError: cannot pickle 'SSLSocket' object

@admo1
Copy link

admo1 commented Aug 11, 2023

I also would like to have async support

@Danielhiversen
Copy link

Same here

@srushtishah
Copy link

Also interested in async/await support.

@noctuid
Copy link

noctuid commented Sep 21, 2023

The more people ask for it the easier it will be to prioritise

Our team is also interested

@susodapop
Copy link
Contributor

Thanks for the signal, everyone. This feature is now being developed. I'll post updates on this issue as we move closer to release. Pull requests implementing this behaviour should begin to pop up in the next couple weeks.

@arunachalamsivananthandb

Hello @susodapop - Please share any latest updates on this?

@arunachalamsivananthandb

Hello, @susodapop - InMobi customer and MSFT have been following up on this. Can you please let me know if this issue is fixed.

@michaelmirandi
Copy link

Echoing the interest here. This would be a huge add!

@hayescode
Copy link

I have a requirement for this building an LLM web app with 400+ concurrent users.

@dhirschfeld
Copy link
Contributor

The more people ask for it the easier it will be to prioritise

Just chiming in here that I'd also really like async sypport.

As a user of the trio async-framework it would be great if you could also support that (it is supported by httpx).

To support users of both asyncio and trio you need to not use any asyncio primitives directly (asymc/await are fine) but instead use the equivalent functionality provided by the AnyIO library (this is how httpx provides support for both frameworks).

Using AnyIO automatically gives you support for both async-frameworks and has the additional benefit of providing a clean structured-concurrency (SC) api which should be easier to develop against than asyncio.

Many of the ideas developed by trio are now considered best practice for writing async code. Notably, the asyncio.TaskGroup is modelled off Trio's Nursery construct. asyncio.TaskGroup is only available on 3.11 but by using AnyIO you can use modern async constructs in earlier Python versions.

@hayescode
Copy link

Thanks for the signal, everyone. This feature is now being developed. I'll post updates on this issue as we move closer to release. Pull requests implementing this behaviour should begin to pop up in the next couple weeks.

@susodapop checking in to see if you can share the latest status/ETA?

@kravets-levko kravets-levko added the enhancement New feature or request label Apr 17, 2024
@joarobles
Copy link

Hi there! Any updates on this?

@yunbodeng-db yunbodeng-db assigned gopalldb and unassigned susodapop May 8, 2024
@yunbodeng-db
Copy link
Collaborator

Adding async support is not trivial. We need to prioritize and do the design. There is no ETA to provide at this time.

@belligiu
Copy link

belligiu commented May 8, 2024

Also interested in async/await support 🙏🏻

@joarobles
Copy link

Hi @yunbodeng-db, I understand async support is non trivial but is this even being considered at the time?

@yunbodeng-db
Copy link
Collaborator

Hi @yunbodeng-db, I understand async support is non trivial but is this even being considered at the time?

Not the async APIs at this moment, but it's possible to expose an async handler for the client to poll the status of a long running query. I cannot provide an ETA yet.

@jprakash-db
Copy link
Contributor

@thetadweller We have added support for the async handler in v3.7.0 cc @deeksha-db

@BenMcH
Copy link

BenMcH commented Feb 19, 2025

I don't think that this should have been closed as #485 is easily reproducible. The async methods don't appear to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests