Decouple stats polling from results fetching #63

matthewwardrop · 2018-07-12T23:18:48Z

Greetings all!

I'm looking to transition a project I curate (omniduct, a library to simplify data acquisition, especially for data scientists) from pyhive to prestodb; but currently I would lose the ability to poll for query progress before actually attempting to retrieve results.

i.e. cursor.fetchone() is used to both collect results and update stats, which means that I cannot show progress of the actual execution of the query, only progress through collection of the results.

Would you welcome a patch to add support for this polling? Or are you planning to add it yourselves? Or are you opposed to adding this feature?

The text was updated successfully, but these errors were encountered:

ggreg · 2019-02-15T00:01:14Z

@matthewwardrop yes, we would welcome a patch to add support polling stats independently of fetching results. We're not currently working on it, so your contribution would be greatly appreciated :).

What stats are you the most interest in?
What options are you considering to gather stats?

Regarding (2.), the client could sent a GET HTTP request to a /1/query/{query_id} endpoint.

At some point, we'll need to consider using asyncio (or a concurrent.futures executor in Python 2.7) to asynchronously perform some HTTP request as interleaving the process of getting result and stats could lead to unexpected behaviors such as queries failing with an ABANDONED error if the client takes too long poll the status of a query.

matthewwardrop · 2019-02-15T17:56:06Z

Hi @ggreg,

Thanks for responding to this.

I'm interested in all of the stats that are returned by the standard endpoints, but most especially the 'progress' field. In terms of methodology, I am imagining polling the same endpoints currently used by PrestoQuery.fetch and returning stats. At some point, this will call return data and/or will enter a finished state, and any returned data will be cached on some internal instance attribute, and further status polling will simply return the state as of that time. The user can then use the fetch methods as before, which will collect the data set aside in the local cache and then append to it any data returned by subsequent endpoint calls until the data is fully collected locally, as is the current behaviour.

This should not suffer any abandonment issues unless the user does not move on to using the fetch method within some sensible window of time after the polling indicates that the query has successfully ran its course.

Perhaps the asyncio/futures approach might belong instead in a wrapping library, such as omniduct, unless you are planning to support multiplexing of queries within this library itself.

I'll put out a PR soon.

akhandev · 2019-06-26T11:30:12Z

Hi @matthewwardrop,
Did you get a chance to add this functionality?

matthewwardrop · 2019-06-26T19:56:26Z

Not yet @akhandev . I'll try and put out a PR this week. :).

matthewwardrop changed the title ~~Make stats polling asynchronous~~ Decouple stats polling from results fetching Jul 12, 2018

ggreg assigned matthewwardrop Feb 15, 2019

ggreg added the enhancement label Feb 15, 2019

matthewwardrop mentioned this issue Oct 17, 2019

Add support for polling #90

Open

matthewwardrop mentioned this issue Apr 4, 2020

Add support for polling. trinodb/trino-python-client#18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Decouple stats polling from results fetching #63

Decouple stats polling from results fetching #63

matthewwardrop commented Jul 12, 2018

ggreg commented Feb 15, 2019

Uh oh!

matthewwardrop commented Feb 15, 2019 •

edited

Loading

Uh oh!

akhandev commented Jun 26, 2019

Uh oh!

matthewwardrop commented Jun 26, 2019

Uh oh!

Decouple stats polling from results fetching #63

Decouple stats polling from results fetching #63

Comments

matthewwardrop commented Jul 12, 2018

ggreg commented Feb 15, 2019

Uh oh!

matthewwardrop commented Feb 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

akhandev commented Jun 26, 2019

Uh oh!

matthewwardrop commented Jun 26, 2019

Uh oh!

matthewwardrop commented Feb 15, 2019 •

edited

Loading