You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, when writing in batch mode, there may be a period of delay when the batch is flushed and points are written to a distant InfluxDB instance - the interpreter has to switch context between threads to perform the write, which has knock on effects for the calling application.
For latency sensitive uses, this is problematic as it delays the calling application.
For example, a service offering an API might choose to write analytic data out to InfluxDB whenever an endpoint is called - so their responses to their own API consumers may be delayed if network conditions between the calling app and InfluxDB are sub-optimal.
With a batch size of 500, by calculating the time (in ms) between iterations, we can see the impact of the write (this is with a deliberately distant InfluxDB instance to really highlight the difference)
with client.write_api(write_options=WriteOptions(batch_size=500)) as wa:
for x in range(1, 1000):
point = f"foo,bar=sed fieldval={x} {time.time_ns()}"
wa.write(bucket="btasker+cloud2's Bucket", record=point)
now = time.time_ns()
delta = now - last
print(f"{x}: {delta}")
last = now
....
495: 6312
496: 6053
497: 6186
498: 6059
499: 6057
500: 2906668
501: 124634
502: 12492
503: 9888
Desired behavior:
It's possible to work around this by using multiprocessing and/or similar approaches.
What'd be good though, is if the client library could implement this itself so that it's abstracted away from developers - that way they won't need to generate boilerplate to address this issue.
In effect, in the example above there should be no more overhead/delay to the calling application on iteration 500 than there is on iteration 1,2,3 etc.
Use case:
The time impact of context switching will affect particularly latency sensitive applications.
The text was updated successfully, but these errors were encountered:
Proposal:
Create a "no-block" batch flush mode
Current behavior:
Currently, when writing in batch mode, there may be a period of delay when the batch is flushed and points are written to a distant InfluxDB instance - the interpreter has to switch context between threads to perform the write, which has knock on effects for the calling application.
For latency sensitive uses, this is problematic as it delays the calling application.
For example, a service offering an API might choose to write analytic data out to InfluxDB whenever an endpoint is called - so their responses to their own API consumers may be delayed if network conditions between the calling app and InfluxDB are sub-optimal.
With a batch size of 500, by calculating the time (in ms) between iterations, we can see the impact of the write (this is with a deliberately distant InfluxDB instance to really highlight the difference)
Desired behavior:
It's possible to work around this by using
multiprocessing
and/or similar approaches.What'd be good though, is if the client library could implement this itself so that it's abstracted away from developers - that way they won't need to generate boilerplate to address this issue.
In effect, in the example above there should be no more overhead/delay to the calling application on iteration 500 than there is on iteration 1,2,3 etc.
Use case:
The time impact of context switching will affect particularly latency sensitive applications.
The text was updated successfully, but these errors were encountered: