Skip to content

Timeout cannot be specify as a float #383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wiseboar-9 opened this issue Dec 13, 2021 · 5 comments · Fixed by #384
Closed

Timeout cannot be specify as a float #383

wiseboar-9 opened this issue Dec 13, 2021 · 5 comments · Fixed by #384
Labels
bug Something isn't working
Milestone

Comments

@wiseboar-9
Copy link

wiseboar-9 commented Dec 13, 2021

When the connection to the InfluxDB is lost during a write operation, no exception is raised by the python client where the write_api was configured with no retries UNLESS retries are also set to 0 for the InfluxDBClient instance.

Steps to reproduce:

  1. run the script below
  2. disconnect during the write operation (lots of writes so disconnect happens during write)
  3. no exception occurs, the script gets stuck in a loop, no further debugging output
  4. when the connection is resumed, the write resumes which means there is a retry operation despite specifying 0 retries for the write_api
import copy
import datetime
from typing import List
import numpy as np
import pandas as pd
from influxdb_client import WriteOptions, InfluxDBClient
from influxdb_client.client.write_api import WriteType

if __name__ == '__main__':
    t_col = "time"
    token = "REDACTED"
    org = "REDACTED"
    url = "http://10.43.0.52:8086"
    bucket = "prc_test"

    def create_data(time_col: str):
        x = np.arange(1, 10)
        now = datetime.datetime.utcnow().timestamp()
        time_data = [now - 10 - xxx for xxx in x]

        y = x ** 2
        frame = pd.DataFrame({time_col: time_data, 'x': x, 'y': y})
        tag_name = 'some_tag'
        tags: List[str] = [tag_name]
        frame[tag_name] = "very"

        frame[time_col] = frame[time_col].apply(lambda x: datetime.datetime.fromtimestamp(x))
        frame = copy.deepcopy(frame.set_index(time_col))
        return frame, tags

    influx_client = InfluxDBClient(
        url=url, token=token, org=org, debug=True,
        timeout=1000,
    )

    print("write started")
    writer = influx_client.write_api(
        write_options=WriteOptions(
            flush_interval=0,
            max_retries=0,
            write_type=WriteType.synchronous,
            max_retry_time=0,
            max_retry_delay=0,
            retry_interval=0,
            exponential_base=1,
        ),
    )
    for i in range(0, 10000):
        df, tag_list = create_data(time_col=t_col)
        writer.write(
            bucket=bucket,
            org=org,
            record=df,
            data_frame_measurement_name="test-01",
            data_frame_tag_columns=tag_list,
        )
    print("done!")

when retriesof the InfluxDBClient instance is set to 0 retries a urllib3.exceptions.ReadTimeoutError occurs, which is what I'd expect to happen with the code above, e.g.

influx_client = InfluxDBClient( url=url, token=token, org=org, debug=True, timeout=1000, retries=0, )

Expected behavior:
Exception to be raised after the specified amount of retries and delays of the writer

Actual behavior:
No exception raised despite connection loss.

Specifications:

  • Client Version: 1.24.0
  • InfluxDB Version: 2.0.8
  • Platform: Ubuntu 20.04.
@wiseboar-9
Copy link
Author

wiseboar-9 commented Dec 13, 2021

I did some more tests and landed on another small issue that crippled my progress for a while: when the given timeout (e.g. InfluxDBClient(timeout=1.0*1000)) is not an integer but a float, the same thing as described above seems to occur - I'm guessing the timeout value doesn't get recognized properly.

@bednar
Copy link
Contributor

bednar commented Dec 14, 2021

Hi @cellcubeBaMa,

thanks for using our client.

The following properties are related only for WriteType.batching: batch_size, flush_interval, jitter_interval, retry_interval, max_retry_time, max_retries, max_retry_delay, exponential_base, but you are using the WriteType.synchronous mode.

So your code can be simplify to:

import copy
import datetime
from typing import List

import numpy as np
import pandas as pd

from influxdb_client import InfluxDBClient
from influxdb_client.client.write_api import SYNCHRONOUS

if __name__ == '__main__':
    t_col = "time"
    token = "my-token"
    org = "my-org"
    url = "http://localhost:8086"
    bucket = "my-bucket"


    def create_data(time_col: str):
        x = np.arange(1, 10)
        now = datetime.datetime.utcnow().timestamp()
        time_data = [now - 10 - xxx for xxx in x]

        y = x ** 2
        frame = pd.DataFrame({time_col: time_data, 'x': x, 'y': y})
        tag_name = 'some_tag'
        tags: List[str] = [tag_name]
        frame[tag_name] = "very"

        frame[time_col] = frame[time_col].apply(lambda x: datetime.datetime.fromtimestamp(x))
        frame = copy.deepcopy(frame.set_index(time_col))
        return frame, tags


    with InfluxDBClient(url=url, token=token, org=org, debug=True, timeout=1000) as influx_client:

        print("write started")
        writer = influx_client.write_api(write_options=SYNCHRONOUS)
        for i in range(0, 10000):
            df, tag_list = create_data(time_col=t_col)
            writer.write(
                bucket=bucket,
                org=org,
                record=df,
                data_frame_measurement_name="test-01",
                data_frame_tag_columns=tag_list,
            )
        print("done!")

I've tried your code and everything works perfectly. I've pause the InfluxDB container and urllib3.exceptions.ReadTimeoutError immediately occurs.

Which version of urllib3 do you use?

Regards

@bednar bednar added the question Further information is requested label Dec 14, 2021
@wiseboar-9
Copy link
Author

wiseboar-9 commented Dec 14, 2021

Thanks for the quick reply!

I figured it out - it's what I wrote in my second comment. I specified the timeout as a float before, the retries apparently had nothing to do with it. I don't know where I messed this up testing yesterday, but I spent the whole day on this sigh. Mea culpa.

I humbly suggest adding a check so the timeout cannot be specified as a float. With timeout=1.0 what I described occurs:

with InfluxDBClient(
        url=url, token=token, org=org, debug=True, timeout=1.0,
) as influx_client:

I'm disconnecting on the OS-side by the way, turning off the WiFi-Adapter or connecting to a different network / disconnecting from the network the database is in.

fyi urllib3==1.26.6

@bednar
Copy link
Contributor

bednar commented Dec 14, 2021

I humbly suggest adding a check so the timeout cannot be specified as a float.

Thanks for your detail investigation. We will add supports for specify timeout as float.

@bednar bednar changed the title no exception when connection is lost during write Timeout cannot be specify as a float Dec 14, 2021
@bednar bednar added bug Something isn't working and removed question Further information is requested labels Dec 14, 2021
@wiseboar-9
Copy link
Author

I humbly suggest adding a check so the timeout cannot be specified as a float.

Thanks for your detail investigation. We will add supports for specify timeout as float.

no problem at all
thank you for your prolific issue testing!

@bednar bednar added this to the 1.25.0 milestone Dec 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants