Skip to content

Implement exponential backoff/retry for reconnections #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
brentru opened this issue Oct 9, 2019 · 1 comment
Closed

Implement exponential backoff/retry for reconnections #10

brentru opened this issue Oct 9, 2019 · 1 comment
Labels
enhancement New feature or request

Comments

@brentru
Copy link
Member

brentru commented Oct 9, 2019

To avoid placing load on MQTT servers like Adafruit IO, Google Cloud IoT Core, or Amazon AWS IOT, reconnections should add an optional flag for implementing an exponential backoff after failed reconnection attempts. This will also help avoid throttling errors from the server.

reconnect() method: https://github.com/adafruit/Adafruit_CircuitPython_MiniMQTT/blob/master/adafruit_minimqtt.py#L585

See:
AWS Python Client implementation: https://github.com/aws/aws-iot-device-sdk-python#id4
AWS IoT Retries: https://docs.aws.amazon.com/general/latest/gr/api-retries.html
Google Cloud IoT Backoff: https://cloud.google.com/iot/docs/how-tos/exponential-backoff

@brentru brentru added the enhancement New feature or request label Oct 9, 2019
@vladak
Copy link
Contributor

vladak commented Jan 30, 2023

I have a proof of concept in the works (using the Google algo even though they are going to shut down their MQTT services this summer) and would like to hash out some implementation details. There are two parts to this: the initial connect attempts, currently done in a cycle in _get_connect_socket(). This obviously needs the expo-backo treatment. There is a sub-problem to that: what if the inner cycle fails because of some internal cause such as MemoryError ? Should the backoff algo continue in such case ? My stance is that no, it should not, albeit just for that particular iteration.

However, there is one more aspect. Consider this code:

    mqtt_client.connect()

    while True:
        mqtt_client.loop(timeout=0.1)
        try:
            mqtt_client.reconnect()

The way reconnect() works currently is that it closes the pre-existing socket regardless its state and attempts the connect once again. If the connect always goes through on its first attempt, this piece of code will hammer the broker. I feel this sort of convoluted example should probably be taken into account and use the backoff also in this case of what seems like bad programming.

Lastly, the backoff should work not only on TCP level, but also on the MQTT level, so some refactoring is in order.

vladak added a commit to vladak/Adafruit_CircuitPython_MiniMQTT that referenced this issue Feb 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants