Skip to content

Intermittent hangs in NTP.get_time() #56

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
chabala opened this issue Apr 3, 2022 · 3 comments · Fixed by #82
Closed

Intermittent hangs in NTP.get_time() #56

chabala opened this issue Apr 3, 2022 · 3 comments · Fixed by #82

Comments

@chabala
Copy link
Contributor

chabala commented Apr 3, 2022

I'm working with the WIZnet W5100S-EVB-Pico. As part of troubleshooting another issue, I wanted timestamps, so I configured NTP. I saw there was no NTP example code, so here is my short example code:

import board
import busio
import digitalio
import time
import os
import rtc

from adafruit_wiznet5k.adafruit_wiznet5k import WIZNET5K
from adafruit_wiznet5k.adafruit_wiznet5k_ntp import NTP

# SPI0
SPI0_SCK = board.GP18
SPI0_TX = board.GP19
SPI0_RX = board.GP16
SPI0_CSn = board.GP17

# reset
W5x00_RSTn = board.GP20

print('starting...')

# define your own unique MAC address for the network interface
MAC_ADDR = (0xDE, 0xAD, 0xBE, 0xEF, 0xFE, 0xED)

eth_rst = digitalio.DigitalInOut(W5x00_RSTn)
eth_rst.direction = digitalio.Direction.OUTPUT

spi_bus = busio.SPI(SPI0_SCK, MOSI=SPI0_TX, MISO=SPI0_RX)
eth_cs = digitalio.DigitalInOut(SPI0_CSn)
eth = WIZNET5K(spi_bus, eth_cs, reset=eth_rst, is_dhcp=True, mac=MAC_ADDR, hostname="W5100S", dhcp_timeout=30, debug=False)


def syncTimeWithNTP():
    try:
        print("\nconstructing NTP object")
        ntp = NTP(eth, "pool.ntp.org", -5)
        print("setting RTC via NTP request")
        rtc.RTC().datetime = ntp.get_time()
        print("RTC set successfully")
    except OSError as e:
        print("OSError", e)


def currentTime():
    t = time.localtime(time.time())
    return "%04d-%02d-%02dT%02d:%02d:%02d" % (t.tm_year, t.tm_mon, t.tm_mday, t.tm_hour, t.tm_min, t.tm_sec)

print("machine:", os.uname().machine)
print("sysname:", os.uname().sysname)
print("CircuitPython version:", os.uname().version)
print("Chip Version:", eth.chip)
print("MAC Address:", ":".join("%02X" % _ for _ in eth.mac_address))
print("IP Address:", eth.pretty_ip(eth.ip_address))

syncTimeWithNTP()
print(currentTime())


while True:
    time.sleep(5)
    # Maintain DHCP lease
    eth.maintain_dhcp_lease()

This works, often, like so:

code.py output:
starting...
machine: Raspberry Pi Pico with rp2040
sysname: rp2040
CircuitPython version: 7.2.3 on 2022-03-16
Chip Version: w5100s
MAC Address: DE:AD:BE:EF:FE:ED
IP Address: 192.168.1.195

constructing NTP object
setting RTC via NTP request
RTC set successfully
2022-04-03T15:30:12

However, just as often, it hangs in NTP.get_time():

code.py output:
starting...
machine: Raspberry Pi Pico with rp2040
sysname: rp2040
CircuitPython version: 7.2.3 on 2022-03-16
Chip Version: w5100s
MAC Address: DE:AD:BE:EF:FE:ED
IP Address: 192.168.1.195

constructing NTP object
setting RTC via NTP request
Traceback (most recent call last):
  File "code.py", line 55, in <module>
  File "code.py", line 38, in syncTimeWithNTP
  File "/lib/adafruit_wiznet5k/adafruit_wiznet5k_ntp.py", line 58, in get_time
  File "/lib/adafruit_wiznet5k/adafruit_wiznet5k_socket.py", line 294, in recv
KeyboardInterrupt:

line 294 is gc.collect()

I've also seen this:

code.py output:
starting...
machine: Raspberry Pi Pico with rp2040
sysname: rp2040
CircuitPython version: 7.2.3 on 2022-03-16
Chip Version: w5100s
MAC Address: DE:AD:BE:EF:FE:ED
IP Address: 192.168.1.195

constructing NTP object
setting RTC via NTP request
Traceback (most recent call last):
  File "code.py", line 55, in <module>
  File "code.py", line 38, in syncTimeWithNTP
  File "/lib/adafruit_wiznet5k/adafruit_wiznet5k_ntp.py", line 58, in get_time
  File "/lib/adafruit_wiznet5k/adafruit_wiznet5k_socket.py", line 281, in recv
  File "/lib/adafruit_wiznet5k/adafruit_wiznet5k_socket.py", line 406, in available
  File "/lib/adafruit_wiznet5k/adafruit_wiznet5k.py", line 518, in socket_available
  File "/lib/adafruit_wiznet5k/adafruit_wiznet5k.py", line 866, in _get_rx_rcv_size
  File "/lib/adafruit_wiznet5k/adafruit_wiznet5k.py", line 907, in _read_snrx_rsr
  File "/lib/adafruit_wiznet5k/adafruit_wiznet5k.py", line 966, in _read_socket
  File "/lib/adafruit_wiznet5k/adafruit_wiznet5k.py", line 461, in read
KeyboardInterrupt:

This is a bus_device write. Hard to tell if it was stuck there or just stuck in a busy loop in that vicinity.

I'm using the latest stable CP (as shown in the output above), and latest Wiznet5k library.

@chabala
Copy link
Contributor Author

chabala commented Apr 4, 2022

I see there was a CP 7.2.4 released last week. I have updated, but it is still flaky:

code.py output:
starting...
machine: Raspberry Pi Pico with rp2040
sysname: rp2040
CircuitPython version: 7.2.4 on 2022-03-31
Chip Version: w5100s
MAC Address: DE:AD:BE:EF:FE:ED
IP Address: 192.168.1.195

constructing NTP object
setting RTC via NTP request
Traceback (most recent call last):
  File "code.py", line 55, in <module>
  File "code.py", line 38, in syncTimeWithNTP
  File "/lib/adafruit_wiznet5k/adafruit_wiznet5k_ntp.py", line 58, in get_time
  File "/lib/adafruit_wiznet5k/adafruit_wiznet5k_socket.py", line 291, in recv
KeyboardInterrupt: 

Updated again:

machine: Raspberry Pi Pico with rp2040
sysname: rp2040
CircuitPython version: 7.2.5 on 2022-04-06
Chip Version: w5100s
MAC Address: DE:AD:BE:EF:FE:ED
IP Address: 192.168.1.195

constructing NTP object
setting RTC via NTP request
Traceback (most recent call last):
  File "code.py", line 55, in <module>
  File "code.py", line 38, in syncTimeWithNTP
  File "/lib/adafruit_wiznet5k/adafruit_wiznet5k_ntp.py", line 58, in get_time
  File "/lib/adafruit_wiznet5k/adafruit_wiznet5k_socket.py", line 291, in recv
KeyboardInterrupt: 

@chabala
Copy link
Contributor Author

chabala commented Apr 8, 2022

My latest theory on this is that since NTP uses UDP, on occasion the reply may be lost in transit, and the socket is stuck in a loop waiting to read a reply that isn't coming.

Coincidentally, I've found a PR that sought to improve Wisnet5k UDP handling, but unfortunately it was opened against a Wiznet demo repository instead of this one. I've applied it locally, and while it doesn't solve the hanging issue, it doesn't seem to cause any ill effects either. I don't fully understand all the changes, but it looks like it's solving a mismatch between the number of sockets available and how the existing library handles tracking UDP globally.

@chabala
Copy link
Contributor Author

chabala commented Apr 11, 2022

I've setup a ntpd daemon on my local network, and after running for a few days, this has almost completely eliminated any hanging behavior, vs using pool.ntp.org. Not entirely though. But that would suggest that something about how the code handles waiting for the response could be more robust, likely with a bit of timeout and retry logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant