Skip to content

800-900ms between loop() latency during wifi connection handling -3 #6136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 of 6 tasks
dsptech opened this issue May 23, 2019 · 18 comments
Closed
4 of 6 tasks

800-900ms between loop() latency during wifi connection handling -3 #6136

dsptech opened this issue May 23, 2019 · 18 comments
Labels
waiting for feedback Waiting on additional info. If it's not received, the issue may be closed.

Comments

@dsptech
Copy link

dsptech commented May 23, 2019

Basic Infos

  • This issue complies with the issue POLICY doc.
  • I have read the documentation at readthedocs and the issue is not addressed there.
  • I have tested that the issue is present in current master branch (aka latest git).
  • I have searched the issue tracker for a similar issue.
  • If there is a stack dump, I have decoded it.
  • I have filled out all fields below.

Platform

  • Hardware: [NodeMcu (ESP12E - ESP12F)]
  • Core Version: [b551992]
  • Development Env: [Eclipse/Sloeber]
  • Operating System: [Windows]

Settings in IDE

  • Module: [Nodemcu]
  • Flash Mode: [dio]
  • Flash Size: [4MB]
  • lwip Variant: [v2 Lower Memory]
  • Reset Method: [ck|nodemcu]
  • Flash Frequency: [40Mhz]
  • CPU Frequency: [80Mhz|160MHz]
  • Upload Using: [SERIAL]
  • Upload Speed: [115200]

Problem Description

Long latency of the loop function (887ms) due WiFi connection activity. There are no change of the issue changing the lock setting from 80Mhz to 160Mhz.
The issue occur after a while that WiFi.begin is called and just before the "WIFI_EVENT_STAMODE_CONNECTED" event.

Use the sketch below to recreate the issue. The sketch also use some pin to allow the tracing via an hardware logic analizer. The two attached pictures show the behavior with 80Mhz and 160Mhz clock settings.
In the two pictures, the 2nd pulse in the track "SETUP" is the WiFi.begin call.
The track "LOOP" is generated by loop function that toggle a pin every 2ms (see the sketch).

The issue is active in the commits: 968d6fc and the last b551992
Could be related to: #4891

MCVE Sketch

#include <Arduino.h>

#include "ESP8266WiFi.h"

//-- Pin definitions. Trace purpose only
#define tracePin0_loop  14
#define tracePin1_setup 12
char mySsid[]="xxx";
char myKey[]="yyy";

void setup() {
    Serial.begin(115200);
    //initialize trace -----------------------
    pinMode(tracePin0_loop, OUTPUT);
    pinMode(tracePin1_setup, OUTPUT);
    digitalWrite(tracePin0_loop, 0);
    digitalWrite(tracePin1_setup, 0);
    delay (10000);
    //end of trace initialization------------



    digitalWrite(tracePin1_setup, 1);
    WiFi.persistent(false); //avoid delays by flash operations
    WiFi.mode(WIFI_OFF);
    WiFi.setAutoConnect(false);
    WiFi.setAutoReconnect(false);
    digitalWrite(tracePin1_setup, 0);

    delay (200);
    digitalWrite(tracePin1_setup, 1);
    WiFi.begin(mySsid, myKey);
    digitalWrite(tracePin1_setup, 0);
}

// The loop function is called in an endless loop
void loop() {
    digitalWrite(tracePin0_loop, !digitalRead(tracePin0_loop)); 
    //do some stuff
    delay(2); //cpu latency by WiFi (887ms) could occur here, or outside of this function
}

Debug Messages

SDK:2.2.1(cfd48f3)/Core:unspecified=0/lwIP:STABLE-2_1_2_RELEASE/glue:1.1-7-g82abda3/BearSSL:a143020
wifi evt: 7
wifi evt: 7
wifi evt: 7
wifi evt: 7
bcn 0
del if1
usl
mode : null
wifi evt: 8
mode : sta(18:fe:34:e8:1c:32)
add if0
wifi evt: 8
wifi evt: 2
scandone
state: 0 -> 2 (b0)
state: 2 -> 3 (0)
state: 3 -> 5 (10)
add 0
aid 9
cnt

connected with Vodafone-44013764, channel 11
dhcp client start...
wifi evt: 0
ip:192.168.5.56,mask:255.255.255.0,gw:192.168.5.1
wifi evt: 3
pm open,type:2 0

Additional information in the next post

ESP8266-80Mhz
ESP8266-160Mhz

@dsptech
Copy link
Author

dsptech commented May 23, 2019

Additional informations:
A third capture (related to another skecth) show the latency near the "WIFI_EVENT_STAMODE_CONNECTED" event.

As additive test I also checked the Timer 1 interrupt. That interrupt continue to work as expected without any latency (I used it as workaround for time critical functions).

regards.
ESP8266-80-withEvents

@devyte
Copy link
Collaborator

devyte commented May 23, 2019

Much better, thank you.

@devyte
Copy link
Collaborator

devyte commented May 23, 2019

What happens if you:

  1. iadd a WiFi. disconnect() during your startup sequence?
  2. wipe the entire flash instead of just the sketch?
  3. set WiFi.mode(STA) explicitly?

Also, could you please modify the sketch to measure time, e. g. with millis(), instead of toggling pins? Not everyone has a logic analyzer to reproduce in that way.

@devyte devyte added the waiting for feedback Waiting on additional info. If it's not received, the issue may be closed. label May 23, 2019
@dsptech
Copy link
Author

dsptech commented May 24, 2019

Hi @devyte

regarding your questions:

  1. Already tested before opening the issue. Adding WiFi.disconnect(true) or WiFi.disconnect(false) inside the initialization produce no effects.
  2. the flash is always fully erased before the upload
  3. adding WiFi.mode(WIFI_STA) at the end of the initialization produce no effects.

Please, see the next post for the new sketch with trace and time measures.

@dsptech
Copy link
Author

dsptech commented May 24, 2019

This is the requested sketch. The time produced in the logs is referred to the wifi initialization (t=0).
The latency is detected as delay >=10ms between two consecutive calls of the loop function.
The sketch logs are mixed with the debug logs to allow a general view of the situation.

I noticed now the latency always occur when the ssid is recognized, even in case of an uncorrect key.

EDIT: the tests are performed at 160Mhz clock

MCVE Sketch

#include <Arduino.h>
#include "ESP8266WiFi.h"

//-- Pin definitions. Trace purpose only
#define tracePin0_loop  14
#define tracePin1_setup 12
#define tracePin2_event 13

char mySsid[]="xxx";
char myKey[]="yyy";

unsigned long start_ms,lastCheck_ms;
const unsigned long trigger_ms=10; //for latency detection

void recordEvent(WiFiEvent_t event){
    //produce a burst of event+1 pulses on tracePin2_event pin
    for (int i=0; i<=(int)event; i++){
        digitalWrite(tracePin2_event, 1);
        digitalWrite(tracePin2_event, 0);
    }
    Serial.println(String("******* Wifi Event ") + (int) event + " received. Time: " + (millis() - start_ms));

}
void setup() {
    Serial.begin(115200);
    //initialize trace -----------------------
    pinMode(tracePin0_loop, OUTPUT);
    pinMode(tracePin1_setup, OUTPUT);
    pinMode(tracePin2_event, OUTPUT);
    digitalWrite(tracePin0_loop, 0);
    digitalWrite(tracePin1_setup, 0);
    digitalWrite(tracePin2_event, 0);

    delay (10000);
    //end of trace initialization -------------

    Serial.println("******* Set time reference to 0 (Wifi initialization)");
    start_ms = millis();

    digitalWrite(tracePin1_setup, 1);
    WiFi.onEvent(recordEvent);
    WiFi.persistent(false); //avoid delays by flash operations
    WiFi.mode(WIFI_OFF);
    WiFi.setAutoConnect(false);
    WiFi.setAutoReconnect(false);
    //WiFi.disconnect(true);
    //WiFi.mode(WIFI_STA);
    digitalWrite(tracePin1_setup, 0);

    //wait exactly 1000ms from the time reference to facilitate the reading of the log
    delay(1000 - (millis()-start_ms));

    Serial.println(String("******* Wifi.begin call. Time:") + (millis()-start_ms));

    lastCheck_ms = millis();

    digitalWrite(tracePin1_setup, 1);
    WiFi.begin(mySsid, myKey);
    digitalWrite(tracePin1_setup, 0);

 }

// The loop function is called in an endless loop
void loop() {
    unsigned long now,timeFromCheck;

    digitalWrite(tracePin0_loop, !digitalRead(tracePin0_loop));
    now=millis();
    timeFromCheck = now - lastCheck_ms;
    if (timeFromCheck >= trigger_ms)
        Serial.println(String("******* LATENCY ") + timeFromCheck + " ms detected. Time : " +(now - start_ms));
    lastCheck_ms = now;

    //do some stuff
    delay(2); //cpu latency by WiFi (887ms) could occur here, or outside of this function
}

--- Test condition: not existent ssid ----------------------------------
NO ISSUES


--- Test condition: correct SSID but wrong key. Latency occurred. ------

SDK:2.2.1(cfd48f3)/Core:unspecified=0/lwIP:STABLE-2_1_2_RELEASE/glue:1.1-7-g82abda3/BearSSL:a143020
******* Set time reference to 0 (Wifi initialization)
bcn 0
del if1
usl
mode : null
wifi evt: 8
******* Wifi Event 8 received. Time: 2
******* Wifi.begin call. Time:1000
mode : sta(18:fe:34:e8:1c:32)
add if0
wifi evt: 8
******* Wifi Event 8 received. Time: 1003
wifi evt: 2
******* Wifi Event 2 received. Time: 1806
scandone
state: 0 -> 2 (b0)
******* LATENCY 888 ms detected. Time : 4738
state: 2 -> 3 (0)
state: 3 -> 5 (10)
add 0
aid 17
cnt
state: 5 -> 2 (fc0)
rm 0
wifi evt: 1
STA disconnect: 15
******* Wifi Event 1 received. Time: 12783


--- Test condition: correct SSID and key. Latency occurred. ------

SDK:2.2.1(cfd48f3)/Core:unspecified=0/lwIP:STABLE-2_1_2_RELEASE/glue:1.1-7-g82abda3/BearSSL:a143020
wifi evt: 7
******* Set time reference to 0 (Wifi initialization)
bcn 0
del if1
usl
mode : null
wifi evt: 8
******* Wifi Event 8 received. Time: 3
******* Wifi.begin call. Time:1000
mode : sta(18:fe:34:e8:1c:32)
add if0
wifi evt: 8
******* Wifi Event 8 received. Time: 1003
wifi evt: 2
******* Wifi Event 2 received. Time: 1849
scandone
state: 0 -> 2 (b0)
******* LATENCY 888 ms detected. Time : 4739
state: 2 -> 3 (0)
state: 3 -> 5 (10)
add 0
aid 17
cnt

connected with DAP, channel 6
dhcp client start...
wifi evt: 0
******* Wifi Event 0 received. Time: 4771
ip:10.1.1.160,mask:255.255.255.0,gw:10.1.1.1
wifi evt: 3
******* Wifi Event 3 received. Time: 9558
pm open,type:2 0


@dsptech
Copy link
Author

dsptech commented May 24, 2019

Update:
I performed other tests:
A) used three different type of access point (to check for a compatibility issue)
B) changed kind of WiFi: b or g or g/n, WPA or WPA2, TKIP or AES and chosed several channels
C) created weak or strong signals conditions.

The latency always present with a constant duration (885 - 889 ms) in ALL cases.

@dsptech
Copy link
Author

dsptech commented May 27, 2019

hi @devyte,
further updates:
To check if the issue is related to the device configuration, I reverted the the project configuration to "generic ESP12" and tested the configurations below:

  • Clock: 80 and 160Mhz
  • Flash clock: 40 or 80 Mhz
  • Flash Mode (dio or qio) (actually qio/80Mhz)
  • Vtables: Flash or IRAM.
  • All three offered combinations of SDK.
    Result: there are no changes of the issue.

For a comparison test, I installed the 8266 RTOS environment and I performed a pair of tests using the simple_wifi example with the task priority lowered to "1". (0 = idle) . In a such case, for every connection I observed two minimal latencies of only 27ms and 50ms (scheduler freq= 100Hz), 32ms and 75ms (scheduler freq=1Khz).

Regards.

EDIT =05/Jun/19: The RTOS case is not correct. I discovered that the latency sometime occur also in this framework. I suppose a wifi driver issue. I'm still investigating about that.

@readmodifywrite
Copy link

I am also observing this issue. Same 800-900 ms delay.

@readmodifywrite
Copy link

This issue is present at least as far back as 2.4.1. I stopped testing beyond that.

@readmodifywrite
Copy link

Also occurs if you specify channel and BSSID, so it isn't related to doing a scan.

@readmodifywrite
Copy link

readmodifywrite commented Jul 18, 2019

Also tried a test with a timer0 interrupt. It looks like that keeps running all the way through. If anyone needs to ensure real-time behavior during a wifi connect, that might be an option.

Timer1 also works, and may be preferable since timer0 is ostensibly used by wifi already.

@dsptech
Copy link
Author

dsptech commented Jul 18, 2019

espressif/ESP8266_RTOS_SDK#626

Hi @readmodifywrite
I think about espressif wifi driver issues. In the latest ESP8266_RTOS_SDK the issue is heavier due the interrupts (but NMI) totally disabled in that time. I already opened an issues there.
Now they are working on. I "hope" in a fix in the driver.

@readmodifywrite
Copy link

@dsptech Awesome. I'm glad you found this issue, I would have been pretty lost in trying to find out what was going on without it.

@devyte
Copy link
Collaborator

devyte commented Nov 9, 2019

@dsptech a whole lot has changed since this was reported. Please retest with release 2.6.0.

@dsptech
Copy link
Author

dsptech commented Nov 9, 2019

Hi @devyte,
today I upgraded the framework at the current commit: 6626b2f. I'm sorry to say the the issue is still on: latency of 887ms evaluated by millis(). Test conditions: nodemcu 1.0 @160Mhz.
I'm pretty sure that the issue is not of the arduino framework, but it came from the sdk and wifi driver instead, so, I also think that the issue cannot be solved by the arduino team.

To confirm that, I already experienced an heavier wifi issue in the ESP8266 RTOS:
espressif/ESP8266_RTOS_SDK#626 (comment)
(EDIT: They compensated the error in the timing functions few time ago) and one years ago I esperienced a similar issue even with the ESP32 (ESP-IDF): Core 0 monopolized by wifi for 1100ms (I made no further tests)
It's the expressif wifi driver!

@devyte
Copy link
Collaborator

devyte commented Nov 9, 2019

Did you report in the NONOS SDK repo? I suggest doing that, and referencing work being done in the FreeRTOS repo.
Given that there's nothing we can do in the core, I'm closing the issue. If a fix is implemented in the NONOS SDK repo, please make sure to request it be backported to their 2.2.x branch, which is what we use (we do NOT use the sdk 3.x at this time). Once that fix is done, please report here and I'll reopen, or open a new issue and reference this one. We can pull the new sdk 2.2.x with the fix at that point.

@devyte devyte closed this as completed Nov 9, 2019
@dsptech
Copy link
Author

dsptech commented Nov 9, 2019

I've not reported the issue in NONOS SDK reported because many guys has already worked on:
(May 2018) #4891 (comment)
and earlephilhower requested (when closing the issue) to open here a new one (if still on).

Anyway, I assumed that the report in NONOS SDK was already done by the arduino team.
OK, I will forward to them.
thank you.

@dsptech
Copy link
Author

dsptech commented Nov 9, 2019

@devyte
regarding freertos, I may have written wrong. They compensated for an error in all timing functions induced by latency. The latency itself is still on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for feedback Waiting on additional info. If it's not received, the issue may be closed.
Projects
None yet
Development

No branches or pull requests

3 participants