Skip to content

WiFiClient::flush() can get stuck indefinitely #4736

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tko opened this issue Jan 20, 2021 · 15 comments
Closed

WiFiClient::flush() can get stuck indefinitely #4736

tko opened this issue Jan 20, 2021 · 15 comments
Labels
Area: BT&Wifi BT & Wifi related issues Resolution: Expired More info wasn't provided

Comments

@tko
Copy link

tko commented Jan 20, 2021

Hardware:

Board: DOIT ESP32 DEVKIT V1
Core Installation version: 1.0.4
IDE name: Platform.io
Flash Frequency: Can't tell (Crystal is 40MHz?)
PSRAM enabled: Can't tell
Upload Speed: 921600
Computer OS: Mac OSX

Description:

After reading the http response line with WiFiClient::readBytesUntil() calling WiFiClient::flush() hangs indefinitely. Doesn't always happen, best guess is it depends on response length. First noticed with 204 No Content response from an influxdb server.

Sketch:

#include <WiFi.h>
#include <WiFiClient.h>

const char ssid[] = "FIXME";
const char pass[] = "FIXME";

void setup()
{
  Serial.begin(115200);

  WiFiClient client;
  WiFi.begin(ssid, pass);
  WiFi.waitForConnectResult();
  client.connect("httpbin.org", 80);
  client.printf("GET /status/200 HTTP/1.0\r\nHost: httpbin.org\r\n\r\n");

  auto res = client.readStringUntil('\n');
  Serial.println(res);

  client.flush();
  Serial.println("Done.");
}

void loop()
{
}

Debug Messages:

[D][WiFiGeneric.cpp:337] _eventCallback(): Event: 0 - WIFI_READY
[D][WiFiGeneric.cpp:337] _eventCallback(): Event: 2 - STA_START
[D][WiFiGeneric.cpp:337] _eventCallback(): Event: 4 - STA_CONNECTED
[D][WiFiGeneric.cpp:337] _eventCallback(): Event: 7 - STA_GOT_IP
[D][WiFiGeneric.cpp:381] _eventCallback(): STA IP: 192.168.1.2, MASK: 255.255.255.0, GW: 192.168.1.1
HTTP/1.1 204 NO CONTENT
@enwi
Copy link

enwi commented Jan 23, 2021

Having the same issue. The while loop inside WiFiClient::flush() hangs indefinitely due to recv always returning 0, but the previous call to available() returning a value larger 0. Adding some sort of timeout would definitely help.

res = recv(fd(), buf, toRead, MSG_DONTWAIT);

@enwi
Copy link

enwi commented Jan 23, 2021

Ahh I think I found the culprit. The call to WiFiClient::available() inside WiFiClient::flush() refers to the available data inside WiFiClientRxBuffer which could contain data already received from the socket. In that case the socket can't receive any more data, since it was already received by the RXBuffer thus looping forever waiting for data.

I fixed the code by changing the implementation of WiFiclient::flush() to

void WiFiClient::flush() {
    if(available()){
       while(read() >= 0) {};
    }
}

This way the WiFiClientRxBuffer is in charge of handling the data and receiving from the socket.

@Kiznoh Kiznoh added the Type: Bug 🐛 All bugs label Jan 27, 2021
@stale
Copy link

stale bot commented Jun 20, 2021

[STALE_SET] This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale Issue is stale stage (outdated/stuck) label Jun 20, 2021
@enwi
Copy link

enwi commented Jun 20, 2021

I dare you to close the issue 🤣

@stale
Copy link

stale bot commented Jun 20, 2021

[STALE_CLR] This issue has been removed from the stale queue. Please ensure activity to keep it openin the future.

@stale stale bot removed the Status: Stale Issue is stale stage (outdated/stuck) label Jun 20, 2021
@Darktemp
Copy link

I had the same issue, that would hand PubSubClient::connected() if the connection was lost and it lead to the same spot in WiFiClient::flush(), where available() reported non-zero byte but recv always returned 0. Checking errno anyways shows this:
[E][WiFiClient.cpp:491] flush(): fail on fd 58, errno: 128, "Socket is not connected"
For now, I just changed if(res < 0) { to if(res <= 0) { but I'm not sure if there are valid scenarios, where res==0 is fine.

@VojtechBartoska
Copy link
Contributor

Hello @tko, sorry for late reply. Are you able to test your issue on development version 2.0.3-RC1 to check if this is still valid? You can take a look on Docs where is explained how to choose development release version in Arduino IDE.

@VojtechBartoska VojtechBartoska added Resolution: Awaiting response Waiting for response of author and removed Type: Bug 🐛 All bugs labels Mar 30, 2022
@tko
Copy link
Author

tko commented Apr 2, 2022

Now I get

...
[  2617][D][WiFiGeneric.cpp:852] _eventCallback(): Arduino Event: 7 - STA_GOT_IP
[  2620][D][WiFiGeneric.cpp:914] _eventCallback(): STA IP: 192.168.1.2, MASK: 255.255.255.0, GW: 192.168.1.1
HTTP/1.1 200 OK
[  3127][E][WiFiClient.cpp:499] flush(): fail on fd 48, errno: 128, "Socket is not connected"
Done.

I suppose it's better for not hanging anymore though the error seems little out of place.

Adding bit more logging I see before flush() calling available() returns 214, and checking connected() returns 0 and logs additionally

[  3164][D][WiFiClient.cpp:528] connected(): Disconnected: RES: 0, ERR: 128

@VojtechBartoska
Copy link
Contributor

Thanks for answer, we will test this.

@VojtechBartoska VojtechBartoska added the Area: BT&Wifi BT & Wifi related issues label Apr 7, 2022
@lemonkey
Copy link

lemonkey commented Aug 27, 2022

I also see this when there are remaining bytes to be read for an HTTPClient request but the client has since disconnected "gracefully". recv returns 0 and flush() gets stuck in a loop since the original value returned by available() (used to set the local variable 'a') is non-zero.

D][HTTPClient.cpp:388] disconnect(): still data in buffer (14), clean up.

Previous versions of HTTPClient.cpp, its disconnect() function didn't call flush, but instead sat in a while loop reading one byte at a time until _client->available() was no longer > 0. But the latest HTTPClient's disconnect() function was updated to call WiFiClient::flush() because the previous version of disconnect was "taking too long" when the number of bytes remaining was large. See also #6277

A similar change was also made to HTTPClient.cpp's end() function e3a5ae4

What's concerning is that according to the man page for recv, it should be returning -1 not 0 when there is an error. And I clearly see errno being set to 128 when currently recv is returning 0.

https://pubs.opengroup.org/onlinepubs/7908799/xns/recv.html

Upon successful completion, recv() returns the length of the message in bytes. If no messages are available to be received and the peer has performed an orderly shutdown, recv() returns 0. Otherwise, -1 is returned and errno is set to indicate the error.

Unless errno 128 (socket is not connected) is considered an orderly shutdown..

@lemonkey
Copy link

@Darktemp

For now, I just changed if(res < 0) { to if(res <= 0) { but I'm not sure if there are valid scenarios, where res==0 is fine.

When res == 0, you can check to see if errno is set. In my testing errno is always 128 when this happens ("Socket is not connected".

@michael-land
Copy link

The problem persists in version 2.0.11

@Shaquum
Copy link

Shaquum commented Oct 2, 2023 via email

@xreef
Copy link

xreef commented Oct 17, 2023

esp32 core 2.0.14 fixes the problem.
Thanks to all

@Parsaabasi Parsaabasi added Resolution: Expired More info wasn't provided and removed Status: Test needed Issue needs testing labels Jan 9, 2025
@Parsaabasi
Copy link

Hello,

I close this because the issue seems fixed. Also, this report contains the release we no longer support. Please try the new versions and in case the issue persists, feel free to reopen it.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: BT&Wifi BT & Wifi related issues Resolution: Expired More info wasn't provided
Projects
Development

No branches or pull requests

10 participants