ESP8266httpUpdate.update() incompatible with nginx #1157

skandragon · 2015-12-06T05:46:29Z

I am attempting to perform an upgrade from a server running nginx, and am having no real luck. It appears only some of the data is being transferred to the ESP, and I'm quite sure it's not a problem with nginx:

10.42.0.47 - - [06/Dec/2015:03:42:40 +0000] "GET /explorer/esp.bin HTTP/1.1" 200 69819 "-" "ESP8266-http-Update"
10.42.0.47 - - [06/Dec/2015:03:43:10 +0000] "GET /explorer/esp.bin HTTP/1.1" 200 28939 "-" "ESP8266-http-Update"
10.42.0.47 - - [06/Dec/2015:03:47:26 +0000] "GET /explorer/esp.bin HTTP/1.1" 200 71279 "-" "ESP8266-http-Update"
10.42.0.47 - - [06/Dec/2015:03:52:02 +0000] "GET /explorer/esp.bin HTTP/1.1" 200 28939 "-" "ESP8266-http-Update"
10.42.0.47 - - [06/Dec/2015:05:32:51 +0000] "GET /explorer/esp.bin HTTP/1.1" 200 228000 "-" "ESP8266-http-Update"
10.42.0.47 - - [06/Dec/2015:05:33:43 +0000] "GET /explorer/esp.bin HTTP/1.1" 200 202679 "-" "ESP8266-http-Update"
10.42.0.47 - - [06/Dec/2015:05:34:43 +0000] "GET /explorer/esp.bin HTTP/1.1" 200 28939 "-" "ESP8266-http-Update"
10.42.0.47 - - [06/Dec/2015:05:41:08 +0000] "GET /explorer/esp.bin HTTP/1.1" 200 28939 "-" "ESP8266-http-Update"

Here's a tcpdump of the last connection shown:

05:41:08.622315 IP 10.42.0.47.4099 > 10.42.120.1.80: Flags [S], seq 16134, win 5840, options [mss 1460], length 0
05:41:08.622364 IP 10.42.120.1.80 > 10.42.0.47.4099: Flags [S.], seq 2641987090, ack 16135, win 29200, options [mss 1460], length 0
05:41:08.624701 IP 10.42.0.47.4099 > 10.42.120.1.80: Flags [.], ack 1, win 5840, length 0
05:41:08.626063 IP 10.42.0.47.4099 > 10.42.120.1.80: Flags [P.], seq 1:341, ack 1, win 5840, length 340
05:41:08.626097 IP 10.42.120.1.80 > 10.42.0.47.4099: Flags [.], ack 341, win 30016, length 0
05:41:08.626264 IP 10.42.120.1.80 > 10.42.0.47.4099: Flags [.], seq 1:1461, ack 341, win 30016, length 1460
05:41:08.626293 IP 10.42.120.1.80 > 10.42.0.47.4099: Flags [.], seq 1461:2921, ack 341, win 30016, length 1460
05:41:08.626323 IP 10.42.120.1.80 > 10.42.0.47.4099: Flags [.], seq 2921:4381, ack 341, win 30016, length 1460
05:41:08.626343 IP 10.42.120.1.80 > 10.42.0.47.4099: Flags [.], seq 4381:5841, ack 341, win 30016, length 1460
05:41:08.631852 IP 10.42.0.47.4099 > 10.42.120.1.80: Flags [.], ack 2921, win 2920, length 0
05:41:08.634229 IP 10.42.0.47.4099 > 10.42.120.1.80: Flags [.], ack 5841, win 0, length 0
05:41:08.730753 IP 10.42.0.47.4099 > 10.42.120.1.80: Flags [.], ack 5841, win 5840, length 0
05:41:08.730778 IP 10.42.120.1.80 > 10.42.0.47.4099: Flags [P.], seq 5841:7301, ack 341, win 30016, length 1460
05:41:08.730789 IP 10.42.120.1.80 > 10.42.0.47.4099: Flags [.], seq 7301:8761, ack 341, win 30016, length 1460
05:41:08.730799 IP 10.42.120.1.80 > 10.42.0.47.4099: Flags [.], seq 8761:10221, ack 341, win 30016, length 1460
05:41:08.730807 IP 10.42.120.1.80 > 10.42.0.47.4099: Flags [P.], seq 10221:11681, ack 341, win 30016, length 1460
05:41:08.730814 IP 10.42.0.47.4099 > 10.42.120.1.80: Flags [F.], seq 341, ack 5841, win 5840, length 0
05:41:08.734321 IP 10.42.0.47.4099 > 10.42.120.1.80: Flags [R.], seq 342, ack 7301, win 5840, length 0
05:41:08.735513 IP 10.42.0.47.4099 > 10.42.120.1.80: Flags [R.], seq 341, ack 8761, win 5840, length 0
05:41:08.735533 IP 10.42.0.47.4099 > 10.42.120.1.80: Flags [R.], seq 341, ack 10221, win 5840, length 0
05:41:08.737900 IP 10.42.0.47.4099 > 10.42.120.1.80: Flags [R.], seq 341, ack 11681, win 5840, length 0

It works exactly once after a fresh reboot, but subsequent upgrade attempts fail even if that one succeeds.

I've attached a working tcpdump to https://gist.github.com/skandragon/47d4769ff988a3e1ce65 rather than put it inline here.

The text was updated successfully, but these errors were encountered:

skandragon · 2015-12-06T06:05:26Z

Compiled with UPDATE_SERIAL Serial:

Success (partial log): https://gist.github.com/skandragon/3b22582d3138009cf436

Failure (full log): https://gist.github.com/skandragon/7307d615b20c123ccfab

skandragon · 2015-12-06T06:19:43Z

Printing the error:

[update] Update failed.
ERROR[5]: Stream Read Timeout

I do not see any IP level fragmentation, so that's not an issue.

skandragon · 2015-12-06T08:49:24Z

I am starting to think the general state of TCP connections in ESPland is somewhat broken.

My ultimate goal is to receive RGB data to drive LED arrays, and this will have to come in at a fairly high data rate.

If I limit my data rate to one update per 0.5 seconds (each update consists of 6 bytes of data), then the ESP has no issue. My read is basically this:

read a single byte command.
If it's an LED update command, read 5 more bytes using client.read(buffer, sizeof buffer) where buffer is a char buffer[5].

With a low data rate, this works fairly well, but it will even then eventually have a short read. If I send at a high data rate, I can cause a short read to occur nearly instantly. In all cases, only 1 byte of the 5 is read, according to a printf().

The connection is entirely over a stable wifi connection; there should be no AP switching or dropping, but I'm not positive the ESP is as stable as I'd like it to be here. It is possible it's reconnecting, but I have no idea how to tell this.

igrr · 2015-12-06T08:55:24Z

Socket read can return less bytes than you requested, there is nothing
wrong with it.
If you want "wait until N bytes are available or timeout" behavior, there's
Stream::readBytes function for that.

In any case, please open a separate issue if you find short reads a
problem, and post the sketch which may be used to reproduce.
Thanks.

On Sun, Dec 6, 2015, 11:49 Michael Graff [email protected] wrote:

I am starting to think the general state of TCP connections in ESPland is
somewhat broken.

My ultimate goal is to receive RGB data to drive LED arrays, and this will
have to come in at a fairly high data rate.

If I limit my data rate to one update per 0.5 seconds (each update
consists of 6 bytes of data), then the ESP has no issue. My read is
basically this:

read a single byte command.
If it's an LED update command, read 5 more bytes using client.read(buffer,
sizeof buffer) where buffer is a char buffer[5].

With a low data rate, this works fairly well, but it will even then
eventually have a short read. If I send at a high data rate, I can cause a
short read to occur nearly instantly. In all cases, only 1 byte of the 5 is
read, according to a printf().

The connection is entirely over a stable wifi connection; there should be
no AP switching or dropping, but I'm not positive the ESP is as stable as
I'd like it to be here. It is possible it's reconnecting, but I have no
idea how to tell this.

—
Reply to this email directly or view it on GitHub
#1157 (comment).

Links2004 · 2015-12-06T10:13:25Z

the ESP8266httpUpdate class itself handle only the requesting and header Interpretation.
the data transmission itself is handled in the Update class.
which uses readBytes, the time out for the read is 1000ms.

you can try to enable the debug:
https://github.com/esp8266/Arduino/blob/master/libraries/ESP8266httpUpdate/src/ESP8266httpUpdate.h#L35
https://github.com/esp8266/Arduino/blob/master/cores/esp8266/Updater.cpp#L6

I think there are to possible reasons.

the reported Content-Length in the header is wrong (easy to check with debug on)
for some reasons a timeout course (then its more the TCP part of the ESP)

skandragon · 2015-12-06T23:45:31Z

Looking more closely at the tcpdump when this first fails (not subsequent failures, which are tainted by left over internal state somewhere inside the ESPhttpUpdate class) I see this pattern:

Server sends some number of bytes (1460 bytes usually)
ESP ACKs
Server sends more
ESP ACKs previous bytes again, window size unchanged.
ESP ACKs previous bytes again, window size unchanged.
Server sends more
ESP ACKs the same bytes again, window size unchanged.
<at this point the server thinks the ESPs window is full, so waits, until...>
Server retransmits from the last ACK
ESP ACKs new data, and data flows again

This happens every time there is an issue. It can take a while for the server to re-transmit the un-ack'd data, usually longer than the timeout for the read call, is my guess.

I'm running a 10x second ping to the ESP, and I see no dropped packets. No other device, wifi or ethernet connected, has issues talking with the server running the nginx server. Local connectivity does not seem to be an issue.

This is a tcpdump running from the wifi access point itself:

23:36:18.523452 IP 10.42.120.1.80 > 10.42.0.47.4097: Flags [.], seq 49641:51101, ack 343, win 30016, length 1460
23:36:18.528178 IP 10.42.0.47.4097 > 10.42.120.1.80: Flags [.], ack 45261, win 5840, length 0
23:36:18.528194 IP 10.42.120.1.80 > 10.42.0.47.4097: Flags [.], seq 45261:46721, ack 343, win 30016, length 1460
23:36:18.528204 IP 10.42.120.1.80 > 10.42.0.47.4097: Flags [.], seq 46721:48181, ack 343, win 30016, length 1460
23:36:18.643764 IP 10.42.0.47.4097 > 10.42.120.1.80: Flags [.], ack 46721, win 4380, length 0
23:36:18.643786 IP 10.42.120.1.80 > 10.42.0.47.4097: Flags [.], seq 48181:49641, ack 343, win 30016, length 1460
23:36:18.650908 IP 10.42.0.47.4097 > 10.42.120.1.80: Flags [.], ack 48181, win 4380, length 0
23:36:18.650926 IP 10.42.120.1.80 > 10.42.0.47.4097: Flags [P.], seq 51101:52561, ack 343, win 30016, length 1460
23:36:18.650934 IP 10.42.0.47.4097 > 10.42.120.1.80: Flags [.], ack 48181, win 5840, length 0
23:36:18.655676 IP 10.42.0.47.4097 > 10.42.120.1.80: Flags [.], ack 49641, win 4380, length 0
23:36:18.655692 IP 10.42.120.1.80 > 10.42.0.47.4097: Flags [.], seq 52561:54021, ack 343, win 30016, length 1460
23:36:18.789139 IP 10.42.0.47.4097 > 10.42.120.1.80: Flags [.], ack 49641, win 4380, length 0
23:36:18.789156 IP 10.42.0.47.4097 > 10.42.120.1.80: Flags [.], ack 49641, win 5840, length 0
23:36:28.789872 IP 10.42.0.47.4097 > 10.42.120.1.80: Flags [F.], seq 343, ack 49641, win 5840, length 0
23:36:28.828208 IP 10.42.120.1.80 > 10.42.0.47.4097: Flags [.], ack 344, win 30016, length 0
23:36:29.556215 IP 10.42.120.1.80 > 10.42.0.47.4097: Flags [.], seq 49641:51101, ack 344, win 30016, length 1460
23:36:29.558423 IP 10.42.0.47.4097 > 10.42.120.1.80: Flags [R.], seq 344, ack 51101, win 5840, length 0

zenmanenergy · 2015-12-09T19:54:49Z

I'm having the same problem on a php-apache server. I've traced it down to this same line number in updater.cpp. The solution is over my head, but I can reproduce the problem 100% of the time.

zenmanenergy · 2015-12-09T20:27:34Z

I've made some interesting progress...

Try increasing the stream timeout to 10000 in writeStream in Updater.cpp

size_t UpdaterClass::writeStream(Stream &data) {
data.setTimeout(10000);

Links2004 · 2015-12-10T11:35:50Z

I have done some changes to the Update class to give the IP stack more time to handle the data.
can you please retest it.

may help with esp8266#1157

zenmanenergy · 2015-12-10T17:51:07Z

I have tested this new fix. for the most part it is working well. Thank you Links! The is still an issue with ESP.restart() seems to restart in bootloader mode. So it seems to require a manual reboot.

Links2004 · 2015-12-10T18:11:31Z

the restart is a other problem see #1017 , but there is no solution yet.

Links2004 added component: core component: network labels Dec 10, 2015

Links2004 added a commit to Links2004/Arduino that referenced this issue Dec 10, 2015

give the IP stack more time to handle the data

0389657

may help with esp8266#1157

Links2004 mentioned this issue Dec 10, 2015

fix update #1191

Merged

Links2004 closed this as completed Dec 10, 2015

Links2004 mentioned this issue Dec 24, 2015

HTTP OTA (ESP8266httpUpdate) #1177

Closed

cyberman54 mentioned this issue Sep 22, 2018

[Updater.cpp] Increasing stream timeout to avoid read timeout during OTA flash write espressif/arduino-esp32#1886

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESP8266httpUpdate.update() incompatible with nginx #1157

ESP8266httpUpdate.update() incompatible with nginx #1157

skandragon commented Dec 6, 2015

skandragon commented Dec 6, 2015

skandragon commented Dec 6, 2015

skandragon commented Dec 6, 2015

igrr commented Dec 6, 2015

Links2004 commented Dec 6, 2015

skandragon commented Dec 6, 2015

zenmanenergy commented Dec 9, 2015

zenmanenergy commented Dec 9, 2015

Links2004 commented Dec 10, 2015

zenmanenergy commented Dec 10, 2015

Links2004 commented Dec 10, 2015

ESP8266httpUpdate.update() incompatible with nginx #1157

ESP8266httpUpdate.update() incompatible with nginx #1157

Comments

skandragon commented Dec 6, 2015

skandragon commented Dec 6, 2015

skandragon commented Dec 6, 2015

skandragon commented Dec 6, 2015

igrr commented Dec 6, 2015

Links2004 commented Dec 6, 2015

skandragon commented Dec 6, 2015

zenmanenergy commented Dec 9, 2015

zenmanenergy commented Dec 9, 2015

Links2004 commented Dec 10, 2015

zenmanenergy commented Dec 10, 2015

Links2004 commented Dec 10, 2015