Skip to content

ESP.restart() or auto-reboot after firmware update causes boot loop #7306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
CRCinAU opened this issue May 16, 2020 · 18 comments
Closed

ESP.restart() or auto-reboot after firmware update causes boot loop #7306

CRCinAU opened this issue May 16, 2020 · 18 comments

Comments

@CRCinAU
Copy link

CRCinAU commented May 16, 2020

Platform

  • Hardware: D1 Mini
  • Core Version: SDK:2.2.2-dev(38a443e)/Core:3.0.0-dev=30000000/lwIP:IPv6+STABLE-2_1_2_RELEASE/glue:1.2-30-g92add50/BearSSL:5c771be
  • Development Env: PlatformIO
  • Operating System: Fedora

Problem Description

When calling ESP.restart() or when rebooting after uploading firmware, the D1 Mini goes into a reboot loop. Output from the serial console below shows a successful boot on power on, then the first reboot after calling ESP.restart(), then the boot loop.

*WM: AutoConnect
*WM: Connecting as wifi client...
*WM: Status:
*WM: 6
*WM: Using last saved values, should be faster
*WM: Connection result: 
*WM: 3
*WM: IP Address:
*WM: 10.1.1.11
Autoupdate enabled at compile time...
*WM: freeing allocated params!
12991: Checking for update...
 - No Update Available.

13939: Finished auto-update check...

 ets Jan  8 2013,rst cause:2, boot mode:(3,7)

load 0x4010f000, len 3456, room 16 
tail 0
chksum 0x84
csum 0x84
vc5f60e31
~ld

 ets Jan  8 2013,rst cause:2, boot mode:(3,7)

load 0x4010f000, len 3456, room 16 
tail 0
chksum 0x84
csum 0x84
vc5f60e31
~ld

 ets Jan  8 2013,rst cause:2, boot mode:(3,7)

load 0x4010f000, len 3456, room 16 
tail 0
chksum 0x84
csum 0x84
vc5f60e31
~ld

This continues until the reset button is hit on the device and a normal boot occurs.

EDIT: To add context, I did a pio update which pulled down the latest frameworks, then rebuilt all my existing projects and flashed them OTA. After this, all of the D1 Minis that were updated showed this problem.

Current versions:

pio update
Updating tool-scons                      @ 3.30102.0      [Up-to-date]

Platform Manager
================
Platform Espressif 8266
--------
Updating espressif8266                   @ 2.5.1          [Up-to-date]
Updating toolchain-xtensa                @ 2.40802.200502 [Up-to-date]
Updating framework-arduinoespressif8266  @ 3.20701.0      [Up-to-date]
Updating tool-esptool                    @ 1.413.0        [Up-to-date]
Updating tool-esptoolpy                  @ 1.20800.0      [Up-to-date]


Library Manager
===============
Library Storage: /home/netwiz/Documents/ESP8266/lib
Updating ArduinoJson                     @ 6.15.2         [Up-to-date]
Updating DHTStable                       @ 0.2.4          [Up-to-date]
Updating ESP8266-ping                    @ 2.0.1          [Up-to-date]
Updating FastLED_DMA                     @ 0.0.0          [Detached]
Updating IRremoteESP8266                 @ 2.7.6          [Up-to-date]
Updating PubSubClient                    @ 2.7            [Up-to-date]
Updating SimpleTimer                     @ b30890b8f7     [Up-to-date]
Updating SparkFun BME280                 @ 2.0.8          [Up-to-date]
Updating WifiManager                     @ 0.15.0         [Up-to-date]

However I use this in platformio.ini to pull in the latest framwork from here:

framework = arduino
platform = espressif8266
platform_packages =
    framework-arduinoespressif8266 @ https://github.com/esp8266/Arduino.git
@CRCinAU
Copy link
Author

CRCinAU commented May 16, 2020

Annoyingly, if I set -DDEBUG_ESP_CORE, the problem goes away....
EDIT: Spoke too soon - I managed to capture this problem with DEBUG_ESP_CORE set....
EDIT2: And I turned logging off by mistake and missed it, now I can't reproduce :(

I am currently building with the following:

build_flags =
  -DDEBUG_ESP_PORT=Serial
;  -DDEBUG_ESP_SSL
;  -DDEBUG_ESP_TLS_MEM
;  -DDEBUG_ESP_HTTP_CLIENT
;  -DDEBUG_ESP_HTTP_SERVER
  -DDEBUG_ESP_CORE
;  -DDEBUG_ESP_WIFI
;  -DDEBUG_ESP_HTTP_UPDATE
;  -DDEBUG_ESP_UPDATER
;  -DDEBUG_ESP_OTA
;  -D PIO_FRAMEWORK_ARDUINO_LWIP2_HIGHER_BANDWIDTH
  -DPIO_FRAMEWORK_ARDUINO_LWIP2_IPV6_HIGHER_BANDWIDTH
;  -DNDEBUG

Normally, I would build with:

build_flags =
;  -DDEBUG_ESP_PORT=Serial
;  -DDEBUG_ESP_SSL
;  -DDEBUG_ESP_TLS_MEM
;  -DDEBUG_ESP_HTTP_CLIENT
;  -DDEBUG_ESP_HTTP_SERVER
;  -DDEBUG_ESP_CORE
;  -DDEBUG_ESP_WIFI
;  -DDEBUG_ESP_HTTP_UPDATE
;  -DDEBUG_ESP_UPDATER
;  -DDEBUG_ESP_OTA
;  -D PIO_FRAMEWORK_ARDUINO_LWIP2_HIGHER_BANDWIDTH
  -DPIO_FRAMEWORK_ARDUINO_LWIP2_IPV6_HIGHER_BANDWIDTH
  -DNDEBUG

@CRCinAU
Copy link
Author

CRCinAU commented May 16, 2020

I'm still seeing this on two other D1 Mini's that are installed in non-easy to reach places - I can see the onboard LED blinking as it does a reset loop - and I can only easily get to the power to turn it off and on again... After I do this, the normal code launches fine - until I reboot or send a firmware to it. At that point, we enter the reboot loop again...

Any suggestions on this?

@d-a-v
Copy link
Collaborator

d-a-v commented May 16, 2020

What are the step to reproduce?
Is it simply doing an OTA then call ESP.restart() (with and without DEBUG_ESP_CORE) ?

@CRCinAU
Copy link
Author

CRCinAU commented May 16, 2020 via email

@devyte
Copy link
Collaborator

devyte commented May 16, 2020

@CRCinAU you forgot an MCVE to reproduce. Please remember that it must not include 3rd party libs.

@devyte devyte added the waiting for feedback Waiting on additional info. If it's not received, the issue may be closed. label May 16, 2020
@CRCinAU
Copy link
Author

CRCinAU commented May 17, 2020

Well, here's the annoying part - if I upload my 'basic web update' program that I use as a bootloader, I can flash that and reboot it as many times as I like... That code is at: https://git.crc.id.au/netwiz/ESP8266_Code/src/branch/master/BasicWebUpdate/src/BasicWebUpdate.ino

If I flash the "OutdoorMonitor" code from the same git, then it fails every reboot - even if that's just to upload the new binary via the /update URL on the BasicWebUpdate flash.
Source:
https://git.crc.id.au/netwiz/ESP8266_Code/src/branch/master/OutsideMonitor/src/GPIO_MQTT.ino

The "GarageDoor" code from here also fails to reboot:
https://git.crc.id.au/netwiz/ESP8266_Code/src/branch/master/GarageDoor/src/GarageDoor.ino

These were both working fine before a pio update :(

@CRCinAU
Copy link
Author

CRCinAU commented May 17, 2020

To try and rule out any issue, I wiped the entire flash by using esptool erase_flash, then wrote a 4Mb blank file, then sent the code to the device again.

Connected properly to the HostAP for WifiManager, connected it to my wifi network fine, first reboot caused it to go into the boot loop again.

Have attached the confirmed failing binary in case that helps.

To confirm, the full source code for this binary is here:
https://git.crc.id.au/netwiz/ESP8266_Code/src/branch/master/OutsideMonitor/src/GPIO_MQTT.ino

It seems that the call to ESP.restart() works, but after that point, the never ending boot loop until you powercycle the device or reflash it over USB...

Edit: removed attachment.

@devyte
Copy link
Collaborator

devyte commented May 17, 2020

Please don't attach binary files.
So: the basic web update example works fine, but the sketch with your code and 3rd party libs doesn't?

@CRCinAU
Copy link
Author

CRCinAU commented May 17, 2020

Correct. If I flashed the basic web updater, it can reboot via the /reboot web address, and via the /update web address after a firmware load happens. I couldn't make this go into the boot loop.

As soon as I sent the other binary to the /update URL, it went into a boot loop.

@devyte
Copy link
Collaborator

devyte commented May 17, 2020

Then something in your code or in the 3rd party libs is causing the problem.
If you're in a reboot loop before even setup is called, I suggest looking at the constructors of globally instanced objects, e .g. a global object constructor should not access other global object instances, because the order of construction of global objects is not deterministic between translation units.
Or something else in the constructors.
The only other thing that comes to mind is that there's something wrong with your binary and/or eboot, as built by your build system. I suggest rebuilding from the Arduino IDE.
Closing due to not a core issue.
If you do reduce the problem to a MCVE that uses only core code, please open a new issue, follow the template instructions, add your code and details, and reference this issue.

@devyte devyte closed this as completed May 17, 2020
@CRCinAU
Copy link
Author

CRCinAU commented May 17, 2020

Hmmm - at the moment, the only thing I can see is the upgrades done via pio update which are:

Updating espressif8266                   @ 2.4.0          [2.5.1]
Uninstalling espressif8266 @ 2.4.0: 	[OK]
PlatformManager: Installing espressif8266 @ 2.5.1
espressif8266 @ 2.5.1 has been successfully installed!
Updating toolchain-xtensa                @ 2.40802.191122 [2.40802.200502]
Uninstalling toolchain-xtensa @ 2.40802.191122: 	[OK]
PackageManager: Installing toolchain-xtensa @ 2.40802.200502

The use of the custom framework path to this git should cause the espressif8266 part to be ignored - leaving only the toolchain-xtensa as a possible problem?

As I have a version number on this, I'll try to downgrade it somehow and see what happens...

@CRCinAU
Copy link
Author

CRCinAU commented May 18, 2020

Ok - I've hit something that I can reproduce....

If I comment out the following lines in my platformio.ini, then the units reboot correctly (after the first boot with default speeds):

;board_build.f_cpu = 160000000L
;board_build.f_flash = 80000000L

I note that I've been using these lines in my platformio.ini for a long time without issue - and even the BasicWebUpdate shown earlier uses these - but it works correctly...

@devyte - Does this ring any bells as to why this would suddenly start causing issues?

@devyte
Copy link
Collaborator

devyte commented May 18, 2020

Not really.
What happens if you build from the Arduino IDE and choose those params?

@devyte devyte removed the waiting for feedback Waiting on additional info. If it's not received, the issue may be closed. label May 18, 2020
@CRCinAU
Copy link
Author

CRCinAU commented May 18, 2020

I don't have the Arduino IDE installed, so I'll have to look at doing that from scratch...

On a similar topic, if I call system_update_cpu_freq(160); in setup(), shouldn't this set the CPU speed to 160Mhz? I can see that even if I call this, ESP.getCpuFreqMHz() still returns 80. I'm not sure if this is a problem - or if ESP.getCpuFreqMHz() only returns the boot speed?

Right now, I'm trying to either prove or disprove that compiling with the code set to switch to 160Mhz is the culprit in causing the boot loop...

@devyte
Copy link
Collaborator

devyte commented May 18, 2020

The cpu freq changes whether you build with 80 or 160. It depends on several things. I'm not sure, but I don't think that changing it should cause a crash.
However, a wrong flash speed can, as can a wrong flash mode.

@CRCinAU
Copy link
Author

CRCinAU commented May 18, 2020

Here's my current data set:

Build at 80 Mhz (ESP.getCpuFreqMHz() shows 80):
Flash -> up ok -> reboot -> up ok

Build at 160Mhz (ESP.getCpuFreqMHz() shows 160):
Flash -> up ok -> reboot -> up ok -> reboot -> up ok -> flash 160Mhz build -> no boot.
Power cycle -> up ok -> reboot -> no boot
Power cycle -> up ok -> Flash 80 Mhz build -> no boot
Power cycle -> up ok -> reboot -> up ok -> reboot -> up ok
Flash 80Mhz build -> up ok -> reboot -> up ok

@rrelande
Copy link

very interesting as I have a similar issue and was not able to reproduce it reliably
nor find path to investigation.

@rrelande
Copy link

OTA wih 80 Mhz CPU and 40 Mhz flash is ok - however I cannot explain why

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants