Skip to content

WiFi UDP misuses new/delete and crashes #7558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
davepl opened this issue Dec 6, 2022 · 19 comments
Closed
1 task done

WiFi UDP misuses new/delete and crashes #7558

davepl opened this issue Dec 6, 2022 · 19 comments
Labels
Area: BT&Wifi BT & Wifi related issues Status: Solved

Comments

@davepl
Copy link

davepl commented Dec 6, 2022

Board

All (eg: HeltecWifiKit32)

Device Description

All

Hardware Configuration

All with WiFi

Version

latest development Release Candidate (RC-X)

IDE Name

VSCode

Operating System

FreeRTOS

Flash frequency

40

PSRAM enabled

no

Upload speed

115200

Description

WiFiUdp::parsePacket makes extensive use of new/delete while checking the return value. It crashes in low memory conditions because new doesn't return null, it throws an exception, which is not caught. They could just specify __nothrow but they don't. Or it could be replaced with malloc/free.

I've fixed it privately, which has corrected this crash for me at least, and would like to fix it in the original.

int WiFiUDP::parsePacket(){
  if(rx_buffer)
    return 0;
  struct sockaddr_in si_other;
  int slen = sizeof(si_other) , len;
  char * buf = new char[1460];
  if(!buf){
    return 0;
  }
  if ((len = recvfrom(udp_server, buf, 1460, MSG_DONTWAIT, (struct sockaddr *) &si_other, (socklen_t *)&slen)) == -1){
    delete[] buf;
    if(errno == EWOULDBLOCK){
      return 0;
    }
    log_e("could not receive data: %d", errno);
    return 0;
  }
  remote_ip = IPAddress(si_other.sin_addr.s_addr);
  remote_port = ntohs(si_other.sin_port);
  if (len > 0) {
    rx_buffer = new cbuf(len);
    rx_buffer->write(buf, len);
  }
  delete[] buf;
  return len;
}

Here’s an example from the log:

  #0  0x40084e09:0x3ffea2e0 in panic_abort at /Users/ficeto/Desktop/ESP32/ESP32S2/esp-idf-public/components/esp_system/panic.c:402
  #1  0x4008f1f1:0x3ffea300 in esp_system_abort at /Users/ficeto/Desktop/ESP32/ESP32S2/esp-idf-public/components/esp_system/esp_system.c:128
  #2  0x40094c71:0x3ffea320 in abort at /Users/ficeto/Desktop/ESP32/ESP32S2/esp-idf-public/components/newlib/abort.c:46
  #3  0x400d89af:0x3ffea3a0 in TerminateHandler() at src/main.cpp:471
  #4  0x40165fab:0x3ffea3d0 in __cxxabiv1::__terminate(void (*)()) at /builds/idf/crosstool-NG/.build/HOST-x86_64-apple-darwin12/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/[eh_terminate.cc:47](http://eh_terminate.cc:47/)
  #5  0x40166012:0x3ffea3f0 in std::terminate() at /builds/idf/crosstool-NG/.build/HOST-x86_64-apple-darwin12/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/[eh_terminate.cc:57](http://eh_terminate.cc:57/)
  #6  0x40166e27:0x3ffea410 in __cxa_throw at /builds/idf/crosstool-NG/.build/HOST-x86_64-apple-darwin12/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/[eh_throw.cc:95](http://eh_throw.cc:95/)
  #7  0x401668ea:0x3ffea430 in operator new(unsigned int) at /builds/idf/crosstool-NG/.build/HOST-x86_64-apple-darwin12/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/[new_op.cc:54](http://new_op.cc:54/)
  #8  0x40166e81:0x3ffea450 in operator new[](unsigned int) at /builds/idf/crosstool-NG/.build/HOST-x86_64-apple-darwin12/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/[new_opv.cc:32](http://new_opv.cc:32/)
  #9  0x400e2ba1:0x3ffea470 in WiFiUDP::parsePacket() at /Users/dave/.platformio/packages/framework-arduinoespressif32/libraries/WiFi/src/WiFiUdp.cpp:210
  #10 0x400d7c96:0x3ffea4c0 in NTPTimeClient::UpdateClockFromWeb(WiFiUDP*) at include/ntptimeclient.h:109
  #11 0x40082357:0x3ffea590 in NetworkHandlingLoopEntry(void*) at src/main.cpp:392

Sketch

Should be obvious from inspection

Debug Message

#0  0x40084e09:0x3ffea2e0 in panic_abort at /Users/ficeto/Desktop/ESP32/ESP32S2/esp-idf-public/components/esp_system/panic.c:402
  #1  0x4008f1f1:0x3ffea300 in esp_system_abort at /Users/ficeto/Desktop/ESP32/ESP32S2/esp-idf-public/components/esp_system/esp_system.c:128
  #2  0x40094c71:0x3ffea320 in abort at /Users/ficeto/Desktop/ESP32/ESP32S2/esp-idf-public/components/newlib/abort.c:46
  #3  0x400d89af:0x3ffea3a0 in TerminateHandler() at src/main.cpp:471
  #4  0x40165fab:0x3ffea3d0 in __cxxabiv1::__terminate(void (*)()) at /builds/idf/crosstool-NG/.build/HOST-x86_64-apple-darwin12/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:47
  #5  0x40166012:0x3ffea3f0 in std::terminate() at /builds/idf/crosstool-NG/.build/HOST-x86_64-apple-darwin12/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:57
  #6  0x40166e27:0x3ffea410 in __cxa_throw at /builds/idf/crosstool-NG/.build/HOST-x86_64-apple-darwin12/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:95
  #7  0x401668ea:0x3ffea430 in operator new(unsigned int) at /builds/idf/crosstool-NG/.build/HOST-x86_64-apple-darwin12/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/new_op.cc:54
  #8  0x40166e81:0x3ffea450 in operator new[](unsigned int) at /builds/idf/crosstool-NG/.build/HOST-x86_64-apple-darwin12/xtensa-esp32-elf/src/gcc/libstdc++-v3/libsupc++/new_opv.cc:32
  #9  0x400e2ba1:0x3ffea470 in WiFiUDP::parsePacket() at /Users/dave/.platformio/packages/framework-arduinoespressif32/libraries/WiFi/src/WiFiUdp.cpp:210
  #10 0x400d7c96:0x3ffea4c0 in NTPTimeClient::UpdateClockFromWeb(WiFiUDP*) at include/ntptimeclient.h:109
  #11 0x40082357:0x3ffea590 in NetworkHandlingLoopEntry(void*) at src/main.cpp:392

Other Steps to Reproduce

No response

I have checked existing issues, online documentation and the Troubleshooting Guide

  • I confirm I have checked existing issues, online documentation and Troubleshooting guide.
@davepl davepl added the Status: Awaiting triage Issue is waiting for triage label Dec 6, 2022
@lbernstone
Copy link
Contributor

It is pretty easy to submit a pull request by editing the file at https://github.com/espressif/arduino-esp32/blob/master/libraries/WiFi/src/WiFiUdp.cpp

@mrengineer7777
Copy link
Collaborator

@davepl How did you fix it in your code? I think the correct fix here would be to add "__nothrow" to each "new" call. Also we should check if rx_buffer = new cbuf(len); succeeds. I'm comfortable submitting a PR if want help.

@davepl
Copy link
Author

davepl commented Dec 7, 2022 via email

@VojtechBartoska VojtechBartoska added Area: BT&Wifi BT & Wifi related issues Status: In Progress ⚠️ Issue is in progress and removed Status: Awaiting triage Issue is waiting for triage labels Dec 8, 2022
@VojtechBartoska
Copy link
Contributor

Hello @davepl, thanks for your work on the PR. We really appreciate it!

@BOBAH1248
Copy link

According to c++ specification, the construction like

    SomeClass *pSc = new SomeClass;
    if ( NULL == pSc )
    {
        // cope with error
    }

include new [] will not work, because new operator in c++ throws a std::bad_alloc exception instead of returning NULL.
But in the libraries we can see everywhere such
https://github.com/espressif/arduino-esp32/blob/master/libraries/WiFi/src/WiFiUdp.cpp#L45-L46
https://github.com/espressif/arduino-esp32/blob/master/libraries/WebServer/src/WebServer.cpp#L148-L154
and so

PS: The library.properties say, that such mistakes should be forwarded to authors

author=Ivan Grokhotkov
maintainer=Ivan Grokhtkov <[email protected]>

author=Hristo Gochkov
maintainer=Hristo Gochkov <[email protected]>

@TD-er
Copy link
Contributor

TD-er commented Feb 18, 2023

For this you can simply use:

#include <new>

    SomeClass *pSc = new (std::nothrow) SomeClass;

It can also be used on allocating arrays on the heap: https://en.cppreference.com/w/cpp/memory/new/nothrow

The only disadvantage of this (as far as I know) is that MS Intellisense does not recognize this and thus shows you a red underline in VS-code.
I use it in my code a lot and I see no reason why it couldn't be used in these classes.

@BOBAH1248
Copy link

Agreed with @TD-er - I already patched it in my instance.
But I use PlatformIO for managing of dependencies and afraid that it may update packages (include framework-arduinoespressif32) and roll back my patches.
As result, I may again get buggy code.

@mrengineer7777
Copy link
Collaborator

@davepl @BOBAH1248 @TD-er PR 7847 closes this issue.
The nothrow issue will be tracked in #7849. Please make your comments there. I will be submitting a PR to fix it.

@BOBAH1248
Copy link

BOBAH1248 commented Feb 20, 2023

@mrengineer7777 probably your commit related to ESP8266 repo, but not ESP32
While PR 7847 just replace few new-delete on malloc-free

@BOBAH1248
Copy link

BOBAH1248 commented Feb 20, 2023

By the way, during research, I run into that (on regular project without PSRAM) function call heap_caps_get_info told that I can allocate at least 30kB (non fragmented linear memory), while total fragmented free RAM was 42kB, but new throws exception on allocation of byte array 1.5kB.
So, I cannot say that adding of nothrow is enough solution.

@BOBAH1248
Copy link

BOBAH1248 commented Feb 22, 2023

By the way, during research, I run into that (on regular project without PSRAM) function call heap_caps_get_info told that I can allocate at least 30kB (non fragmented linear memory), while total fragmented free RAM was 42kB, but new throws exception on allocation of byte array 1.5kB. So, I cannot say that adding of nothrow is enough solution.

I found what is wrong with my research.
It turns out that heap_caps_get_info returns different results for different flag combinations. And because malloc (underlayed func of new) without PSRAM calls heap_caps_malloc with MALLOC_CAP_DEFAULT + MALLOC_CAP_INTERNAL, so the heap_caps_get_info should be called with the same flags.
As soon I fixed this call, I got adequate results.

@mrengineer7777
Copy link
Collaborator

mrengineer7777 commented Feb 23, 2023

@BOBAH1248 I'm not familiar with heap caps flags and I don't use PSRAM. I do use ESP.getMaxAllocHeap().
Does this mean that the call should be changed from heap_caps_get_largest_free_block(MALLOC_CAP_INTERNAL); to heap_caps_get_largest_free_block(MALLOC_CAP_DEFAULT || MALLOC_CAP_INTERNAL); ?

@BOBAH1248
Copy link

BOBAH1248 commented Feb 24, 2023

@mrengineer7777 during research I collected only general information and found the following:

  • in the esp32 not only one heap
  • without PSRAM, they at least 3 - compatible with mapping to IRAM, DMA compatible and "generic"
  • if I'm correct, all those heaps have MALLOC_CAP_INTERNAL flag
  • the heap functions check all provided CAP-flags
  • if I'm correct, the most largest heap has MALLOC_CAP_DEFAULT flag
  • additionally: the heaps may be overlapped (i.e. one heap may be part of another bigger one, so the total capacity of heaps may be bigger than memory size that they used)

Summarizing:
Call with heap_caps_get_largest_free_block(MALLOC_CAP_DEFAULT | MALLOC_CAP_INTERNAL) (make attention: there not ||, because it will get boolean result, but not flag combination) in the most cases will return the same result as MALLOC_CAP_INTERNAL.
But if you check with MALLOC_CAP_INTERNAL flag, then try call malloc (with MALLOC_CAP_DEFAULT | MALLOC_CAP_INTERNAL), it may fail if the heap with this free block is incompatible with MALLOC_CAP_DEFAULT flag (and new will crash app).
PS: As I wrote before, the heap with MALLOC_CAP_DEFAULT flag is the biggest (if I'm right), so usually malloc will succeed.

By the way, during tests I called heap_caps_get_largest_free_block(MALLOC_CAP_INTERNAL) that return about 40kB. Then I called heap_caps_malloc(32*1024, MALLOC_CAP_INTERNAL) but it failure, while heap_caps_malloc(32*1024, MALLOC_CAP_DEFAULT | MALLOC_CAP_INTERNAL) succeed.
I did not make deep research for this case, but put it aside in my head, which is still more complicated than I discovered.

@mrengineer7777
Copy link
Collaborator

@BOBAH1248 That's helpful. Thank you. Good catch on the single pipe |. After some research I see the reality is quite complicated.

ESP.getMaxAllocHeap() likely doesn't work for SPIRAM/PSRAM. For that probably need ESP.getMinFreePsram().

Must read: https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/system/mem_alloc.html.
All the magic happens in https://github.com/espressif/esp-idf/blob/master/components/heap/heap_caps.c

There may be unused memory in IRAM and D/IRAM [caution requires 32 bit alignment].

Helpful:
https://github.com/espressif/esp-idf/blob/master/components/heap/heap_caps.c#L211

https://github.com/espressif/arduino-esp32/tree/master/tools/sdk/esp32/include/heap/include
https://github.com/espressif/arduino-esp32/blob/master/tools/sdk/esp32/include/heap/include/esp_heap_caps.h#L59

 * Equivalent semantics to libc malloc(), for capability-aware memory.
void *heap_caps_malloc(size_t size, uint32_t caps);

 *  In IDF, ``free(p)`` is equivalent to ``heap_caps_free(p)``.
void heap_caps_free( void *ptr);

https://github.com/espressif/arduino-esp32/blob/master/tools/sdk/esp32/include/freertos/port/xtensa/include/freertos/portmacro.h#L432

* @note Because the ROM routines don't necessarily handle a stack in external RAM correctly, we force the stack
 * memory to always be internal.
#define portTcbMemoryCaps               (MALLOC_CAP_INTERNAL|MALLOC_CAP_8BIT)
#define portStackMemoryCaps             (MALLOC_CAP_INTERNAL|MALLOC_CAP_8BIT)
#define pvPortMallocTcbMem(size)        heap_caps_malloc(size, portTcbMemoryCaps)
#define pvPortMallocStackMem(size)      heap_caps_malloc(size, portStackMemoryCaps)

@mrengineer7777
Copy link
Collaborator

I found what is wrong with my research.
It turns out that heap_caps_get_info returns different results for different flag combinations. And because malloc (underlayed func of new) without PSRAM calls heap_caps_malloc with MALLOC_CAP_DEFAULT + MALLOC_CAP_INTERNAL, so the heap_caps_get_info should be called with the same flags.
As soon I fixed this call, I got adequate results.

Can you share a sample project that demonstrates the issue?

@mrengineer7777
Copy link
Collaborator

By the way, during tests I called heap_caps_get_largest_free_block(MALLOC_CAP_INTERNAL) that return about 40kB. Then I called heap_caps_malloc(32*1024, MALLOC_CAP_INTERNAL) but it failure, while heap_caps_malloc(32*1024, MALLOC_CAP_DEFAULT | MALLOC_CAP_INTERNAL) succeed.

I recommend calling malloc() instead of heap_caps_malloc().

@BOBAH1248
Copy link

BOBAH1248 commented Feb 24, 2023

Can you share a sample project that demonstrates the issue?

Sorry, I don't keep these experiments on HDD, just preconditions with results in my head only.

I recommend calling malloc() instead of heap_caps_malloc().

It was test and for clarifications I use the underlayed call of malloc.

Due to I write on c++, I use c++ constructions, like std::vector<T> + .resize() or std::unique_ptr<T[]> + new T[...].
But never malloc as is (except cases, when underlayed c-function requires it or with specific CAPs, like DMA or IRAM for processing in IRQ)

@TD-er
Copy link
Contributor

TD-er commented Feb 24, 2023

just preconditions with results in my head only.

I know exactly what you mean... I also often experience runtime failures when trying to allocate large chunks of memory in my brain :)

@mrengineer7777
Copy link
Collaborator

Fixed by #8065. Will be in release 2.0.8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: BT&Wifi BT & Wifi related issues Status: Solved
Projects
None yet
Development

No branches or pull requests

6 participants