-
Notifications
You must be signed in to change notification settings - Fork 13.3k
is free() async in ESP8266 Arduino core? #7089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
No, This is not the forum to debug this, instead please use https://esp8266.com or https://gitter.im/esp8266/Arduino. |
Thanks for your response. I'm afraid this is related to LWIP2 as the issue does not occur when I switch to LWIP1 (just tested that, because in one of the stack traces which I decoded there was only LWIP stuff going on, no sign of any of my code anywhere). I'm continuously monitoring free heap and the number of queued tasks, around 30k of heap was free all the time. Currently, with LWIP1, im flooding the ESP with 8 WebSocket requests every 25ms(!) without any crashes. Heap decreases to about 26k and recovers to 30k as soon as I close the WebSocket client (which is a JavaScript client in the browser). Max. 1-2 tasks backlog builds up so the ESP can keep up processing the queued tasks without any issues. With LWIP2, the ESP crashes within seconds. The stack trace differs from one crash to the other, as mentioned most of the time it is exception 3 but sometimes it is exception 28 but only when the thing locked up and the watchdog had to kick in and reset it. For me the solution for now is to use "PIO_FRAMEWORK_ARDUINO_LWIP_HIGHER_BANDWIDTH" instead of "PIO_FRAMEWORK_ARDUINO_LWIP2_LOW_MEMORY" but I'd be glad to help tracking this issue down, if you are interested. Let me know if you think that this could still be related with my code, even if LWIP1 fixes it for me. In that case I'd agree that it wouldn't make any sense to discuss it further at this place. |
@mamama1
to:
That way, |
@d-a-v in the asyncwebserver callback (which is isr context, I believe?) I queue the task which is then processed in the processTasks() function like mentioned before... Am I missing something? |
I think you got the interest of a scheduled function: defer the execution of a particular function so it is executed in the main context. |
I will have to look up what Schedule.h and a scheduled function actually is, never heard about that. Also, I'm not sure I 100% understand what you mean (I'm not english native). My callback looks like this (sorry, forgot to post it earlier). so you saying using a scheduled function for WebSocketServer::AddTask(eWebSocketCommand::GET_WIFI, client->id(), TempData, len); would make a difference? Within AddTask only the malloc happens...
Thanks for helping! |
like Also, I don't think lwip2 is the real issue (I am not the most credible one on that).
|
I'll try that, thanks! In general I'm aware of the fact that one shouldn't do any lengthy tasks in the system context. I was just completely stunned because I had the code like it is for months, while working on other parts of the firmware and I had never ever issues with all the JSON parse stuff (I'm aware that the JSON stuff takes quite some time), even when completely flooding the device with requests. The issues began as soon as I started to not complete the whole task directly in the callback (oddly that worked perfectly fine but was ugly and kind of spaghetti code) but when I started to queue a task using AddTask where a malloc() happens and later during processTasks() where the free() happens. So no manipulation within AddTask() (which runs in system context, to my understanding) changed anything regarding the crashes. Only manipulation within processTasks() changed whether the ESP would crash or not. I'll definitely shorten down the stuff which runs during the callback, I'll also try to schedule it like you showed me, that's for sure. Already just because of the fact that it is good practice anyway and should have be done from the beginning. Still I can't wrap my head around why the thing stopped crashing when I stopped freeing the buffer in the user context... (also, as I said, it didn't crash always but only sometimes, so it was not a "hard" error). edit: I've heard for the first time about scheduled functions, I guess documentation is really sparse on that.. |
Sounds a bit like something that's still processing your data did not make a copy of it, but just stored the pointer to it. |
@mamama1 You might want to look at my H4 library which does pretty much what your code does (i.e. its a main loop scheduler / timer) , but for all async tasks, from anywhere, not just Asyncwebserver. It's also stdlib-based and has proven pointer- and crash-free for well over a year now: https://github.com/philbowles/h4plugins |
Hi,
i'm wondering whether free() is async in the Arduino core of the ESP8266 and if it may encounter issues in my special case. As my code consists of tons of classes, cpp and h files, I'll try to first describe the issue here, maybe the root cause will seem completely obvious to you core developers.
If we need to, we can always dig further down.
So I'm having a globally declared vector called "tasks" of a custom struct called "task" of which one of the items is a char *ptr.
The vector is basically a list of tasks which are queued in a callback function of the AsyncWebserver made by @me-no-dev and which then are processed by a processTasks() function which is called in the main loop().
To create a new task, I call my createTask function where I create a task newTask (in the stack) struct element and the I do a malloc for the len of the incoming data buffer from the AsyncWebServer (about 200 bytes, depends on incoming data). The malloc pointer is saved to the char *ptr of the above mentioned, new struct element called "newTask" (which lives in the stack).
When a task is finished during running processTasks(), I do free(tasks.back().ptr) and then tasks.pop_back();
During the free() operation, I sometimes (quite often actually, but absolutely not always) various exceptions. These include exception 3, 9, 28 and/or 29.
The fact that the exception does not occur always and that it depends (often) on how fast I spam the ESP8266 with new tasks, makes me think that the free() might happen later, after the task has been popped from the vector.
So I tried the following, which didn't help either:
char *savePtr;
savePtr = tasks.back().ptr;
tasks.pop_back();
free(savePtr);
When I stop using free, and therefore create a memory leak, the exceptions/crashes stop immediately. I'm guessing that there is happening a free() operation to a place in memory where free() is not possible because either the pointer gets somehow invalid or maybe because there is a bug/design flaw in the malloc/free routines of ESP8266?
I noticed that @me-no-dev implemented a custom LinkedList for a similar task like I have but my brain was about to explode when I tried to wrap it around that sick stuff :-D I suppose there might have been a very good reason for @me-no-dev to implement that LinkedList to accomplish this very similar to mine task (add tasks to the back, pop them from the front). I must say I was using a deque before the vector but behaviour was exactly the same.
Arduino Core Version is the most current (2.6.3), IDE is PlatformIO. I'm using LWIP2_LOW_MEMORY, no other build parameters.
The trace using the exception decoder always pointed around the free() operation. And as mentioned, the ESP stops crashing as soon as I stop freeing the buffer.
Some snippets - this is the struct, I'm talking about:
This is the addTask function:
This ist the processTasks where the malloced buffer gets freed again:
Any hints?
Thanks!
The text was updated successfully, but these errors were encountered: