-
Notifications
You must be signed in to change notification settings - Fork 13.3k
g_cont to unused space in SYS, gives extra 4K free #4553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Use a linker option to hardcode the address of the g_cont context variable to in a region of RAM allocated for SYS use but which is not presently used. Gives an additional 4K+ of heap to all apps.
I'm sure there will be lively debate about this one since it's "borrowing" from a region the OS has reserved for itself, but which experimentally has been seen to be unused. Wanted to get it in before I forgot as it's so simple a change. We can probably extend the stack to 5K for "free" with this as well since there's something like 5.2KB of unused space in the region. |
We could setup a debug mode in which we install a signature in it and check it every now and then ? |
I think this one needs a bit more delicate considerations and/or configurations. For example: Arduino/libraries/Ticker/Ticker.cpp Line 56 in 559cb35
The callback supplied by user will be executed in the timer callback, which will use the sys context. |
@Adam5Wu you are right. However, declaring large objects like a 2KB array on the stack is against current programming guidelines. |
@devyte HW breakpoint sound like a great idea. I wasn't aware of this. :) |
My current idea is, on yield, mark the g_cont range as protected. If any bytes were to be modified, a hardware exception will be triggered, and we can install a handler, which will remap (or copy) the data to a different memory region. And on schedule, we map (or copy) the g_cont back if it was remapped. Remapping is only done on a case-by-case manner, that is, if sys cont did not overflow during a context switch, then no remapping is done so no performance penalty. In this way, when sys context overflow happens, we can both get a notification, and (likely) keep everything continue to work, with a slight loss of performance. Need to check the feasibility, though. |
Not sure how that's be possible without preallocating the move-to space and negating the purpose. From an exception handler you can't mess with the heap, no? Maybe a simpler check to start:
|
My first thought wqs the same as @earlephilhower 's. Maybe we're misunderstanding your idea @Adam5Wu ? @earlephilhower that's exactly what I said in my previous comment: move the hw breakpoint around on context switch. It does seem viable, right? |
No, I think you guys are right in that we should start with a smaller and achievable goal. But my only concern is that, we don't really know for sure that the sys context memory region is just a stack. That is, overflow can only occur in one direction. Either official confirmation from Espressif can clear up the understanding; or to be safe we need a region based hardware trap, not just a point based... |
@devyte Yup, my bad. We're on the same wavelength here. And, of course, if the exception can push values, we need to put the high water mark before the end/beginning to ensure that there's enough space to safely do whatever pushes needed in the exception handler + any functions it calls. The topic at https://bbs.espressif.com/viewtopic.php?t=8879 has some experimental results on the stack size. |
I bet this will also break any GDB usage since there's only the one BP available in the core and my guess is GDB makes extensive use of it. |
About gdb, if the breakpoint can be set dynamically, it'd be just a matteer of unsetting, or re-setting it, from inside our gdb stubs, when gdb connects |
I can't figure out how to install my own exception vector and have it jump to the standard NMI one, but the following code can set a data BP on an arbitrary address. For stack, 0x3f should probably be changed to be a 16-byte window match).
|
If you don't mind holding this until Monday, I'm going to check whether the region is really only used as stack. |
That would be great, thanks! The breakpoint stuff, though, can be useful even w/o this to report corruption when it happens instead of when the values destroyed in the heap are used. |
After looking at the code, my recommendation is: instead of hardcoding the address, allocate the continuation structure on the sys stack. This would mean that g_cont would become a pointer, rather than a value, but the impact seems to be limited to just two files. To place something on the sys stack, we can wrap I.e. option 1: // Add linker flag, -Wl,-wrap,call_user_start
extern void __real_call_user_start();
extern void ICACHE_RAM_ATTR __wrap_call_user_start();
void ICACHE_RAM_ATTR __wrap_call_user_start()
{
cont_t cont;
g_cont = &cont;
__real_call_user_start();
}
Option 2: extern void call_user_start();
void _entry()
{
// don't do anything fancy here: BSS is not zeroed out, and exception vectors still point to the ROM code
cont_t cont;
g_cont = &cont;
call_user_start();
}
// In linker flags, replace -u call_user_start with -u _entry
// In eagle.app.v6.common.ld, replace ENTRY(call_user_start) with ENTRY(_entry) An additional watchpoint-based stack overflow check (re-configured on context switch) would also be a nice improvement, but can be done in a separate PR. |
@igrr I can do that simply enough, but are you sure that there's no return from call_user_start (so the stack local passed in isn't ever deallocated? I don't know enough about the startup to really trace it myself. |
What's the status here? Do we understand what is to be done? |
@igrr I've not had any success with the wrapped functions for allocating g_cont on the SYS stack. Nothing but WDT resets without any debugging info. Are you sure the stack pointer register is even valid and pointing to the right spot at that point in device lifetime, and that it's not reset by the ROM/SDK code to the base later on (and therefore overwriting cont)? Thx! |
Okay, i'll try to allocate some cycles and see what's wrong there. |
@earlephilhower af09cb5 implements my proposal. |
Thanks, closing this one. |
Found an issue with moving g_cont to sys stack -- SDK WPS function is broken. It is because WPS involves pubkey crypto, which seems to rely on a sizable chunk of stack, which is taken away by g_cont. This results in using WPS always lead to wdt reset. Other SDK functions that involves heavy crypto might also be affected (airkiss? smartconfig? mesh?) |
@igrr @Adam5Wu @earlephilhower edit: cont_t is not simply a struct and really contains the stack itself. I did not realized that before, sorry for the noise. |
Use a linker option to hardcode the address of the g_cont context
variable to in a region of RAM allocated for SYS use but which is not
presently used by the SDK.
Gives an additional 4144 bytes of heap to all applications.