Skip to content

Handle breakpoints via postmortem when GDB is not running. #7704

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 23 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
026c078
Handle breakpoints via postmortem when GDB is not running.
mhightower83 Nov 12, 2020
fb3fe9d
comment corrections
mhightower83 Nov 13, 2020
02cab45
Merge branch 'master' into pr-bp-4-postmortem
earlephilhower Nov 14, 2020
c58baaa
Reviewer requested changes.
mhightower83 Nov 16, 2020
c47d7a0
Merge branch 'pr-bp-4-postmortem' of github.com:mhightower83/Arduino …
mhightower83 Nov 16, 2020
dd80b84
Corrected build error.
mhightower83 Nov 16, 2020
44e60df
Merge branch 'master' into pr-bp-4-postmortem
mhightower83 Nov 16, 2020
c8d0da1
Moved definitions around to aid in future merges where
mhightower83 Nov 20, 2020
b52648c
Correction for `using _xtos_handler =` in .c files
mhightower83 Nov 20, 2020
dc484cf
Improved comments.
mhightower83 Nov 22, 2020
01cc415
Cleanup comment.
mhightower83 Nov 29, 2020
76f24ec
Merge branch 'master' into pr-bp-4-postmortem
mhightower83 Dec 3, 2020
1950c78
Merge branch 'master' into pr-bp-4-postmortem
mhightower83 Dec 6, 2020
60b658b
Merge branch 'master' into pr-bp-4-postmortem
mhightower83 Dec 11, 2020
79c0d01
replace typdef with using
mhightower83 Dec 11, 2020
b0f7e4f
Merge branch 'master' into pr-bp-4-postmortem
mhightower83 Dec 11, 2020
de1249f
Merge branch 'master' into pr-bp-4-postmortem
mhightower83 Dec 15, 2020
ba887d5
Merge branch 'master' into pr-bp-4-postmortem
mhightower83 Jan 14, 2021
c91113d
Merge branch 'master' into pr-bp-4-postmortem
mhightower83 Mar 14, 2021
2d67987
Merge branch 'master' into pr-bp-4-postmortem
mhightower83 Apr 4, 2021
0404ccd
Merge branch 'master' into pr-bp-4-postmortem
mhightower83 Jul 1, 2021
1623ffc
Merge branch 'master' into pr-bp-4-postmortem
mhightower83 Jul 28, 2021
229e2e5
Merge branch 'master' into pr-bp-4-postmortem
mhightower83 Dec 8, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions cores/esp8266/core_esp8266_main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -327,6 +327,7 @@ extern "C" void preinit (void)
{
/* do nothing by default */
}
extern "C" void postmortem_init(void);

extern "C" void user_init(void) {
struct rst_info *rtc_info_ptr = system_get_rst_info();
Expand All @@ -342,6 +343,10 @@ extern "C" void user_init(void) {

cont_init(g_pcont);

#if defined(DEBUG_ESP_PORT) || defined(DEBUG_ESP_EXCEPTIONS)
postmortem_init();
#endif

preinit(); // Prior to C++ Dynamic Init (not related to above init() ). Meant to be user redefinable.

ets_task(loop_task,
Expand Down
285 changes: 280 additions & 5 deletions cores/esp8266/core_esp8266_postmortem.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,32 @@
#include "gdb_hooks.h"
#include "StackThunk.h"
#include "coredecls.h"
#include "esp8266_undocumented.h"
#include "core_esp8266_features.h"

extern "C" {

#if defined(DEBUG_ESP_PORT) || defined(DEBUG_ESP_EXCEPTIONS)
extern void _DebugExceptionVector(void);
extern void _KernelExceptionVector(void);
extern void _DoubleExceptionVector(void);

// Context structure used by Debug, Double, Kernel Exception and unhandled
// exception stubs to build a stack frame with potentially useful addresses for
// EXP Exception Decoder's use. Also used to save some special registers
// that were overwritten by using the `ill` instruction.
struct EXC_CONTEXT{
uint32_t ps; // +0
uint32_t pad1; // +4
uint32_t pad2; // +8
uint32_t excsave2; // +12
uint32_t exccause; // +16
uint32_t epc1; // +20
uint32_t a1; // +24
uint32_t excsave1; // +28, a0 at the time of the event
};
#endif

// These will be pointers to PROGMEM const strings
static const char* s_panic_file = 0;
static int s_panic_line = 0;
Expand Down Expand Up @@ -128,6 +151,67 @@ void __wrap_system_restart_local() {

ets_install_putc1(&uart_write_char_d);

/*
Check for Exceptions mapped into User Exception.
*/
if (rst_info.reason == REASON_EXCEPTION_RST && rst_info.exccause == 0) {
#if defined(DEBUG_ESP_PORT) || defined(DEBUG_ESP_EXCEPTIONS)
if ((rst_info.epc1 - 11u) == (uint32_t)_DoubleExceptionVector) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 11? what does that magic number mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the offset into the _DoubleExceptionVector where the ill instruction was executed. I found a better way to deal with that. I'll push later,

struct EXC_CONTEXT *exc_context;
asm volatile("rsr.excsave2 %0\n\t" : "=a" (exc_context) :: "memory");
// Use Special Register values from before the `ill` instruction.
// Note, for a double exception the exccause is that of the second exception.
// epc1 is still the address of the 1st exception, its cause is now unknown.
rst_info.exccause = exc_context->exccause;
rst_info.epc1 = exc_context->epc1;
// Double Exception
if (rst_info.depc) {
ets_printf_P(PSTR("\nDouble Exception"));
// The current "ESP Exception Decoder" does not process the value in depc
// Set a fake epc1 so the decoder identifies the line of the Double Exception
rst_info.epc1 = rst_info.depc;
// The "ESP Exception Decoder" is also helped by updating the
// context saved on the stack.
exc_context->epc1 = rst_info.depc;
}
// BP - Debug Exception
else if (rst_info.epc2) {
if (rst_info.epc2 == 0x4000dc4bu) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this magic address? What is there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the address in the Boot ROM for the breakpoint instruction that executes for an unhandled exception. I'll replace with a constexpr.

// Unhandled Exceptions land here. 'break 1, 1' in _xtos_unhandled_exception
if (20u == exc_context->exccause) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 20? what does the magic number mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EXCCAUSE value 20, InstFetchProhibitedCause - An instruction fetch referenced a page mapped with an attribute that does not permit instruction fetch.

I'll add a constexpr for this in the next push.

// This could be the result from a jump or call; a call
// is more likely. eg. calling a null callback function
// For the call case, a0 is likely the return address.
rst_info.epc1 = exc_context->excsave1;
}
ets_printf_P(PSTR("\nXTOS Unhandled exception - BP"));
}
else if (rst_info.epc2 == 0x4000dc3cu) {
// Unhandled interrupts land here. 'break 1, 15' in _xtos_unhandled_interrupt
ets_printf_P(PSTR("\nXTOS Unhandled interrupt - BP"));
// Don't know what should be done here if anything.
}
else {
ets_printf_P(PSTR("\nDebug Exception"));
// The current Exception Decoder does not process the value in epc2
// Set a fake epc1 so the Exception Decoder identifies the correct BP line
rst_info.epc1 = rst_info.epc2;
}
}
// Kernel Exception
else {
ets_printf_P(PSTR("\nKernel Exception"));
}
ets_printf_P(PSTR(" - excsave1=0x%08x excsave2=0x%08x epc1=0x%08x\n"),
exc_context->excsave1, exc_context->excsave2, exc_context->epc1 );
}
#endif
// The GCC divide routine in ROM jumps to the address below and executes ILL (00 00 00) on div-by-zero
// In that case, print the exception as (6) which is IntegerDivZero
if (rst_info.epc1 == 0x4000dce5u) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the comment above, this magic number seema to be a ROM address. What is there? a 0x0? Can the value be moved into a constexpr and commented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that address has the ill (illegal) instruction. Added comments and constexpr.

rst_info.exccause = 6;
}
}
cut_here();

if (s_panic_line) {
Expand All @@ -144,11 +228,8 @@ void __wrap_system_restart_local() {
ets_printf_P(PSTR("\nAbort called\n"));
}
else if (rst_info.reason == REASON_EXCEPTION_RST) {
// The GCC divide routine in ROM jumps to the address below and executes ILL (00 00 00) on div-by-zero
// In that case, print the exception as (6) which is IntegerDivZero
bool div_zero = (rst_info.exccause == 0) && (rst_info.epc1 == 0x4000dce5);
ets_printf_P(PSTR("\nException (%d):\nepc1=0x%08x epc2=0x%08x epc3=0x%08x excvaddr=0x%08x depc=0x%08x\n"),
div_zero ? 6 : rst_info.exccause, rst_info.epc1, rst_info.epc2, rst_info.epc3, rst_info.excvaddr, rst_info.depc);
rst_info.exccause, rst_info.epc1, rst_info.epc2, rst_info.epc3, rst_info.excvaddr, rst_info.depc);
}
else if (rst_info.reason == REASON_SOFT_WDT_RST) {
ets_printf_P(PSTR("\nSoft WDT reset\n"));
Expand All @@ -157,7 +238,7 @@ void __wrap_system_restart_local() {
ets_printf_P(PSTR("\nStack overflow detected.\n"));
ets_printf_P(PSTR("\nException (%d):\nepc1=0x%08x epc2=0x%08x epc3=0x%08x excvaddr=0x%08x depc=0x%08x\n"),
5 /* Alloca exception, closest thing to stack fault*/, s_stacksmash_addr, 0, 0, 0, 0);
}
}
else {
ets_printf_P(PSTR("\nGeneric Reset\n"));
}
Expand Down Expand Up @@ -322,5 +403,199 @@ void __stack_chk_fail(void) {
while (1); // never reached, needed to satisfy "noreturn" attribute
}

#if defined(DEBUG_ESP_PORT) || defined(DEBUG_ESP_EXCEPTIONS)
/*
Handle breakpoints via postmortem when GDB is not running.

Objective: Add support for Debug, Kernel, and Double Exception reporting
with minimum impact on IRAM usage. Additionally, improve reporting of
Unhandled Exception causes for the disabled interrupt case.

Overview: In the absence of GDB, install a small breakpoint handler that
clears EXCM and executes an illegal instruction. This leverages the existing
exception handling code in the SDK for the `ill` instruction. At postmortem
processing, identify the event and report. This allows for alerting of
embedded breakpoints. In the stack trace, EPC1 is updated with the value of
EPC2. This helps "Exception Decoder" identify the line where the BP
occurred. By using the SDK's handling for `ill`, we get support for handling
the process of getting out of the Cache_Read_Disable state free, without
consuming IRAM.

For breakpoints (BP), the SDK's default behavior is to loop, waiting for an
interrupt >2 until the HWDT kicks in. The current "C++" compiler will embed
breakpoints when it detects a divide by zero at compile time. Are there
other situations where the compiler does this?

Expand technique to preserve more of the current context and to also report
Kernel and Double exceptions through postmortem. As well as enhanced
reporting for XTOS Unhandled Exceptions.

To avoid having to deal with literals and to minimize the use of IRAM, some
jumping around is done to use pockets of available IRAM in each vector
handler's allotted space. Regrettably, the relative jumps needed to ties
these pockets of memory together were computed by hand.

To assist the "ESP Exception Decoder", create a stack frame to store
various special registers for inspection. The expectation is that some of
these values may point to code. This can give insight into what was running
at the time of the crash.
*/

//
// Stage Exception Vector Stubs to be copied into respective vector address
// space in ICACHE/PROGMEM address space.
//
void postmortem_debug_exception_vector(void);
asm( // 16 bytes MAX, using 16
".section .text.postmortem_debug_exception_vector,\"ax\",@progbits\n\t"
".align 4\n\t"
".type postmortem_debug_exception_vector, @function\n\t"
"\n"
"postmortem_debug_exception_vector:\n\t" // Copy destination _DebugExceptionVector
"wsr.excsave2 a0\n\t"
"j . + 32\n\t" // _KernelExceptionVector + 3

// continue here from _KernelExceptionVector
"postmortem_debug_exception_vector_part2:\n\t"
"s32i a0, a1, 16\n\t"
"rsr.ps a0\n\t"
"s32i a0, a1, 0\n\t"
"j . + 86\n\t" // finish with postmortem_double_exception_handler_part2
".size postmortem_debug_exception_vector, .-postmortem_debug_exception_vector\n\t"
);

void postmortem_kernel_exception_handler(void);
asm( // 32 bytes MAX, using 31
".section .text.postmortem_kernel_exception_handler,\"ax\",@progbits\n\t"
".align 4\n\t"
".type postmortem_kernel_exception_handler, @function\n\t"
"\n"
"postmortem_kernel_exception_handler:\n\t" // Copy destination _KernelExceptionVector
"wsr.excsave1 a0\n\t"

"postmortem_kernel_exception_handler_p3:\n\t"
"addi a0, a1, -32\n\t"
"s32i a1, a0, 24\n\t"
"mov.n a1, a0\n\t"
"xsr.excsave2 a0\n\t" // stash pointer to stack data in excsave2
"s32i a0, a1, 12\n\t" // and save previous
"rsr.excsave1 a0\n\t"
"s32i a0, a1, 28\n\t"
"rsr.epc1 a0\n\t"
"s32i a0, a1, 20\n\t"
// save current exccause
"rsr.exccause a0\n\t"
"j . - 54\n\t" // continue at postmortem_debug_exception_vector_part2

".size postmortem_kernel_exception_handler, .-postmortem_kernel_exception_handler\n\t"
);

void postmortem_double_exception_handler(void);
asm( // 16 byte MAX, using 14
".section .text.postmortem_double_exception_handler,\"ax\",@progbits\n\t"
".align 4\n\t"
".type postmortem_double_exception_handler, @function\n\t"
"\n"
"postmortem_double_exception_handler:\n\t" // Copy destination _DoubleExceptionVector
// Common code at _KernelExceptionVector, which is -64 bytes from
// _DoubleExceptionVector.
"j . - 64\n\t" // continue at _KernelExceptionVector

// finish here from _DebugExceptionVector part 2
// _DoubleExceptionVector part 2
"postmortem_double_exception_handler_part2:\n\t"
// User mode, Clear EXCM, and block all interrupts
"movi.n a0, 0x02F\n\t"
"wsr.ps a0\n\t"
"rsync\n\t"
// Let _UserExceptionVector finish processing the exception as an
// exception(0). Adjust results at __wrap_system_restart_local so that `ESP
// Exception Decoder` reports the state before this `ill` instruction.
"ill\n\t"
".size postmortem_double_exception_handler, .-postmortem_double_exception_handler\n\t"
);

/*
Of the Exception Cause table's 64 entries, most are set to
`_xtos_unhandled_exception`. And I assume most of these never occur; however,
at lesat one does exccause 20. This one occurs on a call to a NULL pointer. This
can happen when a function prototype is given the weak attribute but never
defined, or forgetting to test a call back function pointer for NULL before
calling, etc.

A problem with the `_xtos_unhandled_exception` function is that it handles the
event with a Debug Exception BP. If you don't have GDB running you see a HWDT
reset. If interrupts are at INTLEVEL 2 or higher (a BP instruction becomes a
NOP), you still see a HWDT reset regardless of running GDB.
*/
typedef void (* _xtos_handler)(void);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer using= over typedef for type definitions. In the case of a function ptr:

using _xtos_handler = void (*)(void);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I had this covered, but I don't think so anymore. I think my problem stems from trying to reference xtensa sources that I can find on the Internet to fill in some of these missing function types. In the case of _xtos_handler, I am having trouble. They applied it to two very different function tables and it matches neither exactly.

  1. The first table is a set of ASM functions that do not comply with "C" conventions.
    • For .cpp files the typedef boils down to typedef void (*_xtos_handler)(...);
    • extern _xtos_handler _xtos_exc_handler_table[];
    • Of the set of ASM functions that are used in this table, there is a "C" wrapper function (_xtos_c_wrapper_handler) that references the 2nd table.
  2. The 2nd table of "C" styled exception handling functions, defines its entries with the same typedef.
    • extern _xtos_handler _xtos_c_handler_table[];
    • for the functions in this table I think this would better match the function call:
      • typedef void (_xtos_c_exception_handler_func)(struct __exception_frame *ef, int cause);
      • typedef _xtos_c_exception_handler_func *_xtos_c_handler;
      • extern _xtos_c_handler _xtos_c_handler_table[];

My quandary is:

  1. should I keep the typedef's I found and cast everything to fit? or
  2. Ignore the existing ones and hope those files never need to be added to this code base and define new ones?
    • for the first table I don't see a perfect fit for a non-"C" ASM. I think the current typedef would be fine.

Also affected: #7060 (comment)

static _xtos_handler * const _xtos_exc_handler_table = (_xtos_handler *)0x3FFFC000u;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this magic address? What is there? Can it be moved out to a define or constexpr with comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dispatch table for EXCCAUSE values. Added constexpr and comments.

#define ROM_xtos_unhandled_exception (reinterpret_cast<_xtos_handler>(0x4000dc44))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this magic address? Is it a ROM function? Does it make sense to move to the file where the ROM functions are mapped to addresses?
If not, does it make sense to change this define into a constexpr with a type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While _xtos_unhandled_exception is in the linker .ld file, it may have been overridden by the sketch or future SDK(not likely). We require the original Boot ROM function address to limit our override to those old values in the table.

With the new toolchain change constexpr doesn't work for this. It does not like an integer being cast to a function:

constexpr _xtos_handler ROM_xtos_unhandled_exception = (reinterpret_cast<_xtos_handler>(0x4000dc44));

error: 'reinterpret_cast' from integer to pointer

It works if I replace constexpr with const but I was concerned it would take up DRAM. I just tried it and in this case, it doesn't appear to change DRAM usage.

I'll change it to const for now.


void postmortem_xtos_unhandled_exception(void);
asm(
".section .iram.text.postmortem_xtos_unhandled_exception,\"ax\",@progbits\n\t"
".literal_position\n\t"
".align 4\n\t"
".type postmortem_xtos_unhandled_exception, @function\n\t"
"\n"
"postmortem_xtos_unhandled_exception:\n\t"
// Rewind Boot ROM User Exception prologue and continue to
// _KernelExceptionVector such that we look a Debug BP Exception sitting at
// 'break 1, 1' in the _xtos_unhandled_exception handler.
"movi a2, 0x4000dc4bu\n\t"
"wsr.epc2 a2\n\t"
"l32i a2, a1, 20\n\t"
"l32i a3, a1, 24\n\t"
"addmi a1, a1, 0x100\n\t"
"j _KernelExceptionVector\n\t"
".size postmortem_xtos_unhandled_exception, .-postmortem_xtos_unhandled_exception\n\t"
);


static void replace_exception_handler_on_match(
uint32_t cause,
_xtos_handler match,
_xtos_handler replacement) {

_xtos_handler old_wrapper = _xtos_exc_handler_table[cause];
if (old_wrapper == match || NULL == match) {
_xtos_exc_handler_table[cause] = replacement;
}
}

static void install_unhandled_exception_handler(void) {
// Only replace Exception Table entries still using the orignal Boot ROM
// _xtos_unhandled_exception handler.
for (size_t i = 0; i < 64; i++) {
replace_exception_handler_on_match(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that this 64 is the number of entries in the exception handler? Can the value be moved out to a constexpr or define?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay

i,
ROM_xtos_unhandled_exception,
(_xtos_handler)postmortem_xtos_unhandled_exception);
}
}

void postmortem_init(void) {
if (!gdb_present()) {
// Install all three (interdependent) exception handler stubs.
// Length of copy must be rounded up to copy memory in aligned 4 byte
// increments to comply with IRAM memory access requirements.
uint32_t save_ps = xt_rsil(15);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do 16 and 32 mean? doea it make sense to move to a define. constexpr and comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially wasn't sure how to generate values at compile time from the ASM blocks sizes. I think I have that resolved.

ets_memcpy((void*)_DebugExceptionVector, (void*)postmortem_debug_exception_vector, 16);
ets_memcpy((void*)_KernelExceptionVector, (void*)postmortem_kernel_exception_handler, 32);
ets_memcpy((void*)_DoubleExceptionVector, (void*)postmortem_double_exception_handler, 16);
asm volatile(
"movi.n a2, 0\n\t"
"wsr.epc2 a2\n\t"
"wsr.excsave2 a2\n\t"
"wsr.depc a2\n\t"
::: "a2"
);
install_unhandled_exception_handler();
xt_wsr_ps(save_ps);
}
}

#endif

};