Skip to content

Interrupts and Multiple Tasks Causing Usage Fault INVPC #63

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cversek opened this issue May 27, 2023 · 14 comments
Closed

Interrupts and Multiple Tasks Causing Usage Fault INVPC #63

cversek opened this issue May 27, 2023 · 14 comments

Comments

@cversek
Copy link

cversek commented May 27, 2023

Hi, this problem has been plaguing me for a while now and it is very hard for me to pin down the root cause.
My board is an Adafruit STM32F405 Feather. The symptom is a Hard Fault/Usage Fault with INVPC bit set to 1 (found out using J-Link/Ozone debugger).

In my setup, I can cause the fault to happen with the following conditions: there is at least one external interrupt triggering at a rate > 200 Hz (not a hard bound but a typical rate), and there are at least two tasks running at the same time. The interrupt handler can be dead simple (just incrementing its own counter) with no interaction with the FreeRTOS API. The tasks don't need to interact directly and no other primitives are needed to cause instability. The interrupt is triggered by another chip pulling the line down - it has its own clock and its on a custom board. The fault usually occurs within a few minutes time, if the tasks are switching context faster, then the fault tends to happen sooner.

Strangely, I cannot reproduce this fault by using the microcontroller to drive its own interrupt pin directly - this seems pretty stable. Later I will try to drive the interrupt from another microcontroller to see if having independent clocks will cause the fault.

I'm pretty desperate to get this working, any advice would help.

@cversek
Copy link
Author

cversek commented May 27, 2023

Update: I was eventually able to trigger the fault with using just the microcontroller driving its own interrupt, but it took much longer like 20 minutes. I will try to boil this down to some basic reproduction code in a little bit.

@cversek
Copy link
Author

cversek commented May 27, 2023

OK, this sketch should reproduce the bug with minimal other stuff going on. Note that this is being tested on an Adafruit STM32F405 Feather. In this example, pin 13 ("driver") is wired to pin 11 ("interrupt") to cause the fault within a short random amount of time (a few minutes at most). Indeed if the wire is not connected and the interrupt not triggering, then there will (probably) be no fault occurrence.

#include <STM32FreeRTOS.h>

//NOTE the DRIVER_PIN is connected to the INTERRUPT_PIN via a wire
const int INTERRUPT_PIN = 11;
const int DRIVER_PIN    = 13;
const int DRIVER_TASK_DELAY_MILLIS = 2;
const int DRIVER_TASK_RANDOM_BLOCKING_MICROS_MAX = 2000;  //this randomness seems to help break things faster
const int COMPETING_TASK_DELAY_MILLIS = 10;               //decreasing this brings fault on faster

volatile unsigned long interrupt_counter = 0;
void pin_ISR(){
  interrupt_counter++;
}

// this task drives a pin periodically with random added blocking 
static void vTask1(void* arg) {
  UNUSED(arg);
  digitalWrite(DRIVER_PIN,HIGH);
  while (1) {
    // toggle the pin
    vTaskDelay(pdMS_TO_TICKS(DRIVER_TASK_DELAY_MILLIS));
    //random delay to simulate work
    delayMicroseconds(random(DRIVER_TASK_RANDOM_BLOCKING_MICROS_MAX));
    // this transition to low state should trigger the interrupt
    digitalWrite(DRIVER_PIN,LOW);
    vTaskDelay(pdMS_TO_TICKS(DRIVER_TASK_DELAY_MILLIS));
    digitalWrite(DRIVER_PIN,HIGH);
  }
}

// this task competes with the driver task vTask1 for context
static void vTask2(void* arg) {
  UNUSED(arg);
  while (1) {
    // Sleep for tiny bit
    vTaskDelay(pdMS_TO_TICKS(COMPETING_TASK_DELAY_MILLIS));
  }
}

// this task shows the interrupt counter
static void vTask3(void* arg) {
  UNUSED(arg);
  while (1) {
    // Sleep for five seconds
    vTaskDelay(pdMS_TO_TICKS(5000));
    Serial.println(interrupt_counter);
  }
}

void setup() {
  pinMode(DRIVER_PIN, OUTPUT);
  digitalWrite(DRIVER_PIN,HIGH);

  Serial.begin(115200);
  while (!Serial){}; //WAIT FOR SERIAL CONNECTION TO OPEN, DEBUG ONLY!

  pinMode(INTERRUPT_PIN, INPUT_PULLUP);
  int intNum = digitalPinToInterrupt(INTERRUPT_PIN);
  attachInterrupt(intNum, pin_ISR, FALLING);

  xTaskCreate(vTask1,NULL,configMINIMAL_STACK_SIZE+50,NULL,1,NULL);
  xTaskCreate(vTask2,NULL,configMINIMAL_STACK_SIZE+50,NULL,1,NULL);
  xTaskCreate(vTask3,NULL,configMINIMAL_STACK_SIZE+500,NULL,1,NULL);

  // start FreeRTOS
  vTaskStartScheduler();

  // should never return
  Serial.println(F("Die"));
  assert_param(false);
}

void loop() {
}

Example output below:

1109
2220
3329
4435
5545
6660
7767
8877
9985

System goes into hard fault, blinking 4 times in a row loop in about 45 seconds. Note, this seems to be deterministic for this choice of starting parameters with my board. A slightly different choice will obviously lead to a different set of numbers and a different time to failure - and it is probably clock rate dependent to some degree.

Of course someone will tell me that this is not a bug and I must be doing something wrong. I would be happy to be proven wrong yet again ;) This is why I have endeavored to provide a method of reproduction.

@fpistm
Copy link
Member

fpistm commented May 29, 2023

Hi, I guess you used the Serial over USB, in that case, you should follow this:
#16 (comment)

@cversek
Copy link
Author

cversek commented May 30, 2023

@fpistm
Hi, so, yes, I am using the USB support: CDC (generic 'Serial' supersede U(S)ART) with USB speed: Low/Full Speed and Newlib Nano settings in Arduino. Additionally since I had this issue, I have been running with this setting in FreeRTOSConfig_Default.h:

#define configMEMMANG_HEAP_NB             3

which is what you are recommending, correct?

These faults seem more subtle than the problems I was having before using the default heap configuration. My actual application requires USB Serial and at least two pin ISRs, and I'd really like to have the ability to use FreeRTOS tasks to manage the Serial resource with mutexes and queues. I'd also like to be able to notify tasks from the ISRs, so I might eventually need some help configuring the interrupt priorities. Unfortunately, I can't even get trivial ISRs working with my toy example and I could not easily find any example code from others using STM32duino/FreeRTOS with user defined ISRs. Thanks for looking into this.

@fpistm
Copy link
Member

fpistm commented May 30, 2023

@cversek
We provide STM32FreeRTOS "as is", it is just a porting to Arduino library format of the ST porting of FreeRTOS.
As stated in your previous issue: #58 (comment)

Configuration is required depending of your application like IRQ priorities mainly in FreeRTOS context.
Ex: #54 (comment)

I will not look more on this as I have no time for this and seems purely configurations.

@cversek
Copy link
Author

cversek commented May 30, 2023

@fpistm fpstm
I know you have little time for novices who don't know how to configure their own systems that aren't part of the ST dev board family - I can ask for help elsewhere. But please clarify what you have said...

I'm confused... So are you saying that there is some additional configuration I need to do outside of FreeRTOSConfig_Default.h to make the Arduino USB Serial driver work for this board, possibly because some interrupt(s) it may be using are incompatible with FreeRTOS out of the box? Somehow, this has to do with the heap and Newlib?

@cversek
Copy link
Author

cversek commented May 30, 2023

For others running into these issues... I can confirm that removing all USB Serial support does seem to stabilize the system for my toy example problem. Probably not a satisfying path forward for most, but it points to where the issue might be.

@jmailloux
Copy link

I'm seeing a similar issue to you @cversek. I too am using USB serial prints.
Setting configMEMMANG_HEAP_NB to 3 doesn't seem like a fix. From what I can tell, that still uses newlib malloc / free, but without all of the proper things that need to be done when using newlib's malloc / free (that are in heap_useNewlib_ST.c). You just have to define MALLOCS_INSIDE_ISRs to use heap_useNewlib_ST.c
I believe there is some kind of memory corruption going on. Just haven't been able to identify it yet. This happens to me regardless of whether configMEMMANG_HEAP_NB = 3, or it is left as default.

@jmailloux
Copy link

I am guessing it has something to do with the fact that malloc is called from an isr in the usb serial system, so you have to make sure that that isr is lower priority than configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY.

@jmailloux
Copy link

I set configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY to 1 as I believe that is the USB interrupt priority. I still see the problem. I am stuck on this as well.

@cversek
Copy link
Author

cversek commented May 30, 2023

UPDATE: I was wrong about this whole comment, see resolution below

@jmailloux
Thanks very much for your reply. I think you gave me some insight into solving my problem. I went hunting for the USB IRQ priority setting and found this line in the STM32duino library provided usbd_conf.h file:

#define USBD_IRQ_PRIO                               1

and I changed it to 6, since I think all the heap touching ISR priorities should be between 5 and 14 with the default priority settings. This seems to stabilize my toy example and I can print the interrupt_counter variable to the USB port. I don't know if this solves all the issues with how the STM32duino default settings may be incompatible with the FreeRTOS port, but it is a good start.

@fpistm
Copy link
Member

fpistm commented May 31, 2023

I'm currently removing the malloc used in the USB part. Hope this help.

@cversek
Copy link
Author

cversek commented Jun 2, 2023

@jmailloux @fpistm
Unfortunately, I'm having difficulty replicating the stability of the toy example when building on a separate Arduino environment (separate computer, same versions of packages). Even though I am setting USBD_IRQ_PRIO to 6 and configMEMMANG_HEAP_NB to 3, the code still hard faults at the same time (~45 seconds) - even when I remove USB Serial support completely! I am sorry for the misleading test claims I posted before, I was mistaken.

The strange thing is that this example which was unstable at first is now consistently stable on my original test computer - even when I set USBD_IRQ_PRIO back to 1. The truth is that I don't know what is going on with the issue. Something else must have changed at around the same time I was changing the USB serial settings, but I don't know what this could be.

@cversek
Copy link
Author

cversek commented Jun 2, 2023

@fpistm @ABOSTM
So I must have based my STM32FreeRTOSConfig.h after this commit which brought in this change:

#define configKERNEL_INTERRUPT_PRIORITY   14

That of course is wrong, because it does not properly shift the bits the way the NVIC wants it to be. REF: "Cortex-M Internal Priority Representation" https://www.freertos.org/RTOS-Cortex-M3-M4.html

By setting that line back to:

#define configKERNEL_INTERRUPT_PRIORITY   ( configLIBRARY_LOWEST_INTERRUPT_PRIORITY << (8 - configPRIO_BITS) )

which evaluates to decimal 240 in my environment, my example code regains stability.

So I did make a config mistake, but I probably got that mistake from your mistake. We are even ;)

@jmailloux
Please check your config to see if you have this troublesome line.

@cversek cversek closed this as completed Jun 2, 2023
cversek referenced this issue Jun 2, 2023
Aim is to have Systick priority higher (lower value)
than Ethernet Timer

Signed-off-by: Alexandre Bourdiol <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants