Skip to content

Micros is not ISR-safe #1680

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
matthijskooijman opened this issue Mar 24, 2022 · 6 comments · Fixed by #1702
Closed

Micros is not ISR-safe #1680

matthijskooijman opened this issue Mar 24, 2022 · 6 comments · Fixed by #1702
Assignees
Milestone

Comments

@matthijskooijman
Copy link
Contributor

Describe the bug

The micros()/getCurrentMicros() implementation cannot currently be used inside an ISR, even though that is an obvious thing to do (quickly grab and store a timestamp in an ISR and then do more processing in the mainloop). When the timing is right (once every ms), the overflow detection in the code does not work properly, and the value returned will be exactly 1000 too low.

The current implementation has two problems:

  • It is not re-entrant: It clears COUNTFLAG at startup and then uses that flag to detect if an overflow happened while reading the timer values. If then another ISR would happen between the overflow (flag set) and reading the flag, and that ISR would call micros(), the flag would be cleared again and the original micros() call would return the wrong value.
  • It is sometimes wrong when called with interrupts disabled (e.g. with PRIMASK set, or inside an ISR that has priority=0 like tye systick ISR): When COUNTFLAG was set, the code would call HAL_GetTick() again, assuming that it would return an updated value.
    However, when interrupts are disabled, the systick overflow interrupt cannot run and HAL_GetTick() just returns the same value again, and micros() returns the wrong value.

To Reproduce

Run the following sketch:

MicrosBackwardsTest.ino.txt

I've tested this on a Nucleo F401 with core 2.2.0, but I think it will be broken on any STM32 (but if the CPU clock is not 84Mhz, the timer period in the sketch might need to be adapted, though it might also break as-is).

This sketch shows both problems above (you can select which by modifying the if(true) in the code).

For the reentrant test, the output is:

Start
OK: 1547
BACKWARDS: 1997 -> 1001
OK: 4538
BACKWARDS: 10998 -> 10003
OK: 941
BACKWARDS: 12997 -> 12002
OK: 4511
BACKWARDS: 21999 -> 21003
OK: 449
BACKWARDS: 22998 -> 22003
OK: 897
BACKWARDS: 24997 -> 24002
OK: 5116
BACKWARDS: 37997 -> 37002
OK: 5557
BACKWARDS: 48998 -> 48002
OK: 5045
BACKWARDS: 58998 -> 58003
OK: 445
BACKWARDS: 59998 -> 59003
OK: 21
BACKWARDS: 60998 -> 60002
OK: 2551
BACKWARDS: 69999 -> 69003
OK: 1443
BACKWARDS: 72998 -> 72002
OK: 5470
BACKWARDS: 83998 -> 83003
OK: 5040
BACKWARDS: 93999 -> 93003
OK: 945

Which shows that the micros run in one loop is sometimes less than the micros in the previous loop. How often this occurs is quite unpredictable, since this needs 3 things to happen at the same time, and I picked the timer interval to be prime and jump around a bit.

For the run-from-ISR test, the output is:

Start
SysTick LOAD (should be 83999): 83999
OK: 795
BACKWARDS: 794999 -> 794000
OK: 943
BACKWARDS: 1737999 -> 1737000
OK: 2830
BACKWARDS: 4565999 -> 4565000
OK: 0
BACKWARDS: 4566999 -> 4566000
OK: 942
BACKWARDS: 5508999 -> 5508000
OK: 943
BACKWARDS: 6451999 -> 6451000
OK: 943

Which shows two subsequent micros() calls are in the reverse order, which happens about once every second (which I think means the window for this problem is about 84 cycles, or 1 us).

Prior to writing this testcase, I have also seen this bug occur in practice (in a machine firmware where it would very rarely occur at a moment that it confused our scheduler, causing it to run one task continuously for about 20s, confusing some sensor we were reading, showing up as invalid sensor readings for 20s every couple of hours, but also sometimes not at all for a few days...).

Expected behavior
When calling micros() twice, the second call should always return equal or more than the first call.

Solutions

My original idea for fixing this was to, rather than using (and importantly, clearing) COUNTFLAG as is done now, instead use the systick ISR pending flag to detect an overflow, and to disable (just) the systick ISR to ensure a consistent read and ensure the pending flag is set. However, it turns out that (unlike how I expected from other platforms), disabling the systick ISR also prevents the pending flag from being set (maybe because systick is really an exception in the core, not an interrupt in the NVIC).

To still be able to use the pending flag, we resorted to using PRIMASK to disable all interrupts instead. See 3devo@efd2c7b for this implementation. However, this has a downside: Disabling all interrupts adds latency to all interrupts and is best avoided. Using SWD I measured that interrupts are disabled for around 67 cycles (bit more or less depending on how you count the enable/disable instructions), which is quite a bit more than I had hoped.

This lead to me thinking about a different approach: Rather than disabling interrupts and checking the flags, just get the ticks value twice, and if it is different, you know that an overflow occurred, and you can use the second ticks value and just have to reload the timer value (where you can assume that no second overflow will happen so soon after, which is safe unless there are ISRs that take >1ms, but even then the returned value might be technically wrong, it will still be a valid time that lies between the moment micros() is called and when it returns, so no backwards time there).

Here's what I wrote for that:

    uint32_t m = HAL_GetTick();
    __IO uint32_t v = SysTick->VAL;
    uint32_t m2 = HAL_GetTick();
    if (m != m2) {
      v = SysTick->VAL;
      m = m2;
    }
    const uint32_t tms = SysTick->LOAD + 1;
    return (m * 1000 + ((tms - v) * 1000) / tms);

This makes more changes than strictly needed (it was based on my PRIMASK-based approach, which needed to move the tms load and tms - VAL subtraction down to minimize the critical area), but the same approach can also be put into the existing code.

Looking around for existing issues, I found that the "query ticks twice and compare" method was actually used in the past already (introduced in 2fa0645 to fix exactly this issue: #4, micros running backwards) but it was replaced with the COUNTFLAG approach in #202. That PR says to improve accuracy of micros(), but looking closely I think it really only makes micros() a bit faster by removing one subroutine call to HAL_GetTicks() (which I think is an optimization better done using inlining or LTO), but it also unintentionally made micros ISR-unsafe.

Given the above, I think the "check ticks twice" approach would be the best one and should be restored.

(Though note that the original implementation had a bug: It did m1 > m0 which works almost always, except when the ticks value overflows from 2³² to 0, so a correct implementation should use m1 != m0)

@fpistm
Copy link
Member

fpistm commented Apr 4, 2022

Hi @matthijskooijman
Yes we agreed with your analyze.
@ABOSTM would like to test some ideas about this and get back soon on this.

The fact accuracy can decrease is no more an issue as the delayMicroseconds() function does not rely anymore on the micros() one.

@ABOSTM
Copy link
Contributor

ABOSTM commented Apr 4, 2022

Hi @matthijskooijman,
Thanks for your detailed analysis
I agree globally with your it.
Concerning the solution, I still think disabling IRQ would be the proper and simpler solution, in condition to restrict the deactivation to the minimum: i.e. reading both systick counter and uwTick.
I will do further testing

@matthijskooijman
Copy link
Contributor Author

Concerning the solution, I still think disabling IRQ would be the proper and simpler solution, in condition to restrict the deactivation to the minimum: i.e. reading both systick counter and uwTick.

Why not do the re-read HAL_GetTicks() solution I suggested? That prevents having to disable IRQs at all, which helps for IRQ latency?

@ABOSTM
Copy link
Contributor

ABOSTM commented Apr 6, 2022

Aim is to combine accuracy, simplicity and easy maintenance, by using ISR restricted to both read operations only (which thus should not be an issue for IRQ latency).
So something like this :

  __disable_irq();
  uint32_t m = HAL_GetTick();
  __IO uint32_t systickVal = SysTick->VAL;
  __enable_irq();

  const uint32_t tms = SysTick->LOAD + 1;
  uint32_t u = tms - systickVal;

  return (m * 1000 + (u * 1000) / tms);

... to be tested

@matthijskooijman
Copy link
Contributor Author

which thus should not be an issue for IRQ latency

But any time you disable interrupts, this introduces a bit of IRQ latency (not much, but for timing-critical stuff, this might cause unwanted jitter in IRQ latency). Or maybe I'm misunderstanding what you mean?

So something like this :

This looks appealingly simple, but this does not handle overflow. Even with IRQs disabled, the timer can overflow between disabling IRQs and reading VAL, so then VAL is low, but the overflow has not been registered in the ticks value yet, and the output is wrong.

So you if you want the "disable IRQ" route, you still have to keep some "overflow" correction, but instead of re-running HAL_GetTicks() (which will just return the same value with IRQs disabled), you'd need to increment the ticks value when an overflow occurred. This is also what I did in 3devo@efd2c7b. In that commit, I also switched from using COUNTFLAG to using PENDSTSET, but looking back I think that would not really be needed (I originally intended to disable just the systick IRQ, but with all interrupts disabled, the non-reentrancy of the COUNTFLAG approach isn't actually a problem anymore).

Since you'll need some kind of overflow detection anyway, with the resulting complexity, IMHO it could be nicer to just call HAL_GetTicks() twice and keep IRQs enabled.

@ABOSTM
Copy link
Contributor

ABOSTM commented Apr 7, 2022

Hi @matthijskooijman,
You convinced me:
I agree, the simple solution I proposed is not working, good catch.
And thus your proposal is the best approach.

Would you push a PR ?

@fpistm fpistm added this to the 2.3.0 milestone Apr 21, 2022
fpistm added a commit to fpistm/Arduino_Core_STM32 that referenced this issue Apr 29, 2022
It is kind of revert commit 15dc04b.

Fixes stm32duino#1680

Signed-off-by: Frederic Pillon <[email protected]>
fpistm added a commit to fpistm/Arduino_Core_STM32 that referenced this issue Apr 29, 2022
It is kind of revert commit 15dc04b.

Note:  `delayMicroseconds()` does not rely anymore on `micros()`
so accuracy is less important.

Fixes stm32duino#1680

Signed-off-by: Frederic Pillon <[email protected]>
fpistm added a commit that referenced this issue May 10, 2022
It is kind of revert commit 15dc04b.

Note:  `delayMicroseconds()` does not rely anymore on `micros()`
so accuracy is less important.

Fixes #1680

Signed-off-by: Frederic Pillon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants