Skip to content

AttachInterrupt random crashes #1403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
DedeHai opened this issue Jan 10, 2016 · 15 comments
Closed

AttachInterrupt random crashes #1403

DedeHai opened this issue Jan 10, 2016 · 15 comments

Comments

@DedeHai
Copy link

DedeHai commented Jan 10, 2016

Is it possible that there is an issue using the AttachInterrupt function?
I have a setup where I attach a pin to an external signal that triggers approximately every 20ms. In parallel I send data to a server (Thingspeak) every 15 seconds. I get random crashes always at the point where data needs to be sent out. If I deactivate the updates to the server, everything runs smoothly. The crashes happen at random times, sometimes after just one minute, sometimes after 30 minutes or even longer.
Any explanation for this behaviour?
The interrupt shoud be fairly short, it does some calculations but no loops.

@DedeHai
Copy link
Author

DedeHai commented Jan 11, 2016

Update:
This is definitely some sort of bug!
I tested today for 12hours and no crash: what I changed is to deactivate the interrupt before sending data out to the server and re-activeted it right after the data was sent. Not a single crash in 12 hours.
After some more testing I narrowed it down to the two commands that trigger the exception: it always happens either at

client.connect("api.thingspeak.com", 80);

or at:

client.print(header + datastr);

Here is the output I get from my sketch (measured data + timestamp). In this case the exception happened at client.connect() because the '0' gets printed after that.

-30.00 @gmt 21:03:11.492
-29.30 @gmt 21:03:12.492
Thingspeak...012OK
-29.80 @gmt 21:03:13.493
-28.80 @gmt 21:03:14.493
-28.50 @gmt 21:03:15.494
-28.60 @gmt 21:03:16.494
-27.90 @gmt 21:03:17.495
-29.10 @gmt 21:03:18.496
-28.70 @gmt 21:03:19.496
-29.90 @gmt 21:03:20.497
-31.00 @gmt 21:03:21.497
-31.00 @gmt 21:03:22.498
-31.00 @gmt 21:03:23.498
-32.50 @gmt 21:03:24.499
-34.00 @gmt 21:03:25.500
-33.70 @gmt 21:03:26.500
-35.10 @gmt 21:03:27.501
-34.60 @gmt 21:03:28.502
Thingspeak...
Exception (0):
epc1=0x402021d4 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000000 depc=0x00000
000

ctx: sys
sp: 3ffffc30 end: 3fffffb0 offset: 01a0

stack>>>
3ffffdd0: 401070bc 00000000 00000002 0000001c
3ffffde0: ffffffff 00000020 00000001 00000000
3ffffdf0: 00000000 401016fb 00000000 00000022
3ffffe00: 3fffc200 40107084 3fffc258 4000050c
3ffffe10: 400043a3 00000030 00000016 ffffffff
3ffffe20: 60000200 00000002 0001f900 80000000
3ffffe30: 20000000 3fff1544 80000000 203fc240
3ffffe40: 00000000 3fffc6fc fc70ffff 3fff1548
3ffffe50: 000000b4 003fc240 60000600 00000030
3ffffe60: 00000114 40103fdf 00040000 00000001
3ffffe70: 40107120 00000004 00000000 4010710a
3ffffe80: ffffffff 00000020 00000000 3ffed8a0
3ffffe90: 00000000 00000000 0000001f 401059c1
3ffffea0: 4000050c 40107084 3fffc258 4000050c
3ffffeb0: 40000f68 00000030 0000001a ffffffff
3ffffec0: 40000f58 00000000 00000020 00000000
3ffffed0: 01f77462 00000001 3ffedf20 00000008
3ffffee0: 00000020 3fffc6fc fc70ffff 3fffdab0
3ffffef0: 00000000 3fffdcb0 3ffedf00 00000030
3fffff00: 00000000 400042db 40232010 40105d1d
3fffff10: 40004b31 3fff1300 000002f4 003fc000
3fffff20: 40105ff2 40105d4a 3ffecf90 40219dc4
3fffff30: 4020a685 3ffecf90 3ffedef0 06e548da
3fffff40: 3fff1300 00001000 4020ab1e 00000008
3fffff50: 4021c51c 00000000 4020abcb 3ffed044
3fffff60: 3ffedef0 00000001 4020aae0 60000600
3fffff70: 402242b9 3ffedef0 3ffedec8 402242b9
3fffff80: 402242fe 3fffdab0 00000000 3fffdcb0
3fffff90: 3ffedf08 00000000 40000f65 3fffdab0
3fffffa0: 40000f49 00036948 3fffdab0 40000f49
<<<stack<<<

ets Jan 8 2013,rst cause:2, boot mode:(1,7)

ets Jan 8 2013,rst cause:4, boot mode:(1,7)

wdt reset

it does actually not reset here, just hangs...

@chaeplin
Copy link
Contributor

Is there delay in ISR ?

@alltheblinkythings
Copy link
Contributor

alltheblinkythings commented Jan 12, 2016 via email

@DedeHai
Copy link
Author

DedeHai commented Jan 12, 2016

Thank you for the hint, I am unfamiliar with decyphering the exception dump info. I will take a stab at the objdump to get to the root of it.
There are no delay calls in the ISR nor are there any Serial.print calls nor anything else that used another interrupt (to my knowledge). Only gathering data and doing some calculations and writing it back into an array. It should all be in ram but I will double check that.
Also the crash only happens if the Thingspeak sendout is active. The ISR alone works fine, the sending out alone works fine as well but if I run them both in parallel the crashes start to happen.

@DedeHai
Copy link
Author

DedeHai commented Jan 12, 2016

Here is what I found (it is not much):
Exception(0) translates to 'Illegal Instruction Cause' if I interpret the xtensa documentation correctly.
The function at the epc1 address is actually the interrupt function. I dropped almost everything from the interrupt to make sure no external functions get called, the exception still keeps happening.
The interrupt function is now only:

void pininterrupt() {
  arrayindex++;
  if (arrayindex >= 50)
  {
    static int measurementindex = 0;
    measurementdata[measurementindex].flag = 1;
    measurementindex++;
    if (measurementindex >= 128) measurementindex = 0;
    arrayindex = 0;
    issampling = false;
  }
}

Here is what I get:

0.00    @GMT 6:28:16.000                                                        
Thingspeak...                                                                   
Exception (0):                                                                  
epc1=0x40202058 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000000 depc=0x00000
000                                                                             

ctx: sys                                                                        
sp: 3ffffc20 end: 3fffffb0 offset: 01a0                                         

>>>stack>>>              

.etc.

checking the objdump at 0x40202058 I find parts of the interrupt vector:

40202054:       fe8354          excw
40202057:       3f              .byte 0x3f

40202058 <_Z12pininterruptv>:
    measurementindex++;
    if (measurementindex >= 128) measurementindex = 0;
    return 0;
  }
else return -1;
}
40202058:       fffc21          l32r    a2, 40202048 <__run_user_rf_pre_init+0x4
>
4020205b:       143c            movi.n  a4, 49
4020205d:       000232          l8ui    a3, a2, 0
40202060:       743030          extui   a3, a3, 0, 8
40202063:       331b            addi.n  a3, a3, 1
40202065:       743030          extui   a3, a3, 0, 8
40202068:       0020c0          memw
4020206b:       004232          s8i     a3, a2, 0
4020206e:       000232          l8ui    a3, a2, 0
40202071:       743030          extui   a3, a3, 0, 8
40202074:       32b437          bgeu    a4, a3, 402020aa <_Z12pininterruptv+0x52
>
40202077:       fff531          l32r    a3, 4020204c <__run_user_rf_pre_init+0x8
>

What are the options? Sorry, I am totally lost.

@DedeHai
Copy link
Author

DedeHai commented Jan 12, 2016

this may be rlated to #797 and/or #1020

I see mentioned that putting ICACHE_RAM_ATTR does not work. Mentioned by Igrr: #797 (comment)
however, I tried it and it now seems to work for longer than usual.

Bonus question: if a function is called from the interrupt, does that function also have to have the ICACHE_RAM_ATTR or is this done by the compiler? Or is it better to inline the functions called from the interrupt?

@alltheblinkythings
Copy link
Contributor

alltheblinkythings commented Jan 12, 2016 via email

@DedeHai
Copy link
Author

DedeHai commented Jan 12, 2016

Thank you so much, it looks like that did the trick.
I checked the objdump again and all the functions I put the ICACHRE_RAM_ATTR are loaded to the ram (text) section. Sweet.

@gmt
Copy link

gmt commented Jan 26, 2016

-30.00 @gmt 21:03:11.492

-29.30 @gmt 21:03:12.492

my ears are burning :)

@DedeHai DedeHai closed this as completed Feb 20, 2016
@sohailsadiq
Copy link

hi,
can you explain how to put the ICACHRE_RAM_ATTR are loaded to the ram (text) section.
i am just a beginner in esp8266 but facing same problem in flow sensor. it stop randomly can you help.

Thaks

@DedeHai
Copy link
Author

DedeHai commented Nov 27, 2016

very easy, just put it int front of your function name. Be aware that all functions called within an interrupt function must reside in ram, not only the function itself.
for example:

void ICACHE_RAM_ATTR Second_Tick()
{
...
}

void ICACHE_RAM_ATTR PinInterrupt(void) 
{
...
}

@nsted
Copy link

nsted commented Dec 19, 2016

So if I use the Wire library inside the interrupt does this mean I need to make a modified version of Wire that uses ICACHE_RAM_ATTR in front of the function names?
Thanks

@igrr
Copy link
Member

igrr commented Dec 19, 2016

Alternatively, you can disable this interrupt while reading/writing/erasing flash memory. This includes EEPROM, SPIFFS, and WiFi configuration functions.

@mkeyno
Copy link

mkeyno commented Dec 19, 2016

@DedeHai , @nsted do you know any reference or tutorial for these kind of functions (ICACHE_RAM_XXX)?

@DedeHai
Copy link
Author

DedeHai commented Dec 19, 2016

Alternatively, you can disable this interrupt while reading/writing/erasing flash memory. This includes EEPROM, SPIFFS, and WiFi configuration functions.

Interesting, did not know that the flash access is the reason for this, I thought it's a general requirement for this processor architecture.
In general it is a bad idea to access any communication interface (I2C, SPI, UART etc.) while in an interrupt routine because communication is about the slowest thing for an MCU. It is much better practice to just set a flag that the interrupt happened and then poll that flag in the main loop. There are exceptions to this of course but if you are not really careful and knowledgable of the whole code (including all libraries and core functionality) it is better to keep interrupts very short. I think for the ESP8266 a rule of thumb is 15us (correct me if I am wrong).

@mkeyno what do you mean by tutorial? it is as simple as putting one keyword in front of your function. I put an example of it above. Inside the interrupt do not call any other functions (except inline functions maybe). I used it here:
Mutatio Firmware: ISR but it may violate the 15us rule, the firmware does not run 100% stable and reboots sometimes. I never investigated why since it happens rarely (like every few hours or even days).
As you can see in the code I do call another function (writeMeasurement or something) which also has the RAM attribute set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants