-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
I2C/Wire library: changes for issue #42 #107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ause of a timeout
Some very strange thing just happened, did a bunch of commits to the master branch just vaporize? Now this PR has a bunch of commits in it that I have nothing to do with. Do I need to rebase this now? @lxrobotics Force-pushing to this repo feels pretty dangerous, if these commits were bad, why weren't they properly reverted instead of rewriting history? |
Hang on, what branch should I even be making this PR for? |
Sorry for the mixup 😳 Everything should be restored now. |
@lxrobotics all good! :-) Thanks for fixing it up 👍 |
Any idea when this PR will be included in the Arduino IDE distribution? Having an I2C timeout is mandatory for robust operation. |
Any news regarding when this pull request will be merged? |
Has anyone besides me run this code in practice? Would be great to get some feedback from testers (besides myself) before it's merged. Especially nice would be testers using clock stretching and multimaster buses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good change overall, but not ready to merge. I left some comments inline, including some minor improvements but also some bugs and a possible ommission.
There's one more loop that did not get a timeout in twi_readFrom()
:
do {
TWDR = twi_slarw;
} while(TWCR & _BV(TWWC));
Did you intentionally leave that one out? Is there some guarantee that that does not lock up?
Also, in my version I also handled arbitration errors by resetting the hardware, since if those occur due to noise rather than a second master, the hardware ends up waiting for a second master's stop condition indefinitely. Of course, unconditionally resetting on arbitration error is not ok, since that breaks multi-master setups. I suspect that no special handling is really needed: If an arbitration error occurs, this returns an error right now. In the background, the hardware will end up (invisibly :-S) locked, waiting for a stop condition, but that should be detected on the next transaction (which will run into a timeout on the start condition and reset the hardware). This would result in two failed transactions, but at least no lockup. Did you happen to test this scenario?
Hi @facchinm, I think the Also, before it gets merged, would you be okay with me modifying the Wire library usage examples to enable the timeout? And I would be very keen to hear from matthijskooijman before this gets merged, because he's been very involved in its development. |
@greyltc overloading (but also shadowing) it is ok I think, since it applies to a situation that cannot happen with I2C (it's normally needed by APIs like I tested this patch on the top of yours calling diff --git a/libraries/Wire/src/Wire.cpp b/libraries/Wire/src/Wire.cpp
index b446f9a..63062ea 100644
--- a/libraries/Wire/src/Wire.cpp
+++ b/libraries/Wire/src/Wire.cpp
@@ -94,7 +94,7 @@ void TwoWire::setClock(uint32_t clock)
* @param reset_with_timeout if true then TWI interface will be automatically reset on timeout
* if false then TWI interface will not be reset on timeout
*/
-void TwoWire::setWireTimeoutUs(uint32_t timeout, bool reset_with_timeout){
+void TwoWire::setTimeout(uint32_t timeout, bool reset_with_timeout){
twi_setTimeoutInMicros(timeout, reset_with_timeout);
}
@@ -103,14 +103,14 @@ void TwoWire::setWireTimeoutUs(uint32_t timeout, bool reset_with_timeout){
*
* @return true if timeout has occured
*/
-bool TwoWire::getWireTimeoutFlag(void){
+bool TwoWire::getTimeoutFlag(void){
return(twi_manageTimeoutFlag(false));
}
/***
* Clears the TWI timeout flag.
*/
-void TwoWire::clearWireTimeoutFlag(void){
+void TwoWire::clearTimeoutFlag(void){
twi_manageTimeoutFlag(true);
}
diff --git a/libraries/Wire/src/Wire.h b/libraries/Wire/src/Wire.h
index c8b9a3e..64e09d3 100644
--- a/libraries/Wire/src/Wire.h
+++ b/libraries/Wire/src/Wire.h
@@ -55,9 +55,9 @@ class TwoWire : public Stream
void begin(int);
void end();
void setClock(uint32_t);
- void setWireTimeoutUs(uint32_t, bool);
- bool getWireTimeoutFlag(void);
- void clearWireTimeoutFlag(void);
+ void setTimeout(uint32_t timeout = 25000, bool reset_with_timeout = false);
+ bool getTimeoutFlag(void);
+ void clearTimeoutFlag(void);
void beginTransmission(uint8_t);
void beginTransmission(int);
uint8_t endTransmission(void);
*/
-void TwoWire::clearWireTimeoutFlag(void){
+void TwoWire::clearTimeoutFlag(void){
twi_manageTimeoutFlag(true);
}
diff --git a/libraries/Wire/src/Wire.h b/libraries/Wire/src/Wire.h
index c8b9a3e..64e09d3 100644
--- a/libraries/Wire/src/Wire.h
+++ b/libraries/Wire/src/Wire.h
@@ -55,9 +55,9 @@ class TwoWire : public Stream
void begin(int);
void end();
void setClock(uint32_t);
- void setWireTimeoutUs(uint32_t, bool);
- bool getWireTimeoutFlag(void);
- void clearWireTimeoutFlag(void);
+ void setTimeout(uint32_t timeout = 25000, bool reset_with_timeout = false);
+ bool getTimeoutFlag(void);
+ void clearTimeoutFlag(void);
void beginTransmission(uint8_t);
void beginTransmission(int);
uint8_t endTransmission(void); |
But does the Wire.setTimeout() - which comes from the Stream - assumes that the passed argument should be in milliseconds while for I2C we would like to have it as microseconds? Would that be a bit confusing? @facchinm please also take a look at this comment: There is a conversation between @matthijskooijman, @greyltc and me about the Wire.setTimeout() I do not know how to put a direct link to this here on github. It is in the line I posted above and related to the libraries/Wire/src/Wire.h file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I would be very keen to hear from matthijskooijman before this gets merged, because he's been very involved in its development.
Sorry, had been a bit busy :-)
I've left some inline remarks and suggestions (mostly docs, one code improvement that I did not actually test) on the PR, I'll respond to the rest of the discussion separately.
In the latest commit, da7d85a, I took the liberty of fixing some of what I thought were the more severe syntax issues especially in the ISR in twi.c, I hope that's alright.
I think the changes themselves are good, but I would really prefer them to be in a separate commit, to make this change easier to review. In that regard, it would probably be good to rebase this commit on top of latest master and squash it, leaving just one commit for the formatting fixes and one commit for the reset.
/*** | ||
* Returns the TWI timeout flag. | ||
* | ||
* @return true if timeout has occured |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @return true if timeout has occured | |
* @return true if timeout has occured since the flag was last cleared. |
* Sets the TWI timeout. | ||
* | ||
* @param timeout a timeout value in microseconds, if zero then timeout checking is disabled | ||
* @param reset_with_timeout if true then TWI interface will be automatically reset on timeout | ||
* if false then TWI interface will not be reset on timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Sets the TWI timeout. | |
* | |
* @param timeout a timeout value in microseconds, if zero then timeout checking is disabled | |
* @param reset_with_timeout if true then TWI interface will be automatically reset on timeout | |
* if false then TWI interface will not be reset on timeout | |
* Sets the TWI timeout. | |
* | |
* This limits the maximum time to wait for the TWI hardware. If more time passes, the bus is assumed | |
* to have locked up (e.g. due to noise-induced glitches or faulty slaves) and the transaction is aborted. | |
* Optionally, the TWI hardware is also reset, which can be required to allow subsequent transactions to | |
* succeed in some cases (in particular when noise has made the TWI hardware think there is a second | |
* master that has claimed the bus). | |
* | |
* When a timeout is triggered, a flag is set that can be queried with `getWireTimeoutFlag()` and is cleared | |
* when `clearWireTimeoutFlag()` or `setWireTimeoutUs()` is called. | |
* | |
* Note that this timeout can also trigger while waiting for clock stretching or waiting for a second master | |
* to complete its transaction. So make sure to adapt the timeout to accomodate for those cases if needed. | |
* A typical timeout would be 25ms (which is the maximum clock stretching allowed by the SMBus protocol), | |
* but (much) shorter values will usually also work. | |
* | |
* In the future, a timeout will be enabled by default, so if you require the timeout to be disabled, it is | |
* recommended you disable it by default using `setWireTimeoutUs(0)`, even though that is currently | |
* the default. | |
* | |
* @param timeout a timeout value in microseconds, if zero then timeout checking is disabled | |
* @param reset_with_timeout if true then TWI interface will be automatically reset on timeout | |
* if false then TWI interface will not be reset on timeout |
@@ -373,8 +413,17 @@ void twi_stop(void) | |||
|
|||
// wait for stop condition to be exectued on bus | |||
// TWINT is not set after a stop condition! | |||
volatile uint32_t counter = twi_timeout_us/10ul; // approximate the timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
volatile uint32_t counter = twi_timeout_us/10ul; // approximate the timeout | |
// We cannot use micros() from an ISR, so approximate the timeout with cycle-counted delays | |
const uint8_t us_per_loop = 8; | |
uint32_t counter = (twi_timeout_us + us_per_loop - 1)/us_per_loop; // Round up |
No need to make this volatile
AFAICS, there will not be an ISR interrupting this code and writing to counter
. Also, round up, to make sure the timeout is never shorter than configured. And also use a constant, to prevent duplicating the 10.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to use 8 as a value for us_per_loop variable? Is division by 8 simple to perform as it is mostly like shifting the register by three bits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, seems like a good suggestion to me. Suggestion updated.
continue; | ||
if(twi_timeout_us > 0ul){ | ||
if (counter > 0ul){ | ||
_delay_us(10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_delay_us(10); | |
_delay_us(us_per_loop); |
* Input reset: true causes this function to reset the twi hardware interface | ||
* Output none | ||
*/ | ||
void twi_handleTimeout(bool reset){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have a reset
parameter here? It seems it is always passed twi_do_reset_on_timeout
, so might as well use that global variable directly here?
twi_state = TWI_READY; | ||
} | ||
break; | ||
if (twi_sendStop){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be a space before the {
I think.
Please don't do that (shadowing one API with another), I'm afraid that might cause confusion and needlessly entangles these two APIs. It might not cause problems, but consider some code like this:
Rather than setting the Wire timeout, as a user might expect, this would set the Stream timeout (which is unused, so effectively this is just silently throwing a way the timeout). Without shadowing, this would properly produce a compile error.
That seems like a good approach to me too. Also, an important part of this, I think, is also to:
In my review, I've added some suggestions for more documentation in function comments, which can later be moved into the reference docs and should handle the above points.
Yes, please. This would also help with the "documentation" stuff I mentioned above.
I like explicitness in this sense, making code easier to read, but maybe it's more consistent with other code without the |
The solution from @greyltc branch works nice for me. |
Any idea when this will be merged? The branch works well for me, too. |
Aaaand, it is finally merged 🙂 Thanks everyone for the involvement; we'll be updating the reference documentation in the next days, applying the same approach on other architectures too. |
For future reference, relevant commits are deea929, 38ff552 and f1fe5e8. AFAICS @facchinm has squashed all commits of this PR together, renamed @facchinm you missed this change: - _delay_us(10);
+ _delay_us(us_per_loop); (making the timeout actually slightly off now, since |
Ah right, I totally missed that suggestion 😬 Do you want to push a commit on top with that change? |
I'm in a different project now, so if you could pick it up, that would be great. |
Sure, on it |
While it is nice to have API functions to set/get a timeout, there is no way for the application code to know that they exist.
|
Good plan. Adding Also, the reference docs (https://github.com/arduino/reference-en) still need to be updated (which can probably mostly use the comments from the source files, I think). Anyone care to make these 2 pullrequests? |
On the documentation, should this be written from the perspective of the user or as information for future core developers? I'm assuming the documentation for a WIRE_HAS_TIMEOUT macro would go in constants.adoc ? There are many constants that are not currently in there like: Would it be better to create a separate documentation issue that dealt with all of these so they get fixed together as part of the same PR to fix the documentation for WIRE macro? I've been wanting to get all these macros documented for years as I've worked with several 3rd party platform developers have either left them out, used different names or even implemented them. I'll be happy to do it. I'd just like some guidance as to the best way to proceed with fixing this issue and getting the documentation updated in terms of what the PR(s) should cover. |
Just yesterday (literally) I received yet another email from someone new to Arduino/I2C who was experiencing the I2C lockup problem, asking for my help due to one of my blog posts from two years ago. As I was helping him, we both discovered that the new version (1.8.13) now contains the timeout feature (thanks again greyltc). This is wonderful news, but the fact that it isn't implemented by default doesn't help anyone - new users will still spend an infinite number of hours chasing what appears to be a hardware problem, only to discover (again) that the issue is a well-known bug in the Wire library. FWIW, I have been using the SBWire library for the last two years in all my I2C projects, and I have yet to experience any lockups, and I was getting them daily (and sometimes hourly) beforehand. The SBWire library uses a simple loopcount designed to produce an approximaely 100 uSec timeout, and it has been working fine for me in several different configurations. While I commend everyone involved for finally driving a stake through the heart of this 'problem-that-would-not-die', please understand that without a default timeout all your efforts will have been in vain, except possibly for the few experts who know enough to enable the timeout (and they have all moved to SBWire by now anyway). Frank |
Hello everyone, I 've started to try your solution, I put Wire.setWireTimeout(250,true) in the setup loop, then in the main loop, I also changed the time from 25 to 2500, but the communication with the MPU6050 is not working anymore ,( but LCD works well). So I wonder if i'm not using it in the wrong way, or maybe it's not compatible with the MPU6050. Do you have any idea of what should I try ? Anyway, than you again for all the work you are doing. |
@utis82 , maybe your MPU6050 chip is the cause of the freeze? At least now with the fix your program can go on - however communication with MPU6050 doesn't work. Try to do software reset MPU6050 after freeze - maybe this will help:
If this doesn't work then I think you need to do a hardware reset of MPU6050 by pulling its power supply off and on. You ca use a MOSFET transistor to do that. |
FWIW, I also have an MPU6050 module in my wall-following robot project, along with a host of other I2C devices, and they all seem to be perking along happily with the new Wire library. As always when dealing with hardware/software issues, it is important to simplify the problem to the point where the cause of the problem becomes obvious (often easier to say than do, I know). I would suggest setting up another Arduino controller with just the minimum software needed to communicate with the MPU6050 (not necessarily even JRowbergs stuff - maybe just an example program from wherever you got the MPU6050 module) and no other hardware at all. Keep this configuration until you have either solved the problem or convinced yourself that the MPU6050 isn't the source of the problem. Then change JUST ONE THING! and see what happens. Here are a couple of posts to give you the idea: https://www.fpaynter.com/2019/10/basic-arduino-mpu6050-gy-521-test/ |
last_status is reasonable solution to get the last status of i2c trnasmittion, but in some cases, the last_status is overwritten by calling read/write register API more than once. For instance, init(), startContinuous(), etc. Especially on Arduino IDE 1.8.13, new error code `5` (i2c timeout) is added to Wire.endTransmission() by merging this PR arduino/ArduinoCore-avr#107 . You can distinguish what error occurs and use them depending on the situation.
I2C comms currently seem unusable because the Arduino master may get stuck in one of several while loops during normal usage. This locks up execution with no way to recover (besides having a watchdog timer), see issue #42.
This PR uses @wmarkow's approach of putting timers in all the places we might get stuck while handling the I2C bus, then resetting the interface if one of those timers expires.
I've modified @wmarkow's approach in the following ways:
setClock()
orbegin()
(with an address argument)I've tested this code in my application and it works fine there and prevents my code from locking up (with the stock Wire library it would lock up about once per 200,000 I2C messages).
This adds one new user-facing function to the Wire library:
setTimeoutInMicros()
which takes a uint16 microsecond value. If this function is not called, or called with0
, Wire acts like it previously did, otherwise the I2C interface will be reset if it spends more than the given number of microseconds waiting for some change from an ISR.This timeout approach might be incompatible with clock stretching and a multimaster bus, but it sure is better than an unusable I2C master. Of course, any previous Wire library functionality is maintained if
setTimeoutInMicros()
is not called.