-
-
Notifications
You must be signed in to change notification settings - Fork 7k
I2C/TWI library: Race Condition with Repeated STARTs #2173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Here's the modified files. edit: right, so if you try to recreate this, please remove the fix. You won't get the issue if you don't. |
@earlyprogrammer thanks for the explanation and suggestion! I was able to reproduce and test out the patch using a slightly modified Master sketch, as is Slave sketch from: http://forum.arduino.cc/index.php?topic=172296.msg1289288#msg1289288 I'll prepare a pull request with your suggestion shortly. Master // TEST_i2c_Master
// Arduino 1.0.5
// http://forum.arduino.cc/index.php?topic=172296.0
#include <Wire.h>
void setup()
{
Serial.begin(115200);
Serial.println("Start");
Wire.begin();
}
unsigned long okcount=0, errcount=0;
void loop()
{
Wire.beginTransmission(0x2B);
Wire.write(0x11); // for the (simulated) register address.
byte error = Wire.endTransmission(false); // do not stop, to trigger repeated start
if( error != 0) {
Serial.print("endTransmission error = ");
Serial.print(error);
errcount++;
} else {
Wire.requestFrom( 0x2B, 4, true); // release bus after this
byte avail = Wire.available();
if (avail != 4) {
Serial.print("avail = ");
Serial.print( avail);
errcount++;
} else {
char buffer[10];
buffer[0] = Wire.read();
buffer[1] = Wire.read();
buffer[2] = Wire.read();
buffer[3] = Wire.read();
if( buffer[0] != 'A' || buffer[1] != 'B' || buffer[2] != 'C' || buffer[3] != 'D') {
Serial.print("bad data : ");
buffer[4] = '\0';
Serial.print(buffer);
errcount++;
} else {
Serial.print("good response");
okcount++;
}
}
}
Serial.print(", ok=");
Serial.print(okcount);
Serial.print(", err=");
Serial.print(errcount);
Serial.println();
} |
as suggested by @earlyprogrammer in arduino#2173, to ensure TWDR value is set if there is a write collision
as suggested by @earlyprogrammer in #2173, to ensure TWDR value is set if there is a write collision
Closed via #3742 |
I believe I've identified a race condition with repeated STARTS in the TWI library.
(As Master, doesn't seem related to the TWI slave reports I found searching online)
I'm using a modified version of Wire and TWI, I'll discuss my changes later.
The Flow:
Instead of sending a stop, send a repeated start. Note: TWIE is set to 0, so the ISR should not get called when the hardware flags that sending the START has completed.
(ISR)
Next, the flow skips the wait loop, believing that the state is READY.
(readFrom or writeTo)
Finally, put the address in TWDR, then initiate the transmission and reenable the interrupts
(readFrom or writeTo)
CASE 1: intended
The hardware finishes the repeated start before
TWDR = twi_slarw;
Hardware sets TWINT to 0, but because TWIE is disabled, ISR will not be called.
Address will be put in the TWDR, then the hardware will be initiated by setting TWINT to 1.
CASE 2: works
The hardware finishes the repeated start after
TWCR = _BV(TWINT) | _BV(TWEA) | _BV(TWEN) | _BV(TWIE);
The attempt to
TWDR = twi_slarw;
broke because TWINT was still 1, flagging TWWC.The
TWCR = _BV(TWINT) | _BV(TWEA) | _BV(TWEN) | _BV(TWIE);
was presumably ignored because the hardware was already working on a request.Note: TWIE has been set.
When the hardware returns, the ISR will be called because interrupts were enabled.
leading to...
twi_reply(1) ==
TWCR = _BV(TWEN) | _BV(TWIE) | _BV(TWINT) | _BV(TWEA);
meaning the address is put in TWDR, then hardware initiated, just like CASE 1.
This will also clear TWWC, meaning no error is noticed.
It works, so that's fine.
CASE 3: broken
The hardware finishes the repeated start between cases 1 and 2.
The attempt to
TWDR = twi_slarw;
broke because TWINT was still 1, flagging TWWC.Hardware returns, and because ISR was disabled when the repeated start was constructed, the TWINT flag is ignored as it comes in.
Finally,
TWCR = _BV(TWINT) | _BV(TWEA) | _BV(TWEN) | _BV(TWIE);
runs, starting the hardware.But the TWDR was never updated, so the wrong data gets sent (whatever was previously in the TWDR).
TWWC will also stay flagged.
My sniffer would notice an address nack once in about 50 repeat starts. (the address is the first byte sent after the repeated start, so that's the one that is sent incorrectly, and there was no slave at that wrong address.)
My Fix:
I ended up adding this...
polling TWWC until it accepts the TWDR change, effectively removing cases 2 and 3.
(I tried polling TWINT with
while (TWCR & _BV(TWINT));
before updating the data register, but I kept hitting an infinite loop. Not sure why, not completely relevant, but if anyone knows, I am curious.)This fix doesn't quite fit the nice interrupt driven design, but I was having trouble fixing it in a good way.
The end, I'd love to hear from you, hope this was understandable and accurate.
PS: Why I was exploring the TWI and Wire libraries.
We wanted to fit PMBus protocol, meaning we needed 256 byte block write/read. changing the static buffers in Wire and TWI to 256 increases the static memory load from 32_5 to 256_5 > 1250 bytes, a big hit while using the uno.
We don't intend to fully use the asynchronous capabilities of the TWI library, and Wire blocks anyway, so I hacked out the buffers, made TWI write to a buffer created by the application. I also removed most of the slave code (using as demo board for other products, only master is needed.)
I'll attach those files as soon as I figure out how. The simplified versions might be necessary to consistently recreate the issue. I'm hoping a theoretical conversation will be useful either way though.
The text was updated successfully, but these errors were encountered: