-
-
Notifications
You must be signed in to change notification settings - Fork 7k
HardwareSerial::flush hangs on high speed communications. #1742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Nice job tracking down this issue! However, the issue you found is a duplicate of #1456, so it was already debugged before (and my pull request #1711 fixes this). It is interesting that you say your improved code makes things faster. Are you saying the queueing of the 9 bytes is now faster, or also the actual transmission. Or, thinking about it, that is basically the same because 1 Mbit/s (== 1us/bit == 10us/byte) the queuing is (currently) slower than the sending. Perhaps you could test my pullrequest to see if this fixes your issue and what the speed of it is? If it's still slower than 87us (or thereabouts), I'm happy to add another patch to my pullrequest to improve the speed, if possible. |
Thanks, yes, it is the same issue. Will take a closer look at the code in your pull request. Need to figure out the easiest way to try out your code, as I am not currently enlisted... Looks like it is more involved change over several files. There are a few things in your changes that I need to look up. Things like the ATOMIC_BLOCK with a nested NONATOMIC_BLOCK, will it restore the atomic part once it leaves the NONATOMIC_BLOCK... My first look through your changes may fix the issue. Just trying to decide if you need that much of the code set in ATOMIC blocks. In fact wondering if you simply need it around these two lines in the write function:
Also a minor thing, but in your HardwareSerial1,2,3 files, don't think you need all of those conditional names for those interrupt vectors. Again will try to figure out good way to get the necessary files to run with 1.5.5 or the nightly build |
The easiest way to get at the code is to clone the repository and get my branch: https://github.com/matthijskooijman/Arduino/tree/ide-1.5.x-hardware-serial You can clone my repository directly:
And then I think you can do
to get the code from my branch. You can then either build the entire IDE and use the git version, or just copy the "hardware/arduino" folder from git and replace it inside the 1.5.5 IDE you have already downloaded. |
Thanks, I did the clone to my machine and then copied the appropriate files to my 1.5.5 IDE (HardwareSerial_._, Arduino.h, usbapi.h). Note when I tried to compile my program, it would not compile as it said _NOP was not defined... To fix, I updated your definition in Arduino.h From:
To:
There was a _ missing from the start of the name, as well as the asm was missing an _ The good news is then I ran my test program and so far I have not hung the board. However the output performance is about the same as the original code base. 148us for the 9 bytes. I tried to include images showing the different Logic Analyzer outputs of a packet, but this editor keeps saying it does not sport my files (tried both .png and .jpg). If you are interested, I did post information about this up on the Trossen forum: |
Hmm, I wonder where those underscores went... Thanks for pointing that out, I'll make sure I fix that in the pullrequest (my local toolchain has a newer libc that already defines _NOP, so it didn't error out here...). As for the performance, I guess that means there is still some room for improvement in the code :-) I haven't got time right now to look at your code in detail and compare it with mine, but I certainly intend to do so later. |
Thanks, This avoids the interrupt overhead. My quick look at the output from avr_objdump -D of the generated elf file shows that the interrupt handler (__vector_37) uses about 86 clocks to handle the interrupt (16 push, 16 pop, ...) plus the actual code in Serial1._tx_udr_empty_irq. I have not counted the clocks in that function yet, but ... But I will be happy with a release that does not hang! Good work |
Ah, I see that now, yes. I guess we could do the same improvement on top of my code to get the speed gain. With your fix, do you get the improved speed consistently? Or are there sometimes a few bits that are slower? With your fix, I would expect the queuing of bytes (e.g. calls to write()) to happen faster than the actual sending of bytes. This means that eventually (and probably after the first couple of bytes already), the direct queueing of bytes cannot happen anymore because the hardware is transmitting one byte and has the next byte queued in UDRE already. When this happens, interrupts will be enabled, which should cause all bytes sent from that point onwards to be slower? But perhaps the extra slowness causes the buffer to be depleted, interrupts disabled and allows a few more bytes to be queued directly again. Also, perhaps the fact that the hardware has two queue bytes could maks this variation in queueing latency so all bytes still are sent at the nominal baudrate? Do you see what I mean here? |
I applied your shortcut to my code and found out that at 1Mbit/s, the queueing of new bytes is exactly fast enough to keep the buffer full (effectively, the UART is just a slight bit faster than the incoming new bytes). However, if I lower the baudrate to 0.9Mbit/s, I can see the behaviour I described above: the supply of bytes catches up with the UART, the interrupt gets enabled and the next byte gets delayed. However, this is just one interupt, which then disables itself again and a few more bytes can be sent back-to-back (I think this is because the interrupt slows down the UART byte supply, but also halts the supply of new bytes. Because the UART can send bytes while the interrupt is running, the UART can catch up). If the bitrate is lowered further, the interrupt gets enabled sooner and more often. At 0.5Mbit/s, the interrupt overhead no longer prevents the full bitrate from being attained. At 0.3Mbit/s, the interrrupt remains enabled all the time. At this speed (and higher speeds), the interrupt overhead and queuing of new bytes still takes up pretty much all the CPU time. At lower speeds, the (relative) overhead decreases to reasonable proportions (though this is including some digitalWrite calls for debugging, so the practical values might be a bit different). This shows that, at speeds from 0.5Mbit to 1Mbit, it could be beneficial to skip the interrupt altogether - since we're using pretty much 100% CPU time already we won't lose any CPU time and gain a bit of serial transmission speed. However, I'm not sure if we can automatically detect this case and I don't reallly like hardcoding a threshold for the bitrate. Also, it's probably not a big deal if the speed is a bit less that the configured speed, especially if at 1Mbit/s the full speed is attained. So much for my observations :-) |
Thanks, It looks like it is performing like I would expect. At the fastest speed it can more or less completely bypass the interrupt handler. At some mid speeds it will be able to stuff a few out every so often and at lower speeds it will output almost everything using the interrupt handler. Will be interesting to try it out with some real world programs. For example may try with one of my Hexapod robots that talk to an SSC-32 controller at 115200. I suspect that most output will go through interrupt handler, but the code does do some computation for each servo before it outputs the position of it, so maybe a few of the characters will be output directly. Again thanks! Thanks for testing this out and your work on this! |
Warning: WIP - I am using 1.5.5 and/or Daily Build. Also working on fixes to Hardware Serial on Arduino. This is a work in progress. Requires changes to Arbotix files. Information in issues: vanadiumlabs/arbotix#4 Also more information up on TrossenRobotics Forum: http://forums.trossenrobotics.com/showthread.php?5786-Arbotix-on-Arduino-1.5(will-be-1.0.2)-IDE/page2 More Information about Hardware Serial issues on 1.5.5 are up on their github: arduino/Arduino#1742 Also in this first pass, added the ability to define for Quad instead of Hex. Note: Has code to bypass HardwareSerial write method which I am testing currently. Later if Arduino fixes issue can turn off and use ::write and ::flush methods.
Hi there; Any solution to fix that in old version ? Thank you for your help |
For the fun of it I started experimenting running some Dynamixel AX-12 servos connected up to an Arduino Mega (1280), with the standard Hardware Serial class, that externally I do a few tweaks to make it work at half duplex. The AX-12 interface runs at the baud rate of 1000000. I was able to get it to start working, but it would hang the processor on a pretty frequent basis. By instrumenting my code by bracketing some code fragments with digitalWrite statements and the use of my Logic Analyzer, I found the hang was in the flush function. So I took a look over the code in HardwareSerial.cpp, I found a race condition in the write function. The problem is that you clear the TXC condition, which at very high baud rates may happen after the last byte was output and the state was already set. Then when you call off to flush, the state will never be set again and you are hung in the loop.
The code also relies on every byte being output from the Interrupt hander, which for high baud rates will slow down. (Tried to add image, but would not take JPG file).
So I rewrote the write function:
So far I have not had an issue with this updated code. Also from the logic Analyzer of the AX-12 buss, the output of a 9 byte packet went from taking about 144.5us to about 87us.
The text was updated successfully, but these errors were encountered: