Skip to content

Fix serial monitor output cut short with multibyte (UTF-8) data #9809

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 24, 2020

Conversation

matthijskooijman
Copy link
Collaborator

This fixes #9808, detailed description in the commit messages.

This seems like a class that could use some automated testing, but I did not see any existing tests (none at all in arduino-core, only in app), nor did I find any meaningful documentation about how to run tests (I only found "ant test", which runs all tests and does not show a useful summary to tell if any tests failed AFAICS). Hence, no tests.

@matthijskooijman
Copy link
Collaborator Author

Also, I realize that this is a performance-critical bit of code (there has been quite some testing in the past to ensure that high-bandwidth serial streams from native USB boards are properly handled), but I think none of these changes are significant in that sense (the process is still the same, except that in some cases, there might be one extra loop to process any leftover data).

Also note that the decode method returns some status about why it has terminated (underflow of input, overflow of output), which might be able to detect the "there is only a partial multi-byte character left in the input, so no point in decoding that now"-case earlier (with my code, this is detected by decoding it and noticing that no output is produced, which requires one extra loop iteration). However, the documentation was not clear enough for me to really see how to detect this case, so I just went for the easy route (especially since this is also quite the corner case, I think).

@facchinm facchinm added Component: IDE Serial monitor Tools > Serial Monitor Type: Improvement This proposal is considered to be especially beneficial labels Feb 27, 2020
@facchinm facchinm added this to the Release 1.8.13 milestone Feb 27, 2020
cmaglie and others added 3 commits March 24, 2020 12:16
This makes the code slightly more compact and easier to read.
This fixes a problem with the Serial UTF-8 decoder. This decoding moves
data from char[] buf, into a ByteBuffer inFromSerial, then decodes them
into a CharBuffer outToMessage and converts to a char[] to pass on.

When the buf read contained just over a full buffer worth of bytes and
contained some multi-byte characters, a situation could arise where two
decodes were needed to fill up outToMessage, leaving some data in
inFromSerial. If in this case no data would be left in buf, decoding
would stop until more data came in from serial.

This commit fixes this problem by:
 - Changing the outer loop to continue running when buf is empty, but
   inFromSerial is not.
 - Changing the inner loop to run at least once (so it runs when buf is
   empty, but inFromSerial is no).
 - Breaking out of the outer loop when no characters were produced (this
   handles the case where only an incomplete UTF-8 character remains in
   inFromSerial, which would otherwise prevent the loop from
   terminating.
 - Removes a `if (outToMessage.hasRemaining()` check that is now
   necessarily true if the break was not done.

This fixes arduino#9808.
@cmaglie cmaglie force-pushed the fix-serial-lingering-bytes branch from 1f3819c to f5296a2 Compare March 24, 2020 11:24
@arduino arduino deleted a comment from ArduinoBot Mar 24, 2020
@cmaglie
Copy link
Member

cmaglie commented Mar 24, 2020

I've added the missing unit test, the patch LGTM, I'll merge as soon as the CI pass.

@cmaglie cmaglie self-assigned this Mar 24, 2020
@cmaglie cmaglie merged commit a1d6da9 into arduino:master Mar 24, 2020
@matthijskooijman
Copy link
Collaborator Author

Test looks good, thanks!

@matthijskooijman matthijskooijman deleted the fix-serial-lingering-bytes branch March 28, 2020 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: IDE Serial monitor Tools > Serial Monitor Type: Improvement This proposal is considered to be especially beneficial
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Serial monitor output cut short with multibyte (UTF-8) data
4 participants