-
-
Notifications
You must be signed in to change notification settings - Fork 7k
Serial monitor output cut short with multibyte (UTF-8) data #9808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Milestone
Comments
matthijskooijman
added a commit
to matthijskooijman/Arduino
that referenced
this issue
Feb 26, 2020
This fixes a problem with the Serial UTF-8 decoder. This decoding moves data from char[] buf, into a ByteBuffer inFromSerial, then decodes them into a CharBuffer outToMessage and converts to a char[] to pass on. When the buf read contained just over a full buffer worth of bytes and contained some multi-byte characters, a situation could arise where two decodes were needed to fill up outToMessage, leaving some data in inFromSerial. If in this case no data would be left in buf, decoding would stop until more data came in from serial. This commit fixes this problem by: - Changing the outer loop to continue running when buf is empty, but inFromSerial is not. - Changing the inner loop to run at least once (so it runs when buf is empty, but inFromSerial is no). - Breaking out of the outer loop when no characters were produced (this handles the case where only an incomplete UTF-8 character remains in inFromSerial, which would otherwise prevent the loop from terminating. - Removes a `if (outToMessage.hasRemaining()` check that is now necessarily true if the break was not done. This fixes arduino#9808.
The same sketch can be used to reproduce this on an MKRZERO as well. I've submitted a PR to fix this. |
cmaglie
pushed a commit
to matthijskooijman/Arduino
that referenced
this issue
Mar 24, 2020
This fixes a problem with the Serial UTF-8 decoder. This decoding moves data from char[] buf, into a ByteBuffer inFromSerial, then decodes them into a CharBuffer outToMessage and converts to a char[] to pass on. When the buf read contained just over a full buffer worth of bytes and contained some multi-byte characters, a situation could arise where two decodes were needed to fill up outToMessage, leaving some data in inFromSerial. If in this case no data would be left in buf, decoding would stop until more data came in from serial. This commit fixes this problem by: - Changing the outer loop to continue running when buf is empty, but inFromSerial is not. - Changing the inner loop to run at least once (so it runs when buf is empty, but inFromSerial is no). - Breaking out of the outer loop when no characters were produced (this handles the case where only an incomplete UTF-8 character remains in inFromSerial, which would otherwise prevent the loop from terminating. - Removes a `if (outToMessage.hasRemaining()` check that is now necessarily true if the break was not done. This fixes arduino#9808.
cmaglie
pushed a commit
that referenced
this issue
Mar 24, 2020
This fixes a problem with the Serial UTF-8 decoder. This decoding moves data from char[] buf, into a ByteBuffer inFromSerial, then decodes them into a CharBuffer outToMessage and converts to a char[] to pass on. When the buf read contained just over a full buffer worth of bytes and contained some multi-byte characters, a situation could arise where two decodes were needed to fill up outToMessage, leaving some data in inFromSerial. If in this case no data would be left in buf, decoding would stop until more data came in from serial. This commit fixes this problem by: - Changing the outer loop to continue running when buf is empty, but inFromSerial is not. - Changing the inner loop to run at least once (so it runs when buf is empty, but inFromSerial is no). - Breaking out of the outer loop when no characters were produced (this handles the case where only an incomplete UTF-8 character remains in inFromSerial, which would otherwise prevent the loop from terminating. - Removes a `if (outToMessage.hasRemaining()` check that is now necessarily true if the break was not done. This fixes #9808.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I noticed an issue with the serial monitor, where some data might get stuck and not be displayed in the monitor as expected.
To reproduce, run the following sketch (probably works on any native USB Arduino, I tried on an STM32 board which I had at hand):
This prints just over 128 bytes of data, with a multi-byte character in the first 128 bytes. Then it waits a bit, and prints one more byte.
When running this, the serial monitor will first print:
Then, after 10 seconds, it adds the missing part plus the X:
For this to occur, the Arduino must send its data quite fast (hence the native USB requirement). I reproduced this once on a Uno at 2Mbps, but could not reliably do that.
The problem is in the code that converts data from bytes to chars:
Arduino/arduino-core/src/processing/app/Serial.java
Lines 177 to 195 in b40f54a
This code reads an arbitrary amount of data using
port.readBytes
intobuf
. It then puts as much bytes as possible intoinFromSerial
, which is then converted to chars which are stored intooutToMessage
, continuing untiloutToMessage
is full, which is then passed on to the serial monitor, and the entire thing is continued untilbuf
is fully read.However, in the case simulated with this sketch:
buf
contains 132 bytes (12 x01234567890
+°0123456789
, not that°
is 2 bytes).inFromSerial
, which are decoded into 127 characters inoutToMessage
.inFromSerial
, but just one is decoded intooutToMessage
because thenoutToMessage
is full.789
) are still ininFromSerial
.next < buf.length
, which is false (since everything has been read frombuf
, so processing stops.X
is sent, that is copied intoinFromSerial
as well, where the789
are still waiting, and everything together is decoded and displayed.To fix this, processing should probably just continue until
inFromSerial
is empty (or actually until no bytes can be read anymore, since there might be a partial UTF-8 character left ininFromSerial
that must be left there until the rest is received).The text was updated successfully, but these errors were encountered: