Skip to content

USBSerial loosing characters on Linux #478

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ktand opened this issue Mar 23, 2019 · 8 comments · Fixed by #480
Closed

USBSerial loosing characters on Linux #478

ktand opened this issue Mar 23, 2019 · 8 comments · Fixed by #480
Labels
bug 🐛 Something isn't working

Comments

@ktand
Copy link
Contributor

ktand commented Mar 23, 2019

Describe the bug
I'm using the Marlin 3d-printer firmware on the Armed 3D-printer board (STM32F407) together with Octoprint running on a Raspberry PI 3 A+. Octoprint request temperature status to be sent from the printer to the Octoprint host on a reqular basis (every 2nd second). A temperature status message looks like this:

T:23.01 /0.00 B:22.88 /0.00 @:0 B@:0

The problem is that sometimes one or more characters of this message is lost. I have tested without Octoprint, using minicom, and the character loss is there as well so this is not a problem with Octoprint.

 T:96.18 /0.00 B:61.56 /0.00 @:0 B@:0
 T:95.66 /0.00 B:61.47 /0.00 @:0 B@:0
 T:90.86 /0.00 B:60.41 /0.00 @:0 B@:0
 T:90.41 /0.00 B:60.34 /0.00 @:0 B@:0
 T:89.93 /0.00 :60.26 /0.00 @:0 B@:0 <-- character lost
 T:88.13 /0.00 B:.85 /0.00 @:0 B@:0 <-- characters lost
 T:87.70 /0.00 B:59.71 /0.00 @:0 B@:0
 T:87.20 /0.00 B:59.59 /0.00 @:0 B@:0
 T:85.52 /0.00 :59.22 /0.00 @:0 B@:0 <-- character lost
 :96.66 /0.00 B:61.65 /0.00 @:0 B@:0 <-- character lost
 T:85.10 /0.00 B:.88 /0.00 @:0 B@:0 <-- characters lost
 T:84.63 /0.00 B:58.76 /0.00 @:0 B@:0

The problem exists in the master branch but not in release 1.5 (I'm currently testing with master @ commit 7feeb09)

The message is composed by calls to both string and single character Stream functions

What is strange is that on Windows no characters are lost.

To Reproduce
I have tried, but failed, to reproduce this in a test program by sending the message above every 2nd second.

Could this be a timing issue? Marlin uses multiple timer interrupts.

I understand this is not easy to investigate but I'll be happy to help test if someone got an idea what could be causing this.

Board (please complete the following information):

  • Name: Armed
@fpistm
Copy link
Member

fpistm commented Mar 24, 2019

Hi @ktand
If issue is not met with 1.5.0, I do not think it is related to timing issue (at least at this time).
Some changes have been done on USB. I think it should be fine to find which commit or sets of commits bring this issue.
Could you made tests to find it (them)?
I guess you should firstly try before the F4 HAL update and then after.
If it is OK after HAL update then, we could look at USB one:
#444
#456
#466

@fpistm fpistm added bug 🐛 Something isn't working waiting feedback Further information is required labels Mar 24, 2019
@ktand
Copy link
Contributor Author

ktand commented Mar 24, 2019

Hi @fpistm

Sure! I will work my way from 1.5.0 and up.

@ktand
Copy link
Contributor Author

ktand commented Mar 24, 2019

@fpistm @ghent360 @makarenya

I have found the commit that introduces this problem and also what causes it. The commit is @eea20cec and the problem is in the CDC_connected() function:

bool CDC_connected()
{
  uint32_t transmitTime = 0;
  if (transmitStart) {
    transmitTime = HAL_GetTick() - transmitStart;
  }
  return hUSBD_Device_CDC.dev_state == USBD_STATE_CONFIGURED
         && transmitTime < USB_CDC_TRANSMIT_TIMEOUT
         && lineState;
}

What happens is that when the if (transmitStart) condition is evaluted, transmitStart may be non-zero causing the conditional statement to be executed. But, sometimes, when the next line, transmitTime = HAL_GetTick() - transmitStart;, is executed the transmitStart might be zero (I guess it's because of a call to USBD_CDC_Transferred() from the USB-interrupt when the last packet is transferred) and then transmitTime will be equal to HAL_GetTick() which is way more than USB_CDC_TRANSMIT_TIMEOUT and the function returns FALSE.

Another thing, not directly related to this bug: If there is another interrupt delaying execution and causing the transmitTime to be > USB_CDC_TRANSMIT_TIMEOUT, which is defined as 3 ticks, then we'll loose another character or string... Maybe the USB_CDC_TRANSMIT_TIMEOUT should be much higher.

@ktand
Copy link
Contributor Author

ktand commented Mar 24, 2019

A simple workaround that fixes the issue of lost characters could be to read transmitStart only once:

bool CDC_connected()
{
  uint32_t transmitTime = transmitStart;
  if (transmitTime) {
    transmitTime = HAL_GetTick() - transmitTime;
  }
  return hUSBD_Device_CDC.dev_state == USBD_STATE_CONFIGURED
         && transmitTime < USB_CDC_TRANSMIT_TIMEOUT
         && lineState;
}

But since I'm not exactly sure how the timeout is supposed to work I'm not sure it's the right solution. It does however fix the character drop for me.

Not sure why this only happens on Linux and not on Windows....

@ghent360
Copy link
Contributor

I was reading your first e-mail and was thinking we need a local variable to save the value, so your proposal is spot on.

Re:timeout - this value depends how often the host reads data over the USB connection. Clearly there are some differences in the serial driver implementations between Windows and Linux. In theory you need much bigger value with low boud rates. For example if you are emulating 9600bps serial, a full USB packet of 64 bytes would take (64*10/9600) or ~67ms to transmit, so the timeout should be 100 or thereabout.

ghent360 added a commit to ghent360/Arduino_Core_STM32 that referenced this issue Mar 25, 2019
@fpistm
Copy link
Member

fpistm commented Mar 25, 2019

Just a quick note to avoid any confusion HAL tick is 1ms. So USB_CDC_TRANSMIT_TIMEOUT is 3ms.
And it can be redefined using the build_opt.h.
Issue using transmit start could be rollover of the register. I know it's very high but this can happen :)
(49d 17h 2m 47s 295 ms)
Anyway this should not happen has HAL_Tick always increment.

@ktand
Copy link
Contributor Author

ktand commented Mar 25, 2019

@ghent360

Re:timeout - this value depends how often the host reads data over the USB connection. Clearly there are some differences in the serial driver implementations between Windows and Linux. In theory you need much bigger value with low boud rates. For example if you are emulating 9600bps serial, a full USB packet of 64 bytes would take (64*10/9600) or ~67ms to transmit, so the timeout should be 100 or thereabout.

Is that really true? Doesn't a USB packat transfer take the same time regardless of baudrate?

@fpistm
Copy link
Member

fpistm commented Mar 25, 2019

@ktand
I think baudrate is useless in VCP.

fpistm pushed a commit that referenced this issue Mar 25, 2019
ghent360 added a commit to ghent360/Arduino_Core_STM32 that referenced this issue Mar 25, 2019
ghent360 added a commit to ghent360/Arduino_Core_STM32 that referenced this issue Mar 25, 2019
fpistm added a commit that referenced this issue Mar 26, 2019
Signed-off-by: Frederic.Pillon <[email protected]>
benwaffle pushed a commit to benwaffle/Arduino_Core_STM32 that referenced this issue Apr 10, 2019
benwaffle pushed a commit to benwaffle/Arduino_Core_STM32 that referenced this issue Apr 10, 2019
Signed-off-by: Frederic.Pillon <[email protected]>
@fpistm fpistm removed the waiting feedback Further information is required label Nov 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants