Reinitialize serial buffer on every read to avoid concurrent rewrite #481

facchinm · 2019-11-07T09:37:27Z

buffered_ch was being rewritten with data being read on a later Read()

Attention: this implementation "leaks" (ch will be garbage collected every time)
Please profile the memory usage before merging

buffered_ch was being rewritten with data being read on a later Read() Attention: this implementation "leaks" (ch will be garbage collected every time) Please profile the memory usage before merging

matteosuppo · 2019-11-07T10:04:50Z

I'm letting it run reading the serial monitor without issues or memory growing. Looks fine to me

facchinm · 2019-11-07T11:21:01Z

Did you try some VERY FAST print like analogReadSerial sketch?

matthijskooijman · 2019-11-07T11:31:05Z

@facchinm can you elaborate on how this fix works exactly? From looking at the code, I cannot see why the original code would be wrong, or why the new code really changes anything (admittedly, I do not remember all details around how golang's slices and arrays work exactly).

facchinm · 2019-11-07T11:39:00Z

@matthijskooijman me neither, but what I experienced was:

if runeValue == utf8.RuneError {
  fmt.Println(ch[i:n]) // <- this holds the right values
  buffered_ch.Write(ch[i:n])
  break
}

... after Read() is executed again, roughly 1 times out of 10 ...

if err == nil {
  fmt.Println(buffered_ch.Bytes()) // <- now it holds ch[:previous_n]
  ch = append(buffered_ch.Bytes(), ch[:n]...)
  n += len(buffered_ch.Bytes())
  buffered_ch.Reset()
}

I thought it was a concurrency issue but it may be more subtle (it surely is)

smellai · 2019-11-07T11:42:54Z

as a reference, expected output (obtained from desktop IDE):

Create Agent output:

Sketch code

matthijskooijman · 2019-11-07T11:47:26Z

fmt.Println(buffered_ch.Bytes()) // <- now it holds ch[:previous_n]

Ah, so you're saying that this holds the bytes from the current ch, indexed by the previous n?

I suspect that that this:

buffered_ch.Write(append(ch[i:n]))

does not actually copy values into buffered_ch, but lets bufferd_ch point to the same underlying storage as ch. I guess the append() here is intended to produce a copy (but fails). I guess it might be better to fix that here rather than creating a new ch, though I can see why a new ch also works.

However, looking at the docs for bytes.Buffer.Write, it does seem that that is actually intended to copy values, not create a reference: https://golang.org/pkg/bytes/#Buffer.Write

cmaglie · 2019-11-07T12:08:48Z

Is this utf8 conversion really needed here? I would let the agent just send the raw data over the wire and delegate the utf8 conversion to the serial monitor, this would simplify a lot the copy-loop. Also from the serial monitor we can receive basically everything, even non-UTF8 chars.

matthijskooijman · 2019-11-07T12:11:33Z

Good point. I think this code might still stem from the serial-port-json-server where the agent was originally based on (I think?), which actually did some processing of commands in the server as well (so that probably needed this processing). I think there might be some other indirections in the (serial) processing that are not really needed anymore.

facchinm · 2019-11-07T13:06:52Z

@cmaglie the communication layer is plain text so no binary data can be sent over that channel. I think interpreting the output and not sending raw data should be kept but of course if there's a better fix let's do it 😄

matthijskooijman · 2019-11-07T13:13:14Z

What layer is that exactly? Websockets? I think those can be binary as well? Alternatively, you could encode the data using some encoding (base64, or something that only encodes bytes > 127). Adding encoding does require changes on both sides, of course.

mastrolinux · 2019-11-07T13:35:31Z

asking our master of Go @matteosuppo, please advise.

mastrolinux · 2019-11-07T13:35:52Z

also if @masci could review this would be awesome.

matteosuppo · 2019-11-07T14:49:28Z

I think that changing the encoding is out of scope for this fix, especially since it's urgent.

Regarding the performance of instantiating a new ch or fixing the append, that's where some tests and benchmarks would shine. But I would delay them to when we refactor this part.

matteosuppo · 2019-11-07T14:58:38Z

@facchinm can we tell jenkins to build this so that we can test also on windows?

masci · 2019-11-07T16:20:37Z

The problem is actually here https://github.com/arduino/arduino-create-agent/pull/481/files#diff-59a5ccbf6d889213033f4a5e627e5aadL115

With this assignment

ch = append(buffered_ch.Bytes(), ch[:n]...)

ch shrinks from 1024 to whatever length was read from the serial, so any subsequent call will only work if data read from the serial is smaller than the first read.

In this example I've reproduced the case when the final output after 2 readings is truncated https://play.golang.org/p/_qN2a5SHWtW

The fix in this PR works since ch is re-instantiated at every loop (it doesn't leak btw, it only puts pressure on the GC, no idea about the overall impact) but I'm not sure why the function copies back and forth from buffered_ch in the first place, removing that step would solve the issue without recreating the buffer at each step.

cmaglie · 2019-11-07T16:44:01Z

In this example I've reproduced the case when the final output after 2 readings is truncated https://play.golang.org/p/_qN2a5SHWtW

your test doesn't reproduce exactly the parsing loop, you forget to insert the utf8 check:

		if runeValue == utf8.RuneError {

that will save the incomplete utf8 char for the next loop, it will be parsed successfully when the "remainder" is received: https://play.golang.org/p/kfJIYUDB_wU

BTW the change in size of ch is real, and that may explain it.

matthijskooijman · 2019-11-08T11:04:25Z

ch shrinks from 1024 to whatever length was read from the serial, so any subsequent call will only work if data read from the serial is smaller than the first read.

Ok, I can see how that is not what this code expected. But would it really break? If ch shrinks, then the maximum read size will indeed shrink, but that should not cause any failures or out of bound writes, right? Or will n, err := p.portIo.Read(ch) still assume the original size somehow?

Eventually, you might end up with a single-byte ch, which makes it a lot more likely that the partial utf-8 rune handling triggers, but this should still work as expected since partial bytes are buffered and added again after the next read. IIUC, this can actually grow ch again then:

ch = append(buffered_ch.Bytes(), ch[:n]...)

matteosuppo · 2019-11-11T08:42:07Z

I'm going to merge so that we can make a test build, and if it works release it. Afterwards we'll allocate some time to better refactor all this

Reinitialize serial buffer on every read to avoid concurrent rewrite

1537486

buffered_ch was being rewritten with data being read on a later Read() Attention: this implementation "leaks" (ch will be garbage collected every time) Please profile the memory usage before merging

facchinm assigned rsora Nov 7, 2019

matteosuppo merged commit 2ad557f into arduino:devel Nov 11, 2019

rsora mentioned this pull request Nov 11, 2019

Port fix: Reinitialize serial buffer on every read to avoid concurrent rewrite #486

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reinitialize serial buffer on every read to avoid concurrent rewrite #481

Reinitialize serial buffer on every read to avoid concurrent rewrite #481

facchinm commented Nov 7, 2019

matteosuppo commented Nov 7, 2019

facchinm commented Nov 7, 2019

matthijskooijman commented Nov 7, 2019

facchinm commented Nov 7, 2019

smellai commented Nov 7, 2019 •

edited

Loading

matthijskooijman commented Nov 7, 2019

cmaglie commented Nov 7, 2019

matthijskooijman commented Nov 7, 2019

facchinm commented Nov 7, 2019

matthijskooijman commented Nov 7, 2019

mastrolinux commented Nov 7, 2019

mastrolinux commented Nov 7, 2019

matteosuppo commented Nov 7, 2019

matteosuppo commented Nov 7, 2019

masci commented Nov 7, 2019

cmaglie commented Nov 7, 2019

matthijskooijman commented Nov 8, 2019

matteosuppo commented Nov 11, 2019

Reinitialize serial buffer on every read to avoid concurrent rewrite #481

Reinitialize serial buffer on every read to avoid concurrent rewrite #481

Conversation

facchinm commented Nov 7, 2019

matteosuppo commented Nov 7, 2019

facchinm commented Nov 7, 2019

matthijskooijman commented Nov 7, 2019

facchinm commented Nov 7, 2019

smellai commented Nov 7, 2019 • edited Loading

matthijskooijman commented Nov 7, 2019

cmaglie commented Nov 7, 2019

matthijskooijman commented Nov 7, 2019

facchinm commented Nov 7, 2019

matthijskooijman commented Nov 7, 2019

mastrolinux commented Nov 7, 2019

mastrolinux commented Nov 7, 2019

matteosuppo commented Nov 7, 2019

matteosuppo commented Nov 7, 2019

masci commented Nov 7, 2019

cmaglie commented Nov 7, 2019

matthijskooijman commented Nov 8, 2019

matteosuppo commented Nov 11, 2019

smellai commented Nov 7, 2019 •

edited

Loading