Skip to content

Reinitialize serial buffer on every read to avoid concurrent rewrite #481

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 11, 2019

Conversation

facchinm
Copy link
Member

@facchinm facchinm commented Nov 7, 2019

buffered_ch was being rewritten with data being read on a later Read()

Attention: this implementation "leaks" (ch will be garbage collected every time)
Please profile the memory usage before merging

buffered_ch was being rewritten with data being read on a later Read()

Attention: this implementation "leaks" (ch will be garbage collected every time)
Please profile the memory usage before merging
@matteosuppo
Copy link
Contributor

I'm letting it run reading the serial monitor without issues or memory growing. Looks fine to me

@facchinm
Copy link
Member Author

facchinm commented Nov 7, 2019

Did you try some VERY FAST print like analogReadSerial sketch?

@matthijskooijman
Copy link
Collaborator

@facchinm can you elaborate on how this fix works exactly? From looking at the code, I cannot see why the original code would be wrong, or why the new code really changes anything (admittedly, I do not remember all details around how golang's slices and arrays work exactly).

@facchinm
Copy link
Member Author

facchinm commented Nov 7, 2019

@matthijskooijman me neither, but what I experienced was:

if runeValue == utf8.RuneError {
  fmt.Println(ch[i:n]) // <- this holds the right values
  buffered_ch.Write(ch[i:n])
  break
}

... after Read() is executed again, roughly 1 times out of 10 ...

if err == nil {
  fmt.Println(buffered_ch.Bytes()) // <- now it holds ch[:previous_n]
  ch = append(buffered_ch.Bytes(), ch[:n]...)
  n += len(buffered_ch.Bytes())
  buffered_ch.Reset()
}

I thought it was a concurrency issue but it may be more subtle (it surely is)

@smellai
Copy link
Contributor

smellai commented Nov 7, 2019

as a reference, expected output (obtained from desktop IDE):
68386285-ff979f80-015b-11ea-9a31-51e805b2a97d

Create Agent output:
68386314-0c1bf800-015c-11ea-9ffd-e0b4a4566f2f

Sketch code

@matthijskooijman
Copy link
Collaborator

fmt.Println(buffered_ch.Bytes()) // <- now it holds ch[:previous_n]

Ah, so you're saying that this holds the bytes from the current ch, indexed by the previous n?

I suspect that that this:

buffered_ch.Write(append(ch[i:n]))

does not actually copy values into buffered_ch, but lets bufferd_ch point to the same underlying storage as ch. I guess the append() here is intended to produce a copy (but fails). I guess it might be better to fix that here rather than creating a new ch, though I can see why a new ch also works.

However, looking at the docs for bytes.Buffer.Write, it does seem that that is actually intended to copy values, not create a reference: https://golang.org/pkg/bytes/#Buffer.Write

@cmaglie
Copy link
Member

cmaglie commented Nov 7, 2019

Is this utf8 conversion really needed here? I would let the agent just send the raw data over the wire and delegate the utf8 conversion to the serial monitor, this would simplify a lot the copy-loop. Also from the serial monitor we can receive basically everything, even non-UTF8 chars.

@matthijskooijman
Copy link
Collaborator

Good point. I think this code might still stem from the serial-port-json-server where the agent was originally based on (I think?), which actually did some processing of commands in the server as well (so that probably needed this processing). I think there might be some other indirections in the (serial) processing that are not really needed anymore.

@facchinm
Copy link
Member Author

facchinm commented Nov 7, 2019

@cmaglie the communication layer is plain text so no binary data can be sent over that channel. I think interpreting the output and not sending raw data should be kept but of course if there's a better fix let's do it 😄

@matthijskooijman
Copy link
Collaborator

What layer is that exactly? Websockets? I think those can be binary as well? Alternatively, you could encode the data using some encoding (base64, or something that only encodes bytes > 127). Adding encoding does require changes on both sides, of course.

@mastrolinux
Copy link
Contributor

asking our master of Go @matteosuppo, please advise.

@mastrolinux
Copy link
Contributor

also if @masci could review this would be awesome.

@matteosuppo
Copy link
Contributor

I think that changing the encoding is out of scope for this fix, especially since it's urgent.

Regarding the performance of instantiating a new ch or fixing the append, that's where some tests and benchmarks would shine. But I would delay them to when we refactor this part.

@matteosuppo
Copy link
Contributor

@facchinm can we tell jenkins to build this so that we can test also on windows?

@masci
Copy link

masci commented Nov 7, 2019

The problem is actually here https://github.com/arduino/arduino-create-agent/pull/481/files#diff-59a5ccbf6d889213033f4a5e627e5aadL115

With this assignment

ch = append(buffered_ch.Bytes(), ch[:n]...)

ch shrinks from 1024 to whatever length was read from the serial, so any subsequent call will only work if data read from the serial is smaller than the first read.

In this example I've reproduced the case when the final output after 2 readings is truncated https://play.golang.org/p/_qN2a5SHWtW

The fix in this PR works since ch is re-instantiated at every loop (it doesn't leak btw, it only puts pressure on the GC, no idea about the overall impact) but I'm not sure why the function copies back and forth from buffered_ch in the first place, removing that step would solve the issue without recreating the buffer at each step.

@cmaglie
Copy link
Member

cmaglie commented Nov 7, 2019

In this example I've reproduced the case when the final output after 2 readings is truncated https://play.golang.org/p/_qN2a5SHWtW

your test doesn't reproduce exactly the parsing loop, you forget to insert the utf8 check:

		if runeValue == utf8.RuneError {

that will save the incomplete utf8 char for the next loop, it will be parsed successfully when the "remainder" is received: https://play.golang.org/p/kfJIYUDB_wU

BTW the change in size of ch is real, and that may explain it.

@matthijskooijman
Copy link
Collaborator

ch shrinks from 1024 to whatever length was read from the serial, so any subsequent call will only work if data read from the serial is smaller than the first read.

Ok, I can see how that is not what this code expected. But would it really break? If ch shrinks, then the maximum read size will indeed shrink, but that should not cause any failures or out of bound writes, right? Or will n, err := p.portIo.Read(ch) still assume the original size somehow?

Eventually, you might end up with a single-byte ch, which makes it a lot more likely that the partial utf-8 rune handling triggers, but this should still work as expected since partial bytes are buffered and added again after the next read. IIUC, this can actually grow ch again then:

ch = append(buffered_ch.Bytes(), ch[:n]...)

@matteosuppo
Copy link
Contributor

I'm going to merge so that we can make a test build, and if it works release it. Afterwards we'll allocate some time to better refactor all this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants