You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SI-8879 fix quadratic reading time in StreamReader
StreamReader.nextEol used to loop all the way to Eol every time an
element was read. That's very costly when lines are long.
Furthermore, it used to call PagedSeq.length, forcing PagedSeq to
load the whole input in memory, even when a single character was read.
nextEol is now saved as part of the state of StreamReader, and is passed
to child readers when created (as long as we do not read past the end of
the line). Thus it computed only once per line, whatever the length.
With the example in the ticket (SI-8879), we get:
* before:
User time (seconds): 82.12
System time (seconds): 0.07
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:21.52
* after:
User time (seconds): 1.05
System time (seconds): 0.06
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.68
* for comparison, using PagedSeqReader directly:
User time (seconds): 1.06
System time (seconds): 0.06
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.69
`isDefinedAt` is used instead of `length` so that pages beyond the
tested index do not need to be read. The test only tests this part.
0 commit comments