Skip to content

Commit 6c37a6e

Browse files
authored
internal/gogrep: use external MatcherState (#286)
The reason we have to clone rule set every time we do `Engine.Run` is `gogrep.Pattern` internal state. Every gogrep matcher has mutable state that can't be combined with the thread-safe API of `Engine.Run`. Cloning is usually not very expensive, we just copy a few dozens of slices here and there, but it's noticeable for both small and big rule file sizes. If we move the mutable shared state out of the matchers and keep it somewhere in the per-run or per-context scope, we should be able to reuse it without races and there will be no need to copy the matchers/patterns on their own as their internal structure will become immutable (they receive a mutable state via param). My experiments show ~15% speedup on the current stage. I may move gogrep state even higher, to the per-context scope so it's reused even more efficiently. But as a proof of concept we can see that thread-safe version of `Engine.Run` is possible without excessive copying. With small rules file and small target file: ``` name old time/op new time/op delta EngineRun-8 41.4µs ± 3% 21.5µs ± 1% -48.05% (p=0.000 n=10+10) name old alloc/op new alloc/op delta EngineRun-8 17.1kB ± 0% 3.4kB ± 0% -80.14% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 129 ± 0% 47 ± 0% -63.57% (p=0.000 n=10+10) ``` With big rules file and small targe file: ``` name old time/op new time/op delta EngineRun-8 968µs ± 2% 481µs ± 4% -50.37% (p=0.000 n=9+9) name old alloc/op new alloc/op delta EngineRun-8 429kB ± 0% 13kB ± 0% -96.94% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 2.64k ± 0% 0.31k ± 0% -88.33% (p=0.000 n=10+10) ``` With small rules file and big target file size: ``` name old time/op new time/op delta EngineRun-8 127µs ± 5% 105µs ± 1% -17.15% (p=0.000 n=9+9) name old alloc/op new alloc/op delta EngineRun-8 22.0kB ± 0% 8.3kB ± 0% -62.27% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 267 ± 0% 185 ± 0% -30.71% (p=0.000 n=10+10) ``` With huge rules file and big target file size: ``` name old time/op new time/op delta EngineRun-8 4.31ms ± 2% 3.72ms ± 1% -13.71% (p=0.000 n=10+9) name old alloc/op new alloc/op delta EngineRun-8 463kB ± 0% 46kB ± 0% -89.97% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 3.53k ± 0% 1.20k ± 0% -66.01% (p=0.000 n=10+10) ``` Fixes #285
1 parent a0c5279 commit 6c37a6e

File tree

7 files changed

+200
-194
lines changed

7 files changed

+200
-194
lines changed

internal/gogrep/gogrep.go

+18-3
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,22 @@ func (data MatchData) CapturedByName(name string) (ast.Node, bool) {
3333
return findNamed(data.Capture, name)
3434
}
3535

36+
type MatcherState struct {
37+
Types *types.Info
38+
39+
// node values recorded by name, excluding "_" (used only by the
40+
// actual matching phase)
41+
capture []CapturedNode
42+
43+
pc int
44+
}
45+
46+
func NewMatcherState() MatcherState {
47+
return MatcherState{
48+
capture: make([]CapturedNode, 0, 8),
49+
}
50+
}
51+
3652
type Pattern struct {
3753
m *matcher
3854
}
@@ -46,16 +62,15 @@ func (p *Pattern) NodeTag() nodetag.Value {
4662
}
4763

4864
// MatchNode calls cb if n matches a pattern.
49-
func (p *Pattern) MatchNode(info *types.Info, n ast.Node, cb func(MatchData)) {
50-
p.m.MatchNode(info, n, cb)
65+
func (p *Pattern) MatchNode(state *MatcherState, n ast.Node, cb func(MatchData)) {
66+
p.m.MatchNode(state, n, cb)
5167
}
5268

5369
// Clone creates a pattern copy.
5470
func (p *Pattern) Clone() *Pattern {
5571
clone := *p
5672
clone.m = &matcher{}
5773
*clone.m = *p.m
58-
clone.m.capture = make([]CapturedNode, 0, 8)
5974
return &clone
6075
}
6176

0 commit comments

Comments
 (0)