Skip to content

Optimization: statements reuse previous column name #1711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

methane
Copy link
Member

@methane methane commented Apr 27, 2025

Description

#1708 added []mysqlField cache to stmt. It was used only for MariaDB cached metadata.

This commit allows MySQL to also benefit from the metadata cache. If the column names are the same as the cached metadata, it reuses them instead of allocating new strings.

$ ~/go/bin/benchstat master.txt reuse.txt
goos: darwin
goarch: arm64
pkg: github.com/go-sql-driver/mysql
cpu: Apple M1 Pro
                  │ master.txt  │           reuse.txt           │
                  │   sec/op    │   sec/op     vs base          │
ReceiveMetadata-8   1.273m ± 2%   1.269m ± 2%  ~ (p=1.000 n=10)

                  │  master.txt  │              reuse.txt              │
                  │     B/op     │     B/op      vs base               │
ReceiveMetadata-8   88.17Ki ± 0%   80.39Ki ± 0%  -8.82% (p=0.000 n=10)

                  │  master.txt  │             reuse.txt              │
                  │  allocs/op   │ allocs/op   vs base                │
ReceiveMetadata-8   1015.00 ± 0%   16.00 ± 0%  -98.42% (p=0.000 n=10)

Checklist

  • Code compiles correctly
  • Created tests which fail without the change (if possible)
  • All tests passing
  • Extended the README / documentation, if necessary
  • Added myself / the copyright holder to the AUTHORS file

Summary by CodeRabbit

  • Performance Improvements
    • Reduced unnecessary memory allocations when reading column metadata, resulting in improved efficiency during query execution.
  • Tests
    • Updated benchmark timing to more accurately measure only the metadata retrieval and scanning phases.

Copy link

coderabbitai bot commented Apr 27, 2025

Walkthrough

This set of changes modifies how column metadata is managed and benchmarked in the codebase. The readColumns function now accepts an additional parameter for potential reuse of existing metadata, optimizing string allocations when column names remain unchanged. All invocations of readColumns are updated to accommodate the new parameter. In benchmarks, the timing and allocation reporting are adjusted to exclude statement preparation, focusing measurement solely on metadata retrieval and scanning. No exported or public API signatures are altered except for the internal readColumns function.

Changes

File(s) Change Summary
benchmark_test.go Moved b.ReportAllocs() and b.ResetTimer() to just before the benchmark loop to exclude statement preparation from measurements.
connection.go, rows.go Updated calls to readColumns to pass an additional nil argument, reflecting the new method signature.
statement.go Updated readColumns calls to pass stmt.columns as the second argument for potential metadata reuse.
packets.go Modified readColumns to accept an old []mysqlField parameter, reusing string references for unchanged metadata, and updated its signature.

Sequence Diagram(s)

sequenceDiagram
    participant Benchmark
    participant DB
    participant readColumns

    Benchmark->>DB: Prepare statement
    Note right of Benchmark: Timer/allocs not started
    Benchmark->>Benchmark: b.ReportAllocs(), b.ResetTimer()
    loop N times
        Benchmark->>DB: Execute query
        DB->>readColumns: readColumns(count, old)
        readColumns-->>DB: Columns metadata (possibly reused)
        DB-->>Benchmark: Scan result
    end
Loading
sequenceDiagram
    participant DB
    participant readColumns

    DB->>readColumns: readColumns(count, oldColumns)
    alt oldColumns matches
        readColumns->>readColumns: Reuse string references for tableName and name
    else oldColumns mismatch
        readColumns->>readColumns: Allocate new strings
    end
    readColumns-->>DB: Columns metadata
Loading

Possibly related PRs

Poem

A hop, a skip, a clever tweak,
Now columns reuse what they seek.
Benchmarks run with sharpened eye,
Old strings reused, allocations fly by.
With every query, swift and neat—
This code’s a rabbit’s agile feat! 🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6e944e1 and cf0877b.

📒 Files selected for processing (5)
  • benchmark_test.go (1 hunks)
  • connection.go (2 hunks)
  • packets.go (3 hunks)
  • rows.go (2 hunks)
  • statement.go (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (20)
  • GitHub Check: test (macos-latest, 1.24, mariadb-10.5)
  • GitHub Check: test (windows-latest, 1.24, mariadb-10.11)
  • GitHub Check: test (windows-latest, 1.24, mariadb-11.4)
  • GitHub Check: test (windows-latest, 1.24, mariadb-11.2)
  • GitHub Check: test (windows-latest, 1.24, mariadb-10.6)
  • GitHub Check: test (macos-latest, 1.24, 8.4)
  • GitHub Check: test (windows-latest, 1.24, 8.4)
  • GitHub Check: test (macos-latest, 1.24, 5.7)
  • GitHub Check: test (ubuntu-latest, 1.24, 5.7)
  • GitHub Check: test (ubuntu-latest, 1.24, mariadb-10.11)
  • GitHub Check: test (ubuntu-latest, 1.24, mariadb-10.6)
  • GitHub Check: test (macos-latest, 1.24, mariadb-11.4)
  • GitHub Check: test (ubuntu-latest, 1.24, mariadb-10.5)
  • GitHub Check: test (macos-latest, 1.24, 9.0)
  • GitHub Check: test (ubuntu-latest, 1.24, mariadb-11.4)
  • GitHub Check: test (ubuntu-latest, 1.24, 9.0)
  • GitHub Check: test (ubuntu-latest, 1.24, mariadb-11.1)
  • GitHub Check: test (ubuntu-latest, 1.24, mariadb-11.2)
  • GitHub Check: test (ubuntu-latest, 1.24, 8.0)
  • GitHub Check: test (ubuntu-latest, 1.24, 8.4)
🔇 Additional comments (8)
benchmark_test.go (1)

498-499: Improved benchmark measurement accuracy

Relocating b.ReportAllocs() and b.ResetTimer() to this position ensures that allocation tracking and timing are only measured for the actual metadata retrieval and scanning loop, excluding the statement preparation overhead. This change provides more accurate performance metrics that directly reflect the optimization being tested.

statement.go (2)

77-77: Column data reuse optimization properly implemented

Now passing stmt.columns as the second parameter to readColumns enables the reuse of existing column metadata strings, which aligns with PR's goal of reducing allocations when column names match. This is a key part of the optimization strategy.


128-128: Consistent implementation of column data reuse

Updated readColumns call follows the same pattern as the earlier instance in the file, passing stmt.columns to enable string reuse. The assignment on line 131 preserves the updated columns for future use, creating a complete loop of metadata reuse.

connection.go (2)

234-234: Align with the new readColumns API

Updated to match the new readColumns signature, passing nil as there's no existing metadata to reuse in this context. This is appropriate as the first call to readColumns for this statement.


451-451: Consistent API usage for readColumns

Properly updated to pass nil as the second parameter, following the same pattern as other call sites. This ensures consistency across the codebase while maintaining the function's behavior when there's no existing metadata to reuse.

rows.go (2)

189-189: Updated readColumns call for binary rows

Correctly modified to match the updated function signature, passing nil as the second argument. For NextResultSet calls, we don't have previous metadata to reuse, so this is the appropriate approach.


211-211: Consistent pattern for text rows implementation

Updated to match the binary rows implementation, maintaining consistency across similar functions. The nil parameter correctly indicates that no previous column data is available for reuse in this context.

packets.go (1)

705-709: Great optimization to reduce memory allocations!

This change intelligently reuses existing column metadata strings when they match, avoiding unnecessary string allocations. The implementation is clean with three key components:

  1. Adding a new parameter to allow passing existing metadata
  2. Safety check to ensure old metadata matches the expected column count
  3. String comparison and reuse for both tableName and name fields

This approach should significantly reduce memory allocations during repeated queries with similar column structures, as indicated in the PR description (98.42% reduction in allocations).

Also applies to: 737-742, 763-768

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@coveralls
Copy link

Coverage Status

coverage: 82.413% (+0.003%) from 82.41%
when pulling cf0877b on methane:reuse-metadata
into 6e944e1 on go-sql-driver:master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants