Skip to content

MariaDB Metadata skipping #1708

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

rusher
Copy link
Contributor

@rusher rusher commented Apr 24, 2025

Implement MariaDB metadata skipping

This is in replacement for #1692, for a simplier (still requiring lots of changes).
See MariaDB metadata skipping. With this change, MariaDB server won't send metadata when they have not changed, saving client parsing metadata and network.

Even if metadata skipping change is not so big, this rely on big changes :

  • capabilities support
  • EOF packet deprecation makes current implementation to be revised

A benchmark BenchmarkReceiveMetadata has been added to show the difference.
On a local server, difference gain is only a few percent.
on a distant server, difference is huge. Example here with a server next to client :
goos: linux
goarch: amd64
pkg: github.com/go-sql-driver/mysql
cpu: 11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz
BenchmarkReceiveMetadata
BenchmarkReceiveMetadata/query

before :
BenchmarkReceiveMetadata/query-16 138 7849971 ns/op 90379 B/op 1015 allocs/op

after change:
BenchmarkReceiveMetadata/query-16 362 2974155 ns/op 33337 B/op 16 allocs/op

Checklist

  • Code compiles correctly
  • Created tests which fail without the change (if possible)
  • All tests passing
  • Extended the README / documentation, if necessary
  • Added myself / the copyright holder to the AUTHORS file

Summary by CodeRabbit

  • New Features

    • Added support for MariaDB-specific extended capabilities and improved capability negotiation during connection handshake.
    • Introduced caching of column metadata for prepared statements to enhance performance.
    • Added a new benchmark to measure performance when retrieving a large number of metadata columns.
  • Improvements

    • Enhanced protocol compliance and robustness by refining how column and row packets are skipped and by improving EOF packet handling.
    • Improved handling of TLS requirements and client-server capability matching during connection setup.
  • Bug Fixes

    • Fixed minor formatting inconsistencies in benchmark tests.

Copy link

coderabbitai bot commented Apr 24, 2025

Walkthrough

This change introduces explicit support for MariaDB's extended capability flags, particularly focusing on the clientCacheMetadata feature. It refactors client-server capability negotiation during the handshake, updates packet parsing and construction to handle new capability flags, and modifies how column metadata is read, cached, or skipped depending on server and client support for metadata caching. Several helper functions are added for skipping protocol packets. Tests and benchmarks are updated to reflect new function signatures and to add coverage for metadata-heavy queries.

Changes

Files/Paths Change Summary
connection.go, connector.go, const.go, packets.go, statement.go, rows.go Refactored capability negotiation to support MariaDB extended capabilities, added/renamed capability flag types and constants, updated handshake and packet parsing logic, introduced metadata caching and conditional column reading/skipping, and replaced generic EOF reading with more specific skipping helpers.
auth_test.go, packets_test.go Updated test calls and assertions to match new function signatures and to verify server and extended capability flags.
benchmark_test.go Fixed formatting in loops and added a new benchmark for metadata-heavy queries.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Connector
    participant Server

    Client->>Connector: Connect()
    Connector->>Server: Send handshake request
    Server-->>Connector: Send handshake packet (includes capabilities)
    Connector->>Connector: Parse server capabilities
    Connector->>Connector: Compute client capabilities (including MariaDB extended)
    Connector->>Server: Send handshake response (with capabilities)
    Server-->>Connector: Authenticate/OK
    Note over Connector: Use extended capabilities to decide on metadata caching
    Connector->>Server: Prepare/Query/Exec
    Server-->>Connector: Send result set header (may include metadata cache flag)
    alt clientCacheMetadata enabled
        Connector->>Connector: Cache/reuse column metadata
    else
        Connector->>Connector: Read/skip columns as usual
    end
    Connector-->>Client: Return results
Loading

Possibly related issues

Possibly related PRs

Suggested reviewers

  • methane

Poem

In burrows deep where packets flow,
A rabbit tweaks the flags to show—
MariaDB’s new tricks in tow,
Metadata cached, columns in a row!
With skipping here and caching there,
The handshake’s smarter, swift, and fair.
🐇✨ Hopping forward, code anew—
This driver’s ready, thanks to you!

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@rusher rusher mentioned this pull request Apr 24, 2025
5 tasks
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (9)
const.go (1)

76-86: Well-structured implementation of MariaDB extended capabilities.

The introduction of extendedCapabilityFlag type and associated constants properly encapsulates MariaDB-specific extended capabilities, particularly the clientCacheMetadata flag that enables metadata skipping functionality.

Consider adding a brief comment for each extended capability flag to document their purpose, especially for the clientCacheMetadata flag which is central to this PR's implementation.

rows.go (1)

159-159: Updated function call to handle new metadata caching information.

The function call now correctly unpacks the additional boolean return value from readResultSetHeaderPacket() which likely indicates metadata caching presence.

The unused variable (represented by _) indicates metadata caching information that's not needed in this context. Consider adding a brief comment explaining why this information is discarded here but might be used elsewhere.

auth_test.go (1)

137-1346: Comprehensive test updates for authentication methods with capability flags.

All authentication-related tests have been consistently updated to match the new function signature, passing zero values for capability flags which is appropriate for these tests since they're not testing capability negotiation specifically.

Consider adding at least one test case that uses non-zero values for capability flags to explicitly test how capability negotiation affects the authentication process, particularly for metadata caching.

🧰 Tools
🪛 ast-grep (0.31.1)

[warning] 293-293: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{InsecureSkipVerify: true}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)


[warning] 665-665: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{InsecureSkipVerify: true}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)


[warning] 868-868: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{InsecureSkipVerify: true}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)


[warning] 1301-1301: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{InsecureSkipVerify: true}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)

benchmark_test.go (2)

402-407: Populate args slice from index 0 to keep invariant obvious

Because the first for starts at index 1, args[0] is assigned only in the nested loop below. The code is correct, but the two-phase “odd indices first, even indices later” pattern is a bit non-obvious to future readers.

-	for i := 1; i < 200; i += 2 {
+	for i := 0; i < 200; i += 2 {

…and swap the later writer loop to fill the odd indices. Functionality is unchanged while intent becomes clearer.


459-512: BenchmarkReceiveMetadata: tighten correctness & reduce noise

  1. db.SetMaxIdleConns(0) immediately followed by db.SetMaxIdleConns(1) cancels itself.
  2. rows.Next()’s boolean return is ignored; if the single row is unexpectedly absent, Scan triggers “row.Scan called without Next”.
  3. The benchmark never checks rows.Err(); silent I/O errors could be missed.
-	db.SetMaxIdleConns(0)
-	db.SetMaxIdleConns(1)
+	db.SetMaxIdleConns(1) // one stable idle conn is enough

-	rows := tb.checkRows(stmt.Query())
-	rows.Next()
-	// Scan the row
-	err := rows.Scan(valuePtrs...)
-	tb.check(err)
-	rows.Close()
+	rows := tb.checkRows(stmt.Query())
+	if !rows.Next() {
+		rows.Close()
+		b.Fatalf("unexpectedly empty result set")
+	}
+	tb.check(rows.Scan(valuePtrs...))
+	tb.check(rows.Close())
+	tb.check(rows.Err())

These tweaks make the benchmark more robust without affecting the performance signal.

connector.go (1)

133-143: TLS enforcement check is welcome – but message could be clearer

The error text is surfaced to application code; clarifying that the server capability flags are missing helps operators:

-	return nil, fmt.Errorf("TLS is required, but server doesn't support it")
+	return nil, fmt.Errorf("TLS requested in DSN but server did not advertise CLIENT_SSL capability")

Minor wording tweak improves diagnosability.

packets.go (3)

282-318: *The helper receives a Config but then ignores it
initClientCapabilities takes cfg *Config, yet most of the checks (compress, TLS, MultiStatements, DBName) use mc.cfg directly. Either:

  1. Drop the parameter and rely on mc.cfg, or
  2. Use the cfg that is passed in to avoid confusion.
-func (mc *mysqlConn) initClientCapabilities(serverCapabilities capabilityFlag, cfg *Config) capabilityFlag {
+func (mc *mysqlConn) initClientCapabilities(serverCapabilities capabilityFlag) capabilityFlag {
...
-    if cfg.ClientFoundRows {
+    if mc.cfg.ClientFoundRows {
...
-    if cfg.compress {
+    if mc.cfg.compress {
...
-    if n := len(cfg.DBName); n > 0 {
+    if n := len(mc.cfg.DBName); n > 0 {

This will also simplify the call-site (mc.initClientCapabilities(serverCapabilities)).


573-604: Variable name ‑ len shadows the built-in
Using len as a local variable hides the predeclared len() function and can trip up future readers (and some IDE/go-analysis tooling).

-    num, _, len := readLengthEncodedInteger(data)
+    num, _, bytesRead := readLengthEncodedInteger(data)
...
-    return int(num), data[len] == 0x01, nil
+    return int(num), data[bytesRead] == 0x01, nil

No behavioural change – just clearer code.


921-928: Blindly discarding a packet may swallow protocol errors
skipEof reads (and discards) one packet when clientDeprecateEOF is not set but never checks that the packet is actually an EOF/OK packet. A malformed server response would pass unnoticed.

-if mc.clientCapabilities&clientDeprecateEOF == 0 {
-    if _, err := mc.readPacket(); err != nil {
-        return err
-    }
+if mc.clientCapabilities&clientDeprecateEOF == 0 {
+    pkt, err := mc.readPacket()
+    if err != nil {
+        return err
+    }
+    if pkt[0] != iEOF && pkt[0] != iOK {
+        return fmt.Errorf("protocol error: expected EOF/OK, got 0x%02x", pkt[0])
+    }
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0fd55eb and 1907645.

📒 Files selected for processing (9)
  • auth_test.go (14 hunks)
  • benchmark_test.go (3 hunks)
  • connection.go (5 hunks)
  • connector.go (3 hunks)
  • const.go (2 hunks)
  • packets.go (15 hunks)
  • packets_test.go (1 hunks)
  • rows.go (3 hunks)
  • statement.go (4 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
connection.go (1)
dsn.go (1)
  • Config (37-84)
⏰ Context from checks skipped due to timeout of 90000ms (19)
  • GitHub Check: test (ubuntu-latest, 1.23, 9.0)
  • GitHub Check: test (macos-latest, 1.24, mariadb-10.11)
  • GitHub Check: test (windows-latest, 1.24, mariadb-10.6)
  • GitHub Check: test (windows-latest, 1.24, mariadb-10.11)
  • GitHub Check: test (macos-latest, 1.24, 5.7)
  • GitHub Check: test (windows-latest, 1.24, 8.0)
  • GitHub Check: test (windows-latest, 1.24, mariadb-11.1)
  • GitHub Check: test (windows-latest, 1.24, mariadb-11.4)
  • GitHub Check: test (windows-latest, 1.24, 5.7)
  • GitHub Check: test (windows-latest, 1.24, mariadb-11.2)
  • GitHub Check: test (ubuntu-latest, 1.24, mariadb-10.5)
  • GitHub Check: test (ubuntu-latest, 1.24, mariadb-10.11)
  • GitHub Check: test (ubuntu-latest, 1.24, 9.0)
  • GitHub Check: test (macos-latest, 1.24, 9.0)
  • GitHub Check: test (ubuntu-latest, 1.24, mariadb-11.4)
  • GitHub Check: test (ubuntu-latest, 1.24, 5.7)
  • GitHub Check: test (ubuntu-latest, 1.24, 8.4)
  • GitHub Check: test (ubuntu-latest, 1.24, mariadb-11.1)
  • GitHub Check: test (ubuntu-latest, 1.24, 8.0)
🔇 Additional comments (14)
const.go (2)

46-46: Improved type name enhances code clarity.

Renaming clientFlag to capabilityFlag better reflects its purpose as it represents capabilities of both client and server, not just client flags.


49-49: Renamed constant maintains consistent naming with its type.

Renaming clientLongPassword to clientMySQL aligns with the type name change and better represents its meaning as the first capability flag.

packets_test.go (2)

335-335: Updated function call signature to match implementation changes.

The test correctly adapts to the modified readHandshakePacket function signature that now returns additional capability flags.


340-346: Good test coverage for server capabilities.

The added assertions verify that both standard and extended server capabilities are correctly parsed from the handshake packet, which is essential for testing the MariaDB metadata caching feature.

rows.go (2)

116-116: More precise method for skipping result set rows.

Replacing readUntilEOF() with skipResultSetRows() improves code semantics and likely handles metadata caching more efficiently when closing rows.


146-146: Consistent replacement of EOF reading with specialized method.

This change aligns with the previous one, ensuring consistent handling of result sets with potential metadata caching.

auth_test.go (1)

92-92: Updated function signature for handshake response packet.

The call to writeHandshakeResponsePacket now includes the server capability parameters needed for capability negotiation during authentication.

benchmark_test.go (1)

132-147: Minor loop-formatting change is fine

Only whitespace was touched here; functionality and performance stay identical.

statement.go (2)

120-131: Skip-EOF logic may mis-handle MariaDB “metadata skipped” path

When metadataFollows == false, MariaDB does not send an EOF packet; the first packet after the header is already the first row. Calling mc.skipEof() risks consuming that row and corrupting the stream.

Please verify with the MariaDB protocol docs; if confirmed, gate the EOF skip:

-        if err = mc.skipEof(); err != nil {
+        if mc.mc.isMySQL() { // or similar capability check
+            if err = mc.skipEof(); err != nil {
                 return nil, err
             }
         }

Failing to do so will surface as “invalid sequence” or “unexpected field count” on the next read.


73-83: Exec: fast-path skips — good change

Replacing readUntilEOF with explicit skipColumns and skipResultSetRows avoids parsing work and aligns with capability-aware handling. Nice!

connector.go (2)

160-161: HandshakeResponse now forwards server capability info – good

Forward-propagating both capability bitsets keeps the client/server negotiation symmetric and maintainable.


174-177: Compression gating uses negotiated client flags – correct

Switching from mc.flags to mc.clientCapabilities fixes false-negatives when the initial capability mask differed. Looks good.

connection.go (1)

223-242: Conditional metadata path looks solid – just ensure compatibility downstream
The new branch that caches column metadata when clientCacheMetadata is negotiated is well-structured and will save network round-trips. Please double-check every call-site that later inspects stmt.columns; when metadata is skipped, it will now be nil, which may not have been the case before.

packets.go (1)

945-960: EOF detection may mis-classify large data packets
skipResultSetRows treats any packet whose first byte is 0xFE and whose total length < 16 MB (0xffffff) as EOF/OK. According to the protocol, row packets can also legitimately start with 0xFE when the first column’s length-encoded string happens to use the 0xFE marker and the row is small (< 8 bytes/field).

A safer check is “length < 9” (EOF packet’s minimum size) which MySQL itself and other clients use.

@coveralls
Copy link

Coverage Status

coverage: 82.438% (-0.5%) from 82.961%
when pulling 1907645 on rusher:metadataSkip
into 0fd55eb on go-sql-driver:master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants