Skip to content

Support VECTOR data type #1549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bgrainger opened this issue Feb 23, 2025 · 8 comments
Closed

Support VECTOR data type #1549

bgrainger opened this issue Feb 23, 2025 · 8 comments
Assignees

Comments

@bgrainger
Copy link
Member

Discussed in #1548, originally posted by @harrison314.

Add support for the VECTOR(n) type, which is available since MariaDB 11.7.

https://mariadb.com/kb/en/vector-overview/

I've solved it with a workaround:

    internal static class MySqlExtensions
    {
        public static MySqlParameter AddWithVectorValue(this MySqlParameterCollection parameters, string name, ReadOnlySpan<float> vector)
        {
            byte[] buffer = new byte[vector.Length * 4];

            for (int i = 0; i < vector.Length; i++)
            {
                BinaryPrimitives.WriteSingleLittleEndian(buffer.AsSpan(i * 4), vector[i]);
            }

            MySqlParameter p = parameters.Add(name, System.Data.DbType.Binary);
            p.Value = buffer;

            return p;
        }
    }
```</div>
@bgrainger bgrainger changed the title Support MariaDB VECTOR data type Support VECTOR data type Feb 23, 2025
@bgrainger
Copy link
Member Author

@bgrainger
Copy link
Member Author

bgrainger commented Feb 23, 2025

The underlying type should probably be float[] (not ReadOnlySpan<float> as that can't be boxed for MySqlParameter.Value).

Edit: Will use ReadOnlyMemory<float> as per dotnet/runtime#115148 (comment).

Possibly could add GetFieldValue<ReadOnlySpan<float>> to get a view of the row that's been retrieved.

bgrainger added a commit that referenced this issue Feb 23, 2025
@bgrainger
Copy link
Member Author

The VECTOR data type was added in Connector/NET 8.4.0.

@bgrainger
Copy link
Member Author

Preliminary tests with MariaDB 11.7 indicate that it doesn't use a dedicated on-the-wire type for VECTOR(n), so all results come back as BLOBs. (Possibly waiting on https://jira.mariadb.org/browse/MDEV-35831.) This means that MySqlDataReader.GetValue(n) will return byte[] and users will have to use Buffer.BlockCopy or MemoryMarshal.Cast<> to get a float[].

@bgrainger bgrainger self-assigned this Feb 24, 2025
@harrison314
Copy link

Possibly could add GetFieldValue<ReadOnlySpan<float>> to get a view of the row that's been retrieved.

I think it's enough to implement float[], because the array will always be allocated anyway. I used ReadOnlySpan<float> in my helper method because it is a generic type. But it may not be suitable for implementing methods in the MySQL connector.

This means that MySqlDataReader.GetValue(n) will return byte[] and users will have to use Buffer.BlockCopy or MemoryMarshal.Cast<> to get a float[].

MemoryMarshal.Cast<> may not work on all processors due to big and little endian.

@harrison314
Copy link

Don't you plan to release at least a preview nuget with vector support in the near future?

@bgrainger
Copy link
Member Author

@harrison314 Since you mentioned big-endian systems, can you review this one commit? c7bb9b3

(I don't have a big-endian system for testing, so it's mostly a guess that that's the correct code.)

@harrison314
Copy link

@bgrainger Unfortunately, I also don't have a big-endian system to test.
I looked at the given commit, and tried the opposite direction of the transformation using both BinaryPrimitives and Array.Reverse and they gave the same results. Therefore, I think the given code is fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants