Skip to content

Parse SIGNED INTEGER type in MySQL CAST #1739

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 26, 2025
Merged

Conversation

mvzink
Copy link
Contributor

@mvzink mvzink commented Feb 21, 2025

MySQL doesn't have the same set of possible CAST types as for e.g. column definitions. For example, it raises a syntax error for CAST(1 AS INTEGER SIGNED) and instead expects CAST(1 AS SIGNED INTEGER).

We retain the current somewhat permissive datatype parsing behavior (e.g. allowing CAST(1 AS BIGINT) even though MySQL would raise a syntax error), and add datatypes for this specific case (SIGNED [INTEGER] and UNSIGNED [INTEGER]).

To keep the naming consistent across datatypes, we also rename UnsignedInt to IntUnsigned and similar for other unsigned datatypes. This means they display in the same way they are written, i.e. IntUnsigned = INT UNSIGNED instead of UnsignedInt = INT UNSIGNED.

Closes #1589

readysetbot pushed a commit to readysettech/readyset that referenced this pull request Feb 22, 2025
See relevant sqlparser [PR].

[PR]: apache/datafusion-sqlparser-rs#1739

Change-Id: I11a01b48f1649c6530cdbe7bd23c158e9fe4f888
@mvzink mvzink marked this pull request as draft February 24, 2025 15:20
@mvzink mvzink changed the title Parse signed/unsigned integer data types correctly in MySQL CAST Parse signed/unsigned integer data type in MySQL CAST Feb 24, 2025
@mvzink mvzink marked this pull request as ready for review February 24, 2025 17:08
/// Semantically equivalent to `BIGINT`.
///
/// [MySQL CAST]: https://dev.mysql.com/doc/refman/8.4/en/cast-functions.html
Signed(bool),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking it feels a unexpected to use Signed(true) to represent SIGNED INTEGER? and the repr assumes a mysql-only type which might not be the case going forward. Feels like it'd be clear to represent them explicitly as SIGNED and SIGNED INTEGER where both can be reused by other dialects later if needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure exactly what you mean. Different variants Signed = SIGNED and SignedInteger = SIGNED INTEGER? The only reason I didn't do that is there's already UnsignedInt = INT[EGER] UNSIGNED and it would be even more confusing to have UnsignedInteger = UNSIGNED INTEGER, and very disruptive to change the existing UnsignedInt to IntUnsigned or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, that is what you are suggesting. I misread your previous comment, sorry! I'll try that out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Let me know if this is what you had in mind. Definitely could be a bit disruptive to downstream users.

MySQL doesn't have the same set of possible CAST types as for e.g.
column definitions. For example, it raises a syntax error for `CAST(1 AS
INTEGER SIGNED)` and instead expects `CAST(1 AS SIGNED INTEGER)`.

We retain the current somewhat permissive datatype parsing behavior
(e.g. allowing `CAST(1 AS BIGINT)` even though MySQL would raise a
syntax error), and add datatypes for this specific case (`SIGNED
[INTEGER]` and `UNSIGNED [INTEGER]`).

To keep the naming consistent across datatypes, we also rename
`UnsignedInt` to `IntUnsigned` and similar for other unsigned datatypes.
This means they display in the same way they are written, i.e.
`IntUnsigned = INT UNSIGNED` instead of `UnsignedInt = INT UNSIGNED`.

Closes apache#1589
@mvzink mvzink changed the title Parse signed/unsigned integer data type in MySQL CAST Parse SIGNED INTEGER type in MySQL CAST Feb 25, 2025
Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @mvzink!
cc @alamb

@iffyio iffyio merged commit de4dbc5 into apache:main Feb 26, 2025
9 checks passed
@mvzink mvzink deleted the push-vlqruupookvx branch February 26, 2025 17:09
ayman-sigma pushed a commit to sigmacomputing/sqlparser-rs that referenced this pull request Apr 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MySQL datatype discrepancy between DDL and CAST
2 participants