Skip to content

Throwing on an invalid charset seems a little much for a minor release #807

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tgriesser opened this issue May 6, 2014 · 8 comments
Closed
Labels

Comments

@tgriesser
Copy link

Ticket #789 added an error thrown when an invalid charset is specified.

Is this a feature to throw? Would doing a a console.error with a message or something along those lines make more sense?

I had for awhile assumed that utf8 was a valid choice for a charset since it's what I would have used in mysql, the docs even have > charset utf8; in multiple examples.

So I'd advertised it in the docs for knex.js as the default for connection settings example, and I'm sure there are a lot of people who still have that as a config option, for example here in the ghost blogging platform.

I'm just afraid this might cause more breakage than it's worth for a minor release, particularly now that the ^ is the default for npm install --save, and since as far as I can tell from mysql docs, utf8 makes sense as a valid value for a charset.

@dougwilson
Copy link
Member

Would doing a a console.error with a message or something along those lines make more sense?

We really won't want to spew out random to people's stderr/stdout pipes.

Is this a feature to throw?

The "feature" is that people had no idea what the valid values were, so it is very dangerous to specify an unknown value, as it would do something you didn't intend silently. Since throwing is a "feature" it was introduced in 2.2, which is according to semver conventions.

I had for awhile assumed that utf8 was a valid choice for a charset since it's what I would have used in mysql

That may be, but the docs for this module showed the valid was "UTF8_GENERAL_CI" and required to be in caps since v2.0.0-alpha2 and clarified it needed to be in caps (which is now no longer the case) since v2.0.0-alpha9 (this is all before v2 was even released). It was documented correctly, but the code was incorrectly ignoring bad values.

the docs even have > charset utf8; in multiple examples.

Those docs are for SET calls in MySQL queries, not for the protocol. The MySQL docs for the actual protocol charset is here: http://dev.mysql.com/doc/refman/5.0/en/charset-applications.html

This client calls it "charset" because that it what it is called on the protocol level (http://dev.mysql.com/doc/internals/en/connection-phase-packets.html#packet-Protocol::Handshake), but it is technically the collation that the protocol is using.

particularly now that the ^ is the default for npm install --save

Perhaps, but not throwing was a bug in the prior releases; all bug fixes are technically breaking, which does suck.

@dougwilson
Copy link
Member

Forgot to address this line:

as far as I can tell from mysql docs, utf8 makes sense as a valid value for a charset.

Remember that "charset" is what the protocol calls it, but what is negotiated is the collation, as in there is no binary value that corresponds to "utf8"; you can see the full list here: https://github.com/felixge/node-mysql/blob/v2.2.0/lib/protocol/constants/charsets.js

@tgriesser
Copy link
Author

We really won't want to spew out random to people's stderr/stdout pipes.

Fair

The "feature" is that people had no idea what the valid values were, so it is very dangerous to specify an unknown value, as it would do something you didn't intend silently. Since throwing is a "feature" it was introduced in 2.2, which is according to semver conventions.

I guess feature depends a lot on one's interpretation. I see it as a breaking change

Remember that "charset" is what the protocol calls it, but what is negotiated is the collation

Right, so if really what you're requesting there is the collation name, not the charset name, then it's the API that's unclear/incorrect.

Would it be possible to add a lookup table between the actual character set names and the collation names, so as to allow a more flexible charset api? Or to instead change charset to collation in the config, as that's really what you're asking for/providing.

@tgriesser
Copy link
Author

If so, I'd gladly work on it and open a pull request

@tgriesser
Copy link
Author

So also, looking at http://dev.mysql.com/doc/internals/en/charsets.html it really looks like the charset value should correspond to things like utf8 latin1, it's just always been that the collation was mislabeled as charset in node-mysql.

So anyway, still think this should be addressed differently than throwing for those putting a valid charset value in a config slot that asks for a charset, not a collation.

@dougwilson
Copy link
Member

I guess feature depends a lot on one's interpretation. I see it as a breaking change

That's fine. It won't be going away, though. 2.2+ will throw on unknown charset settings. Commit 09e3343 is also a bug fix, but will suddenly make stuff error when before it silently did nothing. Adding a throw to fix a bug is a standard thing. Node.js core has added throws when the type to an API in patch releases many times.

An API that, given an unknown value, changes that value to a default value is very clearly broken. Like I said, this is not going to change. You're welcome to spec >= 2 < 2.2 if you like :)

Would it be possible to add a lookup table between the actual character set names and the collation names, so as to allow a more flexible charset api?

Not particularly. If you give "utf8" how would it choose between the following values?

UTF8_GENERAL_CI
UTF8_BIN
UTF8_UNICODE_CI
UTF8_ICELANDIC_CI
UTF8_LATVIAN_CI
UTF8_ROMANIAN_CI
UTF8_SLOVENIAN_CI
UTF8_POLISH_CI
UTF8_ESTONIAN_CI
UTF8_SPANISH_CI
UTF8_SWEDISH_CI
UTF8_TURKISH_CI
UTF8_CZECH_CI
UTF8_DANISH_CI
UTF8_LITHUANIAN_CI
UTF8_SLOVAK_CI
UTF8_SPANISH2_CI
UTF8_ROMAN_CI
UTF8_PERSIAN_CI
UTF8_ESPERANTO_CI
UTF8_HUNGARIAN_CI
UTF8_SINHALA_CI
UTF8_GERMAN2_CI
UTF8_CROATIAN_MYSQL561_CI
UTF8_UNICODE_520_CI
UTF8_VIETNAMESE_CI
UTF8_GENERAL_MYSQL500_CI
UTF8_GENERAL50_CI

If so, I'd gladly work on it and open a pull request

See above.

So anyway, still think this should be addressed differently than throwing for those putting a valid charset value in a config slot that asks for a charset, not a collation.

Yes, the naming is confusing, I agree, but http://dev.mysql.com/doc/internals/en/connection-phase-packets.html#packet-Protocol::Handshake calls it "charset" in the protocol, rather than collation for whatever reason. I don't want to have both a charset and a collation configuration value as aliases, but I have slated to renaming the config to collation in v3.

TL;DR I'm glad this fix has brought to your attention the invalid protocol charset option you were supplying.

@dougwilson
Copy link
Member

@tgriesser feel free to email me (in my GitHub profile) if you are interested in discussing a solution over the phone.

@tgriesser
Copy link
Author

Following up with an email.

dveeden pushed a commit to dveeden/mysql that referenced this issue Jan 31, 2023
* log missing auth plugin name

* refactor auth handling

* auth: fix AllowNativePasswords

* auth: remove plugin name print

* packets: attempt to fix writePublicKeyAuthPacket

* packets: do not NUL-terminate auth switch packets

* move handleAuthResult to auth

* add old_password auth tests

* auth: add empty old_password test

* auth: add cleartext auth tests

* auth: add native auth tests

* auth: add caching_sha2 tests

* rename init and auth packets to documented names

* auth: fix plugin name for switched auth methods

* buffer: optimize default branches

* auth: add tests for switch to caching sha2

* auth: add tests for switch to cleartext password

* auth: add tests for switch to native password

* auth: sync NUL termination with official connectors

* packets: handle missing NUL bytes in AuthSwitchRequests

Updates mysqljs#795
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants