-
Notifications
You must be signed in to change notification settings - Fork 27.4k
Conversation
A period (‘.’) may not begin or end a local-part cf. http://www.mirbsd.org/cvs.cgi/contrib/hosted/tg/mailfrom.php?rev=HEAD and RFC 822 / 5321
A “label” (each of the things between the dots (‘.’) after the ‘@’ in the eMail address) MUST NOT be longer than 63 characters. cf. http://www.mirbsd.org/cvs.cgi/contrib/hosted/tg/mailfrom.php?rev=HEAD and RFC 1035 §2.3.4
RFC 1035 §2.3.4 imposes a maximum length for any DNS object; RFC 5321 §2.3.5 references this (and requires FQDNs, but there have been cases where a TLD had an MX RR and thus eMail addresses like “localpart@tld” are valid). Credits: Natureshadow <[email protected]>
There is no need to make it case-insensitive; the local-part already catches the entire range, and the host part is easily done. Furthermore, this makes the regex locale-independent, avoiding issues with e.g. turkish case conversions. cf. http://www.mirbsd.org/cvs.cgi/contrib/hosted/tg/mailfrom.php?rev=HEAD
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed, please reply here (e.g.
|
1 similar comment
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed, please reply here (e.g.
|
I signed it! (although this diff does not pass threshold of originality, and as such is not covered by copyright protection) |
@mirabilos CLA is required for all Google projects, unfortunately. I wouldn't require it personally. Can you include links to the docs on which these changes are based? And we also need tests for these changes in this file: https://github.com/angular/angular.js/blob/master/test/ng/directive/inputSpec.js#L2784 |
Martin Staffa dixit:
Yeswell, I did sign it, and today I even went and made sure
RFC 822/5321 (SMTP) and RFC 1035 (DNS) are linked from the
When I read that I thought “oh nooooo…” at first, but that Is adding them in a subsequent commit okay with you? I don’t Thanks, //mirabilostarent solutions GmbH |
CLAs look good, thanks! |
1 similar comment
CLAs look good, thanks! |
@@ -24,7 +24,7 @@ var ISO_DATE_REGEXP = /^\d{4,}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d+(?:[+- | |||
// 9. Fragment | |||
// 1111111111111111 222 333333 44444 555555555555555555555555 666 77777777 8888888 999 | |||
var URL_REGEXP = /^[a-z][a-z\d.+-]*:\/*(?:[^:@]+(?::[^@]+)?@)?(?:[^\s:/?#]+|\[[a-f\d:]+\])(?::\d+)?(?:\/[^?#]*)?(?:\?[^#]*)?(?:#.*)?$/i; | |||
var EMAIL_REGEXP = /^[a-z0-9!#$%&'*+\/=?^_`{|}~.-]+@[a-z0-9]([a-z0-9-]*[a-z0-9])?(\.[a-z0-9]([a-z0-9-]*[a-z0-9])?)*$/i; | |||
var EMAIL_REGEXP = /^[-!#-'*+\/-9=?A-Z^-~]+(\.[-!#-'*+\/-9=?A-Z^-~]+)*@(?=.{1,255}$)[A-Za-z0-9]([A-Za-z0-9-]{0,61}[A-Za-z0-9])?(\.[A-Za-z0-9]([A-Za-z0-9-]{0,61}[A-Za-z0-9])?)*$/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ranges such as #-'
or \/-9
make this much less readable imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Georgios Kalpakas dixit:
Ranges such as
#-'
or\/-9
make this much less readable imo.
OK, I’ll expand them. But then, there are people who can’t read
a regex no matter how legible it is, and others will know about
it…
bye,
//mirabilos
tarent solutions GmbH
Rochusstraße 2-4, D-53123 Bonn • http://www.tarent.de/
Tel: +49 228 54881-393 • Fax: +49 228 54881-235
HRB 5168 (AG Bonn) • USt-ID (VAT): DE122264941
Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg
@Narretz tests included, let’s wait for Travis and hope I didn’t add a typo somewhere by accident… and that expanding the character class ranges didn’t fuck it up either… |
@Narretz haha “line is too long”, any idea how I can address this test? Inhouse we always exclude checks such as this one from being run on the testsuite, only on the main code. In this special case, the line is legitimately very long as it tests a very long string… which, I suppose, I could concatenate, but… any suggestions? It’s your project after all. |
requested by Georgios Kalpakas in #14719 (let’s hope all the special characters are accepted by JS as-is…)
LGTM (assuming the tests are green). Now, the big question: Is it a BC ? |
@gkalpak still waiting for the tests, I tried to get around the line length issue by concatenating, and had a pasto in them indeed (something I tested from the command-line but mispasted). Now, the big question: what is a “BC”? https://www.mirbsd.org/wtf.cgi?q=BC doesn’t seem to match in this context… |
I just recherched the maximum length of a local-part, and it’s 64 (RFC 5321 §4.5.3.1.1) so I’ll add another change to limit (and check!) that, but I’ll wait for this set of tests to pass first. |
BC means Breaking Change, i.e. if the change will break something that worked before in someone's application. Since this tightens it in terms of the spec, I think it's not BC. BTW, we are going to squash the commits into one before merging (but preserve the commit message contents) |
Lol, sorry for the acronym. So, "BC" is a breaking change. Regarding, RFC 5321 §4.5.3.1.1, it sounds more like a recommendation than a hard limit: Every implementation MUST be able to receive objects of at least It might be better to leave |
No need to be sorry about the acronyms – I actually do collect them. I’d prefer to have the commits not squashed, as they contain detail relevant for each partial change, but… I guess it’s your project. This is not a breaking change because ① it only invalidates some eMail addresses previously passing as valid which are not valid in SMTP, and ② your own documentation – https://docs.angularjs.org/api/ng/input/input%5Bemail%5D – suggests users with specific requirements to use their own checks anyway. I’d not let things unrestricted because the regex used is a “reasonably sane default” for eMail addresses, which omits, for example, all obsolete forms, the quoted localpart form, and the literal addresses like In case you’re curious: there’s actually an eMail address validating exactly what the RFC says, but it’s so large it doesn’t even fit on my (laptop) screen. I can dig it out if needed. I think the use case for this expression is to catch common typos and mistakes. |
We prefer to squash the commit when merging (so our history is cleaner on a higher level), but keep the unsquashed commits on the PR, which we link to from the squashed commit message. So, all the context is preserved and is easy to get to from the final commit.
Point taken 😃 |
OK. Oh great, now we have a line length issue in the regex, out of all places. I’ll try to wrap this. (On the topic of acronyms, if you have any that http://www.mirbsd.org/wtf.htm or https://f-droid.org/repository/browse/?fdid=de.naturalnet.mirwtfapp doesn’t know, send them to me privately.) |
Don't wrap it. Use
|
RFC 5321 §4.5.3.1.1 ⇒ local-part can have up to 64 octets RFC 5321 §4.5.3.1.3 ⇒ path “including the punctuation and element separators” can have up to 256 octets RFC 5321 §4.1.2 specifies path as ‘<’ + mailbox¹ + ‘>’ in the best case, leaving us 254 octets The limitation of the total path size to 254 octets leaves at most 252 octets (one local-part, one ‘@’) for the domain, which means we don’t need to explicitly check the domain size any more (removing the assertion after the ‘@’). ① RFC 821/5321 “mailbox” is the same as RFC 822 “addr-spec”
Thanks, that indeed worked. I think we’re ready for review then. |
**Limit size of local-part and total path size in eMail addresses** RFC 5321 §4.5.3.1.1 ⇒ local-part can have up to 64 octets RFC 5321 §4.5.3.1.3 ⇒ path “including the punctuation and element separators” can have up to 256 octets RFC 5321 §4.1.2 specifies path as ‘<’ + mailbox¹ + ‘>’ in the best case, leaving us 254 octets The limitation of the total path size to 254 octets leaves at most 252 octets (one local-part, one ‘@’) for the domain, which means we don’t need to explicitly check the domain size any more (removing the assertion after the ‘@’). ① RFC 821/5321 “mailbox” is the same as RFC 822 “addr-spec” **Optimise eMail address regex for speed** There is no need to make it case-insensitive; the local-part already catches the entire range, and the host part is easily done. Furthermore, this makes the regex locale-independent, avoiding issues with e.g. turkish case conversions. cf. http://www.mirbsd.org/cvs.cgi/contrib/hosted/tg/mailfrom.php?rev=HEAD **Limit eMail address total host part length** RFC 1035 §2.3.4 imposes a maximum length for any DNS object; RFC 5321 §2.3.5 references this (and requires FQDNs, but there have been cases where a TLD had an MX RR and thus eMail addresses like “localpart@tld” are valid). Credits: Natureshadow <[email protected]> **Limit eMail address individual host part length** A “label” (each of the things between the dots (‘.’) after the ‘@’ in the eMail address) MUST NOT be longer than 63 characters. cf. http://www.mirbsd.org/cvs.cgi/contrib/hosted/tg/mailfrom.php?rev=HEAD and RFC 1035 §2.3.4 **Fix eMail address local-part validation** A period (‘.’) may not begin or end a local-part cf. http://www.mirbsd.org/cvs.cgi/contrib/hosted/tg/mailfrom.php?rev=HEAD and RFC 822 / 5321 Closes #14719
These commits address a few shortcomings of the current eMail address regular expression used for form validation: