-
Notifications
You must be signed in to change notification settings - Fork 27.4k
Why does Angular need its own angular.uppercase and lowercase methods? #11387
Comments
The reason is documented in a source comment: Lines 161 to 163 in e5d1d65
It's also documented on MDN: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/toLocaleLowerCase Lastly, I think the StackOverflow answers were pretty accurate. |
Neither the comment nor the SO answers are informative.
How do those answers answer it? How does the source comment answer it? Moreover, the comment and the answers contradict each other. In theory, toLowerCase() & toUpperCase() shouldn't be locale-sensitive. That's what one of the answers says. And your MDN link says the same thing. But the comment states something very opposite:
Again, in which exactly browsers does this happen? |
Searching on Google, this seems like a general problem due to the Turkish language itself - this article seems to explain it: http://www.i18nguy.com/unicode/turkish-i18n.html If I had to guess, this is a problem with every browser. |
The general problem exists, that's true. But who said that it (still) exists also in JavaScript? Who faced it during say last 10 years? |
I doubt it was pulled from thin air - that is not how the Angular team functions. |
The hacks and workarounds for IE8 are being removed from the code of Angular now. But this thing looks like a workaround for a bug in some even older browser. |
@thorn0 why not testing on your end (you seem to be from Turkey) and sending a PR to remove those work-arounds if no longer necessary? I'm pretty sure that everyone would be more than happy to remove unnecessary code. |
I wanted to know why this code is there. If the issue still exists (if ever existed) in some browser, it'd be nice to know about it. I'm going to check it in different browsers, but of course I don't have access to all the needed OS+device+browser combinations. |
@thorn0 AFAIK Turkish locale was the only / main reason. So if you can confirm that this bug doesn't affect users of modern browsers and submit a PR to see if all the tests are passing on CI this could potentially be merging. But as you are saying, no one will be able to re-test this on all the possible devices in the wild, so we might potentially introduce a breaking change here.... |
BTW, another library containing code like that is Google Closure Library. See these lines. Supposedly, it came to Angular from there. |
I think this can be split into two issues
The former point should be easy to decide if someone can verify that this is no longer an issue with the supported browsers |
In case it helps, the code was added to the angular.js file by @mhevery in Oct 2010 as part of this commit "create HTML sanitizer to allow inclusion of untrusted HTML in safe manner". Everything that I could find on the topic only mentioned this being a problem in Java. The ECMAScript standards (3 and 5) state that the toUpperCase and toLowerCase functions are not locale sensitive and the toLocaleUpperCase and toLocaleLowerCase functions are locale sensitive. So it doesn't seem to be required, but I don't know how to test on multiple browsers in the Turkish locale. |
I think the right thing to do for now would be just undocumenting |
The basic issue is that when malicious code writes If you can prove that in JS |
I notice that there are various places in the sanitizer that "don't" use our custom lowercase. E.g. https://github.com/angular/angular.js/blob/master/src/ngSanitize/sanitize.js#L378 |
As far as I can tell, AngularJS and the Closure Library are the only libraries that include this workaround, and nowhere else on the Internet is it mentioned that this problem ever existed in the JS world, only in Java. |
The `manualLowercase` & `manualUppercase` functions were inspired by Google Caja code. Caja is written in Java, though, where problems with `toLowerCase` working differently in Turkish locale are well known[1]. In JavaScript `String#toLowerCase` is defined in the ECMAScript spec and all implementations are required to lowercase I in the same way, regardless of the current locale. Differences may (and do) happen only in `String#toLocaleLowerCase`. Other libraries doing string normalization, like jQuery or DOMPurify don't apply special lowercasing logic in a Turkish environment. Therefore, the `manualLowercase` & `manualUppercase` logic is dead code in AngularJS and can be removed. Also, the `manualLowercase` & `manualUppercase` functions are incomplete; they only lowercase ASCII characters which is different to native `String#toLowerCase`. Since those functions are used in many places in the library, they would break a lot of code. For example, the lowercase filter would not lowercase Ω to ω but leave it as Ω. [1] https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ Ref angular#11387
The `manualLowercase` & `manualUppercase` functions were inspired by Google Caja code. Caja is written in Java, though, where problems with `toLowerCase` working differently in Turkish locale are well known[1]. In JavaScript `String#toLowerCase` is defined in the ECMAScript spec and all implementations are required to lowercase I in the same way, regardless of the current locale. Differences may (and do) happen only in `String#toLocaleLowerCase`. Other libraries doing string normalization, like jQuery or DOMPurify don't apply special lowercasing logic in a Turkish environment. Therefore, the `manualLowercase` & `manualUppercase` logic is dead code in AngularJS and can be removed. Also, the `manualLowercase` & `manualUppercase` functions are incomplete; they only lowercase ASCII characters which is different to native `String#toLowerCase`. Since those functions are used in many places in the library, they would break a lot of code. For example, the lowercase filter would not lowercase Ω to ω but leave it as Ω. [1] https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ Ref angular#11387
The `manualLowercase` & `manualUppercase` functions were inspired by Google Caja code. Caja is written in Java, though, where problems with `toLowerCase` working differently in Turkish locale are well known[1]. In JavaScript `String#toLowerCase` is defined in the ECMAScript spec and all implementations are required to lowercase I in the same way, regardless of the current locale. Differences may (and do) happen only in `String#toLocaleLowerCase`. Other libraries doing string normalization, like jQuery or DOMPurify don't apply special lowercasing logic in a Turkish environment. Therefore, the `manualLowercase` & `manualUppercase` logic is dead code in AngularJS and can be removed. Also, the `manualLowercase` & `manualUppercase` functions are incomplete; they only lowercase ASCII characters which is different to native `String#toLowerCase`. Since those functions are used in many places in the library, they would break a lot of code. For example, the lowercase filter would not lowercase Ω to ω but leave it as Ω. [1] https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ Ref angular#11387
The `manualLowercase` & `manualUppercase` functions were inspired by Google Caja code. Caja is written in Java, though, where problems with `toLowerCase` working differently in Turkish locale are well known[1]. In JavaScript `String#toLowerCase` is defined in the ECMAScript spec and all implementations are required to lowercase I in the same way, regardless of the current locale. Differences may (and do) happen only in `String#toLocaleLowerCase`. Other libraries doing string normalization, like jQuery or DOMPurify don't apply special lowercasing logic in a Turkish environment. Therefore, the `manualLowercase` & `manualUppercase` logic is dead code in AngularJS and can be removed. Also, the `manualLowercase` & `manualUppercase` functions are incomplete; they only lowercase ASCII characters which is different to native `String#toLowerCase`. Since those functions are used in many places in the library, they would break a lot of code. For example, the lowercase filter would not lowercase Ω to ω but leave it as Ω. [1] https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ Ref angular#11387
The `manualLowercase` & `manualUppercase` functions were inspired by Google Caja code. Caja is written in Java, though, where problems with `toLowerCase` working differently in Turkish locale are well known[1]. In JavaScript `String#toLowerCase` is defined in the ECMAScript spec and all implementations are required to lowercase I in the same way, regardless of the current locale. Differences may (and do) happen only in `String#toLocaleLowerCase`. Other libraries doing string normalization, like jQuery or DOMPurify don't apply special lowercasing logic in a Turkish environment. Therefore, the `manualLowercase` & `manualUppercase` logic is dead code in AngularJS and can be removed. Also, the `manualLowercase` & `manualUppercase` functions are incomplete; they only lowercase ASCII characters which is different to native `String#toLowerCase`. Since those functions are used in many places in the library, they would break a lot of code. For example, the lowercase filter would not lowercase Ω to ω but leave it as Ω. [1] https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ Ref angular#11387
The `manualLowercase` & `manualUppercase` functions were inspired by Google Caja code. Caja is written in Java, though, where problems with `toLowerCase` working differently in Turkish locale are well known[1]. In JavaScript `String#toLowerCase` is defined in the ECMAScript spec and all implementations are required to lowercase I in the same way, regardless of the current locale. Differences may (and do) happen only in `String#toLocaleLowerCase`. Other libraries doing string normalization, like jQuery or DOMPurify don't apply special lowercasing logic in a Turkish environment. Therefore, the `manualLowercase` & `manualUppercase` logic is dead code in AngularJS and can be removed. Also, the `manualLowercase` & `manualUppercase` functions are incomplete; they only lowercase ASCII characters which is different to native `String#toLowerCase`. Since those functions are used in many places in the library, they would break a lot of code. For example, the lowercase filter would not lowercase Ω to ω but leave it as Ω. [1] https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ Ref angular#11387
The `manualLowercase` & `manualUppercase` functions were inspired by Google Caja code. Caja is written in Java, though, where problems with `toLowerCase` working differently in Turkish locale are well known[1]. In JavaScript `String#toLowerCase` is defined in the ECMAScript spec and all implementations are required to lowercase I in the same way, regardless of the current locale. Differences may (and do) happen only in `String#toLocaleLowerCase`. Other libraries doing string normalization, like jQuery or DOMPurify don't apply special lowercasing logic in a Turkish environment. Therefore, the `manualLowercase` & `manualUppercase` logic is dead code in AngularJS and can be removed. Also, the `manualLowercase` & `manualUppercase` functions are incomplete; they only lowercase ASCII characters which is different to native `String#toLowerCase`. Since those functions are used in many places in the library, they would break a lot of code. For example, the lowercase filter would not lowercase Ω to ω but leave it as Ω. [1] https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ Ref angular#11387
The `manualLowercase` & `manualUppercase` functions were inspired by Google Caja code which worked around Java issues where problems with `toLowerCase` working differently in Turkish locale are well known[1]. In JavaScript `String#toLowerCase` is defined in the ECMAScript spec and all implementations are required to lowercase I in the same way, regardless of the current locale. Differences may (and do) happen only in `String#toLocaleLowerCase`. The mirroring of the Java workarounds in Caja was needed due to an old Rhino bug. Rhino is a pre-Nashorn JavaScript interpreter written in Java and it used to delegate `String.prototype.toLowerCase` to `java.lang.String.toLowerCase`. This has since been long fixed. Other libraries doing string normalization, like jQuery or DOMPurify don't apply special lowercasing logic in a Turkish environment. Therefore, the `manualLowercase` & `manualUppercase` logic is dead code in AngularJS and can be removed. Also, the `manualLowercase` & `manualUppercase` functions are incomplete; they only lowercase ASCII characters which is different to native `String#toLowerCase`. Since those functions are used in many places in the library, they would break a lot of code. For example, the lowercase filter would not lowercase Ω to ω but leave it as Ω. [1] https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ Closes #15890 Ref #11387
What's wrong with the standard
toUpperCase
andtoLowerCase
methods?Looks like there was some issue with the Turkish locale, but does it still exist in the browsers supported by Angular? I asked about it on StackOverflow, but didn't get an answer.
The text was updated successfully, but these errors were encountered: