Why does Angular need its own angular.uppercase and lowercase methods? #11387

thorn0 · 2015-03-21T14:05:32Z

What's wrong with the standard toUpperCase and toLowerCase methods?
Looks like there was some issue with the Turkish locale, but does it still exist in the browsers supported by Angular? I asked about it on StackOverflow, but didn't get an answer.

The text was updated successfully, but these errors were encountered:

realityking · 2015-03-21T15:31:56Z

The reason is documented in a source comment:

angular.js/src/Angular.js

Lines 161 to 163 in e5d1d65

    
           // String#toLowerCase and String#toUpperCase don't produce correct results in browsers with Turkish 
        
           // locale, for this reason we need to detect this case and redefine lowercase/uppercase methods 
        
           // with correct but slower alternatives.

It's also documented on MDN: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/toLocaleLowerCase

Lastly, I think the StackOverflow answers were pretty accurate.

thorn0 · 2015-03-21T15:49:25Z

Neither the comment nor the SO answers are informative.
The question is

In which exactly JS engines are toLowerCase() & toUpperCase() locale-sensitive?"

How do those answers answer it? How does the source comment answer it? Moreover, the comment and the answers contradict each other. In theory, toLowerCase() & toUpperCase() shouldn't be locale-sensitive. That's what one of the answers says. And your MDN link says the same thing. But the comment states something very opposite:

String#toLowerCase and String#toUpperCase don't produce correct results in browsers with Turkish locale

Again, in which exactly browsers does this happen?

wesleycho · 2015-03-21T17:41:33Z

Searching on Google, this seems like a general problem due to the Turkish language itself - this article seems to explain it: http://www.i18nguy.com/unicode/turkish-i18n.html

If I had to guess, this is a problem with every browser.

thorn0 · 2015-03-21T17:54:52Z

The general problem exists, that's true. But who said that it (still) exists also in JavaScript? Who faced it during say last 10 years?

wesleycho · 2015-03-21T18:01:03Z

I doubt it was pulled from thin air - that is not how the Angular team functions.

thorn0 · 2015-03-21T18:01:09Z

The hacks and workarounds for IE8 are being removed from the code of Angular now. But this thing looks like a workaround for a bug in some even older browser.

pkozlowski-opensource · 2015-03-21T18:10:57Z

@thorn0 why not testing on your end (you seem to be from Turkey) and sending a PR to remove those work-arounds if no longer necessary? I'm pretty sure that everyone would be more than happy to remove unnecessary code.

thorn0 · 2015-03-21T18:24:53Z

I wanted to know why this code is there. If the issue still exists (if ever existed) in some browser, it'd be nice to know about it. I'm going to check it in different browsers, but of course I don't have access to all the needed OS+device+browser combinations.

pkozlowski-opensource · 2015-03-21T18:30:57Z

@thorn0 AFAIK Turkish locale was the only / main reason. So if you can confirm that this bug doesn't affect users of modern browsers and submit a PR to see if all the tests are passing on CI this could potentially be merging. But as you are saying, no one will be able to re-test this on all the possible devices in the wild, so we might potentially introduce a breaking change here....

thorn0 · 2015-03-21T18:32:40Z

BTW, another library containing code like that is Google Closure Library. See these lines. Supposedly, it came to Angular from there.

lgalfaso · 2015-03-30T09:42:14Z

I think this can be split into two issues

Why does Angular need an implementation of toUpperCase and toLowerCase
Why does Angular needs to have a public method that exposes this

The former point should be easy to decide if someone can verify that this is no longer an issue with the supported browsers
The later would be hard to remove, but I would be ok to deprecate it (even if it still works)

ryanhart2 · 2015-10-04T07:50:08Z

In case it helps, the code was added to the angular.js file by @mhevery in Oct 2010 as part of this commit "create HTML sanitizer to allow inclusion of untrusted HTML in safe manner".

Everything that I could find on the topic only mentioned this being a problem in Java. The ECMAScript standards (3 and 5) state that the toUpperCase and toLowerCase functions are not locale sensitive and the toLocaleUpperCase and toLocaleLowerCase functions are locale sensitive.

So it doesn't seem to be required, but I don't know how to test on multiple browsers in the Turkish locale.

thorn0 · 2015-10-24T12:09:29Z

I think the right thing to do for now would be just undocumenting angular.uppercase and angular.lowercase without removing them.

mhevery · 2017-03-21T03:54:52Z

The basic issue is that when malicious code writes SCRIPT we need to detect it and take it out. The way we do it is to do toLowerCase on it, but that will produce scrıpt (notice no dot on ı) in some locales and then 'scrıpt' === 'script' fails which means that malicious code gets by the sanitizer.

If you can prove that in JS "I".toLowerCase() === "i" true on all locals, than the code is safe to remove. This code was added on request of google security team.

petebacondarwin · 2017-03-21T06:43:43Z

I notice that there are various places in the sanitizer that "don't" use our custom lowercase. E.g. https://github.com/angular/angular.js/blob/master/src/ngSanitize/sanitize.js#L378
Does this need to be fixed?

thorn0 · 2017-03-21T09:18:03Z

As far as I can tell, AngularJS and the Closure Library are the only libraries that include this workaround, and nowhere else on the Internet is it mentioned that this problem ever existed in the JS world, only in Java.

The `manualLowercase` & `manualUppercase` functions were inspired by Google Caja code. Caja is written in Java, though, where problems with `toLowerCase` working differently in Turkish locale are well known[1]. In JavaScript `String#toLowerCase` is defined in the ECMAScript spec and all implementations are required to lowercase I in the same way, regardless of the current locale. Differences may (and do) happen only in `String#toLocaleLowerCase`. Other libraries doing string normalization, like jQuery or DOMPurify don't apply special lowercasing logic in a Turkish environment. Therefore, the `manualLowercase` & `manualUppercase` logic is dead code in AngularJS and can be removed. Also, the `manualLowercase` & `manualUppercase` functions are incomplete; they only lowercase ASCII characters which is different to native `String#toLowerCase`. Since those functions are used in many places in the library, they would break a lot of code. For example, the lowercase filter would not lowercase Ω to ω but leave it as Ω. [1] https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ Ref angular#11387

The `manualLowercase` & `manualUppercase` functions were inspired by Google Caja code which worked around Java issues where problems with `toLowerCase` working differently in Turkish locale are well known[1]. In JavaScript `String#toLowerCase` is defined in the ECMAScript spec and all implementations are required to lowercase I in the same way, regardless of the current locale. Differences may (and do) happen only in `String#toLocaleLowerCase`. The mirroring of the Java workarounds in Caja was needed due to an old Rhino bug. Rhino is a pre-Nashorn JavaScript interpreter written in Java and it used to delegate `String.prototype.toLowerCase` to `java.lang.String.toLowerCase`. This has since been long fixed. Other libraries doing string normalization, like jQuery or DOMPurify don't apply special lowercasing logic in a Turkish environment. Therefore, the `manualLowercase` & `manualUppercase` logic is dead code in AngularJS and can be removed. Also, the `manualLowercase` & `manualUppercase` functions are incomplete; they only lowercase ASCII characters which is different to native `String#toLowerCase`. Since those functions are used in many places in the library, they would break a lot of code. For example, the lowercase filter would not lowercase Ω to ω but leave it as Ω. [1] https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ Closes #15890 Ref #11387

Narretz added needs: investigation frequency: low component: misc core severity: inconvenient labels Mar 21, 2015

Narretz added this to the Ice Box milestone Mar 21, 2015

thorn0 mentioned this issue Jan 15, 2016

docs(uppercase, lowercase): undocument these artifacts #13779

Closed

petebacondarwin closed this as completed in 6a92e91 Jan 19, 2016

adgoncal mentioned this issue Mar 25, 2016

[Docs] Why is lowercase/uppercase deprecation not in changelog? #14316

Closed

mgol mentioned this issue Apr 5, 2017

chore(*): remove manualLowercase & manualUppercase functions #15890

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why does Angular need its own angular.uppercase and lowercase methods? #11387

Why does Angular need its own angular.uppercase and lowercase methods? #11387

thorn0 commented Mar 21, 2015

realityking commented Mar 21, 2015

thorn0 commented Mar 21, 2015

wesleycho commented Mar 21, 2015

thorn0 commented Mar 21, 2015

wesleycho commented Mar 21, 2015

thorn0 commented Mar 21, 2015

pkozlowski-opensource commented Mar 21, 2015

thorn0 commented Mar 21, 2015

pkozlowski-opensource commented Mar 21, 2015

thorn0 commented Mar 21, 2015

lgalfaso commented Mar 30, 2015

ryanhart2 commented Oct 4, 2015

thorn0 commented Oct 24, 2015

mhevery commented Mar 21, 2017

petebacondarwin commented Mar 21, 2017

thorn0 commented Mar 21, 2017

Why does Angular need its own angular.uppercase and lowercase methods? #11387

Why does Angular need its own angular.uppercase and lowercase methods? #11387

Comments

thorn0 commented Mar 21, 2015

realityking commented Mar 21, 2015

thorn0 commented Mar 21, 2015

wesleycho commented Mar 21, 2015

thorn0 commented Mar 21, 2015

wesleycho commented Mar 21, 2015

thorn0 commented Mar 21, 2015

pkozlowski-opensource commented Mar 21, 2015

thorn0 commented Mar 21, 2015

pkozlowski-opensource commented Mar 21, 2015

thorn0 commented Mar 21, 2015

lgalfaso commented Mar 30, 2015

ryanhart2 commented Oct 4, 2015

thorn0 commented Oct 24, 2015

mhevery commented Mar 21, 2017

petebacondarwin commented Mar 21, 2017

thorn0 commented Mar 21, 2017