Skip to content

Schematic update recorder indices need to be different for BOM files #14558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
devversion opened this issue May 29, 2019 · 4 comments
Closed
Labels
area: @angular-devkit/schematics feature Issue that requests a new feature
Milestone

Comments

@devversion
Copy link
Member

devversion commented May 29, 2019

🐞 Bug report

Package

@angular-devkit/schematics

Is this a regression?

No

Description

Currently schematics which leverage the public UpdateRecorder API to make source-file transformations will get different results for source-files which contain a BOM.

This is because the CLI magically tries to account for the BOM in source-files by always adding 1 to passed indices. This is unexpected and causes confusing behavior as this implies that the UpdateRecorder only expects character indices for visible characters.

Schematics that currently determine the character indices by parsing the source-files (e.g. through TypeScript compiler API or fs.readFileSync) will get broken behavior for source-files that contain a BOM as the replacements are accidentally shifted by one.

We need to do one of the following:

  1. Either provide a clear API description explaining the story with BOM and what indices the
    update recorder expects. This forces schematic authors to know about BOM and to strip the BOM when reading source-files.

  2. A long-term solution where the UpdateRecorder expects character indices which are truly based on the source-file. No magic differentiation between BOM / non-BOM files.

@alan-agius4 wrote

The main reason for this is that when a developer is writing a schematic they shouldn't need to compute the offset based if a file has a BOM or not and is handled out of the box.

I want to add to that: I see the point of doing that, but it is simply resulting in magic behavior as schematic indices are not determined by just looking at the visible rendered characters but rather by parsed source-files where the BOM is a valid invisible character that still adds to the amount characters.

e.g. fs.readFileSync('./with_bom.txt', 'utf8').length will respect the BOM characters and therefore it's common that indices passed to the update recorder will be truly absolute.

Ideally the magic differentation between BOM and non-BOM files would have not happened at all, but for now the easiest way to fix this in a backwards-compatible way (where older CLI versions are used) is to strip the BOM when parsing the TypeScript source-files.

We need to find a proper solution for this in the future though. It adds unnecessary bloat and confusion to schematic authors as they need to know about different behavior with BOM / and need to handle BOM characters in a special way for now. See the unnecessary bloat we need to add:

A clear and concise description of the problem...

🔬 Minimal Reproduction

See test-case in angular/angular#30719 or various other affected issues:

🔥 Exception or Error

Replacements are shifted by one character if the source-file contains a BOM. Can result in
broken static-query migration where ViewChild will be converted to VViewChild or in a broken lazy route syntax migration where routes are incorrectly converted (#14551)

@devversion devversion added freq1: low Only reported by a handful of users who observe it rarely severity3: broken area: @angular-devkit/schematics needs: discussion On the agenda for team meeting to determine next steps labels May 29, 2019
@ngbot ngbot bot added this to the needsTriage milestone May 29, 2019
@devversion devversion changed the title Schematic update recorder indices are different for BOM files Schematic update recorder indices need to be different for BOM files May 29, 2019
@alan-agius4 alan-agius4 added feature Issue that requests a new feature and removed freq1: low Only reported by a handful of users who observe it rarely needs: discussion On the agenda for team meeting to determine next steps severity3: broken labels May 30, 2019
@ngbot ngbot bot modified the milestones: needsTriage, Backlog May 30, 2019
@alan-agius4
Copy link
Collaborator

We should probably return the contents of the file already without a BOM.

//cc @clydin, feel free to add more details :)

@devversion
Copy link
Member Author

devversion commented May 30, 2019

@alan-agius4 would you think this is a feature? I definitely think the severity is confusing and might even go through as a bug as it is not clearly documented what the UpdateRecorder expects.

Magic behavior (how it is right now) is not always good, especially since I said above that in most cases the BOM is part of the characters in a file unless the BOM is explicitly stripped off.

@clydin
Copy link
Member

clydin commented Feb 9, 2024

The readText and readJson methods on Tree instances will automatically handle the presence of the BOM and remove as needed via the internal use of a TextDecoder. If a schematic uses these methods than the indices will match and no offsetting will be required. If a schematic continues to use the lower-level read method which provides the raw byte data, then the presence of a BOM will need to be handled manually by such a schematic. Any existing schematics outside of the CLI (these have already been converted) should probably update to readText/readJson where appropriate.

@clydin clydin closed this as completed Feb 9, 2024
@angular-automatic-lock-bot
Copy link

This issue has been automatically locked due to inactivity.
Please file a new issue if you are encountering a similar or related problem.

Read more about our automatic conversation locking policy.

This action has been performed automatically by a bot.

@angular-automatic-lock-bot angular-automatic-lock-bot bot locked and limited conversation to collaborators Mar 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area: @angular-devkit/schematics feature Issue that requests a new feature
Projects
None yet
Development

No branches or pull requests

3 participants