Skip to content

clang: Tokenize more lazily. #1466

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 14, 2018
Merged

clang: Tokenize more lazily. #1466

merged 1 commit into from
Dec 14, 2018

Conversation

emilio
Copy link
Contributor

@emilio emilio commented Dec 14, 2018

Instead of converting all the tokens to utf-8 before-hand, which is costly, and
allocating a new vector unconditionally (on top of the one clang already
allocates), just do the tokenization more lazily.

There's actually only one place in the codebase which needs the utf-8 string,
all the others can just work with the byte slice from clang.

This should have no behavior change, other than be faster. In particular, this
halves the time on my machine spent on the test-case from #1465.

I'm not completely sure that this is going to be enough to make it acceptable,
but we should probably do it regardless.

@highfive
Copy link

warning Warning warning

  • These commits modify unsafe code. Please review it carefully!

Instead of converting all the tokens to utf-8 before-hand, which is costly, and
allocating a new vector unconditionally (on top of the one clang already
allocates), just do the tokenization more lazily.

There's actually only one place in the codebase which needs the utf-8 string,
all the others can just work with the byte slice from clang.

This should have no behavior change, other than be faster. In particular, this
halves the time on my machine spent on the test-case from rust-lang#1465.

I'm not completely sure that this is going to be enough to make it acceptable,
but we should probably do it regardless.
@emilio emilio merged commit eb97c14 into rust-lang:master Dec 14, 2018
@emilio emilio deleted the token-lazy branch December 14, 2018 10:59
MihirLuthra added a commit to fortanix/rust-mbedtls that referenced this pull request Feb 7, 2022
Although, bindgen needs .enable_function_attribute_detection()
to process __attribute__((__warn_unused_result__)) because parsing
attrs can be really slow in certain cases. Benches were performed
to confirm our case doesn't face that issue.

References:
rust-lang/rust-bindgen#2149
rust-lang/rust-bindgen#1465
rust-lang/rust-bindgen#1466
rust-lang/rust-bindgen#1467
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants