-
Notifications
You must be signed in to change notification settings - Fork 73
Support larger sizes in rmeta tables and incr comp alloc offsets #666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed. cc @rust-lang/compiler @rust-lang/compiler-contributors |
@rustbot second |
Adapt table sizes to the contents This is an implementation of rust-lang/compiler-team#666 The objective of this PR is to permit the rmeta format to accommodate larger crates that need offsets larger than a `u32` can store without compromising performance for crates that do not need such range. The second commit is a number of tiny optimization opportunities I noticed while looking at perf recordings of the first commit. The rmeta tables need to have fixed-size elements to permit lazy random access. But the size only needs to be fixed _per table_, not per element type. This PR adds another `usize` to the table header which indicates the table element size. As each element of a table is set, we keep track of the widest encoded table value, then don't bother encoding all the unused trailing bytes on each value. When decoding table elements, we copy them to a full-width array if they are not already full-width. `LazyArray` needs some special treatment. Most other values that are encoded in tables are indexes or offsets, and those tend to be small so we get to drop a lot of zero bytes off the end. But `LazyArray` encodes _two_ small values in a fixed-width table element: A position of the table and the length of the table. The treatment described above could trim zero bytes off the table length, but any nonzero length shields the position bytes from the optimization. To improve this, we interleave the bytes of position and length. This change is responsible for about half of the crate metadata win on many crates. Fixes rust-lang#112934 (probably) Fixes rust-lang#103607
Adapt table sizes to the contents This is an implementation of rust-lang/compiler-team#666 The objective of this PR is to permit the rmeta format to accommodate larger crates that need offsets larger than a `u32` can store without compromising performance for crates that do not need such range. The second commit is a number of tiny optimization opportunities I noticed while looking at perf recordings of the first commit. The rmeta tables need to have fixed-size elements to permit lazy random access. But the size only needs to be fixed _per table_, not per element type. This PR adds another `usize` to the table header which indicates the table element size. As each element of a table is set, we keep track of the widest encoded table value, then don't bother encoding all the unused trailing bytes on each value. When decoding table elements, we copy them to a full-width array if they are not already full-width. `LazyArray` needs some special treatment. Most other values that are encoded in tables are indexes or offsets, and those tend to be small so we get to drop a lot of zero bytes off the end. But `LazyArray` encodes _two_ small values in a fixed-width table element: A position of the table and the length of the table. The treatment described above could trim zero bytes off the table length, but any nonzero length shields the position bytes from the optimization. To improve this, we interleave the bytes of position and length. This change is responsible for about half of the crate metadata win on many crates. Fixes rust-lang/rust#112934 (probably) Fixes rust-lang/rust#103607
Closing as accepted (we should maybe automate "close after 10 days"...) |
Adapt table sizes to the contents This is an implementation of rust-lang/compiler-team#666 The objective of this PR is to permit the rmeta format to accommodate larger crates that need offsets larger than a `u32` can store without compromising performance for crates that do not need such range. The second commit is a number of tiny optimization opportunities I noticed while looking at perf recordings of the first commit. The rmeta tables need to have fixed-size elements to permit lazy random access. But the size only needs to be fixed _per table_, not per element type. This PR adds another `usize` to the table header which indicates the table element size. As each element of a table is set, we keep track of the widest encoded table value, then don't bother encoding all the unused trailing bytes on each value. When decoding table elements, we copy them to a full-width array if they are not already full-width. `LazyArray` needs some special treatment. Most other values that are encoded in tables are indexes or offsets, and those tend to be small so we get to drop a lot of zero bytes off the end. But `LazyArray` encodes _two_ small values in a fixed-width table element: A position of the table and the length of the table. The treatment described above could trim zero bytes off the table length, but any nonzero length shields the position bytes from the optimization. To improve this, we interleave the bytes of position and length. This change is responsible for about half of the crate metadata win on many crates. Fixes rust-lang/rust#112934 (probably) Fixes rust-lang/rust#103607
Adapt table sizes to the contents This is an implementation of rust-lang/compiler-team#666 The objective of this PR is to permit the rmeta format to accommodate larger crates that need offsets larger than a `u32` can store without compromising performance for crates that do not need such range. The second commit is a number of tiny optimization opportunities I noticed while looking at perf recordings of the first commit. The rmeta tables need to have fixed-size elements to permit lazy random access. But the size only needs to be fixed _per table_, not per element type. This PR adds another `usize` to the table header which indicates the table element size. As each element of a table is set, we keep track of the widest encoded table value, then don't bother encoding all the unused trailing bytes on each value. When decoding table elements, we copy them to a full-width array if they are not already full-width. `LazyArray` needs some special treatment. Most other values that are encoded in tables are indexes or offsets, and those tend to be small so we get to drop a lot of zero bytes off the end. But `LazyArray` encodes _two_ small values in a fixed-width table element: A position of the table and the length of the table. The treatment described above could trim zero bytes off the table length, but any nonzero length shields the position bytes from the optimization. To improve this, we interleave the bytes of position and length. This change is responsible for about half of the crate metadata win on many crates. Fixes rust-lang/rust#112934 (probably) Fixes rust-lang/rust#103607
Uh oh!
There was an error while loading. Please reload this page.
Proposal
Rustc stores a lot of lengths in
u32
, which reduces the size of data structures and on-disk formats that contain these lengths. But this strategy limits the space of programs that can be compiled. In general, this is a gamble that our user base will benefit more from the performance improvement of picking a small fixed-size type than the implementation limitation this imposes will cause trouble.We have a few reports of running into the
u32
limit for rmeta table offsets and incr comp allocation offsets:Result::unwrap()
on anErr
value: TryFromIntError(()) rust#112934[21u8; u32::MAX as usize]
rust#111613This is a proposal to find a way to lift the implementation limitation imposed by using
u32
in these two places, and thus be able to compile the programs reported in these issues.Current implementations of these are:
While the goal of this MCP is to support compiling more programs, it is important that the additional space of supported programs not impose a measurable cost on the average user according to our benchmark suite. In that sense, the perf implications of solutions to this MCP should be evaluated on their own, as opposed to as if they are the cost of adding a feature.
Mentors or Reviewers
Process
The main points of the Major Change Process are as follows:
@rustbot second
.-C flag
, then full team check-off is required.@rfcbot fcp merge
on either the MCP or the PR.You can read more about Major Change Proposals on forge.
Comments
This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.
The text was updated successfully, but these errors were encountered: