|
| 1 | +//! Inplace iterate-and-collect specialization for `Vec` |
| 2 | +//! |
| 3 | +//! Note: This documents Vec internals, some of the following sections explain implementation |
| 4 | +//! details and are best read together with the source of this module. |
| 5 | +//! |
| 6 | +//! The specialization in this module applies to iterators in the shape of |
| 7 | +//! `source.adapter().adapter().adapter().collect::<Vec<U>>()` |
| 8 | +//! where `source` is an owning iterator obtained from [`Vec<T>`], [`Box<[T]>`][box] (by conversion to `Vec`) |
| 9 | +//! or [`BinaryHeap<T>`], the adapters each consume one or more items per step |
| 10 | +//! (represented by [`InPlaceIterable`]), provide transitive access to `source` (via [`SourceIter`]) |
| 11 | +//! and thus the underlying allocation. And finally the layouts of `T` and `U` must |
| 12 | +//! have the same size and alignment, this is currently ensured via const eval instead of trait bounds |
| 13 | +//! in the specialized [`SpecFromIter`] implementation. |
| 14 | +//! |
| 15 | +//! [`BinaryHeap<T>`]: crate::collections::BinaryHeap |
| 16 | +//! [box]: crate::boxed::Box |
| 17 | +//! |
| 18 | +//! By extension some other collections which use `collect::<Vec<_>>()` internally in their |
| 19 | +//! `FromIterator` implementation benefit from this too. |
| 20 | +//! |
| 21 | +//! Access to the underlying source goes through a further layer of indirection via the private |
| 22 | +//! trait [`AsVecIntoIter`] to hide the implementation detail that other collections may use |
| 23 | +//! `vec::IntoIter` internally. |
| 24 | +//! |
| 25 | +//! In-place iteration depends on the interaction of several unsafe traits, implementation |
| 26 | +//! details of multiple parts in the iterator pipeline and often requires holistic reasoning |
| 27 | +//! across multiple structs since iterators are executed cooperatively rather than having |
| 28 | +//! a central evaluator/visitor struct executing all iterator components. |
| 29 | +//! |
| 30 | +//! # Reading from and writing to the same allocation |
| 31 | +//! |
| 32 | +//! By its nature collecting in place means that the reader and writer side of the iterator |
| 33 | +//! use the same allocation. Since `try_fold()` (used in [`SpecInPlaceCollect`]) takes a |
| 34 | +//! reference to the iterator for the duration of the iteration that means we can't interleave |
| 35 | +//! the step of reading a value and getting a reference to write to. Instead raw pointers must be |
| 36 | +//! used on the reader and writer side. |
| 37 | +//! |
| 38 | +//! That writes never clobber a yet-to-be-read item is ensured by the [`InPlaceIterable`] requirements. |
| 39 | +//! |
| 40 | +//! # Layout constraints |
| 41 | +//! |
| 42 | +//! [`Allocator`] requires that `allocate()` and `deallocate()` have matching alignment and size. |
| 43 | +//! Additionally this specialization doesn't make sense for ZSTs as there is no reallocation to |
| 44 | +//! avoid and it would make pointer arithmetic more difficult. |
| 45 | +//! |
| 46 | +//! [`Allocator`]: core::alloc::Allocator |
| 47 | +//! |
| 48 | +//! # Drop- and panic-safety |
| 49 | +//! |
| 50 | +//! Iteration can panic, requiring dropping the already written parts but also the remainder of |
| 51 | +//! the source. Iteration can also leave some source items unconsumed which must be dropped. |
| 52 | +//! All those drops in turn can panic which then must either leak the allocation or abort to avoid |
| 53 | +//! double-drops. |
| 54 | +//! |
| 55 | +//! This is handled by the [`InPlaceDrop`] guard for sink items (`U`) and by |
| 56 | +//! [`vec::IntoIter::forget_allocation_drop_remaining()`] for remaining source items (`T`). |
| 57 | +//! |
| 58 | +//! [`vec::IntoIter::forget_allocation_drop_remaining()`]: super::IntoIter::forget_allocation_drop_remaining() |
| 59 | +//! |
| 60 | +//! # O(1) collect |
| 61 | +//! |
| 62 | +//! The main iteration itself is further specialized when the iterator implements |
| 63 | +//! [`TrustedRandomAccessNoCoerce`] to let the optimizer see that it is a counted loop with a single |
| 64 | +//! [induction variable]. This can turn some iterators into a noop, i.e. it reduces them from O(n) to |
| 65 | +//! O(1). This particular optimization is quite fickle and doesn't always work, see [#79308] |
| 66 | +//! |
| 67 | +//! [#79308]: https://github.com/rust-lang/rust/issues/79308 |
| 68 | +//! [induction variable]: https://en.wikipedia.org/wiki/Induction_variable |
| 69 | +//! |
| 70 | +//! Since unchecked accesses through that trait do not advance the read pointer of `IntoIter` |
| 71 | +//! this would interact unsoundly with the requirements about dropping the tail described above. |
| 72 | +//! But since the normal `Drop` implementation of `IntoIter` would suffer from the same problem it |
| 73 | +//! is only correct for `TrustedRandomAccessNoCoerce` to be implemented when the items don't |
| 74 | +//! have a destructor. Thus that implicit requirement also makes the specialization safe to use for |
| 75 | +//! in-place collection. |
| 76 | +//! Note that this safety concern is about the correctness of `impl Drop for IntoIter`, |
| 77 | +//! not the guarantees of `InPlaceIterable`. |
| 78 | +//! |
| 79 | +//! # Adapter implementations |
| 80 | +//! |
| 81 | +//! The invariants for adapters are documented in [`SourceIter`] and [`InPlaceIterable`], but |
| 82 | +//! getting them right can be rather subtle for multiple, sometimes non-local reasons. |
| 83 | +//! For example `InPlaceIterable` would be valid to implement for [`Peekable`], except |
| 84 | +//! that it is stateful, cloneable and `IntoIter`'s clone implementation shortens the underlying |
| 85 | +//! allocation which means if the iterator has been peeked and then gets cloned there no longer is |
| 86 | +//! enough room, thus breaking an invariant ([#85322]). |
| 87 | +//! |
| 88 | +//! [#85322]: https://github.com/rust-lang/rust/issues/85322 |
| 89 | +//! [`Peekable`]: core::iter::Peekable |
| 90 | +//! |
| 91 | +//! |
| 92 | +//! # Examples |
| 93 | +//! |
| 94 | +//! Some cases that are optimized by this specialization, more can be found in the `Vec` |
| 95 | +//! benchmarks: |
| 96 | +//! |
| 97 | +//! ```rust |
| 98 | +//! # #[allow(dead_code)] |
| 99 | +//! /// Converts a usize vec into an isize one. |
| 100 | +//! pub fn cast(vec: Vec<usize>) -> Vec<isize> { |
| 101 | +//! // Does not allocate, free or panic. On optlevel>=2 it does not loop. |
| 102 | +//! // Of course this particular case could and should be written with `into_raw_parts` and |
| 103 | +//! // `from_raw_parts` instead. |
| 104 | +//! vec.into_iter().map(|u| u as isize).collect() |
| 105 | +//! } |
| 106 | +//! ``` |
| 107 | +//! |
| 108 | +//! ```rust |
| 109 | +//! # #[allow(dead_code)] |
| 110 | +//! /// Drops remaining items in `src` and if the layouts of `T` and `U` match it |
| 111 | +//! /// returns an empty Vec backed by the original allocation. Otherwise it returns a new |
| 112 | +//! /// empty vec. |
| 113 | +//! pub fn recycle_allocation<T, U>(src: Vec<T>) -> Vec<U> { |
| 114 | +//! src.into_iter().filter_map(|_| None).collect() |
| 115 | +//! } |
| 116 | +//! ``` |
| 117 | +//! |
| 118 | +//! ```rust |
| 119 | +//! let vec = vec![13usize; 1024]; |
| 120 | +//! let _ = vec.into_iter() |
| 121 | +//! .enumerate() |
| 122 | +//! .filter_map(|(idx, val)| if idx % 2 == 0 { Some(val+idx) } else {None}) |
| 123 | +//! .collect::<Vec<_>>(); |
| 124 | +//! |
| 125 | +//! // is equivalent to the following, but doesn't require bounds checks |
| 126 | +//! |
| 127 | +//! let mut vec = vec![13usize; 1024]; |
| 128 | +//! let mut write_idx = 0; |
| 129 | +//! for idx in 0..vec.len() { |
| 130 | +//! if idx % 2 == 0 { |
| 131 | +//! vec[write_idx] = vec[idx] + idx; |
| 132 | +//! write_idx += 1; |
| 133 | +//! } |
| 134 | +//! } |
| 135 | +//! vec.truncate(write_idx); |
| 136 | +//! ``` |
| 137 | +use core::iter::{InPlaceIterable, SourceIter, TrustedRandomAccessNoCoerce}; |
| 138 | +use core::mem::{self, ManuallyDrop}; |
| 139 | +use core::ptr::{self}; |
| 140 | + |
| 141 | +use super::{InPlaceDrop, SpecFromIter, SpecFromIterNested, Vec}; |
| 142 | + |
| 143 | +/// Specialization marker for collecting an iterator pipeline into a Vec while reusing the |
| 144 | +/// source allocation, i.e. executing the pipeline in place. |
| 145 | +#[rustc_unsafe_specialization_marker] |
| 146 | +pub(super) trait InPlaceIterableMarker {} |
| 147 | + |
| 148 | +impl<T> InPlaceIterableMarker for T where T: InPlaceIterable {} |
| 149 | + |
| 150 | +impl<T, I> SpecFromIter<T, I> for Vec<T> |
| 151 | +where |
| 152 | + I: Iterator<Item = T> + SourceIter<Source: AsVecIntoIter> + InPlaceIterableMarker, |
| 153 | +{ |
| 154 | + default fn from_iter(mut iterator: I) -> Self { |
| 155 | + // See "Layout constraints" section in the module documentation. We rely on const |
| 156 | + // optimization here since these conditions currently cannot be expressed as trait bounds |
| 157 | + if mem::size_of::<T>() == 0 |
| 158 | + || mem::size_of::<T>() |
| 159 | + != mem::size_of::<<<I as SourceIter>::Source as AsVecIntoIter>::Item>() |
| 160 | + || mem::align_of::<T>() |
| 161 | + != mem::align_of::<<<I as SourceIter>::Source as AsVecIntoIter>::Item>() |
| 162 | + { |
| 163 | + // fallback to more generic implementations |
| 164 | + return SpecFromIterNested::from_iter(iterator); |
| 165 | + } |
| 166 | + |
| 167 | + let (src_buf, src_ptr, dst_buf, dst_end, cap) = unsafe { |
| 168 | + let inner = iterator.as_inner().as_into_iter(); |
| 169 | + ( |
| 170 | + inner.buf.as_ptr(), |
| 171 | + inner.ptr, |
| 172 | + inner.buf.as_ptr() as *mut T, |
| 173 | + inner.end as *const T, |
| 174 | + inner.cap, |
| 175 | + ) |
| 176 | + }; |
| 177 | + |
| 178 | + let len = SpecInPlaceCollect::collect_in_place(&mut iterator, dst_buf, dst_end); |
| 179 | + |
| 180 | + let src = unsafe { iterator.as_inner().as_into_iter() }; |
| 181 | + // check if SourceIter contract was upheld |
| 182 | + // caveat: if they weren't we might not even make it to this point |
| 183 | + debug_assert_eq!(src_buf, src.buf.as_ptr()); |
| 184 | + // check InPlaceIterable contract. This is only possible if the iterator advanced the |
| 185 | + // source pointer at all. If it uses unchecked access via TrustedRandomAccess |
| 186 | + // then the source pointer will stay in its initial position and we can't use it as reference |
| 187 | + if src.ptr != src_ptr { |
| 188 | + debug_assert!( |
| 189 | + unsafe { dst_buf.add(len) as *const _ } <= src.ptr, |
| 190 | + "InPlaceIterable contract violation, write pointer advanced beyond read pointer" |
| 191 | + ); |
| 192 | + } |
| 193 | + |
| 194 | + // Drop any remaining values at the tail of the source but prevent drop of the allocation |
| 195 | + // itself once IntoIter goes out of scope. |
| 196 | + // If the drop panics then we also leak any elements collected into dst_buf. |
| 197 | + // |
| 198 | + // Note: This access to the source wouldn't be allowed by the TrustedRandomIteratorNoCoerce |
| 199 | + // contract (used by SpecInPlaceCollect below). But see the "O(1) collect" section in the |
| 200 | + // module documenttation why this is ok anyway. |
| 201 | + src.forget_allocation_drop_remaining(); |
| 202 | + |
| 203 | + let vec = unsafe { Vec::from_raw_parts(dst_buf, len, cap) }; |
| 204 | + |
| 205 | + vec |
| 206 | + } |
| 207 | +} |
| 208 | + |
| 209 | +fn write_in_place_with_drop<T>( |
| 210 | + src_end: *const T, |
| 211 | +) -> impl FnMut(InPlaceDrop<T>, T) -> Result<InPlaceDrop<T>, !> { |
| 212 | + move |mut sink, item| { |
| 213 | + unsafe { |
| 214 | + // the InPlaceIterable contract cannot be verified precisely here since |
| 215 | + // try_fold has an exclusive reference to the source pointer |
| 216 | + // all we can do is check if it's still in range |
| 217 | + debug_assert!(sink.dst as *const _ <= src_end, "InPlaceIterable contract violation"); |
| 218 | + ptr::write(sink.dst, item); |
| 219 | + // Since this executes user code which can panic we have to bump the pointer |
| 220 | + // after each step. |
| 221 | + sink.dst = sink.dst.add(1); |
| 222 | + } |
| 223 | + Ok(sink) |
| 224 | + } |
| 225 | +} |
| 226 | + |
| 227 | +/// Helper trait to hold specialized implementations of the in-place iterate-collect loop |
| 228 | +trait SpecInPlaceCollect<T, I>: Iterator<Item = T> { |
| 229 | + /// Collects an iterator (`self`) into the destination buffer (`dst`) and returns the number of items |
| 230 | + /// collected. `end` is the last writable element of the allocation and used for bounds checks. |
| 231 | + /// |
| 232 | + /// This method is specialized and one of its implementations makes use of |
| 233 | + /// `Iterator::__iterator_get_unchecked` calls with a `TrustedRandomAccessNoCoerce` bound |
| 234 | + /// on `I` which means the caller of this method must take the safety conditions |
| 235 | + /// of that trait into consideration. |
| 236 | + fn collect_in_place(&mut self, dst: *mut T, end: *const T) -> usize; |
| 237 | +} |
| 238 | + |
| 239 | +impl<T, I> SpecInPlaceCollect<T, I> for I |
| 240 | +where |
| 241 | + I: Iterator<Item = T>, |
| 242 | +{ |
| 243 | + #[inline] |
| 244 | + default fn collect_in_place(&mut self, dst_buf: *mut T, end: *const T) -> usize { |
| 245 | + // use try-fold since |
| 246 | + // - it vectorizes better for some iterator adapters |
| 247 | + // - unlike most internal iteration methods, it only takes a &mut self |
| 248 | + // - it lets us thread the write pointer through its innards and get it back in the end |
| 249 | + let sink = InPlaceDrop { inner: dst_buf, dst: dst_buf }; |
| 250 | + let sink = |
| 251 | + self.try_fold::<_, _, Result<_, !>>(sink, write_in_place_with_drop(end)).unwrap(); |
| 252 | + // iteration succeeded, don't drop head |
| 253 | + unsafe { ManuallyDrop::new(sink).dst.offset_from(dst_buf) as usize } |
| 254 | + } |
| 255 | +} |
| 256 | + |
| 257 | +impl<T, I> SpecInPlaceCollect<T, I> for I |
| 258 | +where |
| 259 | + I: Iterator<Item = T> + TrustedRandomAccessNoCoerce, |
| 260 | +{ |
| 261 | + #[inline] |
| 262 | + fn collect_in_place(&mut self, dst_buf: *mut T, end: *const T) -> usize { |
| 263 | + let len = self.size(); |
| 264 | + let mut drop_guard = InPlaceDrop { inner: dst_buf, dst: dst_buf }; |
| 265 | + for i in 0..len { |
| 266 | + // Safety: InplaceIterable contract guarantees that for every element we read |
| 267 | + // one slot in the underlying storage will have been freed up and we can immediately |
| 268 | + // write back the result. |
| 269 | + unsafe { |
| 270 | + let dst = dst_buf.offset(i as isize); |
| 271 | + debug_assert!(dst as *const _ <= end, "InPlaceIterable contract violation"); |
| 272 | + ptr::write(dst, self.__iterator_get_unchecked(i)); |
| 273 | + // Since this executes user code which can panic we have to bump the pointer |
| 274 | + // after each step. |
| 275 | + drop_guard.dst = dst.add(1); |
| 276 | + } |
| 277 | + } |
| 278 | + mem::forget(drop_guard); |
| 279 | + len |
| 280 | + } |
| 281 | +} |
| 282 | + |
| 283 | +/// Internal helper trait for in-place iteration specialization. |
| 284 | +/// |
| 285 | +/// Currently this is only implemented by [`vec::IntoIter`] - returning a reference to itself - and |
| 286 | +/// [`binary_heap::IntoIter`] which returns a reference to its inner representation. |
| 287 | +/// |
| 288 | +/// Since this is an internal trait it hides the implementation detail `binary_heap::IntoIter` |
| 289 | +/// uses `vec::IntoIter` internally. |
| 290 | +/// |
| 291 | +/// [`vec::IntoIter`]: super::IntoIter |
| 292 | +/// [`binary_heap::IntoIter`]: crate::collections::binary_heap::IntoIter |
| 293 | +/// |
| 294 | +/// # Safety |
| 295 | +/// |
| 296 | +/// In-place iteration relies on implementation details of `vec::IntoIter`, most importantly that |
| 297 | +/// it does not create references to the whole allocation during iteration, only raw pointers |
| 298 | +#[rustc_specialization_trait] |
| 299 | +pub(crate) unsafe trait AsVecIntoIter { |
| 300 | + type Item; |
| 301 | + fn as_into_iter(&mut self) -> &mut super::IntoIter<Self::Item>; |
| 302 | +} |
0 commit comments