-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Optimize map unions to avoid building long lists #14215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
e3fb8ed
5255a75
9d74860
ec96574
55ae816
6e489d4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1278,8 +1278,130 @@ defmodule Module.Types.Descr do | |
|
||
defp map_only?(descr), do: empty?(Map.delete(descr, :map)) | ||
|
||
# Union is list concatenation | ||
defp map_union(dnf1, dnf2), do: dnf1 ++ (dnf2 -- dnf1) | ||
defp map_union(dnf1, dnf2) do | ||
# Union is just concatenation, but we rely on some optimization strategies to | ||
# avoid the list to grow when possible | ||
|
||
# first pass trying to identify patterns where two maps can be fused as one | ||
with [{tag1, pos1, []}] <- dnf1, | ||
[{tag2, pos2, []}] <- dnf2, | ||
strategy when strategy != nil <- map_union_optimization_strategy(tag1, pos1, tag2, pos2) do | ||
case strategy do | ||
:all_equal -> | ||
dnf1 | ||
|
||
:any_map -> | ||
[{:open, %{}, []}] | ||
|
||
{:one_key_difference, key, v1, v2} -> | ||
new_pos = Map.put(pos1, key, union(v1, v2)) | ||
[{tag1, new_pos, []}] | ||
|
||
:left_subtype_of_right -> | ||
dnf2 | ||
|
||
:right_subtype_of_left -> | ||
dnf1 | ||
end | ||
else | ||
# otherwise we just concatenate and remove structural duplicates | ||
_ -> dnf1 ++ (dnf2 -- dnf1) | ||
end | ||
end | ||
|
||
defp map_union_optimization_strategy(tag1, pos1, tag2, pos2) | ||
defp map_union_optimization_strategy(tag, pos, tag, pos), do: :all_equal | ||
defp map_union_optimization_strategy(:open, empty, _, _) when empty == %{}, do: :any_map | ||
defp map_union_optimization_strategy(_, _, :open, empty) when empty == %{}, do: :any_map | ||
|
||
defp map_union_optimization_strategy(tag, pos1, tag, pos2) | ||
when map_size(pos1) == map_size(pos2) do | ||
:maps.iterator(pos1) | ||
|> :maps.next() | ||
|> do_map_union_optimization_strategy(pos2, :all_equal) | ||
end | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This does not handle cases where the open map on the left has some extra fields that are set to Example:
Similarly, what if we have a larger (in size) open map as Example:
I don't think those are case that necessarily need to be covered, but adding those tests to highlight it would prevent us discovering this again. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh I was totally forgetting about
Yeah this is an optimization supposed to deal with some "obvious" cases that happen frequently, so it might be OK not to catch all cases (we're not dealing with negs either). But in this case it might be possible to implement in the current pass with something like:
The map size issue is a real problem though... Perhaps by changing the internal representation to store There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think it is fine because our goal is to traverse the smallest map for performance. The full algorithm does require traversing both sides but the point here is precisely to not implement the full algorithm. :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good! I was more thinking if we get bottlenecks in the future due to |
||
defp map_union_optimization_strategy(:open, pos1, _, pos2) | ||
when map_size(pos1) <= map_size(pos2) do | ||
:maps.iterator(pos1) | ||
|> :maps.next() | ||
|> do_map_union_optimization_strategy(pos2, :right_subtype_of_left) | ||
end | ||
|
||
defp map_union_optimization_strategy(_, pos1, :open, pos2) | ||
when map_size(pos1) >= map_size(pos2) do | ||
:maps.iterator(pos2) | ||
|> :maps.next() | ||
|> do_map_union_optimization_strategy(pos1, :right_subtype_of_left) | ||
|> case do | ||
:right_subtype_of_left -> :left_subtype_of_right | ||
nil -> nil | ||
end | ||
end | ||
|
||
defp map_union_optimization_strategy(_, _, _, _), do: nil | ||
|
||
defp do_map_union_optimization_strategy(:none, _, status), do: status | ||
|
||
defp do_map_union_optimization_strategy({key, v1, iterator}, pos2, status) do | ||
with %{^key => v2} <- pos2, | ||
next_status when next_status != nil <- map_union_next_strategy(key, v1, v2, status) do | ||
do_map_union_optimization_strategy(:maps.next(iterator), pos2, next_status) | ||
else | ||
_ -> nil | ||
end | ||
end | ||
|
||
defp map_union_next_strategy(key, v1, v2, status) | ||
|
||
# structurally equal values do not impact the ongoing strategy | ||
defp map_union_next_strategy(_key, same, same, status), do: status | ||
|
||
defp map_union_next_strategy(key, v1, v2, :all_equal) do | ||
if key != :__struct__, do: {:one_key_difference, key, v1, v2} | ||
end | ||
|
||
defp map_union_next_strategy(_key, v1, v2, {:one_key_difference, _, d1, d2}) do | ||
# we have at least two key differences now, we switch strategy | ||
# if both are subtypes in one direction, keep checking | ||
cond do | ||
trivial_subtype?(d1, d2) and trivial_subtype?(v1, v2) -> :left_subtype_of_right | ||
trivial_subtype?(d2, d1) and trivial_subtype?(v2, v1) -> :right_subtype_of_left | ||
true -> nil | ||
end | ||
end | ||
|
||
defp map_union_next_strategy(_key, v1, v2, :left_subtype_of_right) do | ||
if trivial_subtype?(v1, v2), do: :left_subtype_of_right | ||
end | ||
|
||
defp map_union_next_strategy(_key, v1, v2, :right_subtype_of_left) do | ||
if trivial_subtype?(v2, v1), do: :right_subtype_of_left | ||
end | ||
|
||
# cheap to compute sub-typing | ||
# a trivial subtype is always a subtype, but not all subtypes are subtypes | ||
defp trivial_subtype?(_, :term), do: true | ||
defp trivial_subtype?(same, same), do: true | ||
|
||
defp trivial_subtype?(%{} = left, %{} = right) | ||
when map_size(left) == 1 and map_size(right) == 1 do | ||
case {left, right} do | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I went for an even simpler version with just shallow comparisons (except for the structural comparison on top) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Suggestion, let's go with the regular subtyping? We apply this on very few cases and once we start having structs nested inside structs, this will make a difference? In other words, let's go with the "slow" but more general and less code version and we optimize it again in the future? Especially to avoid getting the logic wrong here... :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This makes total sense, especially since And perhaps we'll be able to make |
||
{%{atom: _}, %{atom: {:negation, neg}}} when neg == %{} -> | ||
true | ||
|
||
{%{map: _}, %{map: [{:open, pos, []}]}} when pos == %{} -> | ||
true | ||
|
||
{%{bitmap: bitmap1}, %{bitmap: bitmap2}} -> | ||
(bitmap1 &&& bitmap2) === bitmap2 | ||
|
||
_ -> | ||
false | ||
end | ||
end | ||
|
||
defp trivial_subtype?(_, _), do: false | ||
|
||
# Given two unions of maps, intersects each pair of maps. | ||
defp map_intersection(dnf1, dnf2) do | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we can encapsulate this and use it on
map_normalize
too?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you mean instead of the
map_non_negated_fuse
/fusible_maps
/map_non_negated_fuse_pair
part right?Seems it's doing very similar stuff, let's see!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I can also be the one giving that a try. We can merge this once the comments above are addressed, then I can work on the map_normalize and you work on tuples? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I got it actually! Will push in a sec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ec96574
I didn't spend time on renaming but I think we can probably use the term
fuse
/fusion
etc more for consistency.Feel free to rename as you see fit!