-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Redesign Tuples with HList-like structure #2199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
No idea - GitHub gets messy with reordered commits in rebases I've
noticed...Could be?
Otherwise - just change the hash of the latest commit and force push,
should get it going again. Also - CI runs a maximum of 20 jobs, currently
that means 3 commits concurrently.
…On Fri, Apr 7, 2017, 09:36 Olivier Blanvillain ***@***.***> wrote:
@felixmulder <https://github.com/felixmulder> any idea why the CI decided
not to test the latest commit?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2199 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABdYwb3iZL0TIaXcdGCUSUNfO-Ekt0YPks5rtedmgaJpZM4M2jcg>
.
|
It always does that. GitHub has this stupid notion that the commits of a PR should be displayed in timestamp order, instead of in history order. It's only a display thing, though, so I doubt it would affect your CI (it certainly never affected ours). |
@@ -6,7 +6,7 @@ case class Var() extends Expr | |||
object Analyzer { | |||
def substitution(expr: Expr, cls: (Var,Var)): Expr = | |||
expr match { | |||
case cls._2 => cls._1 // source of the error | |||
case e if e == cls._2 => cls._1 // source of the error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since _2
is now added thought implicit conversion, the pattern used in this case is no longer stable. (see commit message)
tests/run/double-pattern-type.scala
Outdated
} | ||
|
||
c2 match { | ||
case C2(a, b, c) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you detail what this one expands to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's unchanged. (the first commits come from #1938 which this PR is rebased on to be able to test case class with size > 23)
tests/tuples/tuple-accessors.scala
Outdated
v22: Int | ||
|
||
// We can't go above that since patDef is implemented with direct _i calls. | ||
// It's should be possible to update the desugaring to use var + pattern |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: “it should”
(1d, 2d, 3d) | ||
|
||
val l3: (String, Boolean, Double, Double, Double) = | ||
l1 ++ l2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
Great news!! Will also be available everything from shapeless to operate with HLists the way we used to? |
I'd like to point you to https://github.com/szeiger/ErasedTypes which I wrote half a decade ago with the purpose of eventually replacing Scala tuples with HArrays. It contains abstractions over tuple arity that may be useful to add (although some will not work in the same way with Dotty making arbitrary type projections illegal). It also has a clean separation between cons-style HLists and flat HArrays (but I recommend looking at typical use cases to determine whether or not this should be preferred over the implementation here which combines the two). |
tests/tuples/tuple-accessors.scala
Outdated
@@ -120,7 +120,7 @@ object Test { | |||
v22: Int | |||
|
|||
// We can't go above that since patDef is implemented with direct _i calls. | |||
// It's should be possible to update the desugaring to use var + pattern | |||
// Its should be possible to update the desugaring to use var + pattern |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It?
a792788
to
5b48d46
Compare
967bd9d
to
6214895
Compare
@DanyMariaLee It seems like the current trend is to remove things from the standard library, not add more, so I'm not really sure it shapeless style operations on tuples would have a place in the standard library. Miles has been talking for a while about a possible shapeless redesign based on Symmetric Monoidal Category (see this branch) that would allow operations like HList#length and Coproduct#length to be factored out into a single implementation over a common structure. The tuple types introduced in this PR would fit perfectly with this design! |
@szeiger Thanks for pointing out your HArrays project, it's something I studied briefly before I started working on this. I think the main difference with this PR is that here nothing is intended to be exposed to users besides the The question of array vs linked list representation is easily answered with a binary compatibility requirement: it wouldn't make much sense to us a flat representation up to size 22, then switch to linked afterward. However, if we lower that limit to a more reasonable size, then the question opens again :) |
This change is needed for scala/scala3#2199. The idea is to have `type Tuple2[A, B] = (A, B) // Desugars into TupleCons[A, TupleCons[B, Unit]]` in Predef, so that users can refer to "a tuple of size 2" either with `(A, B)` or `Tuple2[A, B]`. It's still possible to refer to the actually underlying class with the explicit prefix, `scala.Tuple2`. Because this code is inside the scala package and mixes the two syntax in breaks with the above PR...
When it comes to such low-level constructs as tuples you may not want to hide too much of the details for performance reasons. I haven't done any benchmarking but exposing the TupleN classes should in theory enable better optimizations because you get monomorphic dispatch. The motivation for having both HList and HArray in my implementation was not a mixed implementation of tuples depending on size. It doesn't do that (tuples are always flat) and for predictable performance I think it shouldn't do it. But at the type level HArrays are nested, so if you already have a proper HList type you can use that as part of the HArray type, i.e. "here's a flat implementation that corresponds to the following HList". Building up from Nat to HList and finally HArray was a natural thing to do given that my goal was not only to remove the Tuple22 limit but also to allow useful abstractions over arity. Why would you still have special implementations up to 22 here anyway? I think you should lower that number considerably. 22 was originally chosen as large enough to represent what Scala needed, not because of performance considerations. The latter should be the driving factor for determining a good size for Dotty if you support arbitrary tuple sizes anyway. You are not bound by binary compatibility and you have Scalafix to take care of source compatibility. |
While originally 22 wasn't chosen due to performance considerations, It would be nice to keep performance properties of existing code. I'd prefer to keep 22 limit in place and not lower it. |
The current very practical reason is that dotty test suite currently uses the 2.11.x standard library compiled by scalac. Methods like Map.apply and Map.head accept and return scala.Tuple2 instances, which is handled in this PR by preserving binary compatibility.
I definitely agree. I did a quick search on a large corpus of open-source Scala project (~600K LOC), and every tuple instantiated, pattern-matched on or accessed with _i explicitly was on tuple of size 6 or below. Above that, it's only generic operations (for example, in Play), for which both array and linked list representation should be superior. It also seems that the majority of generic operations are based on head & tail, which obviously works better on a linked representation.
The tuple interface has no methods, so they shouldn't be dynamic dispatch happening on these tuples. Non-generic operations with statically known length use a cast and a direct call, generic operations must rely on pattern matching. If we imagine an up to size 6 case classes then linked list representation, and always put the I should probably plug in a few benchmarks I did just on the representations themselves: bench.pdf. Each representation is special cased up to size 4 with case classes. |
b11a2f7
to
303a7d3
Compare
Is this ready for review? In this case it would be good to rebase first. |
Changes are as follows: - New types for tuples in dotty library: Tuple = Unit | TupleCons Head Tail <: Tuple - Desugaring uses this structure instead of the scala.TupleN for types and expressions - New TupleRewrites phase does rewrites the HList structure into scala.TupleN like case classes (for small arities) or a case class wrapping an Array (for large arities)
To stay binary compatibility with scala 2.11 binaries, we hijack the unpickling of types to fake an HList structure. Later on in erasure, HList types are eraised back to scala.TupleN (for N <= 22). In addition because the actual scala.TupleN classes are neither `<: Tuple` now `<: TupleCons`, these types needs to be systematically araised to Object. In addition to all these carefully compensating hacks, this also imposes a new contain: the dotty-library, which defines several `Tuple/TupleCons` methods, can should now *always* be compiled by dotty itself. Indeed, when compiled with scalac, these types will stay as they are instead of being eraised to Object.
I couldn't get rid of the `dotty-library` project completely, it's still used the first time sbt-briged is compiled with dotty. The dependencies are as follows: - (no dep) compile `dotty-compile` with scalac [0] - (no dep) compile `dotty-library` with scalac [1] - ([0], [1]) → compile `dotty-sbt-bridge` with dotty [2] - ([0], [2]) → compile `dotty-library-bootstraped` with dotty + sbt After that, [1] should never be used again.
- dotc/scala-collections.blacklist: Blacklist Unit and TupleN from scala-collections test - tests/neg/i1653.scala: Having an additional superclass on Unit seams to affect the error here (not sure why). - tests/pos/t0301.scala: Since `_2` is now added thought implicit conversion, the pattern used in this case is no longer stable. - tests/pos/tcpoly_bounds1.scala: This test uses the fact that `(A, B) <: Product2[A, B]`, which is no longer the case. - tests/repl/patdef.check: Having the `_i` via implicit conversions remove the @unchecked here - tests/run/tuple-match.scala: Replace Tuple1 in pattern matching with explicit TupleCons pattern. TupleN can still be used in apply position (defined as defs in DottyPredef), but not in unapply position. - tests/run/unapply.scala: Also uses the fact that `(A, B) <: Product2[A, B]`. - tests/run/withIndex.scala: Might be a limitation in typer?: 14 | case _: Array[Tuple2[_, _]] => true | ^^^^^^^^^^^^ |unreducible application of higher-kinded type [A, B] => dotty.TupleCons[A, dotty.TupleCons[B, Unit]] to wildcard arguments
The LogPendingSubTypesThreshold change is requires for tests on size ~25 tuples that would otherwise trigger -Yno-deep-subtypes
Because all tests depend on dotty-library-bootstrapped publishLocal is always needed to run tests.
@odersky yes! The last missing part is to update pattern matching exhaustivity. The current implementation is special cased for tuples; to obtain the same warnings when using hlists this code would need to be either generalised for GADTs or special cased for the new structure. |
AFAIK in scalac, 22 was chosen actually due to the Scala runtime size. The blowup in jar size for the various Tuple/Product/Function etc classes that get generated is polynomial relative to the arity, and once you start going above 22 you start adding megs onto the Scala runtime
I think that there should be an exception for shapeless style operations because they tend to be central to how the language is used. A lot of the things which shapeless provides, if they were baked into the actual scalac backend, would be a lot more performant and they would also remove a huge amount of boilerplate. That doesn't mean the entirety of shapeless should be in Scalac, but a lot of the tuple/case class conversion functions, and other methods of abstracting over arity I think would be welcome. Its painful to see languages like Ruby and Python able to do something in one line which Scala doesn't have an answer for without shapeless (or it does but it takes 20 lines) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Impressive work! I have lots of comments on details but the general approach looks good.
One overall remark: Have a closer look what goes on in Definitions
. It seems a lot of the things I criticise really come down to the fact that you don't make optimal use of the things defined there.
case 0 => unitLiteral | ||
case _ if ctx.mode is Mode.Type => | ||
// Transforming Tuple types: (T1, T2) → TupleCons[T1, TupleCons[T2, Unit]] | ||
val nil: Tree = TypeTree(defn.UnitType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use hnilType
for consistency with hconsType
?
ts.foldRight(nil)(hconsType) | ||
case _ => | ||
// Transforming Tuple trees: (T1, T2, ..., TN) → TupleCons(T1, TupleCons(T2, ... (TupleCons(TN, ()))) | ||
val nil: Tree = unitLiteral |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again for naming consistency, use hnil
, hcons
?
@@ -140,7 +140,7 @@ object Config { | |||
final val LogPendingUnderlyingThreshold = 50 | |||
|
|||
/** How many recursive calls to isSubType are performed before logging starts. */ | |||
final val LogPendingSubTypesThreshold = 50 | |||
final val LogPendingSubTypesThreshold = 70 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this increased?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some tests with size 24 tuples that triggers this limit, I'm going to revert this change and run these test without -Yno-deep-subtypes
.
lazy val ProductNType = mkArityArray("scala.Product", MaxTupleArity, 0) | ||
lazy val TupleNType = mkArityArray("scala.Tuple", MaxImplementedTupleArity, 1) | ||
lazy val TupleNSymbol = TupleNType.map(t => if (t == null) t else t.classSymbol) | ||
lazy val DottyTupleNType = mkArityArray("dotty.DottyTuple", MaxImplementedTupleArity, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defined like this TupleNSymbol
is an Array[AnyRef]. It's better to use if (t == null) NoSymbol
which makes it an Array[Symbol]
.
lazy val TupleNType = mkArityArray("scala.Tuple", MaxImplementedTupleArity, 1) | ||
lazy val TupleNSymbol = TupleNType.map(t => if (t == null) t else t.classSymbol) | ||
lazy val DottyTupleNType = mkArityArray("dotty.DottyTuple", MaxImplementedTupleArity, 1) | ||
lazy val DottyTupleNModule = DottyTupleNType.map(t => if (t == null) t else t.classSymbol.companionModule.symbol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
@@ -758,10 +758,18 @@ class Typer extends Namer with TypeAssigner with Applications with Implicits wit | |||
else errorType(i"wrong number of parameters, expected: ${protoFormals.length}", tree.pos) | |||
|
|||
/** Is `formal` a product type which is elementwise compatible with `params`? */ | |||
def ptIsCorrectProduct(formal: Type) = { | |||
def ptIsCorrectProduct(formal: Type): Boolean = { | |||
val cons = defn.TupleConsType.symbol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defn.TupleConsClass
(once it is introduced, see other comment)
val cons = defn.TupleConsType.symbol | ||
|
||
// Flatten types nested in TupleCons as a List[Type]. | ||
def tupleType(t: Type, acc: List[Type] = Nil): List[Type] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to tupleTypeArgs
@@ -0,0 +1,471 @@ | |||
package dotty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about the src_dotc
scheme or the other aspects of the build. Let's discuss this aspect with the dotty team as a whole.
|
||
def Tuple1[A](a: A): TC[A, Unit] = TC(a, ()) | ||
|
||
implicit class Tuple2Assessors[A, B](l: TC[A, TC[B, Unit]]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assessors
? Should it not be Accessors
?
What’s the status of this PR? Also, this might be out of the scope of this PR but I think you should not stop in the mid-way and do the same for sealed traits ;) (using |
@julienrf Lots of the complexity of this PR is due to have dotty as cross complied scalac/dotty project. Since going for a bootstrap only compiler is very close on the road-map, I'm going to wait for that before coming back to this PR. Sum types will hopefully be the follow up :) |
Actually, I hitted 22 fields limit in some real-case scenarios (when I had to write large case classes to parse large cvs-s) |
Hi guys, |
I wonder if it is possible to create a typesafe Something like trait Tuple { self: A =>
def apply[I <: Int & Singleton, X](i: I)
(implicit ev: TupleIndexer.Aux[A, I, X]): X = ev(self)
} So you could do the following at compile time: val t: (Int, String, Double)
t(0) // type: Int
t(1) // type: String
t(2) // type: Double
t(3) // ERROR: does not compile |
@joan38 It's postponed until scalac can emit TASTY and we can recompile everything from TASTY, given than the stdlib and thus the entire world has a binary dependency on Tuple2. Until then, the hacks needed in this PR to retain binary compatibility are a bit too much IMO. |
Thanks @OlivierBlanvillain for the update. So I guess until 2.14 millstone appears? This feature is really promising. |
I am worried that this change will slow down the compilation speed. |
@Glavo - don't be :) Also, dotty has a benchmarking suite that is being used for a lot of pull requests. So now the compiler team can actually know if something would slow down the compiler before merging. |
@OlivierBlanvillain now we have another implementation of the same idea, can we close this PR? Or is there something useful that was not covered by the current implementation? |
Maybe we should extract the test cases |
@nicolasstucki @allanrenucci Could you point out to the other implementation? |
4 of these tests are failing, I put them in pending a trace. Intrestingly they all fail in different ways: Compiler crash tests/pending/tuple-patmat-extract.scala Compilation error tests/pending/tuple-for-comprehension.scala Runtime error tests/pending/tuple-erased.scala Doesn't pass Ycheck tests/pending/tuple-cons.scala
4 of these tests are failing, I put them in pending a trace. Intrestingly they all fail in different ways: Compiler crash tests/pending/tuple-patmat-extract.scala Compilation error tests/pending/tuple-for-comprehension.scala Runtime error tests/pending/tuple-erased.scala Doesn't pass Ycheck tests/pending/tuple-cons.scala
4 of these tests are failing, I put them in pending for now. Interestingly each of them fails in different ways: Compiler crash tests/pending/tuple-patmat-extract.scala Compilation error tests/pending/tuple-for-comprehension.scala Runtime error tests/pending/tuple-erased.scala Doesn't pass Ycheck tests/pending/tuple-cons.scala
There is also #5182, where the macro implementation changed completely. |
Add tuple >22 test cases from #2199
How will this interact with the standard collections |
has this been added to Dotty already? |
Yes. Under |
This PR redesigns Tuples to use an HList-like structure instead of flat case classes (
scala.TupleN
):These types (and values) are users facing: desugaring turns
(a, b, ...)
intoTupleCons[a, TupleCons[b, ...Unit]]
(andTupleCons(a, TupleCons(b, ...()))
). At run-time, tuples are represented using case classes for small arities and an Array for large arities.A new mini-phase named
TupleRewrites
takes care of optimizingapply
andunapply
expressions corresponding to creation and pattern matching on tuples. To preserve binary compatibility,Scala2Unpickler
is modified to transform tuple types to the new representation,TypeErasure
then takes care of erasing (small) tuple types back toscala.TupleN
.TODO:
PatmatExhaustivityTests
. The exhaustivity checks need to be updated to properly handle the new representation.runCompilerFromInterface
. This now needs an explicit dotty-library dependency. It used to be inherited "automatically" from thedependsOn(dotty-library)
indotty-compiler
._i
which is still going to be capped at 22. We could lift that by desugaring patdef using vars and pattern matching instead.