Synthesize Representable type class #3663

OlivierBlanvillain · 2017-12-13T09:38:23Z

This PR is the first step towards integrating parts shapeless' generic programming facilities in Dotty.

A Representable type class in added in package dotty.generic, alongside Sum and Prod types to implement the equivalent of shapeless.Generic. Representable[A] is synthesized as a fallback to implicit search, when A is a case class or a sealed trait. To make that possible, child annotations to sealed classes are added in typer instead of PostTyper.

Plans for future work are as follows:

Include metadata in the generic representation (starting from labels)
Support higher kinded types via Representable1, Sum1 and Prod1. Early experiments (library based) show that this can be done very elegantly in Dotty by using higher kinded counterparts of Representable, Sum and Prod, as done in GHC-generics.
Experiment with offload more of the type class derivation work to the compiler (doing less in implicit search). The idea would be require users to write their derivation in a ReprFold type class (similar to shapeless.TypeClass), then implement the actual fold in a Deriving type class, compiler generated:

// Similar to shapeless.TypeClass
trait ReprFold[TC[_]] {
  def imap[A, B](fa: TC[A])(f: A => B)(g: B => A): TC[B] // cats.Invariant
  def unit: TC[Unit]                                     // cats.Cartesian
  def product[A, B](fa: TC[A], fb: TC[B]): TC[(A, B)]    // cats.Cartesian
  def sum[A, B](fa: => TC[A], fb: => TC[B]): TC[Either[A, B]] // ??
}

trait Deriving[A] {
  type Mk[TC[_]] // = implicit (TC[Int], TC[String]) => TC[A]
                 // for A = sealed trait Bar
                 // case class Bis(i: Int)    extends Bar
                 // case class Buz(s: String) extends Bar

  def materialize[TC[_]]
    (implicit g: ReprFold[TC]): Mk[TC] // Compiler generated
}

This commit breaks things, for instance pos/i1795.scala that fail during pickling... To be investigated later.

milessabin · 2017-12-13T09:53:00Z

This looks interesting, but I recommend against committing to a concrete representation type, or if you do, go with Tuple2/Unit and Either/Nothing, at least as an interim measure, or perhaps make this parameterizable.

The big issue is with kinding as you've already started to observe with Representable1 etc. I believe that the right approach here is to address kind polymorphism head on.

Also note that shapeless's TypeClass type class is really very limited. One thing that we immediately encountered was a need to be able to thread auxiliary type classes through a derivation. Various attempts were made to generalize TypeClass to support this, but none of them proved satisfactory. The current shapeless model using Lazy (byname in Dotty) and implicit resolution has proved to be a great deal more flexible in practice.

While I support the idea of making functionality of this sort a language intrinsic, I recommend against baking in an implementation which commits to replicating boilerplate at the kind level and limits derivations to such a simple form.

OlivierBlanvillain · 2017-12-13T10:24:56Z

@milessabin Thanks for your input!

The big issue is with kinding as you've already started to observe with Representable1 etc. I believe that the right approach here is to address kind polymorphism head on.

I want to experiment with the GHC.generics approach where higher kinded sum & product are used for both ground types and HKT, something along these lines:

sealed trait Prod[X]
final case class PCons[H[_], T[t] <: Prod[t], X](head: H[X], tail: T[X]) extends Prod[X]
final case class PNil[X]() extends Prod[X]

sealed trait Sum[X]
sealed trait SCons[H[_], T[t] <: Sum[t], X] extends Sum[X]
final case class SLeft[H[_], T[t] <: Sum[t], X](head: H[X]) extends SCons[H, T, X]
final case class SRight[H[_], T[t] <: Sum[t], X](tail: T[X]) extends SCons[H, T, X]
sealed trait SNil[X] extends Sum[X]

trait Representable[A] {
  type Repr[t] <: Sum[t] | Prod[t]

  def to[T](a: A): Repr[T]
  def from[T](r: Repr[T]): A
}

trait Representable1[A[_]] {
  type Repr[t] <: Sum[t] | Prod[t]

  def to[T](a: A[T]): Repr[T]
  def from[T](r: Repr[T]): A[T]
}

// Syntax for ground types
type &:[H, T[t] <: Prod[t]] = [X] => PCons[[Y] => H, T, X]
type |:[H, T[t] <: Sum[t]] = [X] => SCons[[Y] => H, T, X]

// Syntax for HKT
type :&:[H[_], T[t] <: Prod[t]] = [X] => PCons[H, T, X]
type :|:[H[_], T[t] <: Sum[t]] = [X] => SCons[H, T, X]

type Id[t] = t
type Const[t] = [X] => t

sealed trait Tree[T]
case class Leaf[T](t: T) extends Tree[T]
case class Node[T](l: Tree[T], r: Tree[T]) extends Tree[T]

Representable1[Node] { type Repr = Tree :&: Tree :&: PNil }
Representable1[Leaf] { type Repr = Id :&: PNil }
Representable1[Tree] { type Repr = Leaf :|: Node :|: SNil }

Representable[Node[A]] { type Repr = Tree[A] &: Tree[A] &: PNil }
Representable[Leaf[A]] { type Repr = A &: PNil }
Representable[Tree[A]] { type Repr = Leaf[A] |: Node[A] |: SNil }

Also note that shapeless's TypeClass type class is really very limited. One thing that we immediately encountered was a need to be able to thread auxiliary type classes through a derivation.

Are you refering to cases where additional/external type classes are mixed in during derivation? Is it common in practice?

milessabin · 2017-12-13T11:01:56Z

I find something that stops at * -> * a bit unsatisfying ... it'd be much nicer to have a uniform solution for, eg. Tuple2[A, B] etc. I'd also like to experiment with a "no representation type" approach via Church encoding or similar. Anyhow, I'd say it's not a good idea to bake in something so similar to shapeless now that it's clearer what shapeless's limitations in practice are.

Are you refering to cases where additional/external type classes are mixed in during derivation? Is it
common in practice?

Yes, very. The most common is to thread Witness/ValueOf through when working with values produced by shapeless's LabelledGeneric to get hold of terms corresponding to the singleton label types, but there are plenty of others. See shapeless's implementation of Scrap Your Boilerplate, or this recently contributed implementation of recursion schemes. Neither of these could be implemented in terms of shapeless's TypeClass.

julienrf · 2017-12-22T16:51:13Z

Following a discussion I had with @OlivierBlanvillain, I’d like to share a more elaborate example than Show and discuss some points that I think are important to address.

This example is inspired from a library that describes data types and then derives typeclass instances from these descriptions (API doc is here). For simplicity, only record types (case classes) are considered, the case of sum types is similar but uses Either instead of Tuple2, essentially.

Consider the following typeclass for serializing/deserializing data into/from JSON documents:

trait Codec[A] {
  def encode(a: A): Json
  def decode(json: Json): Either[ValidationErrors, A]
}

Here is how we would manually define an instance of Codec[User]:

case class User(name: String, age: Int)

object User {
  implicit val codec: Codec[User] =
    Codec.obj2(
      "name" -> Codec.string,
      "age" -> Codec.integer
    ) { case (n, a) => User(n, a) } { user => (user.name, user.age) }
}

It assumes that the following operations are available:

object Codec {
  /** JSON String */
  implicit def string: Codec[String] = …
  /** JSON number */
  implicit def integer: Codec[Int] = …
  /** JSON object with two fields */
  def obj2[A, B, C](
    fieldA: (String, Codec[A]), fieldB: (String, Codec[B])
  )(
    f: (A, B) => C
  )(
    g: C => (A, B)
  ): Codec[C]
}

Ideally, we would like generically derived instances of Codec to be exactly like User.codec.

However, with the shapeless.Generic approach it does not seem possible because this approach abstracts over the arity of the case classes by using an inductive representation of record fields (with an HList). Consequently, derived instances use several intermediate transformations. Here are the required implicit definitions to make it possible to generically derive typeclass instances:

trait DerivedCodec[A] {
  def codec: Codec[A]
}

object DerivedCodec {
  /** Base rule: derives a codec for a case class with exactly one field */
  implicit def singletonField[L <: Symbol, A](
    fieldLabel: ValueOf[L],
    fieldCodec: Codec[A]
  ): DerivedCodec[FieldType[L, A] :: HNil] = new DerivedCodec[FieldType[L, A] :: HNil] {
    def codec = Codec.obj1(fieldLabel.value.name -> fieldCodec).invmap(a => field[L](a) :: HNil)(_.head)
  }

  /** Induction rule: derives a codec for a case class with n + 1 fields, given a derived codec for a case class with n fields */
  implicit def consField[L <: Symbol, H, T <: HList](implicit
    fieldLabel: ValueOf[L],
    fieldCodec: Codec[H],
    tailDerivedCodec: DerivedCodec[T]
  ): DerivedCodec[FieldType[L, H] :: T] = new DerivedCodec[FieldType[L, H] :: T] {
    def codec =
      Codec.obj1(fieldLabel.value.name -> fieldCodec).zip(tailDerivedCodec.codec)
        .invmap { case (h, t) => field[L](h) :: t } { ht => (ht.head, ht.tail) }
  }

  /** Derives a codec for a case class `A`, given a derived codec for its generic representation `R` */
  implicit def hlistToCaseClass[A, R](implicit
    gen: LabelledGeneric.Aux[A, R],
    derivedCodec: DerivedCodec[R]
  ): DerivedCodec[A] = new DerivedCodec[A] {
    def codec = derivedCodec.codec.invmap(gen.from)(gen.to)
  }
}

This example uses HList, FieldType and LabelledGeneric from shapeless, and ValueOf from SIP-23. It also assumes the following operations on Codec:

trait Codec[A] {
  /** combines `this` codec with `that` codec */
  def zip[B](that): Codec[(A, B)] = …
  /** transforms this `Codec[A]` into a `Codec[B]` by using a pair of inverse functions */
  def invmap[B](f: A => B)(g: B => A): Codec[B] = …
}

object Codec {
  /** JSON object with one field of type `A` */
  def obj1[A](field: (String, Codec[A])): Codec[A] = …
}

If we derive a Codec[User] using the implicit rules, it produces the following codec:

  hlistToCaseClass(
    <compiler-synthesized>,
    consField(
      'name,
      Codec.string,
      singletonField('age, Codec.integer)
    )
  )

Which, in turn, expands to:

  Codec.obj1('name.name -> Codec.string)
    .zip(Codec.obj1('age.name -> Codec.integer).invmap(a => field['age](a) :: HNil)(_.head))
    .invmap { case (h, t) => field['age](h) :: t } { ht => (ht.head, ht.tail) }
    .invmap { case n :: a :: HNil => User(n, a) } { user => user.name :: user.age :: HNil }

(I removed the intermediate DerivedCodec and just kept Codec)

As wee can see, the derived instance uses a lot of intermediate transformations (invmap calls). These transformations are necessary for two reasons:

to convert to/from the HList based generic representation of the case class
to progress between two induction steps

We might be able to inline the HList construction and extraction so that it doesn’t exist at runtime, but I’m wondering how we could get rid of the intermediate transformations caused by the inductive derivation process. The invmap and zip operations are user defined and the compiler has no knowledge on how to rewrite them.

Let’s try to manually rewrite the induction step without using zip (by inlining it) and invmap:

  implicit def consField[L <: Symbol, H, T <: HList](implicit
    fieldLabel: ValueOf[L],
    fieldCodec: Codec[H],
    tailDerivedCodec: DerivedCodec[T]
  ): DerivedCodec[FieldType[L, H] :: T] = new DerivedCodec[FieldType[L, H] :: T] {
    def codec = new Codec[FieldType[L, H] :: T] {
      def encode(ht: FieldType[L, H] :: T): Json =
        Json.obj(fieldLabel.value.name -> fieldCodec.encode(ht.head.value))
          .merge(tailDerivedCodec.codec.encode(ht.tail))
      def decode(json: Json): Either[ValidationError, FieldType[L, H] :: T] = {
        val headResult =
          json match { case JsonObject(fields) if fields.contains(fieldLabel.value.name) => fieldCodec.decode(fields.get(fieldLabel.value.name)) case _ => Left(MissingField(fieldLabel.value.name))
        val tailResult = tailDerivedCodec.codec.decode(json)
        (headResult, tailResult) match {
          case (Right(h), Right(t)) => Right(field[L](h) :: t)
          case (Left(e),  Right(_)) => Left(e)
          case (Right(_), Left(e))  => Left(e)
          case (Left(e1), Left(e2)) => Left(e1.concat(e2))
        }
      }
    }
  }

The derived codec would still be less performant than the manually written one (the derived one uses Codec.obj1 two times and merges their behaviors whereas the manually written one directly uses Codec.obj2).

Note that we even have to expand the code for combining Either[ValidationErrors, H] and Either[ValidationErrors, T] into Either[ValidationErrors, H :: T], just to avoid calling invmap. It seems that this generic representation is the root of our problems.

DavidGregory084 · 2017-12-25T01:21:16Z

/cc @fommil as I think he will be interested in this

fommil · 2017-12-25T11:06:24Z

The alternative to case classes (now mothballed because scalameta macros were abandoned) was written up at https://vovapolu.github.io/scala/stalagmite/perf/2017/09/02/stalagmite-performance.html

I'd love to return to it.

My new approach to typeclass derivation at the point of data definition is at https://gitlab.com/fommil/scalaz-deriving. I'll be writing a chapter in my book soon, plus hopefully giving a talk at lambdaconf.

OlivierBlanvillain · 2019-01-28T10:35:50Z

Subsumed by #5540

OlivierBlanvillain added 30 commits December 12, 2017 11:39

Register children in Typed instead of PostTyper

4d80935

This commit breaks things, for instance pos/i1795.scala that fail during pickling... To be investigated later.

Add skeleton of generic library

1e7ab29

Add implicit search hook

e08a98c

Synthesise Repr type alias for products

800bc05

Implement selectorName using productAccessorName

40a8ee4

Synthesise to and from for products

659ec4e

Add generic-sum test

8936872

Synthesise type Repr for sums

f43fa1a

Synthesise to/from for sums

5a2726e

Simplify Product.from using a Match instead

89ce33f

Document sum case

3263d44

Rewrite sum.to using a match

01742e5

Refactoring, root qualify names form dotty.generic._

145ca11

Wip porting shapeless tests

450efee

Fix "_ is not a type name" on objects

a585c36

Add asSeenFrom for sums

d6e1e75

Swap product/sum cases in pattern match

3ddb304

Handle case objects

471fb3f

Handle varargs

8fbb760

Remove warnings using unchecked

e461835

Reuse instantiate from patmat exhaustivity

175f022

Update representable tests

86b3a16

Use lower kinded Sum/Prod for Representatble

a33dcfa

Use fullyDefinedType to synthesise type arguments

1bd9e71

Adapt to make typer aware of the new contraints

e35ef67

Update representable.scala

09eb241

Replace rootQual by tpd.ref :)

62c0037

Fix a genLoad bug with .widen

a24612b

Split synthesizedRepresentable into 3 methods

e53e690

Strip "../" from test output

ab52281

OlivierBlanvillain added 2 commits December 12, 2017 16:12

Cleanup

460cf6d

Add documentation

2858e53

OlivierBlanvillain self-assigned this Jan 18, 2018

odersky added the stat:on hold label Jan 12, 2019

OlivierBlanvillain closed this Jan 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Synthesize Representable type class #3663

Synthesize Representable type class #3663

Uh oh!

OlivierBlanvillain commented Dec 13, 2017

Uh oh!

milessabin commented Dec 13, 2017 •

edited

Loading

Uh oh!

OlivierBlanvillain commented Dec 13, 2017

Uh oh!

milessabin commented Dec 13, 2017

Uh oh!

julienrf commented Dec 22, 2017 •

edited

Loading

Uh oh!

DavidGregory084 commented Dec 25, 2017

Uh oh!

fommil commented Dec 25, 2017

Uh oh!

OlivierBlanvillain commented Jan 28, 2019

Uh oh!

Uh oh!

Synthesize Representable type class #3663

Synthesize Representable type class #3663

Uh oh!

Conversation

OlivierBlanvillain commented Dec 13, 2017

Uh oh!

milessabin commented Dec 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

OlivierBlanvillain commented Dec 13, 2017

Uh oh!

milessabin commented Dec 13, 2017

Uh oh!

julienrf commented Dec 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavidGregory084 commented Dec 25, 2017

Uh oh!

fommil commented Dec 25, 2017

Uh oh!

OlivierBlanvillain commented Jan 28, 2019

Uh oh!

Uh oh!

milessabin commented Dec 13, 2017 •

edited

Loading

julienrf commented Dec 22, 2017 •

edited

Loading