Skip to content

Improve decompilation to generate compilable sources #4526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nicolasstucki opened this issue May 14, 2018 · 6 comments
Closed

Improve decompilation to generate compilable sources #4526

nicolasstucki opened this issue May 14, 2018 · 6 comments

Comments

@nicolasstucki
Copy link
Contributor

nicolasstucki commented May 14, 2018

Currently the decompilation printer scala.tasty.util.ShowSourceCode is the main step in decompilation of class files. It is a glorified TASTY tree pretty printer. It is still quite rough but it prints out a version of the sources that are fully typed (all types are inferred, all implicit are resolved, ...). There are still two kinds of issues to iron out: (i) Resugaring and (ii) leaking internals.

  • (i) We need to print some trees that were generated by desugaring back into a source like code. For example, we currently do this for lambdas.
  • (ii) The TASTY trees do not match exactly Scala source. We have some internal flags that do not make sense in source code and that should not be printed. Fields are defined in the class and not in the constructor when they should appear in the sources. And many more little details.

The Goal is to have a decompiler that produces back sources that can be recompiled and would generate the same program. We expect to fix this by small incremental steps, supporting more and more features.

Currently, the simplest way to test if some code is decompiled correctly is to create a file Foo.scala (or any of the available tests in tests/pos), write some code in it and run the following scripts from the dotty project:

sbt
;dotc -d output Foo.scala; dotc -decompile -d output -classpath output Foo; dotc -d output2 output/decompiled.scala

If some issue is fixed, regression test must be added by adding a .scala source in tests/pos along with a .decompiled with the same name. For example tests/pos/lambda.scala and tests/pos/lambda.decompiled.

You can test this file by running sbt testFromTasty (or sbt "testFromTasty XYZ" where XYZ is the name of the test), if the test fails a .decompiled.out will be generated besides the .decompiled. You can just rename it if the output of the current run was actually correct.

Note that this file is currently sensitive to spaces at the end of the line (painful and needs to be fixed) and that FromTastyTests append to any existing .decompiled.out which is problematic if you still have one from a previous FromTastyTests. There is also no way to only run a single test under FromTastyTests which is another improvement that must be done.

@nicolasstucki
Copy link
Contributor Author

nicolasstucki commented May 14, 2018

Here is a simple one that fails

class Foo
object Foo

generating

class Foo() {}object Foo {}

It just misses a \n between the top level statements.

@nicolasstucki
Copy link
Contributor Author

nicolasstucki commented May 14, 2018

```scala class Foo(x: Int) ```

generates

/** Decompiled from output/Foo.class */
class Foo(x: Int) { 
  private[this] val x: Int
}

There is an additional statement created internally to represent the field of the constructor argument. We just need to look at it's flags and filter it out if it is one of those.

@nicolasstucki nicolasstucki changed the title Improve decompilation printer to generate compilable sources Improve decompilation to generate compilable sources May 14, 2018
@nicolasstucki
Copy link
Contributor Author

package bar
class Foo {
  protected[bar] def foo(): Int = 0
}

with dotc -decompile ... bar.Foo produces

/** Decompiled from output/bar/Foo.class */
package bar {
  class Foo() { 
    protected def foo(): Int = 0
  }
}

It is missing the bar in protected[bar]. This information can be found in sym.privateWithin of the sym: Symbol of the definition.

@elektronaut0815
Copy link

Trying Varargs:

package bar
class Foo {
  def printAll(strings: String*) = strings.map(println)
}

produces

/** Decompiled from output/bar/Foo.class */
package bar {
  class Foo() { 
    def printAll(strings: Seq[String] @scala.annotation.internal.Repeated()): 
      Seq[Unit]
     = 
      strings.map[Unit, Seq[Unit]]((x: Any) => println(x))(
        collection.Seq.canBuildFrom[Unit]
      )
  }
}

Compilation of this works, but annotation shouldn't occur.

@schneist
Copy link

schneist commented May 15, 2018

class Foo {

  def justdoit (f : Either[Int,String]) : String = {
    f match {
      case Left(i) => i.toString
      case Right(s) => s
    }

  }
}

produces

/** Decompiled from output/Foo.class */
class Foo() {
  def justdoit(f: Either[Int, String]): String =
    {
      f match
        {
          case Left.unapply[Int, String](i @ _): Left[Int, String] =>
            i.toString()
          case Right.unapply[Int, String](s @ _): Right[Int, String] =>
            s: String
        }
    }
}

@elektronaut0815
Copy link

Calling varargs method:

package bar
class Foo {
  def printAll(strings: String*) = strings.foreach(println)

  def printDefault() = printAll("One", "Two", "Three")
}

produces

/** Decompiled from output/bar/Foo.class */
package bar {
  class Foo() { 
    def printAll(strings: Seq[String] @scala.annotation.internal.Repeated()): 
      Unit
     = strings.foreach[Unit]((x: Any) => println(x))
    def printDefault(): Unit = 
      this.printAll(["One","Two","Three" : String]: String*)
  }
}

but should be

def printDefault(): Unit =
      this.printAll(Seq("One","Two","Three"): _*)

fschueler added a commit to fschueler/dotty that referenced this issue May 15, 2018
…la#4526)

This fixes printing of classes with companion objects so that they don't appear on the same line.
Example case:

class Foo
object Foo

will be decompiled into

class Foo() {}
object Foo {}

where before they would be printed on the same line, failing compilation.
nicolasstucki added a commit that referenced this issue May 15, 2018
Fix printing of classes with companion objects in the decompiler (#4526)
nicolasstucki added a commit to dotty-staging/dotty that referenced this issue Aug 8, 2018
nicolasstucki added a commit that referenced this issue Aug 9, 2018
tuvior pushed a commit to tuvior/dotty that referenced this issue Oct 17, 2018
Remove annotation from function declaration

Explicitly type non sequence literal arguments as varargs in function call (_*)
tuvior pushed a commit to tuvior/dotty that referenced this issue Oct 17, 2018
Remove annotation from function declaration

Explicitly type non sequence literal arguments as varargs in function call (_*)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants