Skip to content

add scripting support similar to scala2 scripting #11180

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from
Closed

add scripting support similar to scala2 scripting #11180

wants to merge 12 commits into from

Conversation

philwalk
Copy link
Contributor

@philwalk philwalk commented Jan 20, 2021

This proposal is related to these issues:

#10491
#4747
#10858

Design Rationale

In order for scala scripting to seriously contend with python or bash as a general scripting solution, scripts must be able to leverage the large catalog of available java and scala libraries without always having to specify the required classpath. Other scripting languages rely on a set of "installed" libraries, providing a default configuration. In the case of JVM languages, that implies the need for a default classpath.

Other scripting environments also typically provide a way to install libraries, making them available for import at runtime. Examples of package manager tools include

npm (node)
apt (bash)
pip (python)

A comparable capability for managing a scala scripting environment might be based on coursier, sbt and/or mill. For a jvm library to be "installed" means:

  1. library jar files and transitive dependencies, if not already present, are downloaded to the filesystem
  2. one or more named or default classpaths are updated

This proposal specifies a means for specifying named classpaths, but does not provide a package management tool.

Proposed features for the scala3 scripting solution:

  1. script hash-bang line is ignored by the compiler (no error message)
  2. generate a jar file in the script directory, like scala2 -save option, leveraging scalac -d <scriptname.jar>
  3. specify named runtime classpath at compile time (Class-Path: property added to compiled jar manifest)
  4. Main-Class property is written to the jar manifest
  5. run from compiled jar if newer than script file, otherwise recompile
  6. compiled jar is self-contained, and can be executed via java -jar myScript.jar

The proposed design involves a single jvm startup latency on each script execution.

Benchmark of prototyped solution

A prototype for the proposed solution (based on dotty compiler as of 2021-01-12) requires 4 jvm startups:

  1. compile to jar
  2. analyze the script source to determine main class
  3. add Class-Path and Main-Class properties to compiled jar
  4. execute the compiled jar

The benchmark script:

#!/usr/bin/env scala3.sh

import java.nio.file.Paths

object S3files {
  def main(args:Array[String]):Unit = {
    Paths.get(".").toFile.listFiles.asScala.filter { _.isDirectory }.foreach { dir =>
      printf("%s\n",dir)
    }
  }
}

A rough guess is that the prototype benchmark overstates startup latency by about 300 mSeconds on my system. (java 11.0.9). This was estimated by averaging total jvm run times for java -version over 100 runs.

The actual script startup latency for the prototype's compile versus cached runs:

  • average run time over 10 compile runs: 6.7105 seconds
  • average run time over 100 cached runs: 0.6436 seconds

The challenging parts (for me) are to figure out how to leverage the existing dotty compiler capabilities to:

  1. discover script main class name (if defined)
  2. write a self-contained jar file with supplemented manifest capabilities

A quick and dirty alternative would be to use existing prototype hacks.

Feedback and opinions are welcome!

@philwalk philwalk changed the title initial limited checkin of idea add scripting support similar to scala2 scripting Feb 3, 2021
@anatoliykmetyuk
Copy link
Contributor

Thanks for the proposal @philwalk!

discover script main class name (if defined)

This capability is already present in the existing scripting driver: dotty/compiler/src/dotty/tools/scripting/ScriptingDriver.scala:44

This proposal specifies a means for specifying named classpaths, but does not provide a package management tool.

I believe if we implement 3rd party library support from scripts, it should also support library fetching. It is tedious to manage classpaths manually. We were considering adding something like Ammonite's magic imports – but the semantics is not mature enough, so we can only support Maven libraries (and not references to other scripts) in that way.

Most probably a scripting solution with such support will require integration with Coursier and hence would be good to split into a separate project that depends on the compiler.

  1. pass @<argsFile> and -color:* options to compiler
  2. add -save|-savecompiled option
  3. recognize scripts with #!.*scala regardless of extension
  4. if <scriptPath>.jar file is newer than <scriptPath> execute it (no compile)
  5. set -Dscript.name for both script execution paths (script or .jar)

changes to dotty.tools.scripting package:

  1. moved mainMethod execution from ScriptingDriver to Main
  2. detectMainMethod also detects main class name
  3. on -save option:
     a. generate same-name jar file in <scriptPath> parent directory
     b. "java.class.path" appended to context classpath with deduplication
     c. write "Main-Class" and "Class-Path" to jar manifest
  4. additional compiler args splitting and filtering
  5. throw sys.error if .class file files generated (e.g., if script source is blank)

added new tests to verify the following:

  1. hash bang section is ignored by compiler
  2. main class name in stack dump is as expected when main class is declared in script
  3. main class name in stack dump is as expected when main class is not declared in script
  4. script.name property matches scriptFile.getName
  5. without -save option, no jar file is generated
  6. -save option causes jar file with expected name to be generated
  7. generated jar file is directly executable via "java -jar <scriptFile>.jar"
@philwalk
Copy link
Contributor Author

philwalk commented Feb 9, 2021

@anatoliykmetyuk - I didn't see your feedback this morning before I pushed code.

The implementation provided in this PR is compatible with your suggestion regarding 3rd party tools. What I am submitting provides a maximally simple near-term default scripting environment that isn't dependent on other tools, such as sbt or coursier. The manual parts can be automated, and it can be extended in a backward-compatible way. I have been using this approach with scala2 for 9 years. and it's quite robust.

This PR adds SCALA_OPTS environment variable (see dist/bin/scala line 98).
A general scripting solution requires use of an @<argsfile> for specifying the classpath, due to console line-length limitations that prevent use of -classpath.

My default scripting environment has two components:

export SCALA_OPTS="@$HOME/.scalaScriptClasspath -save"
$HOME/.scalaScriptClasspath`  # defines a very large classpath

The .scalaScriptClasspath file is automatically generated by an sbt target in my project.

With this arrangement, I can use scala scripts the same way I use python scripts, the only difference being the system used to install and update packages and libraries. In addition to fetching libraries, it would update the default @.

To attract python scripters we need to offer a tool (probably coursier) that provides manual "install" and "update" commands. We can also offer automatic library fetching, but it should be optional, especially if it requires non-standard "magic" import syntax.

If I understand it correctly, ammonite imports provide a way to resolve library versions within the script, but ammonite doesn't work in Windows, so it's a non-starter for writing portable scripts that run in most OS environments, which is my primary development requirement

@philwalk
Copy link
Contributor Author

philwalk commented Feb 10, 2021

I created a new PR and closed this one, based on your comment about 3rd party tools. See #11379

This capability is already present in the existing scripting driver: dotty/compiler/src/dotty/tools/scripting/ScriptingDriver.scala:44

@anatoliykmetyuk - it turns out that we also needed the "Main-Class" name in addition to the main Method, so, the return value was modified to a tuple, returning both. Because the main class name is needed in the writeJar method, I moved detectMainMethod to Main.scala, and have now renamed detectMainClassAndMethod` it for clarity.

A few questions:

  1. how do I resolve the conflicts so this will run compile checks? The conflict maybe is the result of a missing 'blog' directory in my sandbox, or maybe is caused by checking code in from a Windows system. I will investigate
  2. when I push to philwalk/dotty, it the checks immediately fail with no useful information (that I can find) as to why the failure, any suggestions you might have are welcome
  3. is there a page somewhere the tells how to run tests in a docker container? I would like to use ci.yaml to verify the build before commiting changes. BTW, all tests pass in my Windows 10 workstation, but a few fail in the WSL Ubuntu environment.

Thanks for the help!

@philwalk
Copy link
Contributor Author

philwalk commented Feb 10, 2021

Most probably a scripting solution with such support will require integration with Coursier and hence would be good to split into a separate project that depends on the compiler.

@anatoliykmetyuk I like the idea of having scripting as a separator project that depends on the compiler, although in order for the solution to run in a single jdk process, it needs to be able to have the compiler run some code on its behalf, like this PR does. That suggests that I should move detectMainClassAndMethod back to ScriptingDriver and add the main class name to the parameters passed to Main. I will investigate.

Hopefully the release will include minimal scripting support so scripters can start migrating from scala2.

@philwalk
Copy link
Contributor Author

This is superceded by PR #11379, a redesign intended to provide a base design that can be easily leveraged by 3rd party tool developers. It implements all the basic features familiar to scala2 script developers.

@philwalk philwalk deleted the extend-scripting-capability branch February 10, 2021 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants