-
Notifications
You must be signed in to change notification settings - Fork 1k
Guide on binary compatibility for library authors #881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 12 commits
0f11be5
8746ef6
dbe3fbc
61695cf
8aeac02
4b23622
3cbd13d
e5f1019
1d1c480
3710908
756fab8
a41fbfc
6acae7c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,222 @@ | ||
--- | ||
layout: singlepage-overview | ||
title: Binary Compatibility for library authors | ||
|
||
partof: binary-compatibility | ||
permalink: /overviews/core/:title.html | ||
--- | ||
|
||
**By Jacob Wang** | ||
|
||
## Introduction | ||
|
||
A diverse and comprehensive set of libraries is important to any productive software ecosystem. While it is easy to develop and distribute Scala libraries, good library authorship goes | ||
beyond just writing code and publishing it. | ||
|
||
In this guide, we will cover the important topic of **Binary Compatibility**: | ||
|
||
* How binary incompatibility can cause production failures in your applications | ||
* How to avoid breaking binary compatibility | ||
* How to reason about and communicate the impact of their code changes | ||
|
||
Before we start, let's understand how code is compiled and executed on the Java Virtual Machine (JVM). | ||
|
||
## The JVM execution model | ||
|
||
Scala is compiled to a platform-independent format called **JVM bytecode** and stored in `.class` files. | ||
These class files are collated in JAR files for distribution. | ||
|
||
When some code depends on a library, its compiled bytecode references the library's bytecode. | ||
The library's bytecode is referenced by its class/method signatures and loaded lazily | ||
by the JVM classloader during runtime. If a class or method matching the signature is not found, | ||
an exception is thrown. | ||
|
||
As a result of this execution model: | ||
|
||
* We need to provide the JARs of every library used in our dependency tree when starting an application since the library's bytecode is only referenced, not merged into its user's bytecode | ||
* A missing class/method problem may only surface after the application has been running for a while, due to lazy loading. | ||
|
||
Common exceptions from classloading failures includes | ||
`InvocationTargetException`, `ClassNotFoundException`, `MethodNotFoundException`, and `AbstractMethodError`. | ||
|
||
Let's illustrate this with an example: | ||
|
||
Consider an application `App` that depends on `A` which itself depends on library `C`. When starting the application we need to provide the class files | ||
for all of `App`, `A` and `C` (something like `java -cp App.jar:A.jar:C.jar:. MainClass`). If we did not provide `C.jar` or if we provided a `C.jar` that does not contain some classes/methods | ||
which `A` calls, we will get classloading exceptions when our code attempts to invoke the missing classes/methods. | ||
|
||
These are what we call **Linkage Errors** -- errors that happen when the compiled bytecode references a name that cannot be resolved during runtime. | ||
|
||
### What about Scala.js and Scala Native? | ||
Similarly to the JVM, Scala.js and Scala Native have their respective equivalents of .class files, namely .sjsir files and .nir files. Similarly to .class files, they are distributed in .jars and linked together at the end. | ||
|
||
However, contrary to the JVM, Scala.js and Scala Native link their respective IR files at link time, so eagerly, instead of lazily at run-time. Failure to correctly link the entire program results in linking errors reported while trying to invoke `fastOptJS`/`fullOptJS` or `nativeLink`. | ||
|
||
Besides that difference in the timing of linkage errors, the models are extremely similar. **Unless otherwise noted, the contents of this guide apply equally to the JVM, Scala.js and Scala Native.** | ||
|
||
Before we look at how to avoid binary incompatibility errors, let us first | ||
establish some key terminologies we will be using for the rest of the guide. | ||
|
||
## What are Evictions, Source Compatibility and Binary Compatibility? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there's some disconnect between this section and the previous one. It would be good to add some paragraph connecting the two so that the user clearly sees how this one links to the previous one. This eases readability. |
||
|
||
### Evictions | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like how you explain evictions, well done. |
||
When a class is needed during execution, the JVM classloader loads the first matching class file from the classpath (any other matching class files are ignored). | ||
Because of this, having multiple versions of the same library in the classpath is generally undesirable: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This mixes up different things. Consider replacing that paragraph with something more obviously accurate, such as
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's inaccurate about the paragraph? I want to minimize the amount of keywords mentioned especially when they doesn't help the key message of the guide (e.g. OSGI). I don't think "only one version of any given library" is accurate. If I have There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jatcwang I think Sébastien's suggestion here makes sense, just make sure you add a paragraph for "(such as OSGi)" so that it has less importance. But the wording is superior in precision 😄. |
||
|
||
* Need to fetch and bundle multiple library versions when only one is actually used | ||
* Unexpected runtime behavior if the order of class files changes | ||
|
||
Therefore, build tools like SBT and Gradle will pick one version and **evict** the rest when resolving JARs to use for compilation and packaging. | ||
By default they pick the latest version of each library, but it is possible to specify another version if required. | ||
|
||
### Source Compatibility | ||
Two library versions are **Source Compatible** with each other if switching one for the other does not incur any compile errors or unintended behavioral changes (semantic errors). | ||
For example, If we can upgrade `v1.0.0` of a dependency to `v1.1.0` and recompile our code without any compilation errors or semantic errors, `v1.1.0` is source compatible with `v1.0.0`. | ||
|
||
### Binary Compatibility | ||
Two library versions are **Binary Compatible** with each other if the compiled bytecode of these versions can be interchanged without causing Linkage Errors. | ||
|
||
### Relationship between source and binary compatibility | ||
While breaking source compatibility often results in binary incompatibilities as well, they are actually orthogonal -- breaking one does not imply breaking the other. | ||
|
||
#### Forwards and Backwards Compatibility | ||
|
||
There are two "directions" when we describe compatibility of a library release: | ||
|
||
**Backwards Compatible** means that a newer library version can be used in an environment where an older version is expected. When talking about binary and source compatibility, | ||
this is the common and implied direction. | ||
|
||
**Forwards Compatible** means that an older library can be used in an environment where a newer version is expected. | ||
Forward compatibility is generally not upheld for libraries. | ||
|
||
Let's look at an example where library `A v1.0.0` is compiled with library `C v1.1.0`. | ||
|
||
{: style="width: 65%; margin: auto; display: block"} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like this example 👍 |
||
|
||
`C v1.1.0 ` is **Forwards Binary Compatible** with `v1.0.0` if we can use `v1.0.0`'s JAR at runtime instead of `v1.1.0`'s JAR without any linkage errors. | ||
|
||
`C v1.2.0 ` is **Backwards Binary Compatible** with `v1.1.0` if we can use `v1.2.0`'s JAR at runtime instead of `v1.1.0`'s JAR without any linkage errors. | ||
|
||
## Why binary compatibility matters | ||
|
||
Binary Compatibility matters because breaking binary compatibility has bad consequences on the ecosystem around the software. | ||
|
||
* End users have to update versions transitively in all their dependency tree such that they are binary compatible. This process is time-consuming and error-prone, and it can change the semantics of end program. | ||
* Library authors need to update their library dependencies to avoid "falling behind" and causing dependency hell for their users. Frequent binary breakages increase the effort required to maintain libraries. | ||
|
||
Constant binary compatibility breakages in libraries, especially ones that are used by other libraries, is detrimental to our ecosystem as they require time | ||
and effort from end users and maintainers of dependent libraries to resolve. | ||
|
||
Let's look at an example where binary incompatibility can cause grief and frustration: | ||
|
||
### An example of "Dependency Hell" | ||
|
||
Our application `App` depends on library `A` and `B`. Both `A` and `B` depends on library `C`. Initially, both `A` and `B` depends on `C v1.0.0`. | ||
|
||
{: style="width: 50%; margin: auto; display: block;"} | ||
|
||
Sometime later, we see `B v1.1.0` is available and upgrade its version in our build. Our code compiles and seems to work so we push it to production and go home for dinner. | ||
|
||
Unfortunately at 2am, we get frantic calls from customers saying that our application is broken! Looking at the logs, you find lots of `NoSuchMethodError` are being thrown by some code in `A`! | ||
|
||
{: style="width: 50%; margin: auto; display: block;"} | ||
|
||
Why did we get a `NoSuchMethodError`? Remember that `A v1.0.0` is compiled with `C v1.0.0` and thus calls methods available in `C v1.0.0`. | ||
While `B v1.1.0` and `App` has been recompiled with `C v2.0.0`, `A v1.0.0`'s bytecode hasn't changed - it still calls the method that is now missing in `C v2.0.0`! | ||
|
||
This situation can only be resolved by ensuring that the chosen version of `C` is binary compatible with all other evicted versions of `C` in your dependency tree. In this case, we need a new version of `A` that depends | ||
on `C v2.0.0` (or any other future `C` version that is binary compatible with `C v2.0.0`). | ||
|
||
Now imagine if `App` is more complex with lots of dependencies themselves depending on `C` (either directly or transitively) - it becomes extremely difficult to upgrade any dependencies because it now | ||
pulls in a version of `C` that is incompatible with the rest of `C` versions in our dependency tree! | ||
|
||
In the example below, we cannot upgrade to `D v1.1.1` because it will transitively pull in `C v2.0.0`, causing breakages | ||
due to binary incompatibility. This inability to upgrade any packages without breaking anything is commonly known as **Dependency Hell**. | ||
|
||
 | ||
|
||
How can we, as library authors, spare our users of runtime errors and dependency hell? | ||
|
||
* Use **Migration Manager** (MiMa) to catch unintended binary compatibility breakages before releasing a new library version | ||
* **Avoid breaking binary compatibility** through careful design and evolution of your library interfaces | ||
* Communicate binary compatibility breakages clearly through **versioning** | ||
|
||
## MiMa - Checking binary compatibility against previous library versions | ||
|
||
The [Migration Manager for Scala](https://github.com/lightbend/migration-manager) (MiMa) is a tool for diagnosing binary incompatibilities between different library versions. | ||
It works by comparing the class files of two provided JARs and report any binary incompatibilities found. | ||
Both backwards and forwards binary incompatibility can be detected by swapping input order of the JARs. | ||
|
||
By incorporating [MiMa SBT plugin](https://github.com/lightbend/migration-manager/wiki/Sbt-plugin) into your SBT build, you can easily check whether | ||
you have accidentally introduced binary incompatible changes. Detailed instruction on how to use the SBT plugin can be found in the link. | ||
|
||
We strongly encourage every library author to incorporate MiMa into their continuous integration and release workflow. | ||
|
||
Detecting backwards source compatibility is difficult with Scala due to language features like implicit | ||
and named parameters. The best approximation to checking backwards source compatibility is running | ||
both forwards and backwards binary compatibility check, as this can detect most cases | ||
of source-incompatible changes. For example, adding/removing public class members is a source | ||
incompatible change, and will be caught through forward + backward binary compatibility check. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Someone at some point mentioned that MiMa can be used as the best approximation of source (in)compatibility detection. I think it was a comment in this very same PR. It may be interesting mentioning that here and also explaining that detecting source incompatibility automatically in Scala is extremely difficult because of complex scoping and certain language features like implicits, named parameters, etc. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I probably said that. Basically, checking both forward and backward bincompat with MiMa is the closest tool we've got to checking backward source compatibility. Because it prevents, for example, from adding public methods, which is one of those things that can break source compat. |
||
## Evolving code without breaking binary compatibility | ||
|
||
Binary compatibility breakages can often be avoided through the careful use of certain Scala features | ||
as well as some techniques you can apply when modifying code. | ||
|
||
For example, the use of these language features are a common source of binary compatibility breakages | ||
in library releases: | ||
|
||
* Default parameter values for methods or classes | ||
* Case classes | ||
|
||
You can find detailed explanations, runnable examples and tips to maintain binary compatibility in [Binary Compatibility Code Examples & Explanation](https://github.com/jatcwang/binary-compatibility-guide). | ||
|
||
Again, we recommend using MiMa to double check that you have not broken binary compatibility after making changes. | ||
|
||
## Versioning Scheme - Communicating compatibility breakages | ||
|
||
Library authors use versioning schemes to communicate compatibility guarantees between library releases to their users. Versioning schemes like [Semantic Versioning](http://semver.org/)(SemVer) allow | ||
users to easily reason about the impact of updating a library, without needing to read the detailed release notes. | ||
|
||
In the following section, we will outline a versioning scheme based on Semantic Versioning that we **strongly encourage** you to adopt for your libraries. The rules listed below are **in addition** to | ||
Semantic Versioning v2.0.0. | ||
|
||
### Recommended Versioning Scheme | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would also like to encourage projects to document their versioning scheme and binary compatibility rules. Can we add a short section suggesting how a project might do this? If it follows this scheme, we should recommend linking to this document in the README. If not, the README should ideally include a short section explaining the rules for that project. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm trying to keep the word count down to not dilute the main takeaways of this tutorial. I think having another "maintaining an OSS Scala project" guide will be a good place to cover topics such as publishing to repositories and documenting versioning schemes. Speaking of which, I think it's a good idea to name this versioning scheme (e.g. "Scala Binary SemVer") instead of "that versioning scheme mentioned in that tutorial" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree, while @gmethvin's suggestion is good I think it may be too much for this guide. |
||
|
||
* If backward **binary compatibility** is broken, **major version number** must be increased. | ||
* If backward **source compatibility** is broken, **minor version number** must be increased. | ||
* A change in **patch version number** signals **neither binary nor source incompatibility**. | ||
According to SemVer, patch versions should contain only bug fixes that fix incorrect behavior so major behavioral | ||
change in method/classes should result in a minor version bump. | ||
* When major version is `0`, a minor version bump **may contain both source and binary breakages**. | ||
|
||
Some examples: | ||
|
||
* `v1.0.0 -> v2.0.0` is <span style="color: #c10000">binary incompatible</span>. | ||
End users and library maintainers need to update all their dependency graph to remove all dependency on `v1.0.0`. | ||
* `v1.0.0 -> v1.1.0` is <span style="color: #2b2bd4">binary compatible</span>. Classpath can safely contain both `v1.0.0` and `v1.1.0`. End user may need to fix minor source breaking changes introduced | ||
* `v1.0.0 -> v1.0.1` is <span style="color: #2b2bd4">binary compatible</span>. This is a safe upgrade that does not introduce binary or source incompatibilities. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is source and binary compatible There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done :) |
||
* `v0.4.0 -> v0.5.0` is <span style="color: #c10000">binary incompatible</span>. | ||
End users and library maintainers need to update all their dependency graph to remove all dependency on `v0.4.0`. | ||
* `v0.4.0 -> v0.4.1` is <span style="color: #2b2bd4">binary compatible</span>. Classpath can safely contain both `v1.0.0` and `v1.1.0`. End user may need to fix minor source breaking changes introduced | ||
|
||
Many libraries in the Scala ecosystem has adopted this versioning scheme. A few examples are [Akka](http://doc.akka.io/docs/akka/2.5/scala/common/binary-compatibility-rules.html), | ||
[Cats](https://github.com/typelevel/cats#binary-compatibility-and-versioning) and [Scala.js](https://www.scala-js.org/). | ||
|
||
## Conclusion | ||
|
||
Why is binary compatibility so important such that we recommend using the major version number to track it? | ||
|
||
From our [example](#why-binary-compatibility-matters) above, we have learned two important lessons: | ||
|
||
* Binary incompatibility releases often lead to dependency hell, rendering your users unable to update any of their libraries without breaking their application. | ||
* If a new library version is binary compatible but source incompatible, the user can fix the compile errors and their application should work. | ||
|
||
Therefore, **binary incompatible releases should be avoided if possible** and unambiguously documented | ||
when they happen, warranting the use of the major version number. Users of your library can then enjoy | ||
simple version upgrades and have clear warnings when they need to align library versions in their dependency tree | ||
due to a binary incompatible release. | ||
|
||
If we follow all recommendations laid out in this guide, we as a community can spend less time untangling dependency hell and more time building cool things! | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
top to bottom direction | ||
|
||
component "A v1.0.0" as a | ||
component "C v1.1.0" as c1 | ||
component "C v1.0.0" as c0 | ||
component "C v1.2.0" as c2 | ||
|
||
a -down-> c1 | ||
c0 -[hidden]> c1 | ||
c1 -[hidden]> c2 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
component "App" as app | ||
component "A 1.0.0" as a1 | ||
component "B 1.0.0" as b1 | ||
component "C 1.0.0" as c_1 | ||
component "D 1.1.0" as d | ||
component "E 1.5.0" as e | ||
component "C 1.0.0" as c_2 | ||
component "C 1.0.0" as c_3 | ||
|
||
app -down-> a1 | ||
app -down-> b1 | ||
app -down-> d | ||
a1 -down-> c_1 | ||
b1 -down-> c_2 | ||
d -down-> e | ||
e -down-> c_3 | ||
|
||
cloud "Available Updates" { | ||
component "D 1.1.1" as d2 | ||
component "E 1.5.1" as e_new | ||
component "C 2.0.0" as c_new_1 | ||
|
||
d2 -down-> e_new | ||
e_new -down-> c_new_1 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be a short subsection about the model of other platforms (Scala.js and Scala Native). Here is a draft:
Linking model of Scala.js and Scala Native
Similarly to the JVM, Scala.js and Scala Native have their respective equivalents of .class files, namely .sjsir files and .nir files. Similarly to .class files, they are distributed in .jars, and linked together at the end.
However, contrary to the JVM, Scala.js and Scala Native link their respective IR files at link time, so eagerly, instead of lazily at run-time. Failure to correctly link the entire program results in linking errors reported while trying to invoke
fastOptJS
/fullOptJS
ornativeLink
.Besides that difference in the timing of linkage errors, the models are extremely similar. Unless otherwise noted, the contents of this guide apply equally to the JVM, Scala.js and Scala Native.