Skip to content

Introduction of Dockerfile for building and running CBMC in a Docker container. #5715

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 12, 2021

Conversation

NlightNFotis
Copy link
Contributor

@NlightNFotis NlightNFotis commented Jan 4, 2021

Hi,

This PR adds support for building and running CBMC in a docker container.

I would appreciate all feedback.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@codecov
Copy link

codecov bot commented Jan 4, 2021

Codecov Report

Merging #5715 (fbcc05b) into develop (c50c69d) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop    #5715   +/-   ##
========================================
  Coverage    69.50%   69.50%           
========================================
  Files         1243     1243           
  Lines       100690   100690           
========================================
  Hits         69989    69989           
  Misses       30701    30701           
Flag Coverage Δ
cproversmt2 43.15% <ø> (ø)
regression 66.40% <ø> (ø)
unit 32.25% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c50c69d...fbcc05b. Read the comment docs.

@NlightNFotis
Copy link
Contributor Author

So, updates on some of the points raised so far, and the future direction of it:

  1. After a discussion with Hannes, it became obvious that the image shouldn't be that big - it is simply because the binaries I copied from my system are debug binaries, some of them more than 100MiB+. There are two ways I want to take this: first, it should be that copying non-debug binaries should have a direct effect on the image size. But this leaves us dependent on the platform the binaries are built (they need to be version compatible with the container distro). So a better approach overall might be to build on the container using staged builds. I will be investigating this next.
  2. I will also try to find what's missing from relevant tools and install them in the container image (gcc, gdb, binutils, buildutils, etc)

@NlightNFotis
Copy link
Contributor Author

Update: yesterday we (me and @thomasspriggs) managed to get multi-stage builds enabled. The PR is more or less good for proper reviews now, but there's at least one more thing I would like to do to feel like this is ready (that is, add documentation for using the Dockerfile).

@NlightNFotis NlightNFotis force-pushed the dockerfile branch 2 times, most recently from 545ae4d to 0b2fb74 Compare January 6, 2021 13:52
@NlightNFotis NlightNFotis marked this pull request as ready for review January 6, 2021 13:53
@NlightNFotis NlightNFotis changed the title First draft of a dockerfile for CBMC. Introduction of Dockerfile for building and running CBMC in a Docker container. Jan 6, 2021
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you'd add the CI tests for this in this PR, I know we track these as separate work items but that doesn't mean we need to create a PR for each right? Problem with merging as is is that right now it's untested code.

@NlightNFotis NlightNFotis requested a review from a team as a code owner January 8, 2021 11:52
@NlightNFotis NlightNFotis force-pushed the dockerfile branch 2 times, most recently from ce33d7e to eaef117 Compare January 8, 2021 12:00
@@ -349,3 +349,24 @@ jobs:
- uses: actions/checkout@v2
- name: Check for unused irep ids
run: ./scripts/string_table_check.sh

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Might be worth stating in the commit title that this is PR check. Otherwise it might be confused with the release actions when looking through the commit history later. Something like - "Add github action on PRs to check the Dockerfile"
⛏️ The commit title should not end with a full stop. See point "4. Do not end the subject line with a period", here - How to Write a Git Commit Message

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@NlightNFotis NlightNFotis force-pushed the dockerfile branch 2 times, most recently from 510e8bb to 9e27b60 Compare January 8, 2021 14:59
@tautschnig
Copy link
Collaborator

So, updates on some of the points raised so far, and the future direction of it:

  1. After a discussion with Hannes, it became obvious that the image shouldn't be that big - it is simply because the binaries I copied from my system are debug binaries, some of them more than 100MiB+. There are two ways I want to take this: first, it should be that copying non-debug binaries should have a direct effect on the image size. But this leaves us dependent on the platform the binaries are built (they need to be version compatible with the container distro). So a better approach overall might be to build on the container using staged builds. I will be investigating this next.

You might be interested in #5646.

dpkg-dev
COPY . /tmp/cbmc
WORKDIR /tmp/cbmc
RUN cmake -S . -Bbuild -G Ninja -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=/usr/bin/gcc -DCMAKE_CXX_COMPILER=/usr/bin/g++ && cd build; ninja -j2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the -DCMAKE_*_COMPILER settings as those will be no different from the system defaults. And ninja -C build -j2 would spare you the cd build.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or cmake --build build

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, when it comes to the WORKDIR I think it might be better to actually keep it. From the docker documentation it seems there's no longer any massive need to avoid extraneous instructions, as only a few introduce new layers, and I think it's cleaner this way.

What do you think about this?

Add a Dockerfile that allows CBMC to be built and
run in a Docker container.

The Dockerfile is setup in a way that allows multi-stage
builds, so that it's built in one container, and then
the build artifacts are being copied in another.
This adds a github action that builds CBMC in a
Docker container, and then exercises a small suite
of smoke tests on it to make sure that the image
is sound.
@martin-cs
Copy link
Collaborator

@NlightNFotis : what is the use-case for this? If it is intended to be a way to distribute CBMC then should creating it be added to the CI or release automation?

@NlightNFotis
Copy link
Contributor Author

Hi @martin-cs

Yes, my understanding is that this is one potential distribution method for us to maintain. The plan for this is is as follows:

  1. There's going to be a github action check, run per PR, which is just building CBMC under docker and runs some basic image sanity checks (this is just to ensure that the image itself isn't broken by the changes introduced), and
  2. There's going to be follow up work (which is going to come in a different PR most likely), whose job is to push a newly built image (corresponding to the newly-released version) to dockerhub.

Step 1 is present in this PR if you want to review it (it's the last commit).

@martin-cs
Copy link
Collaborator

@NlightNFotis Thanks for the fast response. At the risk of showing my ignorance about docker...

  1. Why not run the regression and unit tests in the docker container? They are our usual criteria for "is the system working".
  2. If so, why not simply have one of the CI builds be in a docker container and then export the relevant container once everything has run? That way we treat docker as 'just another platform' and have the same levels of testing and assurance on it.

How often will the docker image be produced and 'released' to dockerhub? Every PR? Every merge? Every release?

@NlightNFotis
Copy link
Contributor Author

Hi @martin-cs, no problem all of these are good questions. Regarding:

  1. We thought it was unneeded. The reasoning behind it is that the docker container is nothing special but an ubuntu 20.04 build (of which we are already building and testing in CI), so for reasons of economy of time (and a simplified first implementation) we felt that we could just make do with a very simple suite of smoke tests that tell us if the build is broken. We thought that an extensive test suite under a docker container wouldn't be terribly useful as it would just prolong the build time and not offer much more information than what we already get from the other jobs.
  2. This is what the next PR is aiming for. However, I believe it's not a good idea to export the container at every PR (by export I'm assuming we both mean push the image to dockerhub) because of the possibility of it being broken and the fact that there's going to be a huge (i.e. spammy) collection of low quality releases - so a better idea would be to have an action that at a new CBMC release makes a new build of the docker container and pushes that, in a similar way to how we do update the homebrew package for CBMC at release.

@martin-cs
Copy link
Collaborator

  1. Could you move the Ubuntu 20.04 build /into/ the docker container or is that not how it works? Or take the binaries from that build that have been tested?
  2. Given the discussion about rolling releases, I think a docker container per release makes much more sense than per PR.

Out of interest and a desire to keep a vague track of how people use CBMC, is this driven by specific user requests or because "we" feel like it is something that should be available?

@NlightNFotis
Copy link
Contributor Author

Hi @martin-cs,

  1. Well, in theory yes. Taking the binaries out might be a little bit weird, but I think it's doable. My preference though would be for the existing CI infrastructure to not change for one reason: consistency. I believe it would be weird if mac and windows builds are happening outside containers and for the linux build to be inside.

Moving the binaries from the other build into the container is a lot harder though, as that would involve different VMs (the ones spawned by the different github actions) so we would need to find a different way to store these binaries before they are either copied to the container or mounted to it from another directory.

  1. Yes, this is the plan now.
  2. Yes, this is driven by a specific user request for these images.

As an aside, we were experimenting with potentially speeding up the build inside the container by using ccache, so if we manage to do that, then it might be easier to also include more elaborate testing (by virtue of dropping the total build time). But that is still heavily experimental so we would much rather we proceed without ccache or heavier testing for now, as it can be added at a later point.

@martin-cs
Copy link
Collaborator

@NlightNFotis This all sounds good, thanks.

I would dispute a little bit that consistency is a good thing in the CI builds. I think we want diverse builds. As it would be infeasibly expensive to try all possible combinations but I think having the set of build cover all of the options (but not all of the combinations). Michael has tow commits for this happening with SAT solvers, I believe some of the build options have this kind of diversity as well. So I would be OK with one of the Linux builds being in a docker container we then kept.

@NlightNFotis
Copy link
Contributor Author

@martin-cs Would that be okay if we do it at a later time?

My plan for this is to get ccache to work reliably inside the container, and if we manage to get this to work, then we could port bits of the existing Ubuntu job to ensure that the container based one and the vm based one are in parity. Otherwise I'm slightly nervous to make this change now (looks like a relatively big one, with a lot of scope for breakage of the existing CI infrastructure).

@martin-cs
Copy link
Collaborator

@NlightNFotis Sure! I'm not in charge of CI or release, so, do as you wish. 😄

@NlightNFotis NlightNFotis merged commit 291c973 into diffblue:develop Jan 12, 2021
@NlightNFotis NlightNFotis deleted the dockerfile branch January 12, 2021 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants