|
| 1 | +\page cpp-api-and-modularisation libcprover-cpp and Modularisation |
| 2 | + |
| 3 | +**Date**: 22 Jun 2023 |
| 4 | +**Updated**: 22 Jun 2023 |
| 5 | +**Author **: Fotis Koutoulakis, [email protected] |
| 6 | +**Domain**: Architecture, API |
| 7 | +**Description**: This document outlines our thinking about the rearchitecting of |
| 8 | + CBMC using the C++ API (`libcprover-cpp`) as the central module and the |
| 9 | + transitioning of other tools to use that as a basis. |
| 10 | + |
| 11 | +## Motivation && Current State of Affairs |
| 12 | + |
| 13 | +CProver is a collection of tools fitting various program analysis needs. CProver |
| 14 | +has been the product of the evolution of the codebase of the model-checking |
| 15 | +tool for C (`CBMC`). Since then CProver has been adopted with various |
| 16 | +front-ends/back-ends and auxilliary tools. |
| 17 | + |
| 18 | +During this time, the repository has grown organically, using some guidelines |
| 19 | +for development that were based on tradition and intuition rather than some |
| 20 | +agreed architectural approach. This development model has been successful for most |
| 21 | +of CProver's life, based on its nature as a hybrid industrial/academic and |
| 22 | +experimental/applied tool. However, this has had the side-effect of accruing |
| 23 | +some code duplication and technical debt. Consequently, the codebase is complex |
| 24 | +and difficult to understand and develop for. This is a large barrier to new |
| 25 | +developers, new features, and also improving and fixing the existing CProver |
| 26 | +tools. |
| 27 | + |
| 28 | +The above concerns have generated discussions about the breaking down of |
| 29 | +CProver into modules, with cleaner interfaces and tighter boundary control |
| 30 | +between them, making the code easier to integrate in other projects (by making |
| 31 | +the various component modules easier to combine and reuse) and making the |
| 32 | +codebase easier to understand and maintain. |
| 33 | + |
| 34 | +The desire to separate functionality into different functional units |
| 35 | +would also remove duplication and bloat, and prevent issues with the |
| 36 | +same flag behaving in different ways across CProver tools. |
| 37 | + |
| 38 | +## The Plan Going Forward |
| 39 | + |
| 40 | +Given the above outline, we have reached a point where we are strongly motivated |
| 41 | +to take action to better componentise and modularise CProver. What we mean by |
| 42 | +"better componentise and modularise" is that right now, even though there |
| 43 | +exists some structure between different CBMC components (at the class level |
| 44 | +or at the code source file level), the different components aren't cleanly |
| 45 | +separated in terms of boundaries/concerns, which hinders their reusability |
| 46 | +or understandability. |
| 47 | + |
| 48 | +It is also an opportune time for us, given the existence of `libcprover-cpp` |
| 49 | +the C++ API that we built to support interfacing with Rust (for now - other |
| 50 | +languages may be coming in the future): we can use this as the basis of |
| 51 | +development for an API exposing the interfaces of the various other modules |
| 52 | +and refactor them into the better-defined shape we want them to take on |
| 53 | +an incremental basis. |
| 54 | + |
| 55 | +Of course, this is a project that is massive in scope, potentially being |
| 56 | +exposed to further scope creep. We acknowledge that any effort to do what |
| 57 | +we have discussed already is going to be a multi-year effort from our end, |
| 58 | +and that we will need community alignment to achieve the outcome we are |
| 59 | +looking for. |
| 60 | + |
| 61 | +This is why we are looking into testing the approach on a smaller component |
| 62 | +first, to get a better feel for the amount of effort and any challenges |
| 63 | +lurking in the dark. |
| 64 | + |
| 65 | +## `goto-bmc` and `libcprover-cpp` |
| 66 | + |
| 67 | +One of the objectives of our modularisation efforts is to decouple the |
| 68 | +various components `CBMC` is based on (front-ends, backends, etc) to allow |
| 69 | +for reuse/recombination. As a first segue into the larger effort, we wanted |
| 70 | +a tool focusing only on running symex (the backend of our analysis engine) |
| 71 | +on a GOTO-binary that has been preprocessed into an appropriate form. |
| 72 | + |
| 73 | +We took the first steps for that in [cbmc#7762](https://github.com/diffblue/cbmc/pull/7762). |
| 74 | + |
| 75 | +The aim of the PR was not only to allow for the tool with the narrower-scope |
| 76 | +to come to life, but also to see if we could expose just enough of the |
| 77 | +process to the C++ API and use that as the basis of the new tool. |
| 78 | + |
| 79 | +This whole process has been very informative: we found out that not only |
| 80 | +we *can* use the C++ API in that capacity, but also that extending the API |
| 81 | +as and when we need to, and doing the various refactorings to the other tools |
| 82 | +on a Just-In-Time basis is viable. |
| 83 | + |
| 84 | +There have been, however, some limitations: |
| 85 | + |
| 86 | +* The C++ API is still nascent, and as such its support for various workflows |
| 87 | + is just not present (yet). We can (and do) build things fast, as and |
| 88 | + when we need them - but it is nowhere near feature complete to provide for |
| 89 | + all of a user's workflows at the time of this writing. |
| 90 | + |
| 91 | +* Still on the C++ API, its usage as the implementation basis for the new |
| 92 | + tool `goto-bmc` means that the code-structure and patterns in `goto-bmc` |
| 93 | + are going to differ from other existing tools in the CProver-suite. |
| 94 | + |
| 95 | +* CProver tools have primarily been based on textual output to report on the |
| 96 | + results of their function (be it analysis, or transformations, etc). This |
| 97 | + has not been a problem up until this point (with the caveat that occasionally |
| 98 | + requests for support of new textual formats come up and adding support for |
| 99 | + those has become a laborious process). |
| 100 | + |
| 101 | + There is a need however for the separation of concerns between the production |
| 102 | + of the results by the analysis engine and the presentation layer for those. |
| 103 | + |
| 104 | +We are working towards addressing these teething problems, but while we are |
| 105 | +still operating on those, we have to accept some compromises in the architecture |
| 106 | +of the code while we are iterating or stabilising several of the new or |
| 107 | +refactored parts. |
| 108 | + |
| 109 | +Be advised that some constructs may pop up in some limited locations |
| 110 | +in the codebase that may appear questionable. We are only asking for |
| 111 | +some patience while we are working out the best way to refactor them into |
| 112 | +an architecture that is more cohesive with the long term vision for the |
| 113 | +platform. |
| 114 | + |
| 115 | +From our end, we will do our best to avoid any spillover effects to |
| 116 | +other areas of the codebase, and to avoid introducing any behavioural |
| 117 | +regressions while we are implementing the above plan. Any constructs |
| 118 | +that may feature "questionable" changes to parts will be marked as such |
| 119 | +and be followed with an explanation as to why the decision was made. |
0 commit comments