Skip to content

Piped process handler for CBMC/SMT solver IPC #6151

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 26, 2021

Conversation

NlightNFotis
Copy link
Contributor

@NlightNFotis NlightNFotis commented May 25, 2021

This PR is related to #6134 .

This is adding a piped process handler class, with the bulk of it extracted/re-arranged
from src/memory-analyzer/gdb_api.cpp.

The idea is that we want to abstract away that interface (since it is code that has been
validated once, and it's known to be working - or have been in the past) and use it to
allow IPC with SMT solvers in the new incremental SMT backend (described in #6134).
Additionally, the gdb_api.cpp is going to be adapted at a later time to use this interface.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

/// \author Diffblue Ltd.

#ifdef _WIN32
// Windows includes go here
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I think this PR either needs the windows code OR a deliberate and acknowledged decision that some functionality is simply not going to be supported on Windows. I'm fine with either but I think this PR is the decision point.

( Or, maybe, a "Linux" build on WSL as the "windows version")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are planning to do implementation for both Windows and non-windows OS. We are just doing the non-windows support first. I guess splitting up the PRs vs a single larger PR is a separate discussion.


while(p)
{
res = reinterpret_cast<char **>(realloc(res, sizeof(char *) * ++n_spaces));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't you have a std::vector<char *> to do the segmentation and then create the char * array at the end?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would have split std::string into std::vector<string> and then converted that into the required char** style array.

process_statet process_state;

public:
bool send(const std::string &message);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some implementations of the STL it used to be possible to create an ostream or an istream based on a file descriptor. If that were possible it would give a much cleaner and more consistent interface to the pipe.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::string statement = "(echo \"hi\")";
std::string termination_statement = "(exit)";
piped_processt process = piped_processt(binary);
// Wait a moment for z3 to start
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh? Why? Pipes should be reliable and asynchronous.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pipes are asynchronous, but the pipes here are non-blocking. This means if you try to read immediately the other process may have not written anything to the pipe yet and you can read nothing (and then fail the following test).

Having the ability to read from a pipe that is not ended/closed is significant to being able to interact (e.g. with a solver) without requiring it to be terminated/closed after each request for a response. That said, without an explicit synchronisation system (yet?) it will not always be clear when the pipe is ready to be read. An alternate is to read lines, but this requires knowing how many lines to read, which may not be known in advance (e.g. how many lines make up a trace?) or doing parsing/checking after each line to know when to continue/stop. (This PR is clearly WIP, mostly I believe created to test the CI and various builds, it is not ready for review or merging.)


TEST_CASE("We create a pipe, interact", "[core][util][piped_process]")
{
std::string binary = "z3 -in";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use CPROVER's SMT solver instead of Z3 so that the unit tests don't add another dependency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that depending on the build type, the location of the SMT solver will be different (in an out-of-source cmake build vs. a make one) so this looks hairy to implement.

We will probably revisit this decision in a bit.

Copy link
Contributor

@thomasspriggs thomasspriggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments.

FILE *response_stream;
FILE *command_stream;
int pipe_input[2];
int pipe_output[2];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 These streams and pipes are operating system specific implementation details. It might be cleaner to use the "PImpl" coding pattern to keep all the #ifdef _WIN32 stuff in the .cpp file. This should:

  • Make the header file easier to read for programmers of other parts of the code base who just want to use this API.
  • Mean that we can keep the windows specific includes in the .cpp file, if we need fields with types based on windows.h when we come to do the windows support. This should also reduce compile times for the rest of the code base by avoiding putting windows.h in our header file.

This seems like a reasonable introduction to the code pattern if you aren't familiar with it - https://en.cppreference.com/w/cpp/language/pimpl We should be able to just leave off the std::experimental::propagate_const stuff it mentions.

@codecov
Copy link

codecov bot commented May 26, 2021

Codecov Report

Merging #6151 (814d951) into develop (7486222) will increase coverage by 0.17%.
The diff coverage is 47.51%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #6151      +/-   ##
===========================================
+ Coverage    75.25%   75.43%   +0.17%     
===========================================
  Files         1454     1461       +7     
  Lines       160796   161662     +866     
===========================================
+ Hits        121006   121942     +936     
+ Misses       39790    39720      -70     
Impacted Files Coverage Δ
...bytecode_convert_method/convert_invoke_dynamic.cpp 0.00% <ø> (ø)
unit/catch/catch.hpp 32.19% <ø> (+0.02%) ⬆️
unit/util/piped_process.cpp 28.57% <28.57%> (ø)
unit/util/sharing_map.cpp 98.14% <66.66%> (ø)
src/util/piped_process.cpp 75.00% <75.00%> (ø)
...it/testing-utils/require_vectors_equal_unordered.h 100.00% <100.00%> (ø)
unit/util/dense_integer_map.cpp 100.00% <100.00%> (ø)
unit/util/small_map.cpp 100.00% <100.00%> (ø)
unit/util/string_utils/split_string.cpp 100.00% <100.00%> (ø)
src/ansi-c/literals/convert_float_literal.cpp 73.80% <0.00%> (-19.05%) ⬇️
... and 134 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6397c1e...814d951. Read the comment docs.

else
{
// parent process here
// Close pipes to be used by the child process
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❔ Is this closing whole pipes or just access to specific ends of specific pipes by specific processes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specific ends of specific pipes for the process involved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just closes one end of the pipe.

# include <util/piped_process.h>

TEST_CASE(
"We create a pipe and we can read from it",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"We create a pipe and we can read from it",
"Creating a sub process and reading its output.",

}

// Default state
process_state = process_statet::NOT_CREATED;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Might be cleaner to put this field initialisation into the constructors initialiser list.

{
enum class process_statet
{
NOT_CREATED,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Can we get rid of this NOT_CREATED status, based on the piped_processt not being constructed if the construction fails? This status is unreachable on successful return of the constructor, based on all error conditions throwing an exception. The CREATED state could then be renamed to RUNNING, or we could simplify things even further by using an is_stopped bool instead of an enum.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be some potential in having an enum with values like RUNNING, STOPPED, and ERROR, but agree that NOT_CREATED is probably not useful.

@NlightNFotis NlightNFotis force-pushed the pipes_infr branch 3 times, most recently from 8037d97 to 4fe6690 Compare June 7, 2021 09:28
@NlightNFotis NlightNFotis force-pushed the pipes_infr branch 2 times, most recently from 4832ee8 to d55b8ce Compare June 7, 2021 09:37
else
{
// parent process here
// Close pipes to be used by the child process
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specific ends of specific pipes for the process involved.

@@ -252,3 +255,23 @@ std::string wrap_line(

return result;
}

char **string_to_cstr_array(const std::string &s, const char *delims)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be named better - the name doesn't really imply that this is actually splitting the string... Also, given what this is used for (splitting a command string intro arguments) it's probably not complete/safe either - it doesn't handle any kind of quoting of spaces which would/might be needed for safety... e.g. if the command string is something like "mysolver --config-file '/path/with spaces/config'" I think this would break....

REQUIRE(response == std::string("hi"));
REQUIRE(response.length() == 2);

res = process.send(termination_statement);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Can we have a different const variable with a different name, rather than reusing an existing variable?

piped_processt process = piped_processt(binary);

piped_processt::process_send_responset res = process.send(statement);
REQUIRE(res == piped_processt::process_send_responset::SUCCEEDED);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 If you inline process.send(statement) into this requirement, then catch will give more informative error messages in the case where this fails. This is because the error messaging will show process.send(statement) on that side of the equals instead of just res. REQUIRE(process.send(statement) == piped_processt::process_send_responset::SUCCEEDED);

process_statet get_status();

/// See if this process can receive data from the other process.
/// \param timeout Amount of time to wait before timing out and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ What unit of time is timeout in? This should be documented, if not part of the parameter name.

@NlightNFotis NlightNFotis force-pushed the pipes_infr branch 12 times, most recently from 8a84963 to f6d6e92 Compare June 10, 2021 14:13
@NlightNFotis NlightNFotis force-pushed the pipes_infr branch 3 times, most recently from f8192da to 93c3e14 Compare June 10, 2021 14:29
@NlightNFotis NlightNFotis marked this pull request as ready for review June 10, 2021 15:45
# include <testing-utils/use_catch.h>
# include <util/optional.h>
# include <util/piped_process.h>
# include <util/string_utils.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❔ Won't these 4 includes be required regardless of whether the tests are being run on Windows or not?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. Although without a Windows implementation this does not matter either way.

{
// can_receive(PIPED_PROCESS_INFINITE_TIMEOUT) waits an ubounded time until
// there is some data
can_receive(PIPED_PROCESS_INFINITE_TIMEOUT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Seems like fairly gratuitous macro usage to me. Why not just use an ordinary constant, defined in either the function or the class. Such as - const auto infinite_timeout = optionalt<std::size_t>{};

/// Wait for the pipe to be ready, waiting specified time between
/// checks. Will return when the pipe is ready or the other process
/// is not in a process_statet::CREATED state (i.e. error has occured).
/// \param wait_time Time spent in usleep() between checks of can_receive(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Please can you document the unit of time here? I think I can safely assume this isn't Planck Time, but is it seconds, microseconds, milliseconds, etc?

@TGWDB TGWDB mentioned this pull request Jun 15, 2021
7 tasks
Copy link
Contributor

@thomasspriggs thomasspriggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My remaining comments amount to nit picks and ideas for improvement. I think this looks pretty good overall. My preference would be to get this PR merged sooner, rather than later and to perform any further improvements as follow up PRs.

std::vector<std::string> commands;
commands.push_back("/bin/echo");
commands.push_back(to_be_echoed);
piped_processt process = piped_processt(commands);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 The command vector could be initialised inline in order to avoid mutating a vector with push_back. For example - piped_processt process{{"/bin/echo", to_be_echoed}};

@NlightNFotis NlightNFotis force-pushed the pipes_infr branch 2 times, most recently from 0c5b91c to 1e2419c Compare June 16, 2021 16:16
{ \
}

class piped_processt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📖 Give an overview (with example) of how this API should be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now under doc/API/util/piped_process.md.

"[core][util][piped_process]")
{
std::vector<std::string> commands;
commands.push_back("z3");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we tell CBMC developers who want to run the unit tests that they need Z3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and all other tests referencing z3) is now hidden behind a hide tag . - [.z3].

An ordinary invocation of bin/unit without any tags will skip all of these tests. I will modify the CI scripts to invoke the unit test binary with the z3 flag passed as well - this we felt is the best compromise between not requiring z3 for the unit tests for an ordinary user, and testing the functionality we need to test in CI as well as locally.

NlightNFotis and others added 2 commits July 15, 2021 16:27
Integration of our piped-process handling class into CBMC.
A prototype of this was developed as a standalone executable,
and after the proof of concept was validated, it was refactored
and integrated into CBMC.

Co-authored-by: Fotis Koutoulakis <[email protected]>
Co-authored-by: Thomas Given-Wilson <[email protected]>
Add validation code for the `piped_processt` class, in
the form of Catch unit tests, under `unit/util/`.

Co-authored-by: Fotis Koutoulakis <[email protected]>
Co-authored-by: Thomas Given-Wilson <[email protected]>
@NlightNFotis NlightNFotis requested a review from a team as a code owner July 15, 2021 15:27
@@ -107,6 +107,7 @@ jobs:
- name: Run unit tests
run: |
make -C unit test
make -C unit test "[z3]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why this is in some of the builds but not others. On the one hand I understand that testing this is largely unrelated to the various flavours of make/ubuntu and so testing once per OS is sufficient. On the other hand, this makes the tests between OS (versions) less consistent and maybe has potential for lack of clarity.

{
enum class process_statet
{
NOT_CREATED,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be some potential in having an enum with values like RUNNING, STOPPED, and ERROR, but agree that NOT_CREATED is probably not useful.

# include <testing-utils/use_catch.h>
# include <util/optional.h>
# include <util/piped_process.h>
# include <util/string_utils.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. Although without a Windows implementation this does not matter either way.

…e `[z3]` tag.

This allows us to run the (now hidden) z3 jobs, so that `z3` is not
required for a argument-less invocation of the unit-test binary.
Copy link
Contributor

@chrisr-diffblue chrisr-diffblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it would be nice to have the windows implementation ready as well, this is here for a dedicated purpose that still an ongoing task, so I'm OK with it going in as is.

@NlightNFotis
Copy link
Contributor Author

Hi @tautschnig, would it be possible to review this PR at your earliest convenience?

We are blocked on a code owner's review from you to get this moving forward.

Thanks for your time 🙏🏻

@NlightNFotis NlightNFotis merged commit 7601ad5 into diffblue:develop Jul 26, 2021
@NlightNFotis NlightNFotis deleted the pipes_infr branch July 26, 2021 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants