Skip to content

Rfc sat backend mergesat #6088

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

conp-solutions
Copy link

This is a RFC patch to investigate whether moving the SAT backend to a more recent solver would be beneficial. This patch set is scappy. In case the integration of the new solver results in performance benefits, required changes could be worked into the used MergeSat solver, and this patch series could be cleaned up.

I am not aware how the performance of the SAT backend can be tested. Please advice. Thanks!

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@@ -62,7 +62,7 @@ if("${CMAKE_CXX_COMPILER_ID}" STREQUAL "Clang" OR
set(CMAKE_CXX_FLAGS_RELEASE "-O2")
set(CMAKE_CXX_FLAGS_RELWITHDEBINFO "-O2 -g")
# Enable lots of warnings
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -Wpedantic -Werror -Wno-deprecated-declarations -Wswitch-enum")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -Wpedantic -Wno-deprecated-declarations -Wswitch-enum")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The policy of the current build is to error on warnings, this should probably be discussed independently of other PRs. If this is required for this PR to work, I would recommend explaining why the warnings cannot be avoided.

Comment on lines +122 to +133
@echo "Downloading MergeSat (replacing Minisat 2.2.1)"
@for i in $$(seq 1 3) ; do \
$(DOWNLOADER) \
http://ftp.debian.org/debian/pool/main/m/minisat2/minisat2_2.2.1.orig.tar.gz && \
https://github.com/conp-solutions/mergesat/archive/refs/tags/v3.0.tar.gz && \
exit 0 ; \
$(RM) minisat2_2.2.1.orig.tar.gz ; \
$(RM) v3.0.tar.gz ; \
if [ $$i -lt 3 ] ; then echo "Re-trying in 10 seconds" 1>&2 ; sleep 10 ; fi ; \
done ; exit 1
@$(TAR) xfz minisat2_2.2.1.orig.tar.gz
@$(TAR) xfz v3.0.tar.gz # will create mergesat-3.0
@rm -Rf ../minisat-2.2.1
@mv minisat2-2.2.1 ../minisat-2.2.1
@(cd ../minisat-2.2.1; patch -p1 < ../scripts/minisat-2.2.1-patch)
@rm minisat2_2.2.1.orig.tar.gz
@mv mergesat-3.0 ../minisat-2.2.1
@rm v3.0.tar.gz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be editing and replacing the current MiniSAT build process. I would much prefer to see a new build path for MergeSat (like the ones for CaDiCaL or glucose) and then if this has much improved performance we can simply change the default. Also see recent PRs for documenting and updating alternative build systems (e.g. #6075 #6047).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; absolutely.

@@ -5,7 +5,7 @@ BUILD_ENV = AUTO
ifeq ($(BUILD_ENV),MSVC)
#CXXFLAGS += /Wall /WX
else
CXXFLAGS += -Wall -pedantic -Werror -Wno-deprecated-declarations -Wswitch-enum
CXXFLAGS += -Wall -pedantic -Wno-deprecated-declarations -Wswitch-enum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this should be separate.

@@ -17,7 +17,7 @@ endif
ifneq ($(MINISAT2),)
MINISAT2_SRC=sat/satcheck_minisat2.cpp
MINISAT2_INCLUDE=-I $(MINISAT2)
MINISAT2_LIB=$(MINISAT2)/minisat/simp/SimpSolver$(OBJEXT) $(MINISAT2)/minisat/core/Solver$(OBJEXT)
MINISAT2_LIB=$(MINISAT2)/minisat/simp/SimpSolver$(OBJEXT) $(MINISAT2)/minisat/core/Solver$(OBJEXT) $(MINISAT2)/minisat/utils/ccnr$(OBJEXT) $(MINISAT2)/minisat/utils/Options$(OBJEXT) $(MINISAT2)/minisat/utils/System$(OBJEXT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, editing the MiniSAT and not providing a new option.

Comment on lines +204 to +212
$(MINISAT2)/minisat/utils/ccnr$(OBJEXT): $(MINISAT2)/minisat/utils/ccnr.cc
$(CXX) $(CP_CXXFLAGS) /w /nologo /c /EHsc $< /Fo$@

$(MINISAT2)/minisat/utils/Options$(OBJEXT): $(MINISAT2)/minisat/utils/Options.cc
$(CXX) $(CP_CXXFLAGS) /w /nologo /c /EHsc $< /Fo$@

$(MINISAT2)/minisat/utils/System$(OBJEXT): $(MINISAT2)/minisat/utils/System.cc
$(CXX) $(CP_CXXFLAGS) /w /nologo /c /EHsc $< /Fo$@

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, editing.

@conp-solutions
Copy link
Author

Thanks for the comments. I fully understand that I should create a separate back-end. I can happily to so, in case the performance of the proposed SAT back-end exceeds the performance of the current one.

This PR is an RFC. In case we decide to move forward, I will address all the comments. I could adapt MergeSat so that dropping "-Werror" is not required any more (or provide a CBMC local patch as is done for other solvers).

Until then, I would love to understand whether there is an agreed way to measure the performance of a new back-end. Thanks!

To be able to build less quality code, allow compilation warnings.

This is only meant as a helper step to allow test new SAT backends.

Signed-off-by: Norbert Manthey <[email protected]>
MergeSat is a recent SAT solver that fits the MiniSat interface. This
change uses MergeSat in place of Minisat2. For this test, no deep changes
are performed. Besides replacing the package to be downloaded, the
required build dependencies are adapted.

To use MergeSat as a proper SAT backend, some extensions might be dropped.
Especially being able to print memory usage is not helpful for CBMC, but
requires pulling in a new dependency.

Signed-off-by: Norbert Manthey <[email protected]>
MergeSat is based on the MiniSat 2.2 interface. Hence, setPolarity needs
a bool instead of an lbool.

Signed-off-by: Norbert Manthey <[email protected]>
@conp-solutions conp-solutions force-pushed the rfc-sat-backend-mergesat branch from e4b1bde to 8d60dae Compare May 7, 2021 19:52
@TGWDB
Copy link
Contributor

TGWDB commented May 10, 2021

Thanks for the comments. I fully understand that I should create a separate back-end. I can happily to so, in case the performance of the proposed SAT back-end exceeds the performance of the current one.

This PR is an RFC. In case we decide to move forward, I will address all the comments. I could adapt MergeSat so that dropping "-Werror" is not required any more (or provide a CBMC local patch as is done for other solvers).

Great.

Until then, I would love to understand whether there is an agreed way to measure the performance of a new back-end. Thanks!

Recent performance work has explored some different benchmarks including: full runs on regression tests, runs over open source repositories (with cbmc validation checks), and clear test examples. An example of a PR with some performance data can be found here: #5964

@martin-cs
Copy link
Collaborator

@tautschnig has probably the best scripting for doing large scale performance evaluation. But I think this might be a little bit of a red herring.

There are already a number of back-ends that often have better performance than the default one. However MiniSAT remains the default because:

  1. It's simple and requires little external dependency.
  2. Its performance is remarkably consistent and scales predictably.

So, patches adding new solvers (SAT, SMT, others) are most welcome and should be merged eagerly. Changing the default is going to require a lot of benchmarking, including against the SMT solvers and a lot of time to see if there are performance regression in actual applications.

PS Is the IPASIR format still a going concern? That might be an easier route to integrate new solvers.

@tautschnig
Copy link
Collaborator

Here are results for two of the categories of SV-COMP, with Mergesat on the x axis of the scatter plots, and Minisat on the y axis. As can be seen, the results vary in both directions. On other categories the results are more consistent in that Mergesat is about 10% slower, but that slowdown does not affect overall scores. Hence, no clear-cut "solver X is uniformly better," which probably is the expected result anyway.

Screenshot 2021-05-14 at 11 19 08

Screenshot 2021-05-14 at 11 19 23

What is much better about Mergesat, of course, is that it is maintained :-)

@tautschnig tautschnig removed their assignment May 14, 2021
@TGWDB
Copy link
Contributor

TGWDB commented May 26, 2021

@conp-solutions Just checking in to see where you are with this PR/RFC.

@conp-solutions
Copy link
Author

conp-solutions commented May 26, 2021 via email

@TGWDB
Copy link
Contributor

TGWDB commented May 26, 2021

Closing as recommended by opening author. Look forward to the new SAT back end in the future.

@TGWDB TGWDB closed this May 26, 2021
@TGWDB TGWDB mentioned this pull request Jun 9, 2021
6 tasks
@tautschnig tautschnig mentioned this pull request Nov 4, 2021
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants