Replace test.pl by Python-based implementation, possibly within pytest #1924

tautschnig · 2018-03-12T11:05:24Z

See #1907 for discussion.

owen-mc-diffblue · 2018-03-20T09:10:28Z

I've looked into pytest-xdist, which is a plugin for pytest to enable parallel running of tests. It seems to work. We have quite a small set of regression tests so it isn't a great comparison. Also, one of the tests takes much longer than the others (10:00 out of 11:39). Parallelising (with automatically chosen number of cores) reduces the time from 11:39 to 10:27. If I take out that long test then it reduces the time from 1:20 to 0:30. (All times reported are the "real" time obtained from time.)

tautschnig · 2018-03-20T15:14:14Z

@owen-jones-diffblue Thanks a lot for following up! Which test is it that vastly dominates the run-time? Maybe we've been working hard on solving the wrong problem...

owen-mc-diffblue · 2018-03-20T15:17:57Z

@tautschnig This was on a private repo, where the tests have already been written using pytest. And yes, we have plans for making that test a lot faster.

karkhaz · 2019-02-18T14:33:37Z

Soon come ;-)

usage: test.py [-h] -c CMD [-C] [-T] [-F] [-K] [-I TAG] [-X TAG] [-i REGEX]
               [-j N] [-n] [-p] [-s S] [-f]
               [DIRS [DIRS ...]]

Run regression tests according to a test.desc spec

optional arguments:
  -h, --help            show this help message and exit
  -i REGEX, --ignore REGEX
                        options in test.desc matching REGEX are ignored
  -j N, --jobs N        run N parallel jobs. Default: $TESTPL_JOBS if set,
                        else 1
  -n, --dry-run         print the tests that would be run, don't actually run
                        them
  -p, --print-logs      print the logs of each failed test (if any)
  -s S, --suffix S      append S to all output and log files. Allows
                        concurrent testing of the same desc file with
                        different commands or options, as runs with different
                        suffixes will operate independently and keep
                        independent logs
  -f, --forward         forward the test name to CMD

required arguments:
  -c CMD, --command CMD
                        run tests on CMD

test selection:
  DIRS                  run only tests in DIRS
  -C, --core            run essential tests (default if no tests are selected)
  -T, --thorough        run expensive tests
  -F, --future          run checks for future features
  -K, --known-bug       run tests associated with known bugs
  -I TAG, --include TAG
                        run only tests that have the given secondary tag (can
                        be passed multiple times)
  -X TAG, --exclude TAG
                        exclude tests that have the given secondary tag (can
                        be passed multiple times)

test.py expects a test.desc file in each subdirectory. The file test.desc
follows the format specified below. Any line starting with // will be ignored.

<level> [<tag1> [<tag2>...]]
<main source>
<options>
<activate-multi-line-match>
<required patterns>
--
<disallowed patterns>
--
<comment text>

where
  <level>                         is one of CORE, THOROUGH, FUTURE or KNOWNBUG
  <tag1>, <tag2> ... <tagn>       are zero or more space-separated secondary tags, for use with the -I and -X parameters
  <main source>                   is a file with extension .c/.i/.gb/.cpp/.ii/.xml/.class/.jar
  <options>                       additional options to be passed to CMD
  <activate-multi-line-match>     The fourth line can optionally be activate-multi-line-match, if this is the
                                  case then each of the rules will be matched over multiple lines (so can contain)
                                  the new line symbol (\n) inside them.
  <required patterns>             one or more lines of regualar expressions that must occur in the output
  <disallowed patterns>           one or more lines of expressions that must not occur in output
  <comment text>                  free form text

owen-mc-diffblue · 2019-02-19T09:30:48Z

@karkhaz Have you rewritten test.pl using python? Or are you using a framework, like pytest? (I had to briefly check that it wasn't the 1st of April.)

karkhaz · 2019-02-19T09:43:32Z

@owen-jones-diffblue It's not finished, but I'm rewriting it from scratch in python, and trying to ensure that it doesn't use any packages from outside the standard library. I actually thought about waiting until April to post the PR, just to be contrary; doing it twice just might make it a tradition...

chrisr-diffblue · 2019-02-19T17:17:06Z

One thing to consider (if this is possible to fix in the new test runner) is that some of the current regression tests are not thread safe (e.g. if you try to distribute running the tests over multiple cpu's), particularly where one regression directory contains multiple '.desc' files. One such example is regression/goto-cc-cbmc/mixed-c-library-goto-main/*.desc - if the two .desc files in that directory happen to get run in parallel, then it's a race as to which test generates a .gb file that gets analysed. Fixing that would be awesome :)

owen-mc-diffblue · 2019-02-19T17:21:40Z

@karkhaz On April 1st you can announce that rewriting test.pl was too hard, so you're rewriting cbmc in perl instead 😄.

karkhaz · 2019-02-19T17:27:05Z

@chrisr-diffblue are multiple .desc files the only situation where this happens? Would it suffice for me to parallelise all directories, but to always serialize .desc files within one directory?

chrisr-diffblue · 2019-02-19T17:34:33Z

I think thats the main threading issue I'm aware of, so yes, that would probably be one approach - though that does still mean that if you want the test runner to leave behind the intermediate files (e.g. the .gb files for debugging purposes, you have similar confusing behaviour. Probably the most robust approach would be to do something like generate a new temporary working directory for each .desc, and then invoke the appropriate chain,sh or tool from within that working directory (fixing up file paths and include paths as appropriate....). e.g. for a file foo.desc, the runner would create a foo.run directory, which after running the test would contain any intermediate files, plus the foo.out. Cleaning up after a test run would then just be an rm -r of foo.run.

chrisr-diffblue · 2019-02-19T17:38:04Z

Oh, and "Default: $TESTPL_JOBS if set" - are we keeping the "PL" for sentimental reasons, or should that be "TESTPY_JOBS" now :-)

karkhaz · 2019-02-19T19:07:13Z

Thank you, a fresh directory for each .desc file sounds sensible---I'll do that. My job is a lot easier due to the plethora of regression tests, which serve to test regressions between test.pl and test.py as well as regressions of cbmc :)

$TESTPL_JOBS: this was intentional. My rationale for this and other features is that I'd prefer not to break people's scripts. So at least for this first implementation, I'm going to strive to keep the command-line interface and other external settings identical. On the other hand, if there are long-standing annoyances with test.pl, this could certainly be an opportunity to finally fix them cleanly, so please feel free to point me to any issues that I might be able to resolve. The parallel .desc file issue is a nice example.

chrisr-diffblue · 2019-02-19T21:51:45Z

As far as TESTPL_JOBS is concerned, I was mostly just being tongue-in-cheek - keeping as its current name seems entirely sensible. Its great to see someone finally tackling this and helping out everyone in the processes!

hannes-steffenhagen-diffblue · 2019-02-26T11:32:48Z

Just a stray comment: One mild annoyance with the way test.pl right now is the colouring, which is unconditionally applied, which is fine if you're viewing the test log on travis or on your terminal, but a nuisance if you're writing to a file or viewing the log on AWS CodeBuild. I'd like to change the behaviour to

By default, only colour on terminal output (and even then only terminals that support colour codes if we have a sensible way of detecting that without additional dependencies)
Have an environment variable (and possibly a flag) to override that. TESTPL_COLOUR=on|off|auto, something along those lines, when you actually want to write the colour codes despite not writing to terminal?

TGWDB · 2021-06-14T14:21:55Z

Closing since I guess this is no longer happening. Please reopen if you believe this is erroneous, or have the mythical python implementation working.

hannes-steffenhagen-diffblue mentioned this issue Mar 12, 2018

Use environment variable for jobs in test.pl #1925

Merged

karkhaz mentioned this issue Apr 1, 2019

Reimplement CBMC in POSIX shell #4473

Closed

TGWDB closed this as completed Jun 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace test.pl by Python-based implementation, possibly within pytest #1924

Replace test.pl by Python-based implementation, possibly within pytest #1924

tautschnig commented Mar 12, 2018

owen-mc-diffblue commented Mar 20, 2018 •

edited

Loading

tautschnig commented Mar 20, 2018

owen-mc-diffblue commented Mar 20, 2018

karkhaz commented Feb 18, 2019

owen-mc-diffblue commented Feb 19, 2019

karkhaz commented Feb 19, 2019

chrisr-diffblue commented Feb 19, 2019

owen-mc-diffblue commented Feb 19, 2019

karkhaz commented Feb 19, 2019

chrisr-diffblue commented Feb 19, 2019

chrisr-diffblue commented Feb 19, 2019

karkhaz commented Feb 19, 2019

chrisr-diffblue commented Feb 19, 2019

hannes-steffenhagen-diffblue commented Feb 26, 2019

TGWDB commented Jun 14, 2021

Replace test.pl by Python-based implementation, possibly within pytest #1924

Replace test.pl by Python-based implementation, possibly within pytest #1924

Comments

tautschnig commented Mar 12, 2018

owen-mc-diffblue commented Mar 20, 2018 • edited Loading

tautschnig commented Mar 20, 2018

owen-mc-diffblue commented Mar 20, 2018

karkhaz commented Feb 18, 2019

owen-mc-diffblue commented Feb 19, 2019

karkhaz commented Feb 19, 2019

chrisr-diffblue commented Feb 19, 2019

owen-mc-diffblue commented Feb 19, 2019

karkhaz commented Feb 19, 2019

chrisr-diffblue commented Feb 19, 2019

chrisr-diffblue commented Feb 19, 2019

karkhaz commented Feb 19, 2019

chrisr-diffblue commented Feb 19, 2019

hannes-steffenhagen-diffblue commented Feb 26, 2019

TGWDB commented Jun 14, 2021

owen-mc-diffblue commented Mar 20, 2018 •

edited

Loading