Use more efficient algorithm to calculate dominators #1787

hannes-steffenhagen-diffblue · 2018-01-31T17:15:27Z

Use Lengauer&Tarjan's algorithm to calculate the tree of
immediate dominators instead of the previous algorithm,
which gives us a significant performance improvement.

Also, instead of storing the dominators in a std::set we
now just store the immediate dominators and make lookups
in there instead. This is a significant performance improvement
as long as the dominators set are only iterated over or only
queried infrequently, as is currently the case, and also
saves a lot of memory in the case of large functions (previous
memory usage was quadratic with function size, now it is linear).

In cases where a lot of queries are made against the same set of dominators,
they can still be copied to a local std::set prior to that.

chrisr-diffblue · 2018-01-31T17:20:23Z

Would it be possible to split this into two commits (still all in the one PR), with the first commit being the change to Lengauer&Tarjan's algorithm, and then the second commit being the changes around the use of std::set ? That might make it a little clearer to review the changes?

mgudemann · 2018-02-01T08:23:12Z

two regression tests fail due to core dumps

  Running full_slice1/test.descbash: line 1: 12676 Aborted                 (core dumped) ../../../src/cbmc/cbmc --full-slice --property main.assertion.1 --unwind 1 'main.c' > 'test.out' 2>&1
  [FAILED]
  Running full_slice2/test.descbash: line 1: 12681 Aborted                 (core dumped) ../../../src/cbmc/cbmc --full-slice --property main.assertion.2 --unwind 1 'main.c' > 'test.out' 2>&1
  [FAILED]

caused by Error: attempt to compare iterators from different sequences.

mgudemann · 2018-02-01T08:24:17Z

out of interest, what is the performance improvement ?

martin-cs · 2018-02-01T08:26:14Z

out of interest, what is the performance improvement ?

And the benchmarks? My guess is that it is substantial for larger programs.

tautschnig

I need to dig in my emails as some years ago I had changed the implementation according to some paper, but completely failed to document this in the source (and git log doesn't seem to be helpful either). Using dominator trees is definitively a major step forward, though.

tautschnig · 2018-02-01T08:27:28Z

src/analyses/cfg_dominators.h

+  enum class Constness
+  {
+    Constant,
+    NotConstant


No CaMeLCase, please.

tautschnig · 2018-02-01T08:32:15Z

src/analyses/cfg_dominators.h

-    bool changed=false;
-    typename cfgt::nodet &node=cfg[cfg.entry_map[current]];
-    if(node.dominators.empty())
+    void dfs(node_indext root)


Is this not depth-first search on a grapht? If it isn't, please add comments. If it is, please move the implementation to util/graph.h.

It is dfs on a grapht, yes. How would you like it moved to graph.h? Like

template<typename Graph, typename Function> void depth_first_search(const Graph& graph, typename graph::index_t start_node, Function function);?

Couldn't it become a member of grapht?

Sure, what I meant was if you wanted it to take a "per node work" function as a parameter.

Yes, your suggestion is good, except I’d recommend making it a member function.

tautschnig · 2018-02-01T08:37:34Z

unit/analyses/cfg_dominators.cpp

+      REQUIRE(log_doms.find(dominators_data.data[4]) != log_doms.end());
+    }
+  }
+}


Missing end-of-line.

tautschnig · 2018-02-01T09:06:31Z

So here's what I found in my emails: the present implementation follows Cooper et al. "A Simple, Fast Dominance Algorithm", which claims to be faster in practice than Lengauer&Tarjan (even though worst-case complexity is incomparable). The changes to follow Cooper et al. were done in e6f36d9.

The move to Cooper et al. was motivated by poor performance observed while running my Debian experiments. I'd thus echo the request to provide details on benchmarking that has been done.

hannes-steffenhagen-diffblue · 2018-02-01T16:19:41Z

@mgudemann W.r.t the regression test failure, unfortunately I have a bit of a case of "doesn't happen on my machine here". This error is raised by my set iterators when iterators created from different sets are compared against each other; If this error is raised either my logic when creating these iterators is screwy or somewhere in the code iterators are used in an unsound manner. I'll investigate.

hannes-steffenhagen-diffblue · 2018-02-01T16:27:49Z

@tautschnig As to the Cooper algorithm: The current version is not a correct implementation of it: they use a special data structure to perform fast set intersections using some knowledge about how the graph is traversed, while we do not. This data structure is the main reason the algorithm runs fast on small graphs.

Second, the problem this PR is addressing (or at least trying to start to address) is the performance with large graphs, which Cooper suggests will be worse than Tarjan; Cooper suggests their algorithm runs better on practical graphs, where they consider "practical" to mean with < 1000 nodes. The graphs we were having troubles with are significantly larger than that.

hannes-steffenhagen-diffblue · 2018-02-01T17:15:54Z

@tautschnig @martin-cs @mgudemann

Regarding benchmarks, I've created some artificial test cases looking like this:

// v1
#include <assert.h>
int main(void) {
  int x;
  x ^= -253;
  assert(x > 0);
}

// v2
#include <assert.h>
int main(void) {
  int x;
  if (x < 0) { x = x ^ -965; } else { x = x ^ -189; }
  assert(x > 0);
}

Where the middle line is repeated (with different numbers) N times. I've tested with N=100, 1000 and 10000 (called small, medium and large respectively). I've run time goto-analyzer $file --dependence-graph --show on each file with the current code, with the tarjan algorithm and with the tarjan algorithm + the lazy set version, 10 times each collecting the mean and standard deviation. All times in seconds.

File	Original	Tarjan Algorithm	Tarjan with lazy set
v1/small.c	mean=0.371, sd=0.01370320	mean=0.370, sd=0.01490712	mean=0.353, sd=0.008232726
v1/medium.c	mean=0.560, sd=0.02000000	mean=0.571, sd=0.02424413	mean=0.520, sd=0.011547005
v1/large.c	mean=6.291, sd=0.74570548	mean=5.609, sd=0.32188507	mean=2.729, sd=0.019119507
v2/small.c	mean=0.425, sd=0.02415229	mean=0.426, sd=0.01955050	mean=0.408, sd=0.006324555
v2/medium.c	mean=2.018, sd=0.01549193	mean=1.571, sd=0.08556349	mean=1.259, sd=0.011972190
v2/large.c	mean=169.184, sd=28.12850203	mean=96.477, sd=14.38272424	mean=55.728, sd=0.396226647

As you can see, with large graphs the difference is quite substantial (overall 3 times faster in the v2/large case). I've also used callgrind to get some more detailed information, which is where I noticed that in the tarjan case we spend nearly all the time in the last step building the sets, which is how I got the idea to use lazy sets (my tarjan without lazy sets version is almost identical, but uses normal std::sets in assign_dominators).

Here are the results of running the v2/large set with callgrind. Notice that the current version spends nearly 70% of the time in the fixedpoint step, with the tarjan+lazy set version it's only 0.1% even though the overall running time is significantly lower. This performance gained at that point is partially paid back in parts of the code using the dominators because of the linear lookup time, but the differential in my test cases implies it's still worth doing.

Current version:

With tarjan algorithm:

With tarjan + lazy set:

Of course there's a risk that these test files happen to be pathological cases, @martin-cs suggested I try to run these against small/medium open source projects which I'm going to do tomorrow. Please let me know if there's any other statistic you're interested in.

tautschnig · 2018-02-01T19:33:55Z

For benchmarking, would you be able to play with packages from http://debian-fortification.net/? The files in the various goto-binaries.tar.bz2 should do.

tautschnig · 2018-02-02T13:06:15Z

As the current interface at http://debian-fortification.net/ is very cumbersome I am attaching a list of URLs of >9000 goto-binary archives: all-binaries-latest.txt

hannes-steffenhagen-diffblue · 2018-02-02T14:52:09Z

@tautschnig Thanks, I'm looking into one of those right now.

martin-cs · 2018-02-02T16:00:35Z

@tautschnig For general context it might be useful to know the original purpose of this change was to handle very large goto-programs. For example up to 50,000 instructions. These occur when you inline large sections of programs (long story; can explain offline). Thus we need to think not only about the practical speed for the usual use cases but also about the asymptotic behaviour.

tautschnig · 2018-02-02T19:05:35Z

@martin-cs Understood, and you’ll find code of this kind in the collection of binaries/Debian packages - it’s really been 10,000+ instruction functions that prompted me to first debug the scalability failures and then implement Cooper et al.

hannes-steffenhagen-diffblue · 2018-02-05T09:55:19Z

@tautschnig Do you remember one or two in particular that were causing problems?

tautschnig · 2018-02-05T10:02:21Z

@tautschnig Do you remember one or two in particular that were causing problems?

I'm afraid I don't, and I don't think I would have any logs dating back >5 years. But anyway don't focus on that too much: just run your experiments on as many of those binaries, which in turn yields real-world data in addition to the data obtained on your artificially crafted code.

hannes-steffenhagen-diffblue · 2018-02-07T11:35:56Z

@tautschnig

I've been checking some of the examples you gave me, here are my results

file	Original	Tarjan Algorithm	Tarjan with lazy set
gzip	mean=1.864, sd=0.0275681	mean=1.863, sd=0.05498485	mean=1.892, sd=0.01549193
gzip (inlined)	OOM	OOM	~20 minutes
dmenu	mean=0.217, sd=0.009486833	mean=0.217, sd=0.006749486	mean=0.216, sd=0.00843274
dmenu (inlined)	mean=14.019, sd=0.08646772	mean=9.381, sd=0.05646041	mean=7.972, sd=0.03994441
grep	mean=9.603, sd=0.1028537	mean=9.448, sd=0.1350555	mean=9.448, sd=0.08689713
git	mean=67.886, sd=0.6079876	mean=66.864, sd=0.2416241	mean=67.231, sd=0.580181
nc	mean=0.401, sd=0.0218327	mean=0.402, sd=0.02440401	mean=0.393, sd=0.01888562
nc (inlined)	mean=1.151, sd=0.01197219	mean=1.083, sd=0.008232726	mean=1.057, sd=0.0105935

Methodology as before, "inlined" means that the binary was produced from the original with goto-instrument --function-inline main (see table below for size differences - note especially that inlined gzip has a 100 fold size increase over the original, which gives an explanation for the huge difference in runtime). I didn't inline all of them as even with the smaller binaries I've been running into memory problems, but I think this much testing should be sufficient to notice a trend. In general, for "normal" binaries with mostly small functions we can see that the difference in overall runtime between the 3 versions is too small to conclusively say that one is better than the other, but for the inlined versions which have huge functions we can see some significant performance differences favouring the new version.

file	original	inlined
gzip	2.2M	225M
dmenu	356K	2.0M
nc	496K	608K

tautschnig · 2018-02-07T13:43:10Z

Thanks a lot for undertaking these experiments!!! If you have produced any scripts that could be of repeated value, feel free to drop them in the scripts/ folder (duplicate work is to be avoided...).

As all CI tasks are failing, could you please address this? Then a proper review on my part is due...

hannes-steffenhagen-diffblue · 2018-02-07T15:40:01Z

@tautschnig The travis and AppVeyor tasks are failing for different reasons. I know why the AppVeyor one is failing and it's easy enough to address, I'm still investigating the travis one.

tautschnig · 2018-02-07T18:50:41Z

Running in gdb (when building using -D_GLIBCXX_DEBUG) makes debugging the iterator comparison problems pretty easy, but I suppose that's the obvious part.

hannes-steffenhagen-diffblue · 2018-02-09T18:16:52Z

Not quite "done done", but this should only require minor changes to pass the remaining tests.

tautschnig

By the author's own admission, this code is "ununderstandable" - that needs to be fixed before a serious review can be conducted.

I'm all for improving the performance, but replacing a rather simple code by something "ununderstandable" to handle some edge case (which is what I consider 10k+-statement functions) doesn't seem to be justified.

tautschnig · 2018-02-10T16:03:17Z

src/analyses/cfg_dominators.h

@@ -16,20 +16,163 @@ Author: Georg Weissenbacher, [email protected]
 #include <list>
 #include <map>
 #include <iosfwd>
-#include <cassert>
+#include <algorithm>
+#include <fstream>


Why would that be necessary here?

Leftover from debugging, removing

tautschnig · 2018-02-10T16:04:47Z

src/analyses/cfg_dominators.h

+template<typename T, typename node_indext>
+struct dominatorst {
+  using datat = dominators_datat<T, node_indext>;
+  static constexpr node_indext npos = std::numeric_limits<node_indext>::max();


As far as I recall that doesn't work with Visual Studio.

It does work with VS2015+; In this case this can just be changed to const but there was some discussion on moving the minimum requirement to VS2015.

If that can be done for AppVeyor then surely that's fine - but for now the build fails at that line. Or is it just a missing #include <limits>?

AppVeyor was failing because it didn't recognize constexpr, but is indeed missing, thanks for spotting this

tautschnig · 2018-02-10T16:06:33Z

src/analyses/cfg_dominators.h

+
+  datat *dominators_data;
+  node_indext node_index;
+  mutable std::size_t cached_distance;


Shouldn't this be a class with those three being made private? Certainly true for a pointer.

There's no reason for it not to be, so sure I can change this.

tautschnig · 2018-02-10T16:13:25Z

src/analyses/cfg_dominators.h

+      //
+      INVARIANT(data == other.data,
+                "iterators from different sets are not comparable");
+      return data != other.data || current_index != other.current_index;


This check suggests that the INVARIANT above could be violated.

True, fixing this.

tautschnig · 2018-02-10T16:15:47Z

src/analyses/cfg_dominators.h

+  struct fixedpointt {
+    explicit fixedpointt(cfg_dominators_templatet &cfg_dominators)
+        : cfg_dominators(cfg_dominators),
+          dfsCounter(0),


No caMelCaSe, please.

tautschnig · 2018-02-10T16:20:01Z

src/analyses/cfg_dominators.h

+    P &program) {
+  // Dominator Tree according to Lengauer and Tarjan
+  // "A fast algorithm for finding dominators in a flow graph"
+  // This is ununderstandable without reading the paper!


That's not a good job done, then.

What should happen: surely it is possible to give the gist of the algorithm in a few sentences. Each of the key steps of the algorithm should then be commented on below.

I'll see if/how I can clear this up, this will take a while though. This also relates to the eval/compress etc functions algorithms

tautschnig · 2018-02-10T16:24:04Z

src/analyses/cfg_dominators.h

+      cfg_dominators.cfg.entry_map[cfg_dominators.entry_node] + 1;
+  dfsCounter = 0;
+  dfs(root);
+  for (node_indext i = dfsCounter; i >= 2; --i) {


Why is it 2? I'm mainly asking as the original paper might assume indices starting from 1 (and I don't understand the code so I can't tell what it is supposed to do).

tautschnig · 2018-02-10T16:25:52Z

src/analyses/cfg_dominators.h

-void dominators_pretty_print_node(const T &node, std::ostream &out)
-{
+template<class T>
+void dominators_pretty_print_node(const T &node, std::ostream &out) {


No whitespace games, please.

tautschnig · 2018-02-10T16:26:34Z

src/analyses/cfg_dominators.h

+template<class P, class T, bool post_dom>
+void cfg_dominators_templatet<P, T, post_dom>::output(std::ostream &out) const {
+  for (const auto &node : cfg.entry_map) {
+    auto n = node.first;


As above: don't change what doesn't need changing. In particular: whitespace.

Sorry, that was an accident

tautschnig · 2018-02-10T16:28:30Z

src/analyses/cfg_dominators.h

+    // stored in this set are passed to find.
+    // The Debug libstdc++ will (correctly!) run into an assertion failure
+    // using std::find. std::less for some reason doesn't trigger this assertion
+    // failure, so we use this as an ugly workaround until that code is fixed.


Why is fixing the root cause more problematic?

tautschnig · 2018-02-14T20:09:27Z

With regard to the linter: I see little risk in adding // NOLINTNEXTLINE(whitespace/braces) as that wouldn't really silence more than what it is expected to silence. If that still seems too much of a concern, then someone needs to sit down and fix the linter.

tautschnig

So we've got the code reformatted, but the "ununderstandable" bit remains, which equals the absence of comments and documentation.

tautschnig · 2018-02-14T20:13:15Z

src/analyses/cfg_dominators.h

+        label(cfg_dominators.cfg.size() + 1),
+        semi(cfg_dominators.cfg.size() + 1),
+        size(cfg_dominators.cfg.size() + 1),
+        bucket(cfg_dominators.cfg.size() + 1)


So what's the magic behind the + 1?

tautschnig · 2018-02-14T20:15:07Z

src/analyses/cfg_dominators.h

    }

-    // compute intersection of predecessors
-    for(const auto &edge : (post_dom ? node.out : node.in))
+    void compress(node_indext v)


I had feared as much: pretty much all of my previous comments are gone as they got killed by reformatting. Those comments were: where is the documentation?

tautschnig · 2018-02-14T20:17:12Z

src/analyses/cfg_dominators.h

-  const goto_programt::targett& target,
-  std::ostream& out)
+  const goto_programt::targett &target,
+  std::ostream &out)


Correct, but unnecessary.

tautschnig · 2018-02-14T20:17:44Z

src/analyses/cfg_dominators.h

+typedef cfg_dominators_templatet<const goto_programt,
+                                 goto_programt::const_targett,
+                                 true>
+  cfg_post_dominatorst;


I think there are no changes here.

tautschnig · 2018-02-14T20:26:08Z

The code was deliberately written to be as close to the paper as possible, to allow easy cross checking against it. I'm not opposed to clearing it up as much as it is possible to do so without destroying that connection, but I'm not sure if the readability would be substantially improved that way.

Sticking close to the paper is great, and I wouldn't want you to diverge from that. But adding comments in between should still be feasible? I'm not asking for long essays, just a single sentence describing the steps. Maybe that's just me, but what I'm doing in symex_other.cpp of #1800 is of that nature: a paragraph upfront and then link the individual steps to that overall story.

hannes-steffenhagen-diffblue · 2018-02-15T10:03:58Z

@tautschnig I think that is a good idea, I have been looking into the paper again to see if I can give a good summary of the steps.

danpoe · 2018-02-15T11:06:29Z

I would be in favor of keeping the original dominator implementation around as well. Then users could either hardcode which one to use, or we could make it switchable via commandline (as suggested by @chrisr-diffblue).

Making it switchable would probably require an abstract dominator factory as users of the dominator computation need the ability to create dominator classes. We could add a commandline option with which to select the dominator computation implementation, and then based on that instantiate a concrete dominator factory. We would then pass it along to all sites where dominator objects are currently created, and the dominator objects would be created via the factory. Any thoughts?

hannes-steffenhagen-diffblue · 2018-02-15T13:41:34Z

@tautschnig @danpoe Has provided some documentation, would you mind taking a look at it?

tautschnig · 2018-02-15T13:46:07Z

Will review in due course; just one remark: I'd appreciate if "Merge pull request #1 from danpoe/tmp/develop_dominators_fake_set" were removed so that the commit history brought in doesn't have merge commits.

hannes-steffenhagen-diffblue · 2018-02-15T13:50:31Z

I will change history

tautschnig · 2018-02-15T13:54:54Z

src/analyses/cfg_dominators.h

+  const_iterator find(const T &node) const
+  {
+    std::less<T> less;
+    // FIXME This works around a bug in other parts of the code in particular,


So I would still like to understand why that's being worked around here rather than fixing it at the root of the problem?

"Bug" is maybe the wrong word here. The "T" here is going to be a const_iterator of a std::list (this is how const_targett in goto_program_template.h is defined). Normally this doesn't have an ordering, only equality which is only defined on iterators of different lists. In other parts of the code a custom ordering for list is defined (based on the address of the thing the iterator is pointing to). We can't redefine equality like that though, because the iterator already has a defined equality.

"Fixing the root issue" in this case would mean rewriting all the code that uses iterators to identify locations in the goto code to no longer do that (and use something else instead). Regardless of whether or not that's a good idea, this would be a major change in the code and should therefore be in a different PR.

What raised my eyebrows was the part below: "in particular, dependence_graph.cpp, where iterators to different lists than those that
are stored in this set are passed to find." That sounds like a problem that's avoidable, as opposed to what you are saying above, which seems inherent in the design of the data structures?

If I'm understanding the dependence_graph.cpp correctly, then this 'property' of comparing items from different iterators is a fairly fundamental aspect of how dependence_graph works with goto programs, in that it is keeping around sets of const_targett where they may be from different goto-functions (and hence, different iterators) in the case of cross-function dependencies. It seems that dependence_graph.cpp already uses a similar pattern to @hannes-steffenhagen-diffblue in various places, e.g. dep_graph_domaint::merge is using < for comparisons when merging dependencies (OK, so that's partly just the merge algorithm but also means it tolerates const_targetts in different functions/iterators).

I think any attempt to fixup dependence_graph.cpp would definitely be outside the scope of this particular PR, but we can always raise an issue to cover that and come back to that in a different PR if people feel strongly enough that it should be fixed (of course, any out-of-tree code that builds on dependence_graph.cpp might end up also having to be fixed up in that case too). As it is, I think this PR is probably at least in keeping with the current implementation of dependence_graph.cpp.

However, as @martin-cs has mentioned, it's probably worth holding off merging this though until we find out if our end user gets any benefit in their use-case.

tautschnig · 2018-02-15T18:42:30Z

src/analyses/cfg_dominators.h

 template <class P, class T, bool post_dom>
 void cfg_dominators_templatet<P, T, post_dom>::output(std::ostream &out) const
 {
  for(const auto &node : cfg.entry_map)
  {
-    auto n=node.first;
+    auto n = node.first;


Seems unnecessary.

tautschnig · 2018-02-15T18:42:47Z

src/analyses/cfg_dominators.h

+typedef cfg_dominators_templatet<const goto_programt,
+                                 goto_programt::const_targett,
+                                 true>
+  cfg_post_dominatorst;


Seems unnecessary to change.

tautschnig · 2018-02-15T18:44:43Z

src/analyses/cfg_dominators.h

 template <class T>
 void dominators_pretty_print_node(const T &node, std::ostream &out)
 {
  out << node;
 }

+/// Pretty-print a single node.
+/// \param node: node to print


It's actually called target (and thus trips up the doxygen check)

martin-cs · 2018-02-16T12:06:53Z

Can we wait merging this until I've heard definitively from the end user? I'm not sure if this fully solves their problem. Cheers, - Martin

tautschnig · 2018-02-16T12:37:21Z

Marking do-not-merge per @martin-cs' request.

Use Lengauer&Tarjan's algorithm to calculate the tree of immediate dominators instead of the previous algorithm, which gives us a significant performance improvement. Also, instead of storing the dominators in a std::set we now just store the immediate dominators and make lookups in there instead. This is a significant performance improvement as long as the dominators set are only iterated over or only queried infrequently, as is currently the case, and also saves a lot of memory in the case of large functions (previous memory usage was quadratic with function size, now it is linear). In cases where a lot of queries are made against the same set of dominators, they can still be copied to a local std::set prior to that.

This is a workaround to a bug that gets triggered in dependence_graph.cpp - in the function dep_graph_domaint::control_dependencies, find is called on cfg_dominators_templatet::target_sett, with an iterator parameter called "from". It is not ensured that this iterator is from the same list as the iterators within the dominators set, this is a problem according to C++ Draft Standard N3960: 24.2.1 An iterator j is called reachable from an iterator i if and only if there is a finite sequence of applications of the expression ++i that makes i == j. If j is reachable from i, they refer to elements of the same sequence. 24.2.5 The domain of == for forward iterators is that of iterators over the same underlying sequence.

smowton

Some suggestions; let's get this in as dominators vs. dominator trees is a major win memory-wise (and apparently time-wise too)

smowton · 2019-03-20T13:49:25Z

src/analyses/cfg_dominators.h

+        dfs_counter(0),
+        // Data structures have size cfg.size() + 1 as node indices are 1-based
+        // to match the paper of Lengauer/Tarjan.
+        parent(cfg_dominators.cfg.size() + 1),


Replace these 9 vectors with a vector-of-structs

smowton · 2019-03-20T13:50:20Z

src/analyses/cfg_dominators.h

+    explicit fixedpointt(cfg_dominators_templatet &cfg_dominators)
+      : cfg_dominators(cfg_dominators),
+        dfs_counter(0),
+        // Data structures have size cfg.size() + 1 as node indices are 1-based


Is it important that the reserved 0 is lower than all the allocated indices (e.g. so that the null node <= the root which is <= all other nodes?) If not suggest just using 0-based indices and note the deviation from the paper.

@smowton I think not, IIRC I just got fed up trying to translate the logic from the paper and eventually just opted to just replicate their logic exactly, including indexing.

smowton · 2019-03-20T13:52:15Z

src/analyses/cfg_dominators.h

+    link(parent[w], w);
+
+    // Step 3
+    // Implicitely define immediate dominator


smowton · 2019-03-20T13:53:10Z

src/analyses/cfg_dominators.h

+    // ugly workaround until that code is fixed.
+
+    // NOLINTNEXTLINE
+    return std::find_if(cbegin(), cend(), [&](const T &other_node) {


How about just comparing by location number? Those should be unique (if not, the caller must run goto_programt::update() first)

Or, for other T, require that the supplied T provide some accessor that yields a unique ID.

That might work, I don't think I was aware of location numbers at the time or that they were meant to be unique.

smowton · 2019-03-20T13:54:08Z

unit/analyses/cfg_dominators.cpp

+
+\*******************************************************************/
+
+#include <string>


Testing: how about retaining the existing dominator algorithm, which is simple enough to check by inspection, randomly generating a couple of hundred medium-sized graphs and checking they give identical results?

@smowton That's more or less what I did manually when testing this IIRC, but doing this as part of automated testing of course makes a lot of sense

martin-cs · 2019-03-21T20:41:06Z

@tautschnig : this should not be blocked on my account any more. I no longer have responsibility for the end-user application that originally started this. Maybe @chrisr-diffblue can give a more up-to-date comment. IIRC this patch didn't solve the problem and I'm not sure if I knew why. If anyone really cares I can try to find the original correspondence on the issue.

hannes-steffenhagen-diffblue requested review from chrisr-diffblue, martin-cs, peterschrammel, smowton and thk123 as code owners January 31, 2018 17:15

tautschnig reviewed Feb 1, 2018

View reviewed changes

hannes-steffenhagen-diffblue force-pushed the develop_dominators_fake_set branch from 8e85f41 to 6c7382a Compare February 9, 2018 18:12

tautschnig requested changes Feb 10, 2018

View reviewed changes

hannes-steffenhagen-diffblue force-pushed the develop_dominators_fake_set branch 2 times, most recently from bc18e85 to dac13b4 Compare February 13, 2018 14:01

hannes-steffenhagen-diffblue force-pushed the develop_dominators_fake_set branch from a40aa2b to 026bd5b Compare February 14, 2018 17:10

tautschnig requested changes Feb 14, 2018

View reviewed changes

hannes-steffenhagen-diffblue force-pushed the develop_dominators_fake_set branch from ca4d25c to 0bff1f3 Compare February 15, 2018 13:53

tautschnig reviewed Feb 15, 2018

View reviewed changes

tautschnig added the do not merge label Feb 16, 2018

hannes-steffenhagen-diffblue force-pushed the develop_dominators_fake_set branch 6 times, most recently from 9a4a0ec to e0ba995 Compare February 16, 2018 14:53

hannes-steffenhagen-diffblue and others added 3 commits February 16, 2018 14:54

Documentation for cfg dominator class

4f89a2b

hannes-steffenhagen-diffblue force-pushed the develop_dominators_fake_set branch from e0ba995 to 4f89a2b Compare February 16, 2018 14:55

tautschnig assigned hannes-steffenhagen-diffblue Feb 21, 2018

smowton reviewed Mar 20, 2019

View reviewed changes

smowton mentioned this pull request Aug 21, 2019

Dominator analysis: analyse on basic block granularity #5042

Open

hannes-steffenhagen-diffblue closed this Feb 4, 2021


		\*******************************************************************/

		#include <string>

Use more efficient algorithm to calculate dominators #1787

Use more efficient algorithm to calculate dominators #1787

Conversation

hannes-steffenhagen-diffblue commented Jan 31, 2018

chrisr-diffblue commented Jan 31, 2018

mgudemann commented Feb 1, 2018 • edited Loading

mgudemann commented Feb 1, 2018

martin-cs commented Feb 1, 2018 via email

tautschnig left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tautschnig commented Feb 1, 2018 • edited Loading

hannes-steffenhagen-diffblue commented Feb 1, 2018

hannes-steffenhagen-diffblue commented Feb 1, 2018

hannes-steffenhagen-diffblue commented Feb 1, 2018 • edited Loading

tautschnig commented Feb 1, 2018

tautschnig commented Feb 2, 2018

hannes-steffenhagen-diffblue commented Feb 2, 2018

martin-cs commented Feb 2, 2018 via email

tautschnig commented Feb 2, 2018

hannes-steffenhagen-diffblue commented Feb 5, 2018

tautschnig commented Feb 5, 2018

hannes-steffenhagen-diffblue commented Feb 7, 2018

tautschnig commented Feb 7, 2018

hannes-steffenhagen-diffblue commented Feb 7, 2018

tautschnig commented Feb 7, 2018

hannes-steffenhagen-diffblue commented Feb 9, 2018

tautschnig left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tautschnig commented Feb 14, 2018

tautschnig left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tautschnig commented Feb 14, 2018

hannes-steffenhagen-diffblue commented Feb 15, 2018

danpoe commented Feb 15, 2018

hannes-steffenhagen-diffblue commented Feb 15, 2018

tautschnig commented Feb 15, 2018

hannes-steffenhagen-diffblue commented Feb 15, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martin-cs commented Feb 16, 2018 via email

tautschnig commented Feb 16, 2018

smowton left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mgudemann commented Feb 1, 2018 •

edited

Loading

tautschnig commented Feb 1, 2018 •

edited

Loading

hannes-steffenhagen-diffblue commented Feb 1, 2018 •

edited

Loading