Skip to content

cfgt: index entry-map by instruction location number, not the instruction itself #5049

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

smowton
Copy link
Contributor

@smowton smowton commented Aug 22, 2019

Since function location numbers are almost always very dense, we can use a vector rather than a map to get from an instruction to its corresponding cfgt node. The introduced dense_integer_mapt for now has just enough functionality to do this, but could be extended in future to a general-purpose flat-map class a la boost::flat_map.

@smowton
Copy link
Contributor Author

smowton commented Aug 22, 2019

On thinking again, I'll make a "give cfgt a public interface" PR and then base this one on that

@smowton smowton force-pushed the smowton/feature/dominators-avoid-std-map branch from 6e06a40 to 17e9761 Compare August 22, 2019 16:45
@smowton smowton force-pushed the smowton/feature/dominators-avoid-std-map branch from 17e9761 to f55f0cf Compare August 22, 2019 16:52
@smowton
Copy link
Contributor Author

smowton commented Aug 22, 2019

This is now rebased on #5051 which does all the cleanup work that doesn't depend on the underlying data structure; the final commit that does the real work is now much briefer

@smowton smowton force-pushed the smowton/feature/dominators-avoid-std-map branch from f55f0cf to 8f6c82b Compare August 23, 2019 15:46
@peterschrammel peterschrammel changed the title cfgt: index entry-map by instruction location number, not the instruction itself cfgt: index entry-map by instruction location number, not the instruction itself [depends: 5051] Aug 23, 2019
{
}

template <typename Iter>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some documentation for this function would be appreciated

/// Functor to convert cfg nodes into dense integers, used by \ref cfg_baset.
/// Default implementation: the identity function.
template <class T>
class cfg_instruction_to_dense_integert
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could do with some unit tests.

@smowton smowton force-pushed the smowton/feature/dominators-avoid-std-map branch from 8f6c82b to be7b9da Compare August 27, 2019 08:01
@smowton smowton changed the title cfgt: index entry-map by instruction location number, not the instruction itself [depends: 5051] cfgt: index entry-map by instruction location number, not the instruction itself Aug 27, 2019
@smowton smowton force-pushed the smowton/feature/dominators-avoid-std-map branch 2 times, most recently from 6d96310 to af101ef Compare August 27, 2019 16:03
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️
This PR failed Diffblue compatibility checks (cbmc commit: af101ef).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/124782661
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.

Copy link
Collaborator

@martin-cs martin-cs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a very small part touches my area of responsibility and that seems fine.

@@ -663,6 +663,9 @@ void dump_ct::cleanup_decl(

tmp.add(goto_programt::make_end_function());

// goto_program2codet requires valid location numbers:
tmp.update();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it might fix an obscure bug or two...

This is required because otherwise a cfgt<..., goto_programt::targett> wouldn't be able to
accept const_targett as an argument to get_node_index(...) etc, which is undesirable. In
particular its own entry_mapt::keys() would not be usable. In general this ought to be some
constified version of 'I', but since base `cfgt` is currently restricted to either targett
or const_targett then statically specifying const_targett suffices.
@smowton smowton force-pushed the smowton/feature/dominators-avoid-std-map branch 2 times, most recently from d56ce9e to b8a4382 Compare September 27, 2019 16:03
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️
This PR failed Diffblue compatibility checks (cbmc commit: d56ce9e).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/129480980
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.

@smowton smowton force-pushed the smowton/feature/dominators-avoid-std-map branch from b8a4382 to 1b5b0ae Compare September 30, 2019 09:26
Copy link
Contributor

@chrisr-diffblue chrisr-diffblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, though I agree with @peterschrammel that some unit tests would be good to have.

@smowton
Copy link
Contributor Author

smowton commented Sep 30, 2019

@chrisr-diffblue added, had forgotten to git add...

@smowton smowton force-pushed the smowton/feature/dominators-avoid-std-map branch from 5cfd752 to 41585c5 Compare September 30, 2019 10:09
Copy link
Contributor

@danpoe danpoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -187,19 +187,19 @@ class cfg_baset:public grapht< cfg_base_nodet<T, I> >
/// in that particular case you should just use `cfg.get_node(i)`). Storing
/// node indices saves a map lookup, so it can be worthwhile when you expect
/// to repeatedly look up the same program point.
entryt get_node_index(const I &program_point) const
entryt get_node_index(const goto_programt::const_targett &program_point) const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use pass-by-value 🍐

{
return entry_map.at(program_point);
}

/// Get the CFG graph node relating to \p program_point.
nodet &get_node(const I &program_point)
nodet &get_node(const goto_programt::const_targett &program_point)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍐

{
return (*this)[get_node_index(program_point)];
}

/// Get the CFG graph node relating to \p program_point.
const nodet &get_node(const I &program_point) const
const nodet &get_node(const goto_programt::const_targett &program_point) const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍐

// Main store: contains <key, value> pairs, where entries at positions with
// no corresponding key are default-initialised and entries with a
// corresponding key but no value set yet have the correct key but a default-
// initialized key. Both of these are skipped by this type's iterators.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initialized key -> initialized value

This gives a map-like interface but vector-like lookup cost, ideal for
mapping keys that have a bijection with densely packed integers, such as
the location numbers of a well-formed goto_modelt or goto_programt.
This saves a lot of allocations, memory, and time when the cfg is frequently queried.
@smowton smowton force-pushed the smowton/feature/dominators-avoid-std-map branch from 41585c5 to 33ceb0a Compare September 30, 2019 10:39
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️
This PR failed Diffblue compatibility checks (cbmc commit: 41585c5).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/129671093
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.

@codecov-io
Copy link

codecov-io commented Sep 30, 2019

Codecov Report

Merging #5049 into develop will increase coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           develop   #5049      +/-   ##
==========================================
+ Coverage    66.95%     67%   +0.04%     
==========================================
  Files         1146    1147       +1     
  Lines        93823   93842      +19     
==========================================
+ Hits         62823   62881      +58     
+ Misses       31000   30961      -39
Flag Coverage Δ
#cproversmt2 42.72% <98.43%> (+0.1%) ⬆️
#regression 63.5% <100%> (+0.04%) ⬆️
#unit 31.96% <92.3%> (+0.09%) ⬆️
Impacted Files Coverage Δ
src/util/dense_integer_map.h 100% <100%> (ø)
.../goto-instrument/goto_instrument_parse_options.cpp 58.05% <100%> (+0.31%) ⬆️
src/goto-instrument/dump_c.cpp 74.58% <100%> (-0.14%) ⬇️
src/goto-programs/cfg.h 80.21% <100%> (+4.54%) ⬆️
src/goto-symex/goto_symex.cpp 99.5% <0%> (-0.14%) ⬇️
...c/java_bytecode/java_string_library_preprocess.cpp 96.82% <0%> (-0.02%) ⬇️
src/goto-symex/goto_symex.h 92.3% <0%> (ø) ⬆️
src/goto-instrument/points_to.h 0% <0%> (ø) ⬆️
src/analyses/cfg_dominators.h 100% <0%> (ø) ⬆️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d8ed6fe...33ceb0a. Read the comment docs.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 33ceb0a).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/129674156

const auto &instructions = goto_program.instructions;
possible_keys.reserve(
std::distance(instructions.begin(), instructions.end()));
for(auto it = instructions.begin(); it != instructions.end(); ++it)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why std::distance? On a list that's linear.
.size() is constant since C++11.

@smowton smowton merged commit cd54630 into diffblue:develop Sep 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants