Feature filter goto symbol table [blocks: #2649] #2680

mmuesly · 2018-08-03T17:29:43Z

I had the desire to filter the output of a goto symbol table for some certain flags.
This simplifies the process, to determine which symbols are a good choice to pass to the memory-analyzer in #2649.

This PR allows, to use --filter-symbol-table together with --list-symbol-table or --show-symbol-table to filter the printed symbols by flags.

See the content of the HELP_SHOW_SYMBOL_TABLE macro for more context on allowed flags.

I had the need, to filter the output of the symbol table for a certain type of symbols. So, this commit adds an additional flag for goto-instrument and cbmc to filter the output of the show-symbol-table and list-symbol-table command. Further, this commit moves the commandflag defintion for show_symbol_table into its own macro. Then this new macro is included into cbmc_parse_options and goto_instrument_parse_options. The same is done for the help output. This commit keeps the --list-symbols flag for backwards compatibility, but also added --list-symbol-table with the same behavior, to align flag names with other show/list functions.

smowton · 2018-08-04T20:13:00Z

Thinking it might be better to do this by using the json-ui and filtering in a script, rather than implement a filtering mini language that will inevitably be less than complete?

martin-cs · 2018-08-06T12:03:20Z

Yeah... I'm kind of with @smowton on this one... I'm not sure what value this gives above and beyond filtering things externally.

mmuesly · 2018-08-06T15:12:11Z

Well as said, I just want to filter the printed symbols. CBMC already loads an internal representation of the symbols, so if I filter the internal representation before printing, I still keep all the nice output formatting and stuff existing, while I don't have to reload the transformed plaintext or JSON output into a new data model, do my filtering externally and then write formatting code for results. (Yes, I am aware that this is done with a few lines of python code, if using a JSON representation).

Here is the story, I think that evolves, if I do this in external scripts:
For sure, I am able to write grep scripts or any other external filter. Next, my colleague needs to do the same thing, so he writes his external scripts or a derivate of my scripts and soon, we all spend considerable amounts of time writing external scripts and maintaining them, whenever someone changes the output format, instead of using a flag, built-in to CBMC that is updated when output is changed.

Regarding the completeness of the filter language:
Currently, it matches all flags owned by symbolt. It is just inevitably incomplete, if you add flags without caring for the output, but that is a process problem, not a domain problem in my opinion. Anyhow, if a filter option is missing, I would prefer changing probably a few lines of code in an existing approach, than writing one from scratch.

Apart from that, I haven't seen json-ui support for symbol tables, so I would have to write that first and that is not going to happen.
Adding the suggested change set, is the fastest and most reliable way to solve the problem from my point of view in a reusable way. Probably it is still less than perfect, but acceptable from my point of view for basic filter needs.

It is up to you, to scope out the user experience you want to provide for CBMC.
I am happy to address the clang format and lint issues occurring during CI checks, once there is a decision, whether this PR will be merged or not.

smowton · 2018-08-07T08:59:52Z

If I make a PR later this week for json-ui to do something sensible with the symbol table would that help?

martin-cs · 2018-08-07T13:12:54Z

@mmuesly I see your issue and as someone who has written exactly the kind of scripts you're talking about (and then had to maintain them when someone changed the output!) I am sympathetic to your problem. The idea of the XML output (and later, the JSON output) was to give a low-pain, more reliable route for interfacing other tools by giving structured output and so skipping the parsing step. IMHO the text output is really for humans and at "human scale" ( @smowton @tautschnig and others may well disagree with this ). So I'm a little reluctant to support patches that do significant "text UI" work.

Would having example / sample scripts in contrib/ improve things? @smowton : opinions?

smowton · 2018-08-07T13:16:28Z

That sounds ideal -- proper (tm) JSON / XML / your-serialization-format-here output, and ship (and TEST / CI) some example scripts

mmuesly · 2018-08-07T13:33:11Z

As long as there is the filter script somewhere shipped in the code base and still a CI test in place, I am happy to accept that instead.

That said, I still don't see the value for doing this in the post ui processing, if I might filter the central data structure before converting it to a proper ui. This is not really a ui dependent functionality. I just want to cut down the data stream, that ends up in the output.

It just effects the text ui right now, because there is missing ui support for other uis in this functionality in general right now. I would have build this in a cross functional ui filter otherwise following the DRY principal. That said, I still see room in this PR for code reduction and might offer to invest more time into reducing the code duplication in check_commandline_for_show_symbol_flags.

So stepping back from the ui output format discussion, the question might be reduced to:

Do we want one filter in place filtering all uis and anyone can easily use his favorite ui,
or do we want to write a post process filter for every ui?

I would choose option one, but if you want to maintain option two, I am happy with that. As long as I have one out of the box working script for filtering at hand, whenever I need it, my requirements are met.

smowton · 2018-08-07T13:46:26Z

For me the big difference is if we commit the filter that's set up for the particular needs of your workflow, then some second user comes along and says "I want a slightly different filter," then (a) we have to advise them to edit the cbmc source code to achieve that, which may be difficult if they're not a C++ programmer / are using a binary distribution / want to do something that's a pain in C++ but easy in languages oriented towards string processing, and (b) if we accept their filter too, we end up either committing a cornucopia of different rather specific filters, or implementing what amounts to a query language. It seems to me if a user can throw 5 lines of Python or a jq one-liner together instead to implement their particular filtering needs then we have a much friendlier and more general solution. I don't care what the interchange format is between cbmc and Python, but of the two we have widest support for now (JSON and XML), JSON seems nicest.

mmuesly · 2018-08-07T18:35:19Z

Thanks for the explanation. I can follow your argumentation.

But there are two further points I want to make:
First, I am currently trying to contribute my workflow as an extension to the cbmc code base, and you cannot feed non symbol variables to the memory-analyzer at the moment. There are people in favor for merging the gdb interconection.
So user will need some assistance to figure out, what they want to feed in. We might discuss, that this filter should go into the memory-analyzer rather than into cbmc and goto-instrument.

Second, if your main concern is about different people suggesting different filters and you really see that a realistic scenario, why don't you create an api, that allows to load custom binary filter before outputting the goto program? That would be a really nice thing.
There is a lot of structural information in the goto-modelt and symbolt, that I have to reconstruct expensively for more complex filtering, but get close to free, when writing a C++ filter.
I tried to write the memory-analyzer as an independent tool on top of filters, instead of intermingle it with the goto-modelt, but it is much easier to intermingle it as I can use the full support of the goto-modelt data structure. And it was the same experience writing this filter.

TGWDB · 2023-05-03T15:22:56Z

Closing due to age (no further comment on PR content), please reopen with rebase on develop if you intent to continue this work.

mmuesly added 2 commits August 3, 2018 13:23

Add regression test for filter in show_symbol_table

ffdf57a

mmuesly requested review from chrisr-diffblue, kroening, martin-cs, peterschrammel, pkesseli, smowton and tautschnig as code owners August 3, 2018 17:29

tautschnig self-assigned this Nov 10, 2018

tautschnig changed the title ~~Feature filter goto symbol table~~ Feature filter goto symbol table [blocks: #2649] Nov 10, 2018

tautschnig added the blocker label Nov 10, 2018

tautschnig added the dependent - do not merge label Feb 25, 2019

tautschnig changed the title ~~Feature filter goto symbol table [blocks: #2649]~~ Feature filter goto symbol table [depends-on: #4261, blocks: #2649] Feb 25, 2019

tautschnig changed the title ~~Feature filter goto symbol table [depends-on: #4261, blocks: #2649]~~ Feature filter goto symbol table [blocks: #2649] Nov 9, 2022

tautschnig removed the dependent - do not merge label Nov 9, 2022

TGWDB closed this May 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature filter goto symbol table [blocks: #2649] #2680

Feature filter goto symbol table [blocks: #2649] #2680

Uh oh!

mmuesly commented Aug 3, 2018

Uh oh!

smowton commented Aug 4, 2018

Uh oh!

martin-cs commented Aug 6, 2018

Uh oh!

mmuesly commented Aug 6, 2018 •

edited

Loading

Uh oh!

smowton commented Aug 7, 2018

Uh oh!

martin-cs commented Aug 7, 2018

Uh oh!

smowton commented Aug 7, 2018

Uh oh!

mmuesly commented Aug 7, 2018

Uh oh!

smowton commented Aug 7, 2018

Uh oh!

mmuesly commented Aug 7, 2018

Uh oh!

TGWDB commented May 3, 2023

Uh oh!

Uh oh!

Feature filter goto symbol table [blocks: #2649] #2680

Feature filter goto symbol table [blocks: #2649] #2680

Uh oh!

Conversation

mmuesly commented Aug 3, 2018

Uh oh!

smowton commented Aug 4, 2018

Uh oh!

martin-cs commented Aug 6, 2018

Uh oh!

mmuesly commented Aug 6, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

smowton commented Aug 7, 2018

Uh oh!

martin-cs commented Aug 7, 2018

Uh oh!

smowton commented Aug 7, 2018

Uh oh!

mmuesly commented Aug 7, 2018

Uh oh!

smowton commented Aug 7, 2018

Uh oh!

mmuesly commented Aug 7, 2018

Uh oh!

TGWDB commented May 3, 2023

Uh oh!

Uh oh!

mmuesly commented Aug 6, 2018 •

edited

Loading