Skip to content

Sanitizer error in Threading Building Blocks #1476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
helen-23 opened this issue Aug 8, 2020 · 4 comments
Closed

Sanitizer error in Threading Building Blocks #1476

helen-23 opened this issue Aug 8, 2020 · 4 comments

Comments

@helen-23
Copy link
Contributor

helen-23 commented Aug 8, 2020

When running the titan_new DLA benchmark variants with the stratixiv_arch.timing.xml architecture under Titan flow, VPR succeeded, but the AddressSanitizer reported a few runtime errors in the TBB. There are also some memory leaks.

Expected Behaviour

Should not have any runtime errors with sanitizer build turned on.

Current Behaviour

The sanitizer currently reports the following 2 issues:

tbb_error
memory_leak

Possible Solution

  • There are many discussions online about this issue. Most suggested that it is a defect in the TBB and that it's possible that Intel has fixed it in later versions, so updating the TBB in the future might work

  • Perhaps turning threading off?

Steps to Reproduce

  1. checkout my branch, "vqm2bliff_one_lut_removal", which contains all changes required to run the DLA circuit
  2. unzip the attached DLA circuit (DLA_BSC)
  3. Run titan_flow.py with DLA_BSC and the SIV architecture capture. DO NOT run titan_flow.py with sanitizer build turned on, because there is currently integer overflow in the hash function due to multiply-add. Please do note to turn on options --fit and --gen_post_fit_netlist, because the DLA circuits need post-fit netlist for VPR. An example of the command looks like the following:

<titan_dir>/scripts/titan_flow.py \
-q DLA_BSC/quartus2_proj/DLA.qpf \
-a <vtr_root_dir>/vtr_flow/arch/stratixiv_arch.timing.xml \
--fit \
--gen_post_fit_netlist \
--titan_dir <titan_dir> \
--vqm2blif_dir <vtr_root_dir>/build/utils/vqm2blif \
--quartus_dir /tools/intel/install/fpga/18.1/standard/quartus/bin \
--vpr_dir <vtr_root_dir>/vpr

  1. Now with sanitizer build turned on, run VPR with the generated post-fit BLIF and the provided vpr.sdc (DLA_BSC/run_flow/vpr.sdc). An example of the command looks like the following:

<vtr_root_dir>/vpr/vpr \
<vtr_root_dir>/vtr_flow/arch/stratixiv_arch.timing.xml \
DLA_stratixiv_post_fit.blif \
--sdc_file DLA_BSC/run_flow/vpr.sdc \
--route_chan_width 300 \
--max_router_iterations 400 \
--timing_analysis on \
--timing_report_npaths 1000

Context

Your Environment

  • VTR revision used: 8.0
  • Operating System and version: Linux Ubuntu 18.04.4 LTS (Bionic Beaver)
  • Compiler version:

Attachments

DLA_BSC.zip

@vaughnbetz
Copy link
Contributor

Thanks Helen. @kmurray, any ideas? If this is a flaw in TBB, we could suppress the error. Helen is running vpr without multi-threading on, so Tatum is presumably not talking to tbb much in this case either.

@vaughnbetz
Copy link
Contributor

vaughnbetz commented Aug 13, 2020

This may be due to the lack of a sanitizer suppression file in CI (vtr_flow.py issue) which is tracked in another issue: #1470

@kmurray
Copy link
Contributor

kmurray commented Sep 3, 2020

I've seen this before, it seems to be latent within TBB. It goes away if we build with out TBB, so I don't think its something we can fix. A suppression file is likely the best approach.

@vaughnbetz
Copy link
Contributor

Thanks Kevin. It seems to have gone away once we fixed the suppression file. At least I haven't heard any more about it, which I'm interpreting as it is gone :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants