-
Notifications
You must be signed in to change notification settings - Fork 415
Parallel router: baseline #2337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Now this PR contains a number of cleanups and API changes:
Should format, squash and rebase. I'll run vtr_reg_strong with Oops: there's no TBB in the runners |
574bcbf
to
87b2b50
Compare
I think the tatum nondeterminism still exists with |
I think the tatum nondeterminism was not due to a data race (spent a lot of time on it with Intel Inspector) but the placer behavior depending on the ordering of Parallel execution in |
a lot of failures due to memory and runtime increase? |
Turns out NetPinTimingInvalidator doesn't do anything when incremental STA is not enabled, but it does keep a cache to avoid doing nothing several times. It's a lot of trouble to keep that cache in a multithreaded and generic way and keep the performance. Added a NoopInvalidator for the general case of non-incremental timing analysis, this should help with the place time failures? |
I would add |
The strong_odin failure doesn't make a lot of sense: the input cmdline has |
Maybe it's a two phase test where it runs the router in find minimum channel width mode, and then re-runs at a channel width of 26? |
Indeed it was a bug I accidentally introduced while rebasing. |
I can't reproduce the place time failures on wintermute. Probably better to disable tbb for CI and figure out another way to integrate tbb tests. |
ca0d135
to
d151a28
Compare
f662632
to
ae98395
Compare
* may be called concurrently */ | ||
#ifdef TATUM_USE_TBB | ||
tbb::concurrent_vector<EdgeId> invalidated_edges_; | ||
tatum::util::linear_map<EdgeId, size_t> edge_invalidated_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be a char (to save memory) (explain in the comments why that's thread safe)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Biggest feedback: comment functions (in header, or at static declaration) as much as you can.
I agree making a unit test for the partition tree is good, but that can wait for another PR.
Please summarize and link to your QoR data. QoR results master vs. this branch with parallel off, and again with parallel on. |
Looks like there are a bunch of build failures in the test CI too |
🤦 |
This commit has a bunch of cleanups piggybacked onto the baseline parallel router: * Use RRNodeId where possible * Remove dead code and docs left from breadth-first router * Fix nondeterminism in parallel tatum * Fix performance penalty from NetPinTimingInvalidator
Seems like this is working (CI already fixed by you @duck2)? |
Yes, the ducktape seems to make it work. Yosys didn't get built in |
Thanks @duck2 . |
Description
Add a spatial partitioning-based parallel router. This version is very simple, it builds a PartitionTree which recursively partitions the netlist into groups (
PartitionTreeNode
s) of non-overlapping nets, then uses a thread pool to route the groups.Related Issue
Motivation and Context
More CPUs, runtime go down
How Has This Been Tested?
Ran vtr_reg_strong and titan_quick_qor with
--router_algorithm parallel
.Types of changes
Checklist: