Skip to content

Commit 3f8d8e4

Browse files
committed
2 parents 37c5fe1 + 85dfc29 commit 3f8d8e4

File tree

46 files changed

+1773
-203
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+1773
-203
lines changed

doc/src/vpr/command_line_usage.rst

Lines changed: 40 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -926,6 +926,13 @@ If any of init_t, exit_t or alpha_t is specified, the user schedule, with a fixe
926926

927927
**Default:** ``move_block_type``
928928

929+
.. option:: --place_quench_only {on | off}
930+
931+
If this option is set to ``on``, the placement will skip the annealing phase and only perform the placement quench.
932+
This option is useful when the the quality of initial placement is good enough and there is no need to perform the
933+
annealing phase.
934+
935+
**Default:** ``off``
929936

930937

931938
.. option:: --placer_debug_block <int>
@@ -1188,15 +1195,43 @@ Analytical Placement is generally split into three stages:
11881195

11891196
Analytical Placement is experimental and under active development.
11901197

1191-
.. option:: --ap_global_placer {quadratic-bipartitioning-lookahead | quadratic-flowbased-lookahead}
1198+
.. option:: --ap_analytical_solver {qp-hybrid | lp-b2b}
1199+
1200+
Controls which Analytical Solver the Global Placer will use in the AP Flow.
1201+
The Analytical Solver solves for a placement which optimizes some objective
1202+
function, ignorant of the FPGA legality constraints. This provides a "lower-
1203+
bound" solution. The Global Placer will legalize this solution and feed it
1204+
back to the analytical solver to make its solution more legal.
1205+
1206+
* ``qp-hybrid`` Solves for a placement that minimizes the quadratic HPWL of
1207+
the flat placement using a hybrid clique/star net model (as described in
1208+
FastPlace :cite:`Viswanathan2005_FastPlace`).
1209+
Uses the legalized solution as anchor-points to pull the solution to a
1210+
more legal solution (similar to the approach from SimPL :cite:`Kim2013_SimPL`).
1211+
1212+
* ``lp-b2b`` Solves for a placement that minimizes the linear HPWL of the
1213+
flat placement using the Bound2Bound net model (as described in Kraftwerk2 :cite:`Spindler2008_Kraftwerk2`).
1214+
Uses the legalized solution as anchor-points to pull the solution to a
1215+
more legal solution (similar to the approach from SimPL :cite:`Kim2013_SimPL`).
1216+
1217+
**Default:** ``lp-b2b``
1218+
1219+
.. option:: --ap_partial_legalizer {bipartitioning | flow-based}
11921220

1193-
Controls which Global Placer to use in the AP Flow.
1221+
Controls which Partial Legalizer the Global Placer will use in the AP Flow.
1222+
The Partial Legalizer legalizes a placement generated by an Analytical Solver.
1223+
It is used within the Global Placer to guide the solver to a more legal
1224+
solution.
11941225

1195-
* ``quadratic-bipartitioning-lookahead`` Use a Global Placer which uses a quadratic solver and a bi-partitioning lookahead legalizer. Anchor points are used to spread the solved solution to the legalized solution.
1226+
* ``bipartitioning`` Creates minimum windows around over-dense regions of
1227+
the device bi-partitions the atoms in these windows such that the region
1228+
is no longer over-dense and the atoms are in tiles that they can be placed
1229+
into.
11961230

1197-
* ``quadratic-flowbased-lookahead`` Use a Global Placer which uses a quadratic solver and a multi-commodity-flow-based lookahead legalizer. Anchor points are used to spread the solved solution to the legalized solution.
1231+
* ``flow-based`` Flows atoms from regions that are overfilled to regions that
1232+
are underfilled.
11981233

1199-
**Default:** ``quadratic-bipartitioning-lookahead``
1234+
**Default:** ``bipartitioning``
12001235

12011236
.. option:: --ap_full_legalizer {naive | appack}
12021237

doc/src/z_references.bib

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -436,3 +436,46 @@ @inproceedings{kosar2024parallel
436436
booktitle={The 23rd International Conference on Field-Programmable Technology},
437437
year={2024}
438438
}
439+
440+
@ARTICLE{Viswanathan2005_FastPlace,
441+
author={Viswanathan, N. and Chu, C.C.-N.},
442+
journal={IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems},
443+
title={{FastPlace}: efficient analytical placement using cell shifting, iterative local refinement,and a hybrid net model},
444+
year={2005},
445+
volume={24},
446+
number={5},
447+
month=may,
448+
pages={722-733},
449+
keywords={Clustering algorithms;Partitioning algorithms;Algorithm design and analysis;Integrated circuit interconnections;Large-scale systems;Minimization;Delay;Simulated annealing;Iterative algorithms;Acceleration;Analytical placement;computer-aided design;net models;standard cell placement},
450+
doi={10.1109/TCAD.2005.846365}
451+
}
452+
453+
@article{Kim2013_SimPL,
454+
author = {Kim, Myung-Chul and Lee, Dong-Jin and Markov, Igor L.},
455+
journal = {Commun. ACM},
456+
title = {{SimPL}: an algorithm for placing {VLSI} circuits},
457+
year = {2013},
458+
issue_date = {June 2013},
459+
publisher = {Association for Computing Machinery},
460+
address = {New York, NY, USA},
461+
volume = {56},
462+
number = {6},
463+
issn = {0001-0782},
464+
doi = {10.1145/2461256.2461279},
465+
month = jun,
466+
pages = {105–113},
467+
numpages = {9}
468+
}
469+
470+
@ARTICLE{Spindler2008_Kraftwerk2,
471+
author={Spindler, Peter and Schlichtmann, Ulf and Johannes, Frank M.},
472+
journal={IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems},
473+
title={Kraftwerk2—A Fast Force-Directed Quadratic Placement Approach Using an Accurate Net Model},
474+
year={2008},
475+
volume={27},
476+
number={8},
477+
month=aug,
478+
pages={1398-1411},
479+
keywords={Cost function;Central Processing Unit;Runtime;Quality control;Convergence;Computational efficiency;Integrated circuit synthesis;Stochastic processes;Circuit simulation;Bound2Bound;force-directed;half-perimeter wirelength (HPWL);Kraftwerk2;quadratic placement;Kraftwerk2;force-directed;quadratic placement;Bound2Bound;HPWL},
480+
doi={10.1109/TCAD.2008.925783}
481+
}

libs/EXTERNAL/libcatch2

libs/libvtrutil/src/vtr_thread_pool.h

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
#pragma once
2+
3+
/**
4+
* @file vtr_thread_pool.h
5+
* @brief A generic thread pool for parallel task execution
6+
*/
7+
8+
#include <thread>
9+
#include <queue>
10+
#include <mutex>
11+
#include <condition_variable>
12+
#include <memory>
13+
#include <atomic>
14+
#include <functional>
15+
#include <cstddef>
16+
#include <vector>
17+
#include "vtr_log.h"
18+
#include "vtr_time.h"
19+
20+
namespace vtr {
21+
22+
/**
23+
* A thread pool for parallel task execution. It is a naive
24+
* implementation which uses a queue for each thread and assigns
25+
* tasks in a round robin fashion.
26+
*
27+
* Example usage:
28+
*
29+
* vtr::thread_pool pool(4);
30+
* pool.schedule_work([]{
31+
* // Task body
32+
* });
33+
* pool.wait_for_all(); // There's no API to wait for a single task
34+
*/
35+
class thread_pool {
36+
private:
37+
/* Thread-local data */
38+
struct ThreadData {
39+
std::thread thread;
40+
/* Per-thread task queue */
41+
std::queue<std::function<void()>> task_queue;
42+
43+
/* Threads wait on cv for a stop signal or a new task
44+
* queue_mutex is required for condition variable */
45+
std::mutex queue_mutex;
46+
std::condition_variable cv;
47+
bool stop = false;
48+
};
49+
50+
/* Container for thread-local data */
51+
std::vector<std::unique_ptr<ThreadData>> threads;
52+
/* Used for round-robin scheduling */
53+
std::atomic<size_t> next_thread{0};
54+
/* Used for wait_for_all */
55+
std::atomic<size_t> active_tasks{0};
56+
57+
/* Condition variable for wait_for_all */
58+
std::mutex completion_mutex;
59+
std::condition_variable completion_cv;
60+
61+
public:
62+
thread_pool(size_t thread_count) {
63+
threads.reserve(thread_count);
64+
65+
for (size_t i = 0; i < thread_count; i++) {
66+
auto thread_data = std::make_unique<ThreadData>();
67+
68+
thread_data->thread = std::thread([&]() {
69+
ThreadData* td = thread_data.get();
70+
71+
while (true) {
72+
std::function<void()> task;
73+
74+
{ /* Wait until a task is available or stop signal is received */
75+
std::unique_lock<std::mutex> lock(td->queue_mutex);
76+
77+
td->cv.wait(lock, [td]() {
78+
return td->stop || !td->task_queue.empty();
79+
});
80+
81+
if (td->stop && td->task_queue.empty()) {
82+
return;
83+
}
84+
85+
/* Fetch a task from the queue */
86+
task = std::move(td->task_queue.front());
87+
td->task_queue.pop();
88+
}
89+
90+
vtr::Timer task_timer;
91+
task();
92+
}
93+
});
94+
95+
threads.push_back(std::move(thread_data));
96+
}
97+
}
98+
99+
template<typename F>
100+
void schedule_work(F&& f) {
101+
active_tasks++;
102+
103+
/* Round-robin thread assignment */
104+
size_t thread_idx = (next_thread++) % threads.size();
105+
auto thread_data = threads[thread_idx].get();
106+
107+
auto task = [this, f = std::forward<F>(f)]() {
108+
vtr::Timer task_timer;
109+
110+
try {
111+
f();
112+
} catch (const std::exception& e) {
113+
VTR_LOG_ERROR("Thread %zu failed task with error: %s\n",
114+
std::this_thread::get_id(), e.what());
115+
throw;
116+
} catch (...) {
117+
VTR_LOG_ERROR("Thread %zu failed task with unknown error\n",
118+
std::this_thread::get_id());
119+
throw;
120+
}
121+
122+
size_t remaining = --active_tasks;
123+
if (remaining == 0) {
124+
completion_cv.notify_all();
125+
}
126+
};
127+
128+
/* Queue new task */
129+
{
130+
std::lock_guard<std::mutex> lock(thread_data->queue_mutex);
131+
thread_data->task_queue.push(std::move(task));
132+
}
133+
thread_data->cv.notify_one();
134+
}
135+
136+
void wait_for_all() {
137+
std::unique_lock<std::mutex> lock(completion_mutex);
138+
completion_cv.wait(lock, [this]() { return active_tasks == 0; });
139+
}
140+
141+
~thread_pool() {
142+
/* Stop all threads */
143+
for (auto& thread_data : threads) {
144+
{
145+
std::lock_guard<std::mutex> lock(thread_data->queue_mutex);
146+
thread_data->stop = true;
147+
}
148+
thread_data->cv.notify_one();
149+
}
150+
151+
for (auto& thread_data : threads) {
152+
if (thread_data->thread.joinable()) {
153+
thread_data->thread.join();
154+
}
155+
}
156+
}
157+
};
158+
159+
} // namespace vtr

vpr/src/analytical_place/analytical_placement_flow.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,8 @@ static PartialPlacement run_global_placer(const t_ap_opts& ap_opts,
131131
return p_placement;
132132
} else {
133133
// Run the Global Placer
134-
std::unique_ptr<GlobalPlacer> global_placer = make_global_placer(ap_opts.global_placer_type,
134+
std::unique_ptr<GlobalPlacer> global_placer = make_global_placer(ap_opts.analytical_solver_type,
135+
ap_opts.partial_legalizer_type,
135136
ap_netlist,
136137
prepacker,
137138
atom_nlist,

0 commit comments

Comments
 (0)