Skip to content

Commit c06800f

Browse files
authored
Merge pull request #1796 from aman26kbm/vtr_power_estimation
Modifications to the power estimation flow and documentation
2 parents e4351bb + 26259af commit c06800f

File tree

2 files changed

+42
-28
lines changed

2 files changed

+42
-28
lines changed

doc/src/vtr/power_estimation/index.rst

Lines changed: 21 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,16 @@ $VTR_ROOT/vtrflow/tech/*
4343

4444
See :ref:`power_technology_properties` for information on how to generate an XML file for your own SPICE technology model.
4545

46+
In this mode, the VTR will run ODIN->ABC->ACE->VPR. The ACE stage is additional and specific to this power estimation flow. Using run_vtr_flow.py will automatically run ACE 2.0 to generate activity information and a new BLIF file (see ::ref:`power_ace` for details).
47+
48+
The final power estimates will be available in file named <circuit_name>.power in the result directory.
49+
50+
Here is an example command:
51+
52+
.. code-block::
53+
$VTR_ROOT/vtr_flow/scripts/run_vtr_flow.py ../benchmarks/verilog/diffeq1.v ../arch/timing/k6_frac_N10_frac_chain_depop50_mem32K_40nm.xml -power -cmos_tech ../tech/PTM_45nm/45nm.xml -temp_dir power_try_45nm
54+
55+
4656
VPR
4757
~~~
4858

@@ -133,7 +143,7 @@ where
133143
* ``<activities.act>``: Is the activity file to be created.
134144
* ``<new.blif>``: The new BLIF file.
135145

136-
This will be functionally identical in function to the ABC blif; however, since ABC does not maintain internal node names, a new BLIF must be produced with node names that match the activity file.
146+
This will be functionally identical in function to the ABC blif; however, since ABC does not maintain internal node names, a new BLIF must be produced with node names that match the activity file. This blif file is fed to the subsequent parts of the flow (to VPR). If a user is using run_vtr_flow.py (which will run ACE 2.0 underneath if the options mentioned earlier like -power are used), then the flow will copy this ACE2 generated blif file (<circuit_name>.ace.blif) to <circuit_name>.pre-vpr.blif and then launch VPR with this new file.
137147

138148
User’s may with to use their own activity estimation tool.
139149
The produced activity file must contain one line for each net in the BLIF file, in the following format::
@@ -202,7 +212,7 @@ Other methods of estimation:
202212

203213

204214
``specify-size``
205-
~~~~~~~~~~~~~~~~
215+
""""""""""""""""
206216
This estimation method provides a detailed transistor level modelling of CLBs, and will provide the most accurate power estimations.
207217
For each ``pb_type``, power estimation accounts for the following components (see :numref:`fig_power_sample_block`).
208218

@@ -257,13 +267,13 @@ If necessary, the user can seperate a port into multiple ports with different wi
257267
For all child ``pb_types``, the algorithm performs a recursive call.
258268
Eventually ``pb_types`` will be reached that have no children.
259269
These are primitives, such as flip-flops, LUTs, or other hard-blocks.
260-
The power model includes functions to perform transistor-level power estimation for flip-flops and LUTs.
270+
The power model includes functions to perform transistor-level power estimation for flip-flops and LUTs (Note: the power model doesn't, by default, include power estimation for single-bit adders that are commonly found in logic blocks of modern FPGAs).
261271
If the user wishes to use a design with other primitive types (memories, multipliers, etc), they must provide an equivalent function.
262-
If the user makes such a function, the ``power_calc_primitive`` function should be modified to call it.
272+
If the user makes such a function, the ``power_usage_primitive`` function should be modified to call it.
263273
Alternatively, these blocks can be configured to use higher-level power estimation methods.
264274

265275
``auto-size``
266-
~~~~~~~~~~~~~
276+
""""""""""""""""
267277
This estimation method also performs detailed transistor-level modelling.
268278
It is almost identical to the ``specify-size`` method described above.
269279
The only difference is that the local wire capacitance and buffers are automatically inserted for all pins, when necessary.
@@ -274,7 +284,7 @@ This is equivalent to using the ``specify-size`` method with the ``wire_length=a
274284
Although not as accurate as user-provided buffer and wire sizes, it is capable of automatically capturing trends in power dissipation as architectures are modified.
275285

276286
``pin-toggle``
277-
~~~~~~~~~~~~~~
287+
""""""""""""""""
278288
This method allows users to specify the dynamic power of a block in terms of the energy per toggle (in Joules) of each input, output or clock pin for the ``pb_type``.
279289
The static power is provided as an absolute (in Watts).
280290
This is done using the following construct:
@@ -304,7 +314,7 @@ It is assumed that the power usage specified here includes power of all child ``
304314
No further recursive power estimation will be performed.
305315

306316
``C-internal``
307-
~~~~~~~~~~~~~~
317+
""""""""""""""""
308318
This method allows the users to specify the dynamic power of a block in terms of the internal capacitance of the block.
309319
The activity will be averaged across all of the input pins, and will be supplied with the internal capacitance to the standard equation:
310320

@@ -327,7 +337,7 @@ It is assumed that the power usage specified here includes power of all child ``
327337
No further recursive power estimation will be performed.
328338

329339
``absolute``
330-
~~~~~~~~~~~~
340+
""""""""""""""""
331341
This method is the most basic power estimation method, and allows users to specify both the dynamic and static power of a block as absolute
332342
values (in Watts).
333343
This is done using the following construct:
@@ -345,12 +355,12 @@ It is assumed that the power usage specified here includes power of all child ``
345355
No further recursive power estimation will be performed.
346356

347357
Global Routing
348-
--------------
358+
~~~~~~~~~~~~~~
349359

350360
Global routing consists of switch boxes and input connection boxes.
351361

352362
Switch Boxes
353-
~~~~~~~~~~~~
363+
""""""""""""""""
354364

355365
Switch boxes are modelled as the following components (:numref:`fig_power_sb`):
356366

@@ -389,7 +399,7 @@ The user may override this method by providing the buffer size as shown below:
389399
The size is the drive strength of the buffer, relative to a minimum-sized inverter.
390400

391401
Input Connection Boxes
392-
~~~~~~~~~~~~~~~~~~~~~~
402+
""""""""""""""""
393403

394404
Input connection boxes are modelled as the following components (:numref:`fig_power_cb`):
395405

vpr/src/power/power_sizing.cpp

Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -79,24 +79,28 @@ static double power_count_transistors_connectionbox() {
7979
auto& power_ctx = g_vpr_ctx.power();
8080

8181
auto type = find_most_common_block_type(device_ctx.grid);
82-
VTR_ASSERT(type->pb_graph_head->num_input_ports == 1);
83-
inputs = type->pb_graph_head->num_input_pins[0];
84-
85-
/* Buffers from Tracks */
86-
buffer_size = power_ctx.commonly_used->max_seg_to_IPIN_fanout
87-
* (power_ctx.commonly_used->NMOS_1X_C_d
88-
/ power_ctx.commonly_used->INV_1X_C_in)
89-
/ power_ctx.arch->logical_effort_factor;
90-
buffer_size = std::max(1.0F, buffer_size);
91-
transistor_cnt += power_ctx.solution_inf.channel_width
92-
* power_count_transistors_buffer(buffer_size);
93-
94-
/* Muxes to IPINs */
95-
transistor_cnt += inputs
96-
* power_count_transistors_mux(
97-
power_get_mux_arch(power_ctx.commonly_used->max_IPIN_fanin,
98-
power_ctx.arch->mux_transistor_size));
9982

83+
//For each port on the most common block, look at the number of
84+
//input pins this port has and estimate the transistor count based
85+
//on the size muxes that drive these input pins.
86+
for (int i = 0; i < type->pb_graph_head->num_input_ports; i++) {
87+
inputs = type->pb_graph_head->num_input_pins[i];
88+
89+
/* Buffers from Tracks */
90+
buffer_size = power_ctx.commonly_used->max_seg_to_IPIN_fanout
91+
* (power_ctx.commonly_used->NMOS_1X_C_d
92+
/ power_ctx.commonly_used->INV_1X_C_in)
93+
/ power_ctx.arch->logical_effort_factor;
94+
buffer_size = std::max(1.0F, buffer_size);
95+
transistor_cnt += power_ctx.solution_inf.channel_width
96+
* power_count_transistors_buffer(buffer_size);
97+
98+
/* Muxes to IPINs */
99+
transistor_cnt += inputs
100+
* power_count_transistors_mux(
101+
power_get_mux_arch(power_ctx.commonly_used->max_IPIN_fanin,
102+
power_ctx.arch->mux_transistor_size));
103+
}
100104
return transistor_cnt;
101105
}
102106

0 commit comments

Comments
 (0)