Skip to content

Equivalent Tiles placement #559

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

acomodi
Copy link
Collaborator

@acomodi acomodi commented May 8, 2019

This PR introduces two main changes:

  • Move all top level pb_type specific tags to tile tags in the architecture XML. This basically makes all pb_types in the <complexblocklist> of two types: normal and primitive.
  • Implement Equivalent Tiles placement capability. This means that, if the <tile> tag in the XML specifies possible equivalent tiles, these can be used during placement.

These changes still need refinement and a review stage to check the direction of this work.

Description

Tile tag

The change related to the tile tags is needed to clearly separate the concept of pb_type and top level pb_type. A top level pb_type, in fact, is a physical configurable block of the FPGA that has to be treated differently from the lower level pb_types.
In order to do so, a new tag has been added to the XML, namely the tiles tag. It contains all the information relative to the ex-top level pb_types (e.g. fc, pinlocations, switchblock_locations).
This change is addressed in the read_xml_arch_file.cpp and physical_types.h.
The current PR is going to fail on CI as the current test architectures do not include the <tiles> tag.

Equivalent Tile placement

The second change addresses the Equivalent Tiles placement.
What happens is that the placement stage now does not rely strictly on the clusterer decisions: each block belongs to a top level pb_type (now called just tile), which can be swapped with a corresponding equivalent during placement (if equivalent tiles are present).
There are checks that do ensure that, during the swap stage, blocks are placed only in correct locations (same tiles, or equivalent tiles).

When the placement has finished, the clustering information need to change, as it could happen that a CLB (Clustered Block) that refers to a specific type (e.g. SLICEL), is placed in an equivalent tile (e.g. SLICEM). This situation generates conflicts during the routing stage, as, even being equivalent, the two tiles have different pin configurations.

To address this issue, the clustering context has to be changed with the information about the new pin mappings for the blocks that have been swapped in equivalent tiles. This happens in the background, therefore there was no need to apply changes at routing stage.

Related Issue

The issue that is addressed with this PR is: #513

Motivation and Context

The motivation of moving the top level pb_type definition to another location in the architecture definition is to have a clearer separation on the two concepts of top level pb_type and normal/primitive pb_type.

Instead, the Equivalent Tiles feature is an improvement in VPR. With the ability to use equivalent locations, placer can choose for a better solution when trying to find the best placement, reducing its final cost.
Moreover, this solution can overcome the fact that the packer is not aware of the possibility to use an equivalent tile (and it should correctly not be aware of that) and the job is left to the placer. In fact, the packer may always choose to implement a cluster using a certain tile instead of another, even though there are other tiles that could be used.
In this case, there could be designs which would consume more resources (tiles) than the available ones and the placer, if allowed, could place the exceeding blocks in the equivalent locations.

How Has This Been Tested?

There is a very initial implementation of a test which was needed during development. It should be made more robust.

Types of changes

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation
  • I have updated the documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed

@probot-autolabeler probot-autolabeler bot added lang-cpp C/C++ code libarchfpga Library for handling FPGA Architecture descriptions tests VPR VPR FPGA Placement & Routing Tool VTR Flow VTR Design Flow (scripts/benchmarks/architectures) labels May 8, 2019
@probot-autolabeler probot-autolabeler bot added infra Project Infrastructure lang-perl Perl code lang-python Python code scripts Utility & Infrastructure scripts labels May 16, 2019
@acomodi acomodi force-pushed the equivalent-tiles branch from cb782c1 to f069dce Compare May 16, 2019 11:30
@acomodi acomodi changed the title WIP [DNM]: Equivalent Tiles placement WIP: Equivalent Tiles placement May 16, 2019
@probot-autolabeler probot-autolabeler bot added lang-shell Shell scripts (bash etc.) Odin Odin II Logic Synthesis Tool: Unsorted item labels May 16, 2019
@acomodi
Copy link
Collaborator Author

acomodi commented May 16, 2019

Hi @kmurray @vaughnbetz,
as reported in #513 (comment), this is the PR with the equivalent tiles placement.

I have added a python script to modify all the test architecture to move the top level-only pb_type tags/attribute to the new tiles tags. This way CI passes on all regression tests.

@kmurray
Copy link
Contributor

kmurray commented May 16, 2019

@acomodi There is actually already a script we use for upgrading architecture files. Can you add your steps to upgrade as another pass there?

@acomodi
Copy link
Collaborator Author

acomodi commented May 16, 2019

@kmurray Sure, I can add the procedure in there. Should I also run the script and push all the modified architectures as well?
Because the problem is that, for now, I have modified the test scripts to apply a temporary upgrade of the architectures to let CI pass with the new tiles format.
Changes are here:

What they do is generating the architecture with the tiles tag and feed it to VPR.

@acomodi
Copy link
Collaborator Author

acomodi commented May 17, 2019

Hi @kmurray,

I have run two test designs with and without Equivalent Placement (EP).

Without EP (SymbiFlow@fa93ec2):

PicoSoC Murax
Initial placement cost 4260.92 1420.89
Final placement cost 753.324 171.815

With EP (antmicro@635a591):

PicoSoC Murax
Initial placement cost 4260.92 1420.89
Final placement cost 485.257 160.581

A note to be said is that there was timing placement has not been used, only bounding box has been adopted.

@vaughnbetz
Copy link
Contributor

That certainly looks promising.
Can you run the new code on the VTR benchmarks and the Titan benchmarks and report QoR? Should be the same, except CPU time, which hopefully is also essentially the same. I want to check that nothing bad has happened to the flow with no placement equivalent blocks.

@mithro
Copy link
Contributor

mithro commented May 17, 2019

@vaughnbetz - Just FYI, I created two issues about getting the two benchmarks that @acomodi mentioned into the normal VtR benchmarks. You can find them at #584 and #582 .

@acomodi
Copy link
Collaborator Author

acomodi commented May 28, 2019

Hi @vaughnbetz @kmurray,
Below there is an initial vtr-qor comparison between this PR and master.

  • master revision: f1f6b71 (it is dirty as I had a non-relative and uncommited change in the master branch)
  • changes: ce7b7d7

Placement time is increased by ~3% even without equivalent tiles. I think it can be easily taken down to be even with master.

For what regards routing I am unsure why there is an increase in the time. I need to investigate whether my changes do affect placement even if there are no equivalent tiles.

If you could provide some initial feedback with a review it would be helpful to see if this PR is going in the right direction.

Comparison file: comparison-place-vtr-qor.xlsx

@vaughnbetz
Copy link
Contributor

For the QoR results:
routed wirelength and critical path delay are unchanged, which is good. This strongly implies the placement is the same, as most circuits have identical results on these metrics with and without the change.

Placement time is up a bit as you noted (+3.8%). Would be good to get back if possible.

The route time does look to be up consistently. I'm not sure why -- I also wouldn't have expected this change to affect route time. Given the quality (critical path delay, routed wirelength) matches, it isn't that the router is facing a more difficult problem -- it is just taking longer. The router doesn't look at the grid very much so hopefully profiling the code would show where the new hot spot is.

I'd also run the Titan benchmarks to see if they show the same behaviour.

I haven't looked at the code yet and won't have time to for a few days anyway.

@acomodi
Copy link
Collaborator Author

acomodi commented May 30, 2019

Hi @vaughnbetz,
Ok, I will proceed by running the Titan then, and try to getting the placer timings even with master than.

@acomodi acomodi force-pushed the equivalent-tiles branch 3 times, most recently from 945cf51 to f5ef6c4 Compare June 1, 2019 12:31
@acomodi
Copy link
Collaborator Author

acomodi commented Jun 1, 2019

@vaughnbetz, I have ran the TItan benchmarks and got the results. I will paste also the parsed results as there were some failures on 2 tests:

  • LU230 failed on basiline with error code 134;
  • gaussian failed on changes with error code 137.

Both of them seemed to be failing during placement. I need to check if my changes are somehow affecting non-equivalent tiles, but the fact that LU230 passed with my changes and failed on baseline is curious. I need to investigate the reasons behind this.

Comparison file: comparison-titan.xlsx

parse_results baseline: baseline.txt
parse_results changes: changes.txt

Run-times (excluding the two failed experiments) seems to be even between changes and baseline

@vaughnbetz
Copy link
Contributor

The QoR results look good, except for the two failures which are both placement failures.
Can you attach the full vpr output logs for them?

@acomodi acomodi changed the title WIP: Equivalent Tiles placement Equivalent Tiles placement Jun 6, 2019
@vaughnbetz
Copy link
Contributor

QoR looks good; thanks.
@kmurray I think this is in your hands now to

  1. check you're aligned with the overall architecture specification approach and
  2. that the vtr 8.0 tagging is done as this is big enough that the plan is for it to be post-8.0

@acomodi acomodi force-pushed the equivalent-tiles branch from f5ef6c4 to e1fbaef Compare June 24, 2019 09:51
@acomodi
Copy link
Collaborator Author

acomodi commented Jun 25, 2019

@kmurray Ping

@acomodi acomodi force-pushed the equivalent-tiles branch from e1fbaef to bbc3140 Compare July 1, 2019 09:33
@acomodi acomodi force-pushed the equivalent-tiles branch from bbc3140 to 2c4fb1e Compare July 16, 2019 09:12
@kmurray
Copy link
Contributor

kmurray commented Jul 30, 2019

@acomodi Can you update this PR to be in sync with master?

I'll try to give a full review soon.

acomodi added 5 commits July 31, 2019 10:21
This commit introduces two major features:
- introduce the tile concept in the architecture XML. Top level pb_type
is discarded and all the top level pb_type information are moved into
the tile tags. This will cause CI build to fail as all the architectures
do not include the tiles tag right now.

- introduce the possibility to place blocks in equivalent tiles (SLICEL
blocks into SLICEM ones). According to the XML architecture description
there could be tiles equivalent to others that can be used during the
placement step (this can bring to better placement solutions)

Signed-off-by: Alessandro Comodi <[email protected]>
I have also changes travis.yml to install the lxml python package
needed by the script

Signed-off-by: Alessandro Comodi <[email protected]>
Signed-off-by: Alessandro Comodi <[email protected]>
Signed-off-by: Alessandro Comodi <[email protected]>
@acomodi acomodi force-pushed the equivalent-tiles branch from 2c4fb1e to 2daf153 Compare July 31, 2019 08:42
@acomodi
Copy link
Collaborator Author

acomodi commented Jul 31, 2019

@kmurray Done, CI is currently failing as I removed my changes in the ODIN_II testing script which modified the architecture to allow having the required tiles tag.
There is also an issue with the C++ tests, presumably for the same reason.

acomodi added 2 commits July 31, 2019 13:36
Signed-off-by: Alessandro Comodi <[email protected]>
Signed-off-by: Alessandro Comodi <[email protected]>
@acomodi acomodi force-pushed the equivalent-tiles branch from c40cb50 to 4f59266 Compare July 31, 2019 14:10
Copy link
Contributor

@kmurray kmurray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @acomodi for putting this together. I think this is an important feature to get into VPR (its been requested by a variety of users over the years).

I've included detailed inline comments below. However I think there are some high-level issues which will need to be reworked before this can be merged, which I outline below. The main one is the distinction between the logical and physical type of a clustered block.

If you have thoughts/comments/questions we can discuss them to come up with a plan to move forward on this.

Logical vs Physical Types
I think we need to make a much clearer distinction between the 'logical block type', and the 'physical tile type'. At the moment they are both combined in t_type_ptr which I think is quite confusing and difficult to understand, particularly now that this PR defines equivalencies between them.

This is somewhat a historical artifact of how this code has evolved. Before this PR there was no distinction between these (logical block type and physical tile were the same). However now that we are making them non-identical I think it is important to distinguish between them explicitly.

More concretely, we need to split out the physical tile information into a separate data structure:

  • Pin-out to RR graph/Pin locations/Fc/switchblock locations
  • width/height/capacity/area
    We should probably call this something like t_physical_tile.

That would leave t_type_ptr to represent the 'logical' type of a block produced by the packer, containing:

  • pb_type
  • pb_graph_head

The current approach of keeping everything in the t_type_ptr and making some t_type_ptrs 'equivalent' to others is confusing, and results in some complexity when asking what the 'type' of a block in the netlist is. I think separating them out makes the dependencies between placement locations and logical types much clearer.

Location of equivalent type state
Currently in this PR, which tile type is physically implementing a particular block is stored in the ClusteredNetlist. The general goal of the ClusteredNetlist is to represent the logical state of the clustered netlist, and not to store information about placement decisions (i.e. which tile type is implementing a particular logical block). This is related to the logical/physical distinction discussed above.

If we split out the logical/physical types into separate data structures (as discussed above), I think we should keep clb_nlist.block_type() returning the logical block type of a block in the netlist.
The information about what physical tile type is currently used to implement that block should be moved to the placement context.

XML Format
As discussed here we probably want to re-use <direct> to describe the pin mappings between tiles and types. Currently you use <map>, which is functionally the same as <direct> but with a different name. I think we should just use <direct> to reduce the cognitive load on end users and re-use that existing concept.

Also (as discussed in the comments), I think using <site> instead of <mode> more clearly communicates what is being modelled.


"""
This script is intended to modify the architecture description file to be compliant with
the new format.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is actually already a script for upgrading architecture files (vtr_flow/script/upgrade_arch.py). This should be moved into that script (as another pass).

arch_name=$(basename ${arches%.*})

TEST_FULL_REF="${bench_name}/${circuit_name}/${arch_name}"
DIR="${NEW_RUN_DIR}/${TEST_FULL_REF}"
mkdir -p $DIR

arch_cmd=""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted below, we should put this upgrade step in upgrade_arch.py and then upgrade all of the architecture files in the VTR source tree. That should avoid the need to do this type of on-the-fly architecture file modification (which would appear rather mysterious to end users).

int num_equivalent_tiles = 0;
std::unordered_map<int, t_type_descriptor*> equivalent_tiles; /* [0..num_equivalent_tiles-1] */
std::unordered_map<int, std::unordered_map<int, int>> equivalent_tile_pin_mapping; /* [0..num_equivalent_tiles-1][0..num_pins-1] */
std::unordered_map<int, std::unordered_map<int, int>> equivalent_tile_inverse_pin_mapping; /* [0..num_equivalent_tiles-1][0..num_pins-1] */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping equivalent bi-directional mappings like this (i.e. equivalent_tile_pin_mapping and equivalent_tile_inverse_pin_mapping) in sync requires special care. A common source of future bug is if the mapping is updated in one direction but not the other.

We should consider using vtr::bimap for this purpose as it provides an interface which ensures the forward and inverse mappings are always consistent (you can see its usage for this purpose here for atom related lookups).

@@ -595,6 +609,13 @@ struct t_type_descriptor /* TODO rename this. maybe physical type descriptor or
t_pb_type* pb_type = nullptr;
t_pb_graph_node* pb_graph_head = nullptr;

/* Equivalent tiles information */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial intuition is that the equivalency information should really be part of a different data structure. That is we should separate out the physical 'tile' and logical block 'type' parts.

For instance a 'tile' would specify:

  • A set of pins
  • Fc
  • Capacity
  • height/width
  • area

A block type woud specify:

  • The top-level pb_type/pb_graph_head

/* Returns the indices of pins that contain a clock for this physical logic block */
std::vector<int> get_clock_pins_indices() const;

/* Returns a boolean set to True if the input index belongs to an available tile, False otherwise */
bool is_available_tile_index(int index_to_check) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_available is a confusing name (e.g. available in current placement? legal? not clear). I think what you probably mean is is_valid or is_legal.

</tile>
<tile name="BLK_IG-SLICEL">
<equivalent_tiles>
<mode name="BLK_IG-SLICEM">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would much prefer the naming and structure suggested in #513:

    <tile name="MLAB_tile" width="1" height="1" capacity="1" area="XXX">
        <input name="inputs" num_pins="50"/>
        <input name="clk" num_pins="2"/>
        <output name="outputs" num_pins="50"/>
        <pinlocations ...>
        <fc ...>
        <equivalent_sites> <!-- NOTE: each site must specify it's pin mapping -->
            <site pb_type="LAB">
                <direct from="MLAB_tile.inputs" to="LAB.inputs"/> <!-- Note LAB inputs are equivalent -->
                <direct from="MLAB_tile.clk" to="LAB.clk"/>
                <direct from="LAB.outputs" to="MLAB_tile.outpus"/>
            </site>
            <site pb_type="MLAB">
                <direct from="MLAB_tile.inputs[19:0]" to="MLAB.addr"/> <!-- Note MLAB inputs are not equivalent -->
                <direct from="MLAB_tile.inputs[29:20]" to="MLAB.data_in"/>
                <direct from="MLAB_tile.clk" to="MLAB.clk"/>
                <direct from="MLAB.data_out" to="MLAB_tile.outputs[9:0]"/>
            </site>
        </equivalent_sites>
    </tile>

The main points are as follows:

  1. Explicit pin listing
    By explicitly listing the set of pins we explicitly define the external routing interface of the tile (this aligns with what I discussed earlier about making the logical block type and physical tile types distinct, otherwise calling things 'equivalent when they really may not be is confusing). It also makes it straight forward to specify a superset of pins which may not exist on any particular pb_type.

  2. Use site instead of mode.
    Mode has a very specific meaning within the <pb_type> hierarchy that I don't think we want to bring into <tiles> (e.g. no hierarchical modes). I think 'site' captures how we are modelling this (i.e. a location the placer can put something), rather than a particular mode (which as packing connotations). While there may not always be a perfect distinction between packing/placement decisions in the actual implementation, it is a distinction we make in the code, so I think it makes sense to have the reflected here.

  3. Use direct instead of map
    Functionally your <map> tags are acting as <direct>s. To avoid overloading the user writing these files with yet another concept its probably better to re-use the <direct> concept and syntax. However we would only want to support <direct> (not <complete> or <mux>).

@@ -430,7 +430,10 @@
#system "cp $odin2_base_config"

my $architecture_file_path_new = "$temp_dir$architecture_file_name";
copy( $architecture_file_path, $architecture_file_path_new );
my $ret = `$vtr_flow_path/scripts/add_tiles.py --arch_xml $architecture_file_path > $architecture_file_path_new`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned elsewhere this kind of on the fly architecture conversion will appear quite mysterious to end users.

We should put the architecture file upgrade code in upgrade_arch.py and then upgrade all the checked-in architectures.


# Script parameters
#script_params=""
script_params = -track_memory_usage -lut_size 1 -starting_stage vpr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason for using -lut_size 1?

Type->height = get_attribute(Node, "height", loc_data, OPTIONAL).as_uint(1);
Type->area = get_attribute(Node, "area", loc_data, OPTIONAL).as_float(UNDEFINED);

if (atof(Prop) < 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be Type->area < 0?

@@ -2638,7 +2656,7 @@ static void ProcessComplexBlocks(pugi::xml_node Node,
/* Alloc the type list. Need one additional t_type_desctiptors:
* 1: empty psuedo-type
*/
*NumTypes = count_children(Node, "pb_type", loc_data) + 1;
*NumTypes = count_children(Node, "tile", loc_data) + 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably also want a call to expect_children since only 'tile' tags are expected

@acomodi
Copy link
Collaborator Author

acomodi commented Aug 1, 2019

Thanks for the detailed feedback @kmurray.
As you pointed out, one of the major issues right now is the distinction between physical and logical types. I think that the best approach moving forward is to focus firstly on this part.

To do so, I believe that the first thing to do is to forget for a while about equivalent placement, and start digging into the code and well-separate the two entities (physical/logical).

I think that this could be done by initially creating an identical structure to t_type_ptr (as you suggested t_physical_tile_ptr) and rename also t_type_ptr into sth like t_logical_tile_ptr.
Initially both the tile_ptr structures will be the same and they would be adopted in their respective areas. Once everything is stable there will be a next phase in which both the data structures will be pruned to eliminate all the unnecessary members (for instance capacity, area and so on will be deleted from the logical type).
This should leave us with a the wanted separation of the physical and logical tiles entities.

Once this first step is achieved, the equivalence of the tiles becomes the new focus and will be applied to the physical tiles only.

I am going to first tackle your inline review comments and then proceed with the physical/logical separation. Do you think this is the right way to go?

@kmurray
Copy link
Contributor

kmurray commented Aug 1, 2019

As you pointed out, one of the major issues right now is the distinction between physical and logical types. I think that the best approach moving forward is to focus firstly on this part.

I agree focusing on this first is the right approach, and your proposed approach of splitting the types incrementally seems reasonable.

I'd probably use the terms 'physical tile', and 'logical block' to make the distinctions even clearer and add 'type' to both to clarify that they reflect types of things rather than individual tile/block instances (e.g. t_physical_tile_type, t_logical_block_type).

I am going to first tackle your inline review comments and then proceed with the physical/logical separation. Do you think this is the right way to go?

I'll leave this up to you judgment. I think the re-achitecting of the code will obsolete some of the comments so your call whether to fix them first or after. Others could be addressed either now or later.

@acomodi
Copy link
Collaborator Author

acomodi commented Aug 16, 2019

@kmurray I have a question on the RR graph Pin-out: Where and how is this information processed?
My concern is on how to relate the pins of a logic block to the actual pins of the physical tile.

@kmurray
Copy link
Contributor

kmurray commented Aug 16, 2019

@acomodi Thats a good question, and gets to the heart of the logical vs physical discussion.

Current
Currently this information is stored in the t_type_descriptor::pinloc, which is initialized during pb graph construction from the architecture specification.

During RR graph construction, pin class information (t_type_descriptor::class_inf) is used to construct the appropriate SOURCE/SINK RR nodes. Which are then hooked up to IPIN/OPIN RR nodes according to their location. See for instance the creation of connections between SOURCE/SINK and IPIN/OPINs here.

Future
For the approach we want to take (splitting apart the logical block pins from the physical tile pins) we will need the following changes:

  1. Drive RR graph IPIN/OPIN construction off the physical tile description (pin location, Fc etc).
  2. Drive RR graph SOURCE/SINK construction off of the logical type pin descriptions
  3. The connectivity between the created SOURCE/SINK and IPIN/OPIN is then specified via the <direct>s in the new <tile> tags.

In terms of how best to do that, here are my thoughts:

  • At each physical tile, we should create a unique set of SOURCE/SINK nodes for each logical type supported (i.e. <site> in the specification, so for example a set of SOURCE/SINK nodes for the LAB site, and another set for the MLAB site in the example specification above) according the the relevant <pb_type>s logical pin specifications (accounting for things like pin equivalence).
  • Each physical tile will then have a single set of pins (IPIN/OPIN) derived from <tile>'s input/output specifications.
  • The per-site <direct> specifications in the <tile> specify which edges should be created between a tiles IPIN/OPINs and the site's SOURCE/SINKs.

With that RR graph structure we can build a single RR graph and control whether the router is routing to one site (e.g LAB) or the other (e.g. MLAB) simply by changing the net's source/sink RR node targets, depending on which site was picked during placement (i.e. by changing device_ctx.net_rr_terminals).

@acomodi
Copy link
Collaborator Author

acomodi commented Aug 19, 2019

@kmurray Thanks for the comment. I think I understood the strategy that we are going to adopt, I still need to figure out how to best implement it practically.

At the moment I am still struggling with the first step of splitting t_type_descriptor. The main problem I have stumbled upon is the fact that it seems that logical and physical types seem to be highly entangled and I am working on how to best split them without affecting functional behavior as well as run-time.
For example (this is just one of many), I assume (correct me if I am wrong), that the device grid only contains physical_tile_types, but, in many cases, the grid is used to retrieve the pb_graph_node of a specific location (this should happen only after placement). To solve this issue I am thinking to use the placement information instead (e.g. by adding a new member to place_ctx.grid_blocks[x][y]).

One more question is: does it sound good to you to add the t_logical_block_type_descriptors in the device_ctx alongside with the the physical_tiles?
It may be a trivial question, but the fact is that I am unsure of where is the best location to retrieve the logical block types information needed in many of the steps of vpr.
For instance, it happens several times that the pb_type or pb_graph_head is retrieved from the device_ctx list of t_type_descriptors. Having another list containing all the logical blocks may come in hand.

@kmurray
Copy link
Contributor

kmurray commented Aug 19, 2019

Thanks for following up @acomodi.

I think it makes the most sense to tackle this incrementally.

1. Functionality preserving refactoring (no new capabilities)
Initially I'd recommend focusing on just refactoring the data structures to split the logical/physical info into separate structs, and not consider equivalencies yet. That is, assume we still have a one-to-one mapping from logical block type to physical tile types, something like:

struct t_logical_block_type {
    //Like t_type_descriptor but without pinlocation etc info
    //...
   
    //Index into device_ctx.tile_types (one-to-one mapping for now)
    int physical_tile_type_index;
};

struct t_physical_tile_type {
    //Contains the 'interface' data like pinlocations etc. from t_type_descriptor
    //...

   //Index into device_ctx.block_types (one-to-one mapping for now)
   int logical_block_type_index
};

struct DeviceContext {
    std::vector<t_logical_block_type> block_types;
    std::vector<t_physical_tile_type> tile_types;
};

It is then straightforward to look-up the associated tile/block types:

//Initialize physical_tile_a and block_type_b somehow (e.g. from netlist, or from grid)
t_physical_tile_type* tile_type_a = device_ctx.grid[x][y].type
t_logical_block_type* block_type_b = clb_nlist.block_type(blk_b);

//The logical block type of tile_type_a
t_logical_block_type* block_type_a = &device_ctx.block_types[tile_type_a.logical_block_type_index];

//The physical tile type of block_type_b
t_physical_tile_type* tile_type_b = &device_ctx.tile_types[block_type_b.physical_tile_type_index];

Of course it would probably be better to hide the details of this data structure tracing in subroutines (to avoid scattering it all through the code-base):

t_logical_block_type* block_type_a = logical_block_type(tile_type_a);
t_physical_tile_type* tile_type_b = physical_tile_type(block_type_b);

Since a common case would be going from a ClusterBlockId -> t_physical_tile_type, it would make sense to also encapsulate this:

t_physical_tile_type* physical_tile_type(ClusterBlockId blk) {
    t_logical_block_type* blk_type = cluster_ctx.clb_nlist.block_type(blk);
    return physical_tile_type(blk_type);
}

At this point we would have seperated the logical/physical information, but from an external perspective the behaviour of VPR should be identical (no changes to architecture format, algorithms, result quality etc.).

2. Extension to equivalent placement sites
Then in the future (with equivalent placement site support) this can be extended to do the look-up based on placement information (since there is no longer a 1:1 mapping):

t_physical_tile_type* physical_tile_type(ClusterBlockId blk) {
    //Use the placement_ctx to find out in which physical tile type
    //blk has been placed, and return its type.
   auto loc = place_ctx.block_loc[blk];
   return place_ctx.grid[loc.x][loc.y].type;
}

@acomodi
Copy link
Collaborator Author

acomodi commented Aug 20, 2019

@kmurray Sounds very good to me.
I was thinking also to create another PR with the first step only, as it will affect quite a lot source files of the code-base. Therefore, having both the equivalent placement sites and the structural separation changes highly increases complexity of this PR.

IMHO, even being the two changes (equivalent placement and physical/logical separation) highly related, it would be better to separate them as the first one does not depend on the equivalent tiles placement, while the opposite is not true.

@kmurray
Copy link
Contributor

kmurray commented Aug 20, 2019

I was thinking also to create another PR with the first step only

Yep that makes sense to me. It'll help keep things more contained and also make it easier to review.

@acomodi
Copy link
Collaborator Author

acomodi commented Oct 4, 2019

Closing this PR as it is superseded by #988

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infra Project Infrastructure lang-cpp C/C++ code lang-perl Perl code lang-python Python code lang-shell Shell scripts (bash etc.) libarchfpga Library for handling FPGA Architecture descriptions Odin Odin II Logic Synthesis Tool: Unsorted item scripts Utility & Infrastructure scripts tests VPR VPR FPGA Placement & Routing Tool VTR Flow VTR Design Flow (scripts/benchmarks/architectures)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants