fix `run_vtr_task.py` getting stuck if an input has an error #1950

sdamghan · 2022-01-04T19:55:22Z

Description

The run_vtr_flow.py script gets stuck on calling get_memory_usage(logfile) when the logfile does not exist. This PR fixes the bug, in addition to showing the maximum memory of all VTR flow stages instead of only the VPR stage in the final output.

Related Issue

Issue #1949

Motivation and Context

If an input file has errors and the VTR front-ends fail, the run_vtr_task script gets stuck and the terminal does not come back to the prompt

How Has This Been Tested?

Types of changes

Bug fix (change which fixes an issue)
New feature (change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My change requires a change to the documentation
I have updated the documentation accordingly
I have added tests to cover my changes
All new and existing tests passed

sdamghan · 2022-01-06T15:17:07Z

@vaughnbetz - this PR is about a minor change to the run_vtr_flow.py script. According to the related issue, the script failed to exit and return to prompt in a terminal if an input design has an error. This was because the get_memory_usage routine was specifically looking for the "vpr.out" while no such file has been generated. I have added a helper routine to first look at all VTR stages output files, and then choose the maximum RSS among all of them (not only vpr).
Would you mind letting me know if you have any comments/reviews on the new code?

vaughnbetz · 2022-01-06T15:30:10Z

vtr_flow/scripts/run_vtr_flow.py

@@ -525,10 +561,10 @@ def vtr_command_main(arg_list, prog=None):
        seconds = datetime.now() - start

        print(
-            "{status} (took {time}, vpr run consumed {max_mem} memory)".format(
+            "{status} (took {time}, {stage[0]} run consumed {stage[1]} memory)".format(


Good to say this was the overall peak stage in this message.

vaughnbetz · 2022-01-06T15:31:54Z

The code looks good.
The new code prints the maximum memory of any stage, and which stage hit that peak, which is probably a good improvement vs. the old code that printed the maximum memory of vpr all the time. However, we should check it doesn't break any of our qor parsing, where we usually check maximum vpr memory. 1

Can you check that the maximum vpr memory will always be parsed out properly (i.e. it's taken from the vpr.out file, rather than from the run_vtr_task output)? If it isn't going to be parsed properly, we should print the peak memory of all stages, and hte overall maximum, so we can still reliably parse vpr peak memory in QoR runs.
I think with this new feature it would be good to add overall peak memory for the entire run and which stage had the peak into the QoR parsing scripts, so we could start using that in regtests and QoR checks. Not essential in this PR, but if it's not hard I think it would be good to add right now while we're thinking of it.

… not generated - show the maximum memory usage of all VTR stages in the run_vtr_flow output Signed-off-by: Seyed Alireza Damghani <[email protected]>

sdamghan · 2022-01-07T16:02:35Z

@vaughnbetz - thanks for the comments. The code added to the run_vtr_flow.py script prints the stage that achieved the overall VTR flow memory peak and the output would be something like below:

k4_N10_memSize16384_memData64/ch_intrinsics_modified		OK (took 1.43 seconds, overall memory peak 39.45 MiB consumed by abc run)
k6_N10_mem32K_40nm/mkPktMerge		OK (took 15.38 seconds, overall memory peak 61.15 MiB consumed by vpr run)
k6_N10_mem32K_40nm/single_ff		OK (took 0.18 seconds, overall memory peak 32.87 MiB consumed by abc run)
k6_N10_mem32K_40nm/single_ff		OK (took 0.18 seconds, overall memory peak 32.57 MiB consumed by abc run)
k6_N10_mem32K_40nm/single_wire		OK (took 0.18 seconds, overall memory peak 32.83 MiB consumed by abc run)
k6_N10_mem32K_40nm/single_wire		OK (took 0.18 seconds, overall memory peak 32.54 MiB consumed by abc run)
k6_N10_mem32K_40nm/diffeq1		OK (took 6.08 seconds, overall memory peak 38.75 MiB consumed by vpr run)
k6_N10_mem32K_40nm/diffeq1		OK (took 5.95 seconds, overall memory peak 38.86 MiB consumed by vpr run)
k6_N10_mem32K_40nm/ch_intrinsics		OK (took 1.77 seconds, overall memory peak 37.22 MiB consumed by abc run)
k6_N10_mem32K_40nm/ch_intrinsics		OK (took 1.73 seconds, overall memory peak 37.00 MiB consumed by abc run)
k4_N10_memSize16384_memData64/single_ff		OK (took 0.18 seconds, overall memory peak 32.78 MiB consumed by abc run)
k4_N10_memSize16384_memData64/single_wire		OK (took 0.25 seconds, overall memory peak 32.91 MiB consumed by abc run)
k4_N10_memSize16384_memData64/diffeq1		OK (took 3.25 seconds, overall memory peak 37.66 MiB consumed by abc run)
k4_N10_memSize16384_memData64/ch_intrinsics		OK (took 1.14 seconds, overall memory peak 39.46 MiB consumed by abc run)
Elapsed time: 37.95 seconds

However, if you would like to have all memory peaks printed separately (something like: OK(took 15.38 seconds, VTR stages peak memories[odin: 12.82 MiB, abc: 32.78 MiB, vpr: 61.15 MiB])), it would be a simple change to the output file. Additionally, The new changes also do not affect the parsing of QoR results since they are parsed from the ".out" files, not from the run_vtr_flow.py script file.

Regarding the second comment, I added two new columns to the QoR results, named vtr_max_mem_stage and vtr_max_mem. The first column shows the stage that achieved the maximum memory peak in the entire VTR flow and the latter one shows the amount of the peak memory in human-readable format (i.e., 37.66 MiB instead of 38560).

It would be great if you could let me know your thoughts. I have also uploaded the new QoR results of a vtr_reg_basic/basic_timing task as an example for your review.
basic_timing.txt

vaughnbetz · 2022-01-07T17:47:24Z

Thanks Seyed. I think this output is good and once CI is green it can be merged.

aman26kbm · 2022-01-08T15:29:52Z

@sdamghan can we merge this?

sdamghan · 2022-01-08T18:46:37Z

@vaughnbetz Thanks,
@aman26kbm here we are

aman26kbm · 2022-01-11T02:13:34Z

Thanks, @sdamghan. Appreciate it!

sdamghan force-pushed the get_max_mem branch from 5d113ce to fa41fff Compare January 5, 2022 15:11

sdamghan added the kokoro:force-run label Jan 5, 2022

symbiflow-robot removed the kokoro:force-run label Jan 5, 2022

sdamghan force-pushed the get_max_mem branch 2 times, most recently from 09e2fe9 to 7789fb5 Compare January 5, 2022 23:43

vaughnbetz reviewed Jan 6, 2022

View reviewed changes

[Infra]: - fix get_memory_usage getting stuck when vpr output file is…

8d281fb

… not generated - show the maximum memory usage of all VTR stages in the run_vtr_flow output Signed-off-by: Seyed Alireza Damghani <[email protected]>

sdamghan force-pushed the get_max_mem branch from 7789fb5 to 8d281fb Compare January 7, 2022 16:02

vaughnbetz merged commit 372c028 into verilog-to-routing:master Jan 8, 2022

sdamghan mentioned this pull request Jan 11, 2022

error at odin stage doesn't kill process #1949

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix `run_vtr_task.py` getting stuck if an input has an error #1950

fix `run_vtr_task.py` getting stuck if an input has an error #1950

Uh oh!

sdamghan commented Jan 4, 2022

Uh oh!

sdamghan commented Jan 6, 2022

Uh oh!

vaughnbetz Jan 6, 2022

Uh oh!

vaughnbetz commented Jan 6, 2022

Uh oh!

sdamghan commented Jan 7, 2022

Uh oh!

vaughnbetz commented Jan 7, 2022

Uh oh!

aman26kbm commented Jan 8, 2022

Uh oh!

sdamghan commented Jan 8, 2022

Uh oh!

aman26kbm commented Jan 11, 2022

Uh oh!

Uh oh!

fix run_vtr_task.py getting stuck if an input has an error #1950

fix run_vtr_task.py getting stuck if an input has an error #1950

Uh oh!

Conversation

sdamghan commented Jan 4, 2022

Description

Related Issue

Motivation and Context

How Has This Been Tested?

Types of changes

Checklist:

Uh oh!

sdamghan commented Jan 6, 2022

Uh oh!

vaughnbetz Jan 6, 2022

Choose a reason for hiding this comment

Uh oh!

vaughnbetz commented Jan 6, 2022

Uh oh!

sdamghan commented Jan 7, 2022

Uh oh!

vaughnbetz commented Jan 7, 2022

Uh oh!

aman26kbm commented Jan 8, 2022

Uh oh!

sdamghan commented Jan 8, 2022

Uh oh!

aman26kbm commented Jan 11, 2022

Uh oh!

Uh oh!

fix `run_vtr_task.py` getting stuck if an input has an error #1950

fix `run_vtr_task.py` getting stuck if an input has an error #1950