Skip to content

fix run_vtr_task.py getting stuck if an input has an error #1950

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 8, 2022

Conversation

sdamghan
Copy link
Member

@sdamghan sdamghan commented Jan 4, 2022

Description

The run_vtr_flow.py script gets stuck on calling get_memory_usage(logfile) when the logfile does not exist. This PR fixes the bug, in addition to showing the maximum memory of all VTR flow stages instead of only the VPR stage in the final output.

Related Issue

Issue #1949

Motivation and Context

If an input file has errors and the VTR front-ends fail, the run_vtr_task script gets stuck and the terminal does not come back to the prompt

How Has This Been Tested?

Types of changes

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation
  • I have updated the documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed

@sdamghan
Copy link
Member Author

sdamghan commented Jan 6, 2022

@vaughnbetz - this PR is about a minor change to the run_vtr_flow.py script. According to the related issue, the script failed to exit and return to prompt in a terminal if an input design has an error. This was because the get_memory_usage routine was specifically looking for the "vpr.out" while no such file has been generated. I have added a helper routine to first look at all VTR stages output files, and then choose the maximum RSS among all of them (not only vpr).
Would you mind letting me know if you have any comments/reviews on the new code?

@@ -525,10 +561,10 @@ def vtr_command_main(arg_list, prog=None):
seconds = datetime.now() - start

print(
"{status} (took {time}, vpr run consumed {max_mem} memory)".format(
"{status} (took {time}, {stage[0]} run consumed {stage[1]} memory)".format(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to say this was the overall peak stage in this message.

@vaughnbetz
Copy link
Contributor

The code looks good.
The new code prints the maximum memory of any stage, and which stage hit that peak, which is probably a good improvement vs. the old code that printed the maximum memory of vpr all the time. However, we should check it doesn't break any of our qor parsing, where we usually check maximum vpr memory. 1

  1. Can you check that the maximum vpr memory will always be parsed out properly (i.e. it's taken from the vpr.out file, rather than from the run_vtr_task output)? If it isn't going to be parsed properly, we should print the peak memory of all stages, and hte overall maximum, so we can still reliably parse vpr peak memory in QoR runs.
  2. I think with this new feature it would be good to add overall peak memory for the entire run and which stage had the peak into the QoR parsing scripts, so we could start using that in regtests and QoR checks. Not essential in this PR, but if it's not hard I think it would be good to add right now while we're thinking of it.

… not generated

	 - show the maximum memory usage of all VTR stages in the run_vtr_flow output

Signed-off-by: Seyed Alireza Damghani <[email protected]>
@sdamghan
Copy link
Member Author

sdamghan commented Jan 7, 2022

@vaughnbetz - thanks for the comments. The code added to the run_vtr_flow.py script prints the stage that achieved the overall VTR flow memory peak and the output would be something like below:

k4_N10_memSize16384_memData64/ch_intrinsics_modified		OK (took 1.43 seconds, overall memory peak 39.45 MiB consumed by abc run)
k6_N10_mem32K_40nm/mkPktMerge		OK (took 15.38 seconds, overall memory peak 61.15 MiB consumed by vpr run)
k6_N10_mem32K_40nm/single_ff		OK (took 0.18 seconds, overall memory peak 32.87 MiB consumed by abc run)
k6_N10_mem32K_40nm/single_ff		OK (took 0.18 seconds, overall memory peak 32.57 MiB consumed by abc run)
k6_N10_mem32K_40nm/single_wire		OK (took 0.18 seconds, overall memory peak 32.83 MiB consumed by abc run)
k6_N10_mem32K_40nm/single_wire		OK (took 0.18 seconds, overall memory peak 32.54 MiB consumed by abc run)
k6_N10_mem32K_40nm/diffeq1		OK (took 6.08 seconds, overall memory peak 38.75 MiB consumed by vpr run)
k6_N10_mem32K_40nm/diffeq1		OK (took 5.95 seconds, overall memory peak 38.86 MiB consumed by vpr run)
k6_N10_mem32K_40nm/ch_intrinsics		OK (took 1.77 seconds, overall memory peak 37.22 MiB consumed by abc run)
k6_N10_mem32K_40nm/ch_intrinsics		OK (took 1.73 seconds, overall memory peak 37.00 MiB consumed by abc run)
k4_N10_memSize16384_memData64/single_ff		OK (took 0.18 seconds, overall memory peak 32.78 MiB consumed by abc run)
k4_N10_memSize16384_memData64/single_wire		OK (took 0.25 seconds, overall memory peak 32.91 MiB consumed by abc run)
k4_N10_memSize16384_memData64/diffeq1		OK (took 3.25 seconds, overall memory peak 37.66 MiB consumed by abc run)
k4_N10_memSize16384_memData64/ch_intrinsics		OK (took 1.14 seconds, overall memory peak 39.46 MiB consumed by abc run)
Elapsed time: 37.95 seconds

However, if you would like to have all memory peaks printed separately (something like: OK(took 15.38 seconds, VTR stages peak memories[odin: 12.82 MiB, abc: 32.78 MiB, vpr: 61.15 MiB])), it would be a simple change to the output file. Additionally, The new changes also do not affect the parsing of QoR results since they are parsed from the ".out" files, not from the run_vtr_flow.py script file.

Regarding the second comment, I added two new columns to the QoR results, named vtr_max_mem_stage and vtr_max_mem. The first column shows the stage that achieved the maximum memory peak in the entire VTR flow and the latter one shows the amount of the peak memory in human-readable format (i.e., 37.66 MiB instead of 38560).

It would be great if you could let me know your thoughts. I have also uploaded the new QoR results of a vtr_reg_basic/basic_timing task as an example for your review.
basic_timing.txt

@vaughnbetz
Copy link
Contributor

Thanks Seyed. I think this output is good and once CI is green it can be merged.

@aman26kbm
Copy link
Contributor

@sdamghan can we merge this?

@vaughnbetz vaughnbetz merged commit 372c028 into verilog-to-routing:master Jan 8, 2022
@sdamghan
Copy link
Member Author

sdamghan commented Jan 8, 2022

@vaughnbetz Thanks,
@aman26kbm here we are

@aman26kbm
Copy link
Contributor

Thanks, @sdamghan. Appreciate it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants