Very long runtime in parse_vtr_flow #1647

litghost · 2021-01-28T18:59:54Z

  %Own   %Total  OwnTime  TotalTime  Function (filename:line)                                                                                                                                                                                                                             
 51.00%  58.00%    6.26s     7.17s   parse_vtr_flow (vtr/parse_vtr_flow.py:75)
 26.00%  26.00%    2.59s     2.59s   parse_vtr_flow (vtr/parse_vtr_flow.py:72)
 14.00%  14.00%    1.04s     1.04s   parse_vtr_flow (vtr/parse_vtr_flow.py:73)
  7.00%   7.00%   0.810s    0.810s   regex (vtr/inspect.py:39)
  2.00%   2.00%   0.150s    0.150s   parse_vtr_flow (vtr/parse_vtr_flow.py:76)
  0.00%   0.00%   0.100s    0.100s   regex (vtr/inspect.py:37)
  0.00%   0.00%   0.050s    0.050s   parse_vtr_flow (vtr/parse_vtr_flow.py:74)
  0.00% 100.00%   0.000s    11.00s   parse_task (vtr/parse_vtr_task.py:207)
  0.00% 100.00%   0.000s    11.00s   run_tasks (run_reg_test.py:315)
  0.00% 100.00%   0.000s    11.00s   <module> (run_reg_test.py:342)
  0.00% 100.00%   0.000s    11.00s   vtr_command_main (run_vtr_task.py:246)
  0.00% 100.00%   0.000s    11.00s   run_tasks (run_vtr_task.py:297)
  0.00% 100.00%   0.000s    11.00s   parse_tasks (vtr/parse_vtr_task.py:186)
  0.00% 100.00%   0.000s    11.00s   vtr_command_main (run_reg_test.py:183)

The text was updated successfully, but these errors were encountered:

vaughnbetz · 2021-01-28T19:05:20Z

May be due to huge number of deprecated warnings from rr-graph, based on what Xifan has said.

litghost · 2021-01-28T19:19:20Z

Long stack trace:

  %Own   %Total  OwnTime  TotalTime  Function (filename:line)                                                                                                                                                                                                                             
 54.00%  62.00%   31.22s    35.09s   parse_vtr_flow (vtr/parse_vtr_flow.py:75)
 25.00%  25.00%   13.63s    13.69s   parse_vtr_flow (vtr/parse_vtr_flow.py:72)
 10.00%  10.00%    4.89s     4.89s   parse_vtr_flow (vtr/parse_vtr_flow.py:73)
  8.00%   8.00%    3.68s     3.68s   regex (vtr/inspect.py:39)
  3.00%   3.00%    1.02s     1.02s   parse_vtr_flow (vtr/parse_vtr_flow.py:76)
  0.00%   0.00%   0.310s    0.310s   parse_vtr_flow (vtr/parse_vtr_flow.py:74)
  0.00%   0.00%   0.190s    0.190s   regex (vtr/inspect.py:37)
  0.00%   0.00%   0.040s    0.040s   decode (codecs.py:321)
  0.00%   0.00%   0.020s    0.020s   decode (codecs.py:323)
  0.00% 100.00%   0.000s    55.00s   <module> (run_reg_test.py:342)
  0.00% 100.00%   0.000s    55.00s   vtr_command_main (run_reg_test.py:183)
  0.00% 100.00%   0.000s    55.00s   vtr_command_main (run_vtr_task.py:246)
  0.00% 100.00%   0.000s    55.00s   run_tasks (run_reg_test.py:315)
  0.00% 100.00%   0.000s    55.00s   parse_task (vtr/parse_vtr_task.py:207)
  0.00% 100.00%   0.000s    55.00s   run_tasks (run_vtr_task.py:297)
  0.00% 100.00%   0.000s    55.00s   parse_tasks (vtr/parse_vtr_task.py:186)

tangxifan · 2021-01-28T23:27:45Z

I have run vtr_strong regression test on my local machine (64-core 2.6GHz CPU), where I push test run on 32 cores.
The runtime results I get is

vpr runs finished in ~120s
result parser takes ~60s

I am trying to get a detail profiling report like the stack trace but have no clue how to run it. If you can share the command to run, I will try.

litghost · 2021-01-28T23:33:12Z

I am trying to get a detail profiling report like the stack trace but have no clue how to run it. If you can share the command to run, I will try.

I used https://github.com/benfred/py-spy

vaughnbetz · 2021-02-04T18:42:56Z

@jgoeders can you or somone on your team take a look? This is a long time for parsing with log files that aren't that big. We had an issue with lots of warnings that was exacerbating this, but even after fixing that we have slow parse times (parse time is 50% of the time of a total vpr run).
@litghost speculates that this may be a regular expression with a back-reference (can get slow). The stack track indicates the regex is the slow part, and that shouldn't be fundamental, but can easily happen with some forms of regex coding.

tangxifan · 2021-02-04T19:32:01Z

Procedures to replicate my tests：

Checkout vtr github repo: git clone https://github.com/verilog-to-routing/vtr-verilog-to-routing.git
Compile VTR project make -j
Run strong regression tests time python3 run_reg_test.py vtr_reg_strong -j 32 (you may use another number if your local machine does not have 32 cores available).

jgoeders · 2021-02-08T05:18:21Z

I can't reproduce. I've added a PR (#1658) that prints the parse time, and the CI runs with 618s vpr, 14s parse. https://github.com/verilog-to-routing/vtr-verilog-to-routing/pull/1658/checks?check_run_id=1852010637

I see similar results on two different local machines, testing with -j8 and j20.

Can anyone else reproduce this locally?

tangxifan · 2021-02-08T18:29:59Z

@jgoeders I managed to reproduce your results on my local machine. I may did something wrong last time. Let me investigate and go back to you later.

tangxifan · 2021-02-08T19:28:19Z

I just did some tests to find out where the runtime difference coming from. Here is what I have found:

I have managed to reproduce both my previous results and Jeff's results.
It seems that the hard disks impact a lot in the parsing flow.

On my local machine, there are two types of hard disks:

A NFS mounted hard disk where data are transferred through network. This is where I am working on. The runtime results of running VTR strong regression tests (with -j 32) are as follows.

<VTR runs>
custom_grid:                           multiple_io_types.xml/raygentop.v/common                                                                                             		OK (took 119.45 seconds)
**Elapsed time: 128.03 seconds**

Parsing test results...
scripts/parse_vtr_task.py -l /research/ece/lnis/USERS/tang/tmp/vtr-verilog-to-routing/vtr_flow/tasks/regression_tests/vtr_reg_strong/task_list.txt
**Elapsed time: 36.08 seconds**

A local disk which is SSD. This is a scratch space. The runtime results of running VTR strong regression tests (with -j 32) are as follows.

<VTR runs>
custom_grid:                           multiple_io_types.xml/raygentop.v/common                                                                                             		OK (took 116.00 seconds)
**Elapsed time: 123.05 seconds**

Parsing test results...
scripts/parse_vtr_task.py -l /var/tmp/xtang/vtr-verilog-to-routing/vtr_flow/tasks/regression_tests/vtr_reg_strong/task_list.txt
**Elapsed time: 11.28 seconds**

This is what I have found so far. I am not sure about how the hard disk is organized on kokoro. VTR nightly tests have a long runtime. I can run the nightly test on my local machine (using SSD disk) and see it the vtr_parse_result is scalable.

jgoeders · 2021-02-08T21:08:41Z

Thanks for the additional info. Both of the local machines I tried are SSDs, so it sounds like this is bound by disk I/O. Not sure if there's much to be done to speed it up but I'll take a look.

vaughnbetz · 2021-02-09T00:51:18Z

We might be able to put some files we parse on a diet. Sarah is simplifying and summarizing the clustering log file output. Are we parsing any big files other than the log file? Is some of the log file too big?

litghost · 2021-02-09T16:33:59Z

I am not sure about how the hard disk is organized on kokoro

Kokoro disks should be pretty fast, but if we are disk limited and it is taking between 10 - 60 seconds on local machines, we probably need to put the logs on a diet (like Vaughn suggested).

litghost · 2021-02-09T16:40:09Z

For additional context relevant to this issue, kokoro was spending more ~1.5 hrs parsing nightly logs in some instances.

jgoeders · 2021-02-09T22:45:51Z

My plan is to take a look at the code, and make sure we aren't reading the file in more than once, and that there aren't any unnecessarily complicated regexs. Aside from that, not sure there's much to do aside from slimming the logs.

jgoeders · 2021-03-06T00:49:08Z

Looks like we were reading each log file file multiple times: once for each things we were looking for. I'm fixing it in #1658 so that it groups parse items by filename and only reads each file once.

@tangxifan Can you see if this fix is faster for you?

tangxifan · 2021-03-06T02:52:06Z

Hi @jgoeders,
Just run vtr_reg_strong tests on both a RAID disk and a SSD disk.
The runtime for parsing results has been reduced by 50%.

Results on a RAID disk (was 36.08 seconds)

<VTR runs>
**Elapsed time: 139.21 seconds**

Parsing test results...
scripts/parse_vtr_task.py -l /research/ece/lnis/USERS/tang/tmp/vtr-verilog-to-routing/vtr_flow/tasks/regression_tests/vtr_reg_strong/task_list.txt
**Elapsed time: 13.06 seconds**

Results on a SSD (was 11.28 seconds)

<VTR runs>
**Elapsed time: 128.81 seconds**

Parsing test results...
scripts/parse_vtr_task.py -l /var/tmp/xtang/vtr-verilog-to-routing/vtr_flow/tasks/regression_tests/vtr_reg_strong/task_list.txt
**Elapsed time: 6.33 seconds**

jgoeders · 2021-03-08T16:58:06Z

Great, thanks for running that! Looks like this is a solid improvement.

It is now reading the files one line at a time and matching against all patterns we are looking for. I tried reading the entire file at once, and doing a single regex match on the entire file for each pattern but that didn't seem to improve performance at all, so I'll leave it at this.

I'm hesitant to merge this until we fix the nightly so it passes (#1666) I don't want to push changes to the parser while we are still debugging QoR issues that use the parser.

vaughnbetz · 2021-03-09T14:50:40Z

Thanks @jgoeders. Nightly should pass now; Aman had some stale QoR data in a new regtest he added and he has disabled that test until he has time to get the QoR data updated. So if you do a PR the nightly should pass and we can merge this

jgoeders · 2021-03-10T16:50:08Z

@vaughnbetz Hmm, my PR #1658 is still failing the nightly. Not sure if this is from my changes or not, but it doesn't look like master is passing the nightly either.

vaughnbetz assigned tangxifan Jan 28, 2021

vaughnbetz mentioned this issue Jan 28, 2021

vtr weekly runs not completing #1648

Closed

tangxifan mentioned this issue Jan 28, 2021

Node Sides Patch to Show All the Sides that a RR Node may Appear #1621

Merged

7 tasks

litghost changed the title ~~Very long runtime in kokoro_log_reader~~ Very long runtime in parse_vtr_flow Feb 4, 2021

vaughnbetz assigned jgoeders Feb 4, 2021

jgoeders mentioned this issue Feb 8, 2021

Faster log parsing: fix parse_vtr_flow.py to only read each log file once #1658

Merged

7 tasks

jgoeders closed this as completed in #1658 Mar 15, 2021

Very long runtime in parse_vtr_flow #1647

Very long runtime in parse_vtr_flow #1647

Comments

litghost commented Jan 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

vaughnbetz commented Jan 28, 2021

Uh oh!

litghost commented Jan 28, 2021

Uh oh!

tangxifan commented Jan 28, 2021

Uh oh!

litghost commented Jan 28, 2021

Uh oh!

vaughnbetz commented Feb 4, 2021

Uh oh!

tangxifan commented Feb 4, 2021

Uh oh!

jgoeders commented Feb 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tangxifan commented Feb 8, 2021

Uh oh!

tangxifan commented Feb 8, 2021

Uh oh!

jgoeders commented Feb 8, 2021

Uh oh!

vaughnbetz commented Feb 9, 2021

Uh oh!

litghost commented Feb 9, 2021

Uh oh!

litghost commented Feb 9, 2021

Uh oh!

jgoeders commented Feb 9, 2021

Uh oh!

jgoeders commented Mar 6, 2021

Uh oh!

tangxifan commented Mar 6, 2021

Uh oh!

jgoeders commented Mar 8, 2021

Uh oh!

vaughnbetz commented Mar 9, 2021

Uh oh!

jgoeders commented Mar 10, 2021

Uh oh!

litghost commented Jan 28, 2021 •

edited

Loading

jgoeders commented Feb 8, 2021 •

edited

Loading