Skip to content

Commit ae7f468

Browse files
committed
[NewPM] Fix MergeFunctions scheduling
MergeFunctions (as well as HotColdSplitting an IROutliner) are incorrectly scheduled under the new pass manager. The code makes it look like they run towards the end of the module optimization pipeline (as they should), while in reality the run at the start. This is because the OptimizePM populated around them is only scheduled later. I'm fixing this by moving these three passes until after OptimizePM to avoid splitting the function pass pipeline. It doesn't seem important to me that some of the function passes run after these late module passes. Differential Revision: https://reviews.llvm.org/D115098
1 parent a25111c commit ae7f468

File tree

2 files changed

+19
-26
lines changed

2 files changed

+19
-26
lines changed

llvm/lib/Passes/PassBuilderPipelines.cpp

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1182,23 +1182,6 @@ PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,
11821182

11831183
addVectorPasses(Level, OptimizePM, /* IsFullLTO */ false);
11841184

1185-
// Split out cold code. Splitting is done late to avoid hiding context from
1186-
// other optimizations and inadvertently regressing performance. The tradeoff
1187-
// is that this has a higher code size cost than splitting early.
1188-
if (EnableHotColdSplit && !LTOPreLink)
1189-
MPM.addPass(HotColdSplittingPass());
1190-
1191-
// Search the code for similar regions of code. If enough similar regions can
1192-
// be found where extracting the regions into their own function will decrease
1193-
// the size of the program, we extract the regions, a deduplicate the
1194-
// structurally similar regions.
1195-
if (EnableIROutliner)
1196-
MPM.addPass(IROutlinerPass());
1197-
1198-
// Merge functions if requested.
1199-
if (PTO.MergeFunctions)
1200-
MPM.addPass(MergeFunctionsPass());
1201-
12021185
// LoopSink pass sinks instructions hoisted by LICM, which serves as a
12031186
// canonicalization pass that enables other optimizations. As a result,
12041187
// LoopSink pass needs to be a very late IR pass to avoid undoing LICM
@@ -1226,6 +1209,23 @@ PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,
12261209
for (auto &C : OptimizerLastEPCallbacks)
12271210
C(MPM, Level);
12281211

1212+
// Split out cold code. Splitting is done late to avoid hiding context from
1213+
// other optimizations and inadvertently regressing performance. The tradeoff
1214+
// is that this has a higher code size cost than splitting early.
1215+
if (EnableHotColdSplit && !LTOPreLink)
1216+
MPM.addPass(HotColdSplittingPass());
1217+
1218+
// Search the code for similar regions of code. If enough similar regions can
1219+
// be found where extracting the regions into their own function will decrease
1220+
// the size of the program, we extract the regions, a deduplicate the
1221+
// structurally similar regions.
1222+
if (EnableIROutliner)
1223+
MPM.addPass(IROutlinerPass());
1224+
1225+
// Merge functions if requested.
1226+
if (PTO.MergeFunctions)
1227+
MPM.addPass(MergeFunctionsPass());
1228+
12291229
if (PTO.CallGraphProfile)
12301230
MPM.addPass(CGProfilePass());
12311231

llvm/test/Transforms/PhaseOrdering/X86/merge-functions.ll

Lines changed: 2 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -90,15 +90,8 @@ bb3: ; preds = %bb1, %bb2
9090

9191
define i1 @test2(i32 %c) {
9292
; CHECK-LABEL: @test2(
93-
; CHECK-NEXT: entry:
94-
; CHECK-NEXT: [[SWITCH_TABLEIDX:%.*]] = add i32 [[C:%.*]], -100
95-
; CHECK-NEXT: [[TMP0:%.*]] = icmp ult i32 [[SWITCH_TABLEIDX]], 20
96-
; CHECK-NEXT: [[SWITCH_CAST:%.*]] = trunc i32 [[SWITCH_TABLEIDX]] to i20
97-
; CHECK-NEXT: [[SWITCH_DOWNSHIFT:%.*]] = lshr i20 -490991, [[SWITCH_CAST]]
98-
; CHECK-NEXT: [[TMP1:%.*]] = and i20 [[SWITCH_DOWNSHIFT]], 1
99-
; CHECK-NEXT: [[SWITCH_MASKED:%.*]] = icmp ne i20 [[TMP1]], 0
100-
; CHECK-NEXT: [[I_0:%.*]] = select i1 [[TMP0]], i1 [[SWITCH_MASKED]], i1 false
101-
; CHECK-NEXT: ret i1 [[I_0]]
93+
; CHECK-NEXT: [[TMP2:%.*]] = tail call i1 @test1(i32 [[TMP0:%.*]]) #[[ATTR0:[0-9]+]]
94+
; CHECK-NEXT: ret i1 [[TMP2]]
10295
;
10396
entry:
10497
%i = alloca i8, align 1

0 commit comments

Comments
 (0)