Skip to content

Commit fb6f60d

Browse files
authored
[flang][cuda][NFC] Use NVVM VoteBallotOp (#134307)
`llvm.nvvm.vote.ballot.sync` has its own operation so use it in lowering.
1 parent de40f61 commit fb6f60d

File tree

2 files changed

+6
-3
lines changed

2 files changed

+6
-3
lines changed

flang/lib/Optimizer/Builder/IntrinsicCall.cpp

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6534,8 +6534,11 @@ mlir::Value
65346534
IntrinsicLibrary::genVoteBallotSync(mlir::Type resultType,
65356535
llvm::ArrayRef<mlir::Value> args) {
65366536
assert(args.size() == 2);
6537-
return genVoteSync(builder, loc, "llvm.nvvm.vote.ballot.sync",
6538-
builder.getI32Type(), args);
6537+
mlir::Value arg1 =
6538+
builder.create<fir::ConvertOp>(loc, builder.getI1Type(), args[1]);
6539+
return builder
6540+
.create<mlir::NVVM::VoteBallotOp>(loc, resultType, args[0], arg1)
6541+
.getResult();
65396542
}
65406543

65416544
// MATCH_ANY_SYNC

flang/test/Lower/CUDA/cuda-device-proc.cuf

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -303,7 +303,7 @@ end subroutine
303303
! CHECK-LABEL: func.func @_QPtestvote()
304304
! CHECK: fir.call @llvm.nvvm.vote.all.sync
305305
! CHECK: fir.call @llvm.nvvm.vote.any.sync
306-
! CHECK: fir.call @llvm.nvvm.vote.ballot.sync
306+
! CHECK: %{{.*}} = nvvm.vote.ballot.sync %{{.*}}, %{{.*}} : i32
307307

308308
! CHECK-DAG: func.func private @__ldca_i4x4_(!fir.ref<!fir.array<4xi32>>, !fir.ref<!fir.array<4xi32>>)
309309
! CHECK-DAG: func.func private @__ldcg_i4x4_(!fir.ref<!fir.array<4xi32>>, !fir.ref<!fir.array<4xi32>>)

0 commit comments

Comments
 (0)