Skip to content

Commit 57ed628

Browse files
[memprof] Speed up caller-callee pair extraction (Part 2) (llvm#116441)
This patch further speeds up the extraction of caller-callee pairs from the profile. Recall that we reconstruct a call stack by traversing the radix tree from one of its leaf nodes toward a root. The implication is that when we decode many different call stacks, we end up visiting nodes near the root(s) repeatedly. That in turn adds many duplicates to our data structure: DenseMap<uint64_t, SmallVector<CallEdgeTy, 0>> Calls; only to be deduplicated later with sort+unique for each vector. This patch makes the extraction process more efficient by keeping track of indices of the radix tree array we've visited so far and terminating traversal as soon as we encounter an element previously visited. Note that even with this improvement, we still add at least one caller-callee pair to the data structure above for each call stack because we do need to add a caller-callee pair for the leaf node with the callee GUID being 0. Without this patch, it takes 4 seconds to extract caller-callee pairs from a large MemProf profile. This patch shortenes that down to 900ms.
1 parent b1fa9d1 commit 57ed628

File tree

2 files changed

+19
-3
lines changed

2 files changed

+19
-3
lines changed

llvm/include/llvm/ProfileData/MemProf.h

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
#ifndef LLVM_PROFILEDATA_MEMPROF_H_
22
#define LLVM_PROFILEDATA_MEMPROF_H_
33

4+
#include "llvm/ADT/BitVector.h"
45
#include "llvm/ADT/MapVector.h"
56
#include "llvm/ADT/STLForwardCompat.h"
67
#include "llvm/ADT/STLFunctionalExtras.h"
@@ -971,11 +972,16 @@ struct CallerCalleePairExtractor {
971972
// A map from caller GUIDs to lists of call sites in respective callers.
972973
DenseMap<uint64_t, SmallVector<CallEdgeTy, 0>> CallerCalleePairs;
973974

975+
// The set of linear call stack IDs that we've visited.
976+
BitVector Visited;
977+
974978
CallerCalleePairExtractor() = delete;
975979
CallerCalleePairExtractor(
976980
const unsigned char *CallStackBase,
977-
llvm::function_ref<Frame(LinearFrameId)> FrameIdToFrame)
978-
: CallStackBase(CallStackBase), FrameIdToFrame(FrameIdToFrame) {}
981+
llvm::function_ref<Frame(LinearFrameId)> FrameIdToFrame,
982+
unsigned RadixTreeSize)
983+
: CallStackBase(CallStackBase), FrameIdToFrame(FrameIdToFrame),
984+
Visited(RadixTreeSize) {}
979985

980986
void operator()(LinearCallStackId LinearCSId) {
981987
const unsigned char *Ptr =
@@ -1004,6 +1010,15 @@ struct CallerCalleePairExtractor {
10041010
LineLocation Loc(F.LineOffset, F.Column);
10051011
CallerCalleePairs[CallerGUID].emplace_back(Loc, CalleeGUID);
10061012

1013+
// Keep track of the indices we've visited. If we've already visited the
1014+
// current one, terminate the traversal. We will not discover any new
1015+
// caller-callee pair by continuing the traversal.
1016+
unsigned Offset =
1017+
std::distance(CallStackBase, Ptr) / sizeof(LinearFrameId);
1018+
if (Visited.test(Offset))
1019+
break;
1020+
Visited.set(Offset);
1021+
10071022
Ptr += sizeof(LinearFrameId);
10081023
CalleeGUID = CallerGUID;
10091024
}

llvm/lib/ProfileData/InstrProfReader.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1678,7 +1678,8 @@ IndexedMemProfReader::getMemProfCallerCalleePairs() const {
16781678
assert(Version == memprof::Version3);
16791679

16801680
memprof::LinearFrameIdConverter FrameIdConv(FrameBase);
1681-
memprof::CallerCalleePairExtractor Extractor(CallStackBase, FrameIdConv);
1681+
memprof::CallerCalleePairExtractor Extractor(CallStackBase, FrameIdConv,
1682+
RadixTreeSize);
16821683

16831684
// The set of linear call stack IDs that we need to traverse from. We expect
16841685
// the set to be dense, so we use a BitVector.

0 commit comments

Comments
 (0)