Skip to content

Commit 0a2be46

Browse files
committed
Modules: Invalidate out-of-date PCMs as they're discovered
Leverage the InMemoryModuleCache to invalidate a module the first time it fails to import (and to lock a module as soon as it's built or imported successfully). For implicit module builds, this optimizes importing deep graphs where the leaf module is out-of-date; see example near the end of the commit message. Previously the cache finalized ("locked in") all modules imported so far when starting a new module build. This was sufficient to prevent loading two versions of the same module, but was somewhat arbitrary and hard to reason about. Now the cache explicitly tracks module state, where each module must be one of: - Unknown: module not in the cache (yet). - Tentative: module in the cache, but not yet fully imported. - ToBuild: module found on disk could not be imported; need to build. - Final: module in the cache has been successfully built or imported. Preventing repeated failed imports avoids variation in builds based on shifting filesystem state. Now it's guaranteed that a module is loaded from disk exactly once. It now seems safe to remove FileManager::invalidateCache, but I'm leaving that for a later commit. The new, precise logic uncovered a pre-existing problem in the cache: the map key is the module filename, and different contexts use different filenames for the same PCM file. (In particular, the test Modules/relative-import-path.c does not build without this commit. r223577 started using a relative path to describe a module's base directory when importing it within another module. As a result, the module cache sees an absolute path when (a) building the module or importing it at the top-level, and a relative path when (b) importing the module underneath another one.) The "obvious" fix is to resolve paths using FileManager::getVirtualFile and change the map key for the cache to a FileEntry, but some contexts (particularly related to ASTUnit) have a shorter lifetime for their FileManager than the InMemoryModuleCache. This is worth pursuing further in a later commit; perhaps by tying together the FileManager and InMemoryModuleCache lifetime, or moving the in-memory PCM storage into a VFS layer. For now, use the PCM's base directory as-written for constructing the filename to check the ModuleCache. Example ======= To understand the build optimization, first consider the build of a module graph TU -> A -> B -> C -> D with an empty cache: TU builds A' A' builds B' B' builds C' C' builds D' imports D' B' imports C' imports D' A' imports B' imports C' imports D' TU imports A' imports B' imports C' imports D' If we build TU again, where A, B, C, and D are in the cache and D is out-of-date, we would previously get this build: TU imports A imports B imports C imports D (out-of-date) TU builds A' A' imports B imports C imports D (out-of-date) builds B' B' imports C imports D (out-of-date) builds C' C' imports D (out-of-date) builds D' imports D' B' imports C' imports D' A' imports B' imports C' imports D' TU imports A' imports B' imports C' imports D' After this commit, we'll immediateley invalidate A, B, C, and D when we first observe that D is out-of-date, giving this build: TU imports A imports B imports C imports D (out-of-date) TU builds A' // The same graph as an empty cache. A' builds B' B' builds C' C' builds D' imports D' B' imports C' imports D' A' imports B' imports C' imports D' TU imports A' imports B' imports C' imports D' The new build matches what we'd naively expect, pretty closely matching the original build with the empty cache. rdar://problem/48545366 llvm-svn: 355778
1 parent 8bef5cd commit 0a2be46

File tree

18 files changed

+345
-132
lines changed

18 files changed

+345
-132
lines changed

clang/include/clang/Serialization/ASTReader.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2231,6 +2231,10 @@ class ASTReader
22312231
// Read a path
22322232
std::string ReadPath(ModuleFile &F, const RecordData &Record, unsigned &Idx);
22332233

2234+
// Read a path
2235+
std::string ReadPath(StringRef BaseDirectory, const RecordData &Record,
2236+
unsigned &Idx);
2237+
22342238
// Skip a path
22352239
static void SkipPath(const RecordData &Record, unsigned &Idx) {
22362240
SkipString(Record, Idx);

clang/include/clang/Serialization/InMemoryModuleCache.h

Lines changed: 54 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -30,54 +30,79 @@ namespace clang {
3030
/// Critically, it ensures that a single process has a consistent view of each
3131
/// PCM. This is used by \a CompilerInstance when building PCMs to ensure that
3232
/// each \a ModuleManager sees the same files.
33-
///
34-
/// \a finalizeCurrentBuffers() should be called before creating a new user.
35-
/// This locks in the current PCMs, ensuring that no PCM that has already been
36-
/// accessed can be purged, preventing use-after-frees.
3733
class InMemoryModuleCache : public llvm::RefCountedBase<InMemoryModuleCache> {
3834
struct PCM {
3935
std::unique_ptr<llvm::MemoryBuffer> Buffer;
4036

41-
/// Track the timeline of when this was added to the cache.
42-
unsigned Index;
37+
/// Track whether this PCM is known to be good (either built or
38+
/// successfully imported by a CompilerInstance/ASTReader using this
39+
/// cache).
40+
bool IsFinal = false;
41+
42+
PCM() = default;
43+
PCM(std::unique_ptr<llvm::MemoryBuffer> Buffer)
44+
: Buffer(std::move(Buffer)) {}
4345
};
4446

4547
/// Cache of buffers.
4648
llvm::StringMap<PCM> PCMs;
4749

48-
/// Monotonically increasing index.
49-
unsigned NextIndex = 0;
50-
51-
/// Bumped to prevent "older" buffers from being removed.
52-
unsigned FirstRemovableIndex = 0;
53-
5450
public:
55-
/// Store the Buffer under the Filename.
51+
/// There are four states for a PCM. It must monotonically increase.
52+
///
53+
/// 1. Unknown: the PCM has neither been read from disk nor built.
54+
/// 2. Tentative: the PCM has been read from disk but not yet imported or
55+
/// built. It might work.
56+
/// 3. ToBuild: the PCM read from disk did not work but a new one has not
57+
/// been built yet.
58+
/// 4. Final: indicating that the current PCM was either built in this
59+
/// process or has been successfully imported.
60+
enum State { Unknown, Tentative, ToBuild, Final };
61+
62+
/// Get the state of the PCM.
63+
State getPCMState(llvm::StringRef Filename) const;
64+
65+
/// Store the PCM under the Filename.
5666
///
57-
/// \pre There is not already buffer is not already in the cache.
67+
/// \pre state is Unknown
68+
/// \post state is Tentative
5869
/// \return a reference to the buffer as a convenience.
59-
llvm::MemoryBuffer &addBuffer(llvm::StringRef Filename,
60-
std::unique_ptr<llvm::MemoryBuffer> Buffer);
70+
llvm::MemoryBuffer &addPCM(llvm::StringRef Filename,
71+
std::unique_ptr<llvm::MemoryBuffer> Buffer);
72+
73+
/// Store a just-built PCM under the Filename.
74+
///
75+
/// \pre state is Unknown or ToBuild.
76+
/// \pre state is not Tentative.
77+
/// \return a reference to the buffer as a convenience.
78+
llvm::MemoryBuffer &addBuiltPCM(llvm::StringRef Filename,
79+
std::unique_ptr<llvm::MemoryBuffer> Buffer);
80+
81+
/// Try to remove a buffer from the cache. No effect if state is Final.
82+
///
83+
/// \pre state is Tentative/Final.
84+
/// \post Tentative => ToBuild or Final => Final.
85+
/// \return false on success, i.e. if Tentative => ToBuild.
86+
bool tryToDropPCM(llvm::StringRef Filename);
6187

62-
/// Try to remove a buffer from the cache.
88+
/// Mark a PCM as final.
6389
///
64-
/// \return false on success, iff \c !isBufferFinal().
65-
bool tryToRemoveBuffer(llvm::StringRef Filename);
90+
/// \pre state is Tentative or Final.
91+
/// \post state is Final.
92+
void finalizePCM(llvm::StringRef Filename);
6693

67-
/// Get a pointer to the buffer if it exists; else nullptr.
68-
llvm::MemoryBuffer *lookupBuffer(llvm::StringRef Filename);
94+
/// Get a pointer to the pCM if it exists; else nullptr.
95+
llvm::MemoryBuffer *lookupPCM(llvm::StringRef Filename) const;
6996

70-
/// Check whether the buffer is final.
97+
/// Check whether the PCM is final and has been shown to work.
7198
///
72-
/// \return true iff \a finalizeCurrentBuffers() has been called since the
73-
/// buffer was added. This prevents buffers from being removed.
74-
bool isBufferFinal(llvm::StringRef Filename);
99+
/// \return true iff state is Final.
100+
bool isPCMFinal(llvm::StringRef Filename) const;
75101

76-
/// Finalize the current buffers in the cache.
102+
/// Check whether the PCM is waiting to be built.
77103
///
78-
/// Should be called when creating a new user to ensure previous uses aren't
79-
/// invalidated.
80-
void finalizeCurrentBuffers();
104+
/// \return true iff state is ToBuild.
105+
bool shouldBuildPCM(llvm::StringRef Filename) const;
81106
};
82107

83108
} // end namespace clang

clang/lib/Frontend/CompilerInstance.cpp

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -62,11 +62,7 @@ CompilerInstance::CompilerInstance(
6262
Invocation(new CompilerInvocation()),
6363
ModuleCache(SharedModuleCache ? SharedModuleCache
6464
: new InMemoryModuleCache),
65-
ThePCHContainerOperations(std::move(PCHContainerOps)) {
66-
// Don't allow this to invalidate buffers in use by others.
67-
if (SharedModuleCache)
68-
getModuleCache().finalizeCurrentBuffers();
69-
}
65+
ThePCHContainerOperations(std::move(PCHContainerOps)) {}
7066

7167
CompilerInstance::~CompilerInstance() {
7268
assert(OutputFiles.empty() && "Still output files in flight?");

clang/lib/Serialization/ASTReader.cpp

Lines changed: 26 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,7 @@
9292
#include "llvm/ADT/None.h"
9393
#include "llvm/ADT/Optional.h"
9494
#include "llvm/ADT/STLExtras.h"
95+
#include "llvm/ADT/ScopeExit.h"
9596
#include "llvm/ADT/SmallPtrSet.h"
9697
#include "llvm/ADT/SmallString.h"
9798
#include "llvm/ADT/SmallVector.h"
@@ -2359,6 +2360,7 @@ ASTReader::ReadControlBlock(ModuleFile &F,
23592360
RecordData Record;
23602361
unsigned NumInputs = 0;
23612362
unsigned NumUserInputs = 0;
2363+
StringRef BaseDirectoryAsWritten;
23622364
while (true) {
23632365
llvm::BitstreamEntry Entry = Stream.advance();
23642366

@@ -2559,7 +2561,9 @@ ASTReader::ReadControlBlock(ModuleFile &F,
25592561
ImportedName, /*FileMapOnly*/ true);
25602562

25612563
if (ImportedFile.empty())
2562-
ImportedFile = ReadPath(F, Record, Idx);
2564+
// Use BaseDirectoryAsWritten to ensure we use the same path in the
2565+
// ModuleCache as when writing.
2566+
ImportedFile = ReadPath(BaseDirectoryAsWritten, Record, Idx);
25632567
else
25642568
SkipPath(Record, Idx);
25652569

@@ -2624,6 +2628,9 @@ ASTReader::ReadControlBlock(ModuleFile &F,
26242628
break;
26252629

26262630
case MODULE_DIRECTORY: {
2631+
// Save the BaseDirectory as written in the PCM for computing the module
2632+
// filename for the ModuleCache.
2633+
BaseDirectoryAsWritten = Blob;
26272634
assert(!F.ModuleName.empty() &&
26282635
"MODULE_DIRECTORY found before MODULE_NAME");
26292636
// If we've already loaded a module map file covering this module, we may
@@ -4180,6 +4187,14 @@ ASTReader::ReadASTCore(StringRef FileName,
41804187

41814188
assert(M && "Missing module file");
41824189

4190+
bool ShouldFinalizePCM = false;
4191+
auto FinalizeOrDropPCM = llvm::make_scope_exit([&]() {
4192+
auto &MC = getModuleManager().getModuleCache();
4193+
if (ShouldFinalizePCM)
4194+
MC.finalizePCM(FileName);
4195+
else
4196+
MC.tryToDropPCM(FileName);
4197+
});
41834198
ModuleFile &F = *M;
41844199
BitstreamCursor &Stream = F.Stream;
41854200
Stream = BitstreamCursor(PCHContainerRdr.ExtractPCH(*F.Buffer));
@@ -4246,6 +4261,7 @@ ASTReader::ReadASTCore(StringRef FileName,
42464261

42474262
// Record that we've loaded this module.
42484263
Loaded.push_back(ImportedModule(M, ImportedBy, ImportLoc));
4264+
ShouldFinalizePCM = true;
42494265
return Success;
42504266

42514267
case UNHASHED_CONTROL_BLOCK_ID:
@@ -4309,7 +4325,7 @@ ASTReader::readUnhashedControlBlock(ModuleFile &F, bool WasImportedBy,
43094325
// validation will fail during the as-system import since the PCM on disk
43104326
// doesn't guarantee that -Werror was respected. However, the -Werror
43114327
// flags were checked during the initial as-user import.
4312-
if (getModuleManager().getModuleCache().isBufferFinal(F.FileName)) {
4328+
if (getModuleManager().getModuleCache().isPCMFinal(F.FileName)) {
43134329
Diag(diag::warn_module_system_bit_conflict) << F.FileName;
43144330
return Success;
43154331
}
@@ -9099,6 +9115,14 @@ std::string ASTReader::ReadPath(ModuleFile &F, const RecordData &Record,
90999115
return Filename;
91009116
}
91019117

9118+
std::string ASTReader::ReadPath(StringRef BaseDirectory,
9119+
const RecordData &Record, unsigned &Idx) {
9120+
std::string Filename = ReadString(Record, Idx);
9121+
if (!BaseDirectory.empty())
9122+
ResolveImportedPath(Filename, BaseDirectory);
9123+
return Filename;
9124+
}
9125+
91029126
VersionTuple ASTReader::ReadVersionTuple(const RecordData &Record,
91039127
unsigned &Idx) {
91049128
unsigned Major = Record[Idx++];

clang/lib/Serialization/ASTWriter.cpp

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4622,9 +4622,9 @@ ASTFileSignature ASTWriter::WriteAST(Sema &SemaRef,
46224622
WritingAST = false;
46234623
if (SemaRef.Context.getLangOpts().ImplicitModules && WritingModule) {
46244624
// Construct MemoryBuffer and update buffer manager.
4625-
ModuleCache.addBuffer(OutputFile,
4626-
llvm::MemoryBuffer::getMemBufferCopy(
4627-
StringRef(Buffer.begin(), Buffer.size())));
4625+
ModuleCache.addBuiltPCM(OutputFile,
4626+
llvm::MemoryBuffer::getMemBufferCopy(
4627+
StringRef(Buffer.begin(), Buffer.size())));
46284628
}
46294629
return Signature;
46304630
}

clang/lib/Serialization/InMemoryModuleCache.cpp

Lines changed: 47 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -11,39 +11,70 @@
1111

1212
using namespace clang;
1313

14+
InMemoryModuleCache::State
15+
InMemoryModuleCache::getPCMState(llvm::StringRef Filename) const {
16+
auto I = PCMs.find(Filename);
17+
if (I == PCMs.end())
18+
return Unknown;
19+
if (I->second.IsFinal)
20+
return Final;
21+
return I->second.Buffer ? Tentative : ToBuild;
22+
}
23+
1424
llvm::MemoryBuffer &
15-
InMemoryModuleCache::addBuffer(llvm::StringRef Filename,
16-
std::unique_ptr<llvm::MemoryBuffer> Buffer) {
17-
auto Insertion = PCMs.insert({Filename, PCM{std::move(Buffer), NextIndex++}});
18-
assert(Insertion.second && "Already has a buffer");
25+
InMemoryModuleCache::addPCM(llvm::StringRef Filename,
26+
std::unique_ptr<llvm::MemoryBuffer> Buffer) {
27+
auto Insertion = PCMs.insert(std::make_pair(Filename, std::move(Buffer)));
28+
assert(Insertion.second && "Already has a PCM");
1929
return *Insertion.first->second.Buffer;
2030
}
2131

32+
llvm::MemoryBuffer &
33+
InMemoryModuleCache::addBuiltPCM(llvm::StringRef Filename,
34+
std::unique_ptr<llvm::MemoryBuffer> Buffer) {
35+
auto &PCM = PCMs[Filename];
36+
assert(!PCM.IsFinal && "Trying to override finalized PCM?");
37+
assert(!PCM.Buffer && "Trying to override tentative PCM?");
38+
PCM.Buffer = std::move(Buffer);
39+
PCM.IsFinal = true;
40+
return *PCM.Buffer;
41+
}
42+
2243
llvm::MemoryBuffer *
23-
InMemoryModuleCache::lookupBuffer(llvm::StringRef Filename) {
44+
InMemoryModuleCache::lookupPCM(llvm::StringRef Filename) const {
2445
auto I = PCMs.find(Filename);
2546
if (I == PCMs.end())
2647
return nullptr;
2748
return I->second.Buffer.get();
2849
}
2950

30-
bool InMemoryModuleCache::isBufferFinal(llvm::StringRef Filename) {
31-
auto I = PCMs.find(Filename);
32-
if (I == PCMs.end())
33-
return false;
34-
return I->second.Index < FirstRemovableIndex;
51+
bool InMemoryModuleCache::isPCMFinal(llvm::StringRef Filename) const {
52+
return getPCMState(Filename) == Final;
53+
}
54+
55+
bool InMemoryModuleCache::shouldBuildPCM(llvm::StringRef Filename) const {
56+
return getPCMState(Filename) == ToBuild;
3557
}
3658

37-
bool InMemoryModuleCache::tryToRemoveBuffer(llvm::StringRef Filename) {
59+
bool InMemoryModuleCache::tryToDropPCM(llvm::StringRef Filename) {
3860
auto I = PCMs.find(Filename);
39-
assert(I != PCMs.end() && "No buffer to remove...");
40-
if (I->second.Index < FirstRemovableIndex)
61+
assert(I != PCMs.end() && "PCM to remove is unknown...");
62+
63+
auto &PCM = I->second;
64+
assert(PCM.Buffer && "PCM to remove is scheduled to be built...");
65+
66+
if (PCM.IsFinal)
4167
return true;
4268

43-
PCMs.erase(I);
69+
PCM.Buffer.reset();
4470
return false;
4571
}
4672

47-
void InMemoryModuleCache::finalizeCurrentBuffers() {
48-
FirstRemovableIndex = NextIndex;
73+
void InMemoryModuleCache::finalizePCM(llvm::StringRef Filename) {
74+
auto I = PCMs.find(Filename);
75+
assert(I != PCMs.end() && "PCM to finalize is unknown...");
76+
77+
auto &PCM = I->second;
78+
assert(PCM.Buffer && "Trying to finalize a dropped PCM...");
79+
PCM.IsFinal = true;
4980
}

clang/lib/Serialization/ModuleManager.cpp

Lines changed: 11 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,8 @@ ModuleManager::addModule(StringRef FileName, ModuleKind Type,
118118
// contents, but we can't check that.)
119119
ExpectedModTime = 0;
120120
}
121+
// Note: ExpectedSize and ExpectedModTime will be 0 for MK_ImplicitModule
122+
// when using an ASTFileSignature.
121123
if (lookupModuleFile(FileName, ExpectedSize, ExpectedModTime, Entry)) {
122124
ErrorStr = "module file out of date";
123125
return OutOfDate;
@@ -159,16 +161,21 @@ ModuleManager::addModule(StringRef FileName, ModuleKind Type,
159161
// Load the contents of the module
160162
if (std::unique_ptr<llvm::MemoryBuffer> Buffer = lookupBuffer(FileName)) {
161163
// The buffer was already provided for us.
162-
NewModule->Buffer = &ModuleCache->addBuffer(FileName, std::move(Buffer));
164+
NewModule->Buffer = &ModuleCache->addBuiltPCM(FileName, std::move(Buffer));
163165
// Since the cached buffer is reused, it is safe to close the file
164166
// descriptor that was opened while stat()ing the PCM in
165167
// lookupModuleFile() above, it won't be needed any longer.
166168
Entry->closeFile();
167169
} else if (llvm::MemoryBuffer *Buffer =
168-
getModuleCache().lookupBuffer(FileName)) {
170+
getModuleCache().lookupPCM(FileName)) {
169171
NewModule->Buffer = Buffer;
170172
// As above, the file descriptor is no longer needed.
171173
Entry->closeFile();
174+
} else if (getModuleCache().shouldBuildPCM(FileName)) {
175+
// Report that the module is out of date, since we tried (and failed) to
176+
// import it earlier.
177+
Entry->closeFile();
178+
return OutOfDate;
172179
} else {
173180
// Open the AST file.
174181
llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> Buf((std::error_code()));
@@ -186,7 +193,7 @@ ModuleManager::addModule(StringRef FileName, ModuleKind Type,
186193
return Missing;
187194
}
188195

189-
NewModule->Buffer = &getModuleCache().addBuffer(FileName, std::move(*Buf));
196+
NewModule->Buffer = &getModuleCache().addPCM(FileName, std::move(*Buf));
190197
}
191198

192199
// Initialize the stream.
@@ -198,7 +205,7 @@ ModuleManager::addModule(StringRef FileName, ModuleKind Type,
198205
ExpectedSignature, ErrorStr)) {
199206
// Try to remove the buffer. If it can't be removed, then it was already
200207
// validated by this process.
201-
if (!getModuleCache().tryToRemoveBuffer(NewModule->FileName))
208+
if (!getModuleCache().tryToDropPCM(NewModule->FileName))
202209
FileMgr.invalidateCache(NewModule->File);
203210
return OutOfDate;
204211
}
@@ -263,17 +270,6 @@ void ModuleManager::removeModules(
263270
mod->setASTFile(nullptr);
264271
}
265272
}
266-
267-
// Files that didn't make it through ReadASTCore successfully will be
268-
// rebuilt (or there was an error). Invalidate them so that we can load the
269-
// new files that will be renamed over the old ones.
270-
//
271-
// The ModuleCache tracks whether the module was successfully loaded in
272-
// another thread/context; in that case, it won't need to be rebuilt (and
273-
// we can't safely invalidate it anyway).
274-
if (LoadedSuccessfully.count(&*victim) == 0 &&
275-
!getModuleCache().tryToRemoveBuffer(victim->FileName))
276-
FileMgr.invalidateCache(victim->File);
277273
}
278274

279275
// Delete the modules.
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
// A
2+
#include "B.h"
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
// B
2+
#include "C.h"
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
// C
2+
#include "D.h"
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
module A { header "A.h" }
2+
module B { header "B.h" }
3+
module C { header "C.h" }

0 commit comments

Comments
 (0)