Skip to content

Use orphaned docs as part of GC calculation #80

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 23, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ public interface LruDelegate {
/** Enumerates all the targets in the QueryCache. */
void forEachTarget(Consumer<QueryData> consumer);

long getTargetCount();
long getSequenceNumberCount();

/** Enumerates sequence numbers for documents not associated with a target. */
void forEachOrphanedDocumentSequenceNumber(Consumer<Long> consumer);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ public int getDocumentsRemoved() {

/** Given a percentile of target to collect, returns the number of targets to collect. */
int calculateQueryCount(int percentile) {
long targetCount = delegate.getTargetCount();
long targetCount = delegate.getSequenceNumberCount();
return (int) ((percentile / 100.0f) * targetCount);
}

Expand Down Expand Up @@ -216,15 +216,13 @@ private Results runGarbageCollection(SparseArray<?> liveTargetIds) {
int numDocumentsRemoved = removeOrphanedDocuments(upperBound);
long removedDocumentsTs = System.currentTimeMillis();

// TODO(gsoltis): post-compaction?

if (Logger.isDebugEnabled()) {
String desc = "LRU Garbage Collection:\n";
desc += "\tCounted targets in " + (countedTargetsTs - startTs) + "ms\n";
desc +=
String.format(
Locale.ROOT,
"\tDetermined least recently used %d sequence numbers in %dms",
"\tDetermined least recently used %d sequence numbers in %dms\n",
sequenceNumbers,
(foundUpperBoundTs - countedTargetsTs));
desc +=
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -82,8 +82,14 @@ public void forEachTarget(Consumer<QueryData> consumer) {
}

@Override
public long getTargetCount() {
return persistence.getQueryCache().getTargetCount();
public long getSequenceNumberCount() {
long targetCount = persistence.getQueryCache().getTargetCount();
long orphanedCount[] = new long[1];
forEachOrphanedDocumentSequenceNumber(
sequenceNumber -> {
orphanedCount[0]++;
});
return targetCount + orphanedCount[0];
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,14 @@ public LruGarbageCollector getGarbageCollector() {
}

@Override
public long getTargetCount() {
return persistence.getQueryCache().getTargetCount();
public long getSequenceNumberCount() {
long targetCount = persistence.getQueryCache().getTargetCount();
long orphanedDocumentCount =
persistence
.query(
"SELECT COUNT(*) FROM (SELECT sequence_number FROM target_documents GROUP BY path HAVING COUNT(*) = 1 AND target_id = 0)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know SQLite, and I assume you tried "SELECT COUNT() FROM target_documents GROUP BY path HAVING COUNT() = 1 AND target_id = 0"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did indeed try that first. I believe COUNT(*) in the projection interacts poorly with GROUP BY in terms of getting what we want here. Using the subquery technique works, and in retrospect is I think a little more obvious when compared with the similar query used to find the sequence numbers.

.firstValue(row -> row.getLong(0));
return targetCount + orphanedDocumentCount;
}

@Override
Expand Down