Parity: URLCache #2341

millenomi · 2019-06-13T18:19:32Z

This strictly implements the class. We need a separate adoption patch to hook it up to _HTTPProtocol, URLSession and URLSessionConfiguration. It has on-disk and in-memory caching. This also fixes a couple issues with archiving.

Implement it all.

millenomi · 2019-06-13T18:28:03Z

cc @drodriguez @karthikkeyan

millenomi · 2019-06-13T18:28:16Z

@swift-ci please test linux

millenomi · 2019-06-13T18:28:58Z

I'm strongly considering _FTPProtocol not employing caching.

millenomi · 2019-06-13T19:12:48Z

I mean obviously _HTTPURLProtocol/_FTPURLProtocol, whoops.

karthikkeyan

Nice implementation @millenomi I really like it.

The only improvement I would want to suggest is to reduces the number of loops in evictFrom* methods. Overall the methods time is O(n), but I would try to reduce number of loops if possible; that can be done in subsequent PRs, or I will try when I get some spare time.

Foundation/URLCache.swift

karthikkeyan · 2019-06-14T04:41:45Z

Foundation/URLCache.swift

+
+    private func identifier(for request: URLRequest) -> String? {
+        guard let url = request.url?.absoluteString else { return nil }
+        return Data(url.utf8).base64EncodedString()


Good use of base64EncodedString(), I totally forgot about the existence of this API when I was implementing.

Just one thing about this implementation, what if the same URL is used for different HTTP-Methods, doesn't this implementation produce the same identifier for different URLRequest?

Yes, this identifier should not be returned for many cases of URLRequest. Some methods should not be cached (PUT, POST?), some HTTP headers might instruct not to be cached. I also think that fragments in an URL should not be taken into account while caching, and I'm not sure about query parameters.

I also don't like it is unbounded, and an URL can be up to 4KB in size or something like that, but file names cannot be that long in some systems.

It is a simple start, but I would not mind a TODO here.

Foundation/URLCache.swift

karthikkeyan · 2019-06-14T06:45:05Z

@millenomi Something like this,

func evictFromMemoryCacheAssumingLockHeld(maximumSize: Int) {
    var totalSize = 0
    var orderedEntryProperties: [(cost: Int, index: Int)] = []
    for (index, identifier) in inMemoryCacheOrder.enumerated() {
        let cost = inMemoryCacheContents[identifier]!.cost
        orderedEntryProperties.append((cost, index))
        totalSize += cost
    }
    
    guard totalSize > maximumSize else { return }
    
    var i = 0
    for entryProperty in orderedEntryProperties {
        inMemoryCacheContents.removeValue(forKey: inMemoryCacheOrder[entryProperty.index])
        totalSize -= entryProperty.cost
        if totalSize < maximumSize {
            break
        }
        
        i += 1
    }
    inMemoryCacheOrder = inMemoryCacheOrder[(i + 1)..<inMemoryCacheOrder.count]
}

This way we can reduce the 6 different loops down to just 2. I am not sure if it is worth doing, but what you think about the above approach?

drodriguez

I'm concerned that there's no locking around the disk operations. There's moments that the disk files are removed and created, and I don't think some of those operations will behave as expected when several threads are trying to store/recover cached contents for the same key.

It is also not ideal that the disk operations block the caller. But once this is in, I can propose a lazy implementation for the disk part, and check that it works correctly.

Thanks for the work!

drodriguez · 2019-06-14T19:58:02Z

Foundation/URLResponse.swift

-        }
+
+        let nsmimetype = aDecoder.decodeObject(of: NSString.self, forKey: "NS.mimeType")
+        self.mimeType = nsmimetype as String?


Nit: This optional member uses one code pattern (directly assigning nil), while the two below use another code pattern (if let). As far as I see there's no difference in results, but using two different patterns might indicate intention. I would recommend using the same pattern in all three.

drodriguez · 2019-06-14T20:09:58Z

TestFoundation/TestURLCache.swift

+        try FileManager.default.removeItem(at: writableTestDirectoryURL)
+        try FileManager.default.createDirectory(at: writableTestDirectoryURL, withIntermediateDirectories: true)
+
+        XCTAssertNil(cache.cachedResponse(for: request))


Wouldn’t it be better to recreate a second cache in the same directory and query that new cache, forcing to use the disk, since the memory of the first cache is gone?

drodriguez · 2019-06-14T20:15:38Z

TestFoundation/TestURLCache.swift

+        super.setUp()
+
+        let pid = ProcessInfo.processInfo.processIdentifier
+        writableTestDirectoryURL = URL(fileURLWithPath: NSTemporaryDirectory()).appendingPathComponent("org.swift.TestFoundation.TestURLCache.\(pid)")


processIdentifier will not change from test to test. It might be better to use ProcessInfo.globallyUniqueString instead, to avoid reusing the same directory again and again.

Foundation/URLCache.swift

drodriguez · 2019-06-14T23:16:06Z

Foundation/URLCache.swift

+
+        inMemoryCacheLock.performLocked {
+            if inMemoryCacheContents[identifier] != nil {
+                inMemoryCacheOrder.removeAll(where: { $0 == identifier })


Since there should be only one, maybe avoid the removeAll, and because we know it should exist if it is present in the dictionary, we can use forced unwrap.

inMemoryCacheOrder.remove(at: inMemoryCacheOrder.firstIndex(where: { $0 == identifier })!)

drodriguez · 2019-06-14T23:18:02Z

Foundation/URLCache.swift

+            inMemoryCacheOrder = []
+        }
+
+        evictFromDiskCache(maximumSize: 0)


I wonder if removing the directory and recreating it would be faster, instead of enumerating and deleting each file one by one.

drodriguez · 2019-06-14T23:48:42Z

Foundation/URLCache.swift

+                if entry.value.date > date {
+                    identifiersToRemove.insert(entry.key)
+                }
+            }


let identifiersToRemove = inMemoryCacheContents.values.filter { $0.date > date }.map { $0.identifier }

drodriguez · 2019-06-14T23:51:51Z

Foundation/URLCache.swift

+            for toRemove in identifiersToRemove {
+                inMemoryCacheContents.removeValue(forKey: toRemove)
+            }
+            inMemoryCacheOrder.removeAll { identifiersToRemove.contains($0) }


I think it can be done iterating the array, and using the dictionary for lookups, instead of ending doing O(n*m) in this final step.

inMemoryCacheOrder.removeAll { identifier in if inMemoryCacheContents[identifier]!.date > date { inMemoryCacheContents.removeValue(forKey: identifier) return true } else { return false } }

This way, the code above is unnecessary.

drodriguez · 2019-06-15T00:01:46Z

Foundation/URLCache.swift

+
+            for url in urlsToRemove {
+                try? FileManager.default.removeItem(at: url)
+            }


From the implementation of enumerateDiskEntries, since the contentsOfDirectory(at:…) are generated before invoking the block, I think you can merge both loops into one and forget about urlsToRemove.

enumerateDiskEntries { entry, _ in if entry.date > data { try? FileManager.default.removeItem(at: url) } }

millenomi · 2019-07-10T17:11:14Z

Reviving this.

millenomi · 2019-07-10T17:13:33Z

Actually: I'm going to close this, and post a consolidated PR that includes the HTTP correctness fixes discussed above (but not your suggestions yet, @drodriguez — I'm going to go through them.)

millenomi · 2019-07-10T17:21:18Z

Moved to #2401.

Parity: URLCache

df5326d

Implement it all.

karthikkeyan reviewed Jun 14, 2019

View reviewed changes

drodriguez reviewed Jun 15, 2019

View reviewed changes

millenomi mentioned this pull request Jul 10, 2019

Parity: URLCache adoption for HTTP and FTP invocations. #2401

Merged

millenomi closed this Jul 10, 2019

Parity: URLCache #2341

Parity: URLCache #2341

Uh oh!

Conversation

millenomi commented Jun 13, 2019

Uh oh!

millenomi commented Jun 13, 2019

Uh oh!

millenomi commented Jun 13, 2019

Uh oh!

millenomi commented Jun 13, 2019

Uh oh!

millenomi commented Jun 13, 2019

Uh oh!

karthikkeyan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

karthikkeyan commented Jun 14, 2019

Uh oh!

drodriguez left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

millenomi commented Jul 10, 2019

Uh oh!

millenomi commented Jul 10, 2019

Uh oh!

millenomi commented Jul 10, 2019

Uh oh!

Uh oh!

karthikkeyan left a comment •

edited

Loading