-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Local disk cache: part 1 #530
base: master
Are you sure you want to change the base?
Conversation
6939738
to
1046709
Compare
go/pkg/diskcache/diskcache.go
Outdated
for i := 0; i < 256; i++ { | ||
_ = os.MkdirAll(filepath.Join(root, fmt.Sprintf("%02x", i)), os.ModePerm) | ||
} | ||
_ = filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be run in parallel for each 00-ff
directory inside the for-loop above? Synchronous 256 directory walks doesn't look great, especially when each directory will contain thousands of files.
DiskCache::New
should be multithreaded and use all available IO as possible. FS walk on posix OSes may be pretty fast, but Windows is very slow when we're talking about directory walking with thousands of files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, absolutely. Done.
go/pkg/diskcache/diskcache.go
Outdated
return err | ||
} | ||
defer out.Close() | ||
_, err = io.Copy(out, in) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we move the file copy function into a separate source (similar to atim_*
) so it can be altered for darwin
and windows
to make use of clonefile and upcoming Windows DevDrive CoW?
clonefile
does preserve the chmod
, so it can be a single call. We also don't need to manually do a buffer copy here. Just let the OS perform the file copy action and update chmod
if it's required, it should be much faster.
CloneFile utils can be found here as an example:
https://github.com/git-lfs/git-lfs/blob/main/tools/util_darwin.go
https://github.com/git-lfs/git-lfs/blob/main/tools/util_linux.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea, thank you! I didn't know of clonefile. I renamed the system-specific files to sys_*
and changed the header to match, so that this change can be nicely done in a followup PR. It is a very contained optimization which doesn't affect the cache layout, so it's backward-compatible, and hence imo it belongs in a folllowup.
c22848c
to
21666d4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the implementation!
go/pkg/diskcache/diskcache.go
Outdated
if err != nil { | ||
return err | ||
} | ||
// Required sanity check: sometimes the copy pretends to succeed, but doesn't, if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand on that? That's news to me. Or you mean in the general case of a race condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, if we write to a file while concurrently deleting it, the Copy might not return an error (even if no bytes were actually copied), and neither the Stat. But the Stat will return different size than expected (usually, 0).
But the comment made me notice that io.Copy
actually returns the number of bytes copied and seems to be correct, so I amended the sanity check to only verify the copy return value to be faster.
go/pkg/diskcache/diskcache.go
Outdated
} | ||
it := iUntyped.(*qitem) | ||
it.mu.RLock() | ||
if err := copyFile(d.getPath(k), path, dg.Size); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This simplifies the control flow:
it.mu.RLock()
err := copyFile(d.getPath(k), path, dg.Size)
it.mu.RUnlock()
if err != nil {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, thank you.
go/pkg/diskcache/diskcache.go
Outdated
prefixDir := filepath.Join(root, fmt.Sprintf("%02x", i)) | ||
go func() { | ||
defer wg.Done() | ||
_ = os.MkdirAll(prefixDir, os.ModePerm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are extreme cases where this could happen (e.g. disk full). I'd recommend to use https://pkg.go.dev/golang.org/x/sync/errgroup#WithContext for error management instead of logging as a silent failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, basically my main problem here was I didn't want to return an error from New, because that would mean that the client Opt can return an error on Apply, and the code currently doesn't support that.
To be fair, that is the right thing to do, imo -- options can, in general, fail. So I'd propose a separate independent PR to add an error
return type to Apply, and then I can propagate all the errors properly from here. @mrahs does that SGTY?
Another idea (pushing the problem further down the line) would be to return an error from New now, but then log + ignore it in Apply. Which is what I'll do now, until @mrahs weighs in with whether he's okay with me changing the Apply type.
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd push error handling on the user by making the client.Opt
accept a *DiskCache
instance.
|
||
func (d *DiskCache) StoreCas(dg digest.Digest, path string) error { | ||
if dg.Size > int64(d.maxCapacityBytes) { | ||
return fmt.Errorf("blob size %d exceeds DiskCache capacity %d", dg.Size, d.maxCapacityBytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more graceful behavior is to try to trigger a synchronous GC first before erroring out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? This will fail for sure regardless of GC. Note that we compare with maxCapacityBytes
, not the current sizeBytes
.
} | ||
it.mu.Lock() | ||
defer it.mu.Unlock() | ||
_, exists := d.store.LoadOrStore(it.key, it) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d.store is already synchronized by it.mu. I'd vote to make store a normal map[digest]*qitem instead; it would be faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, no, store
cannot be synchronized by it.mu
because it.mu
is new (that would be a race condition). It could, in theory, be a normal map synchronized by mu
instead (the DiskCache class variable that we use to protect queue
), but that would be slower than using a sync map:
The Map type is optimized for two common use cases: (1) when the entry for a given key is only ever written once but read many times, as in caches that only grow, or (2) when multiple goroutines read, write, and overwrite entries for disjoint sets of keys. In these two cases, use of a Map may significantly reduce lock contention compared to a Go map paired with a separate Mutex or RWMutex.
(from https://pkg.go.dev/sync#Map)
We have a classical case 1 here, where almost always a key is only written once but read many times from concurrent threads. A sync map should greatly reduce lock contention.
go/pkg/diskcache/sys_windows.go
Outdated
|
||
// This will return correct values only if `fsutil behavior set disablelastaccess 0` is set. | ||
// Tracking of last access time is disabled by default on Windows. | ||
func GetLastAccessTime(path string) (time.Time, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend instead func FileInfoToAccessTime(i *os.FileInfo) time.Time
and keep the os.Stat() call at the call site.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
} | ||
|
||
// If the digest exists in the disk cache, copy the file contents to the given path. | ||
func (d *DiskCache) LoadCas(dg digest.Digest, path string) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd propose to rename from LoadCas to CopyFromCache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize the sister function is called StoreCas. I don't know if it's a nomenclature reused in this code base, if so ignore this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, Load* and Store* are consistent with sync.Map's naming, we also sometimes use Get/Update. I'll leave the naming decisions to @mrahs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the cache can have three main methods: putting bytes, getting bytes, copying bytes to a file path. Since the cache deals with bytes, I'm inclined towards Write(key, Reader), Read(key, Writer), Copy(key, filepath)
, but Store, Load, LoadInto/CopyTo
also sound reasonable. I made another comment on generalizing this method which would avoid having to choose suffixes.
}() | ||
} | ||
wg.Wait() | ||
go res.gc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think gc should only be called if len(res.queue) > X && res.sizeBytes > Y && res.queue[0].lat < time.Now().Sub(Z)
to reduce unnecessary processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand -- gc is a daemon thread, it blocks on the GC request channel (gcReq) and only works if there are things to do (total size is over capacity). And we only send a GC request to the channel if we reach capacity. The current line only starts the daemon thread. Did you mean changes the gc thread itself, or to places where a GC request is added to the channel?
16916b3
to
7b8156c
Compare
7b8156c
to
df6b425
Compare
I'm a little concerned about the whole "scan this huge dir on startup" process. In Chromium Reclient is auto started/stopped during each Does this mean I can think of these options to improve the situation:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little concerned about the whole "scan this huge dir on startup" process. In Chromium Reclient is auto started/stopped during each
autoninja
invocation. This is different compared to how Goma worked: there was a separategoma_ctl ensure_start
command that kept the proxy in the background until a user terminated it manually, that's why local disk cache re-read wasn't an issue.Does this mean
reproxy
startup (andautoninja
call in general) will be delayed by the overall time to scan "local cache" directory? Or is this a separate process right now? @ola-rozenfeld can you please do some tests around this area?I can think of these options to improve the situation:
- Run the "local cache" scan in background and just "fail" cache lookups until it's ready (maybe with some separate status). This is not perfect, because the scan will be even slower when the actual build starts.
- Use some kind of a simple "database" file that can be hot-loaded at startup and then be updated/synced in background (
proto
file for ex. or maybe some Go-specific serialization).
I agree and was about to suggest the same thing. In isolated, I used the second option. A very simple json file worked fine, really, just keeping object in a LRU list. Then a priority queue is not needed, just mutate the slice in place. Go is surprisingly fast at this, I would be surprised if it were not faster than using a priority queue. One lock instead of many.
The isolated code soft-failed when a file was missing / corrupted on fetch.
Only "once in a while" would you scan the whole directory to make sure there's no forgotten files.
"time" | ||
) | ||
|
||
func FileInfoToAccessTime(info fs.FileInfo) time.Time { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd vote to not export this function.
} | ||
} | ||
// Randomly access and store files from different threads. | ||
eg, _ := errgroup.WithContext(context.Background()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's no context, use eg := errgroup.Group{}
return filepath.WalkDir(prefixDir, func(path string, d fs.DirEntry, err error) error { | ||
// We log and continue on all errors, because cache read errors are not critical. | ||
if err != nil { | ||
return fmt.Errorf("error reading cache directory: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/%v/%w/g
?
if err != nil { | ||
return fmt.Errorf("error parsing cached file name %s: %v", path, err) | ||
} | ||
atime, err := getLastAccessTime(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's likely already cached in d.Info(), I'd inline this function here to save one stat per file.
return uint64(atomic.LoadInt64(&d.sizeBytes)) | ||
} | ||
|
||
func (d *DiskCache) getKeyFromFileName(fname string) (key, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't need to be a member function.
testGcTicks chan uint64 | ||
} | ||
|
||
func New(ctx context.Context, root string, maxCapacityBytes uint64) (*DiskCache, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think walking the disk on initialization is going to introduce a performance hit proportional to the size of the cache. I like the idea of persisting the state to a file and reloading that single file on startup. This has two main advantages:
- cheaper startup cost
- access times are persisted as known by the cache, regardless of the nuances of the underlying filesystem (e.g. may be mounted with noatime)
return filepath.Join(d.root, k.digest.Hash[:2], fmt.Sprintf("%s.%d", k.digest.Hash[2:], k.digest.Size)) | ||
} | ||
|
||
func (d *DiskCache) StoreCas(dg digest.Digest, path string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think generalizing this method to Write(dg digest.Digest, io.Reader) error
provides several advantages:
- works for both cas and action results
- can potentially work out of the box with digested virtual inputs Allow clients to provide InputDigests to avoid unnecessary transfer #525
- delegates serialization to the user which reduces the dependencies of this pkg
We should also document that the cache does not enforce any relationship between dg
and the reader's bytes.
return nil | ||
} | ||
|
||
func (d *DiskCache) gc() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this implementation may have the following issues:
- if the capacity is not sufficient for the workload, eviction becomes part of every update.
- having three different locks being acquired and released independently is prune to races; e.g. attempting to load an item after it's been deleted from the ordered list but before it's been deleted from the map, will fail to update the index in the list.
- the client must remember to
Shutdown
the cache to avoid resource leaks.
The capacity issue can be mitigated by evicting a percentage of the cache rather than just enough to meet the threshold. This accommodates several new items before triggering another eviction.
The synchronization issue can be traded off with a simpler implementation that uses a single lock. Updating the list and the map should be an atomic operation in which the entire list is shared and therefore is the bottleneck. Actually, hits are bounded by IO anyways. Misses on the other hand can be faster with less locking. Then again, misses are always followed by updates which are bounded by IO. I think IO is more of a bottleneck than synchronization so it's worth going for a simple locking flow.
How bad is it if eviction events were blocking? I think if eviction saturates IO bandwidth it should be the same whether it's async or sync. Maybe blocking would be faster if we consider disk locality. I also think that most users would prefer using a large cache to avoid eviction anyways. Without a background eviction task, the cache would be simpler to work with.
*bump* |
Yes, I'm really sorry for disappearing -- I've had very hectic few weeks and didn't have time for this. This effort is something I'm doing voluntarily in my spare time, and it is not currently prioritized by EngFlow. I will be back to working on this in the next week or two! |
Thanks @ola-rozenfeld, much appreciated |
Introducing the diskcache package, CAS-only (Action Cache to be added in part 2).
Split out from #529 which has the whole thing, for reference.