-
Notifications
You must be signed in to change notification settings - Fork 30
Use crdt for the MFS? #454
Comments
Sorry for the confusing write-up - I'm kinda tired. Just mention you, @hsanjuan, as you're doing a lot of work on the cluster and maybe wanna chime in on the idea. :) |
I think it's a good idea; I was just thinking the same thing a few days ago in a different context (not MFS, some IPLD stuff). It's been bothering me all along that IPFS isn't a filesystem, despite the name, because of exactly this problem, that CIDs change with every change of content, and that's unavoidable. So as a rule, every time you use Merkle hashing in any context, you always need an out-of-band key-value store of some sort to map an unchanging key to a constantly-changing "head" hash, right? It's like when working with linked lists in a Lisp, you usually add things to a list by prepending, so you have a new head, and in that case the out-of-band storage is a variable containing a simple in-memory pointer. Every time you prepend to the list, you update the pointer to point to the new head. And in general, in IPLD we could try to build the same old data structures but replace pointers with CIDs (and another thing that worries me is how impractical that is: CIDs are so much larger, and indirection is so much more expensive, so it's not like you can afford to build a cons cell containing only two CIDs and build up linked lists with them, right? Or binary trees or any other of those conventional fine-grained data structures. So in practice, IPLD is used only for larger json-ish structures, right?) Anyway in a distributed system, instead of variables to store pointers, you have things like the DHT, and IPNS. Just another out-of-band place to "name" a changing CID with a constant key. So what about IPNS? Well the key is a cryptographic signing one, and by default has only one owner: only one computer (and probably only one person, in practice) is authorized to publish a new key-value mapping for that particular key. That's quite a serious limitation! But gpg can work with team keys (https://medium.com/slalom-build/how-to-use-gpg-to-securely-share-secrets-with-your-team-c09c50fe77e3 etc.) Maybe IPNS could too? but only when the number of users with write access is small; and it does not solve the collision problem. But as long as collisions are detected and resolved manually, maybe that's ok (as with git). Then yeah, another choice is a G-set CRDT. I think (but haven't verified the details) that Berty is using it to share one channel amonst n users, at least in the case of a multi-user channel. It would solve the multiple-users-with-write-access problem quite nicely for MFS. But then why not go all the way and support versioning too? All the versions that were ever in the CRDT are still there, unless garbage collection or history compaction is possible, right? And if we're going to support versioning, I would suggest getting experience with Fossil/Venti on Plan9 first. (I have also not gotten that far yet, to build that kind of storage server, because 9Front doesn't have it; but I want to.) From what I've read, this combination is built up in a similar way: first you have a simple hash store, so that you can write blocks of content and use the hash as the key to retrieve it later. (That's Venti.) Extremely simple. Then Fossil is a filesystem built on top. So I think Venti could be replaced with IPFS as it is today already, and maybe MFS could be like Fossil. At any rate, a real filesystem needs to be mountable, and efficient enough to actually use as a filesystem, i.e. I should be able to store my whole home directory that way if I want to. I think 9p is a good way to make it mountable (nevermind the Linux vfs bugs, they will just have to get fixed); and Fossil provides an example of how version control could be done, using ordinary file operations, rather than with a whole independent set of shell commands like the way git is implemented. I guess FileCoin was supposed to be actually a filesystem, but is it really? It must be using CRDTs right? (Although as an example of how to build a filesystem, it's the least affordable one that I've ever seen. Just read the minimum system requirements 🤦) |
My line of thinking on how to make a go-ds-crdt-based filesystem (to implement something like MFS) was to store paths as:
The fact that badger has optimized prefixed-queries would allow listing all the keys under I have however reservations that this would scale well (i.e. I have to query everything to figure out what folders and files exist directly under "/"), and there are also a bunch of design questions (having So yeah, I don't know if this is the best for a general-purpose replicated MFS (perhaps better for other usecases that are more limited in terms of possible paths/layout etc). |
Hey @hsanjuan, first I think using paths with a starting / is bad for several reasons:
Not sure how performant something like split() would be in this database... however, we could just do the following to avoid it: I think we could leverage that "." is a reserved directory name - so it cannot be used in regular, valid paths as directory name:
That's still pretty human-readable, but we would need an entry for each folder. Would it be "safe" to just create folder-entrys on each file-add, which get's discarded as duplicated by crdt? :)
Not sure whether that's actually an issue… Google Drive actually allows you to create Folders with exactly the same name and it gets mounted all the time. 🤷🏼 I think it's up to the applications to detect that and just make it somehow show both in the view of the folder. |
@hsanjuan I think the added benefit would be, that the cluster could add a "files" command as well, to offer the same naming scheme on top of the normal pins. I think this would solve ipfs-cluster/ipfs-cluster#903. For the ipfs clients this would allow to display/explore the content of clusters read-only. If we create a simple yaml/josm under collab.ipfscluster.io for this, the users could just explore the clusters shared there with a list - or add clusters/lists to their clients to show manually. |
@hsanjuan btw IIRC you can currently not pin the same CID with multiple names. I think this may be an issue here... as the path should best be stored in the name itself file structures in the cluster and the MFS "compatible" and also readable. Different option would be to use key-value elements for that, like a list of paths, but that would be pretty awful to parse and read in the current views on that in cluster-ctl. |
@Jorropo and I discussed shortly the idea to use crdt in the MFS.
The idea is to avoid storing directories as trees in favor of the highly concurrent, mutable representation of files inside the MFS:
Using the current implementation for IPFS-Cluster each file would be added as individual pin and path would be stored inside the name field. Folders would exist just because there are files for them (git style).
This allows to have multiple writing, rewriting, moving, renaming operation happening in a concurrent fashion - and would not even be limited to a single node: Like in ipfs-cluster multiple nodes could be allowed to write to the same pinset - by trust. So a user could link up multiple nodes and have them share the same MFS.
Analog to creating an archive, the user could choose to "freeze" a folder (and it's file/folder) structure. This would need to create a partial lock (e.g. via a hash from to the directory path) on the current section of the MFS locally – to block any further operations. Then all files and folders would be linked like they are currently represented by the pinset to a UnixFS CID and added as regular pin.
Additionally, the "frozen" pin of the folder could be shown as version with a timestamp in the gui, when showing the folder.
Rationale
Apart from already mentioned main advantages of reduced overhead for concurrent access to the MFS, the ability to sync multiple clients in a distributed fashion on a single MFS (view) etc. it would also strictly seperate a "CID" representation of a folder, and an MFS representation of a folder.
I talked with various people with IPFS while they did their first steps, and they had a hard time understanding the CID concept as a immutable data structure.
The default assumption is, that a CID identifies a common point in the network where you provide the most recent version of a content (what actually an IPNS does). So they are very suppried when they create a folder for a friend share the CID with them and put a new file into the folder and "the new file is not appearing" on the "friend's view" of their folder.
IPNS integration in IPFS Gui
I think it would also allow a more easy IPNS integration in the GUI. I think the UX would be much better if they can see "their folder" in all of their clients and then "freeze" the folder into a CID, and have a way to move the IPNS entry from the last version to the new version of the folder - like with a button "update IPNS" next to the version.
The text was updated successfully, but these errors were encountered: