Skip to content
This repository has been archived by the owner on Feb 8, 2023. It is now read-only.

[braindump] How to improve the mounting of go-ipfs as a regular filesystem #448

Open
markg85 opened this issue Jun 24, 2021 · 7 comments
Open
Labels
need/triage Needs initial labeling and prioritization

Comments

@markg85
Copy link

markg85 commented Jun 24, 2021

Hi,

Just the other day i had a very interesting discussion with @Jorropo and @RubenKelevra on discord about improving mounting of IPFS folders on your system natively.

This is a large braindump-like post. Any feedback you can provide is very much appreciated!
We, in the discussion, all have our own very different reasons for wanting this so i hope i'm writing this down in the most appropiate and complete way. If not, please feel free to tell me what i should add.

The current limitations

  1. The mounting is very limited in terms of metadata properties (UnixFS v1.5 is supposedly fixing this?)
  2. If you have, in your own network, a server that runs IPFS, you cannot easily mount IPFS folders from there on another PC
  3. If you want a "cloud drive" on IPFS, say akin to OneDrive/Dropbox/..., then you currently need to have all your data publicly and unencrypted to even remotely make it possible.
  4. (yes this was a reason) You cannot boot from IPFS
  5. File permissions

Wishes
These are just the ideas we had. There likely is a whole lot more.

  1. Mount IPFS folder over network
  2. Make it possible to boot from IPFS
  3. Have file encryption support, making a dropbox like service with private data possible
  4. Saturate network card speed for local network traffic. If you have a 1Gbit network and your IPFS node has the data you want locally (on it's host storage) then the network speed should be 1Gbit or close to the theoretical limit. Protocols like SMB and NFS struggle with this too.
  5. Allow to browse NFS files on the web too, again like dropbox. This means securing your files (aka, encryption). The best example for this in the IPFS world is probably Peergos

There are much more limitations and wishes when you keep brainstorming about this. Those are about the most prominent ones we discussed.
So, let's solve these! :) I'm going to ignore UnixFS v1.5 for the rest of this post as i'm assuming that to be in place for any of the future solutions.

Potential solutions

Disclaimer: there is no easy solution. Anything you can imagine is at the very best a huge task and at the very worst a monstrous task.

NFS server

A potential solution to this would be to write an NFS server application that can expose IPFS data via NFS. Internally the verver would use the IPFS API to get all the data it needs.
Pros

  • Make use of NFS clients (any) to mount IPFS shares hosted by an NFS server.
  • Allows you to boot from IPFS
  • Wide support (windows, mac, linux and even android)
  • Lots of server applications out there to take inspiration from

Cons

  • A very old protocol with a lot of legacy
  • Tremendously huge to implement
  • Difficult to get performant (in terms of network throughput)
  • Lacks encryption though NFS over TLS is possible.

Personally, i like the concept of NFS but not the complexity. It has grown over the decades and isn't of this time anymore. Perhaps it's time for something new?

Invent a new network protocol

In other projects this would sound insane. In IPFS, where the whole protocol is reinvented, this doesn't sound so strange to me. This definitely is a huge task on it's own but it does allow to make something extensible and for today's desires. The following is again a braidump list of concepts it should support, it's far from anywhere near complete. I'm not daring to write a list of features for this as it would only be sorely incomplete.

How to proceed from here on forth?

What i would really like to do is have a braindump session somewhere with the people involved in this. Just brainstorm a little about the features we want to have an how we can get there. So if you are interested, let it be known in the comments and if you're willing to attend! For example, this could tie in very nicely with this 2021 theme proposal from @obo20.

From there one, once we have a global idea of what we want, it's much easier to see if someone can make a proof of concept from something. Right now it's so broad and so vague that it's difficult to start anywhere at all.

So i'm just tagging the people here that could be interested. Please add more in comments if you think someone else should be aware of it.
@lidel @Stebalien @aschmahmann @autonome @momack2

Let's get this ball rolling, get some clarity on where we want to go and perhaps make some proof of concepts :)

Cheers,
Mark

@markg85 markg85 added the need/triage Needs initial labeling and prioritization label Jun 24, 2021
@Jorropo
Copy link

Jorropo commented Jun 24, 2021

I would just add a too long to read :

TL;DR

Embed an NFS server in the exposing the MFS in the IPFS daemon. (potentially an optionally built feature)

@Stebalien
Copy link
Member

I'm really not seeing how we go from the motivations/limitations to "therefore NFS". From what I can tell, I can remove IPFS from the proposal, just use NFS (and maybe btrfs if you want deduplication) and end up with the same thing.

Additionally:

  1. NFS won't solve the encryption/privacy problem so the node would have to be private.
  2. NFS doesn't solve the file permission/metadata problem (these still need to be stored in unixfs).

Really, we need:

  1. File encryption/ACLs for privacy (so you don't need a private network).
  2. Mount support that doesn't suck (prefetching, low-latency, non-buggy, etc.).
  3. Improved GC so you can actually cache data and delete it.
  4. Some form of multi-writer filesystem. Either peergos, or something like textile buckets.

But these are all pretty large projects.

@markg85
Copy link
Author

markg85 commented Jun 25, 2021

File encryption/ACLs for privacy (so you don't need a private network).

Definitely!
I'm still puzzling how - conceptually - such a feature could be supported though.
You could go a couple of different routes with this one alone:

  1. Implement it in IPFS commands. So that you get something like, for example, a ipfs add <file> --encryption-key <aes key>. Likewise for getting a file. A benefit here is that you can also call this construct with the web API making the encryption fully transparent for API users (they only need the AES key, the IPFS client does the encryption/decryption)
  2. Regard IPFS as just the transport layer and implement it on top of IPFS. Downside is that you'd likely need a separate executable to manage the encryption and decryption. Good for separation of responsibilities, not so user friendly.

Just brainstorming here. There's probably more ways to go about this.

@Stebalien
Copy link
Member

Stebalien commented Jun 26, 2021 via email

@markg85
Copy link
Author

markg85 commented Jun 26, 2021

We'd likely avoid asking users to pass keys directly.

That seems fair and wise :)

But:

  1. Directories would include the symmetric keys necessary to read their children.

Isn't that very unsafe? That key, in my mind, is an AES 128 or 256 key. If the directory stores it (so within the ipld object that represents a directory) then anyone knowing the Qm... of that directory can get the key. I don't see how that can be safe.
I'm probably missing something here?

What i could see potentially working is for the directory to store an asymmetric public key of the one that's allowed to read it. And then the contents of that directory to be encrypted with AES256 (which itself is generated by the users private key + that public key). Reading it would then require the reader to create an AES256 key with that public key + the reader's private key (this is akin to https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto/deriveKey).

A benefit here is that you never need to store the AES key itself. You always derive it from a public+private combination. For IPFS is means keeping the same security semantics as it currently has for the private key. A potential way to use it on the API side of things would be to say something like:
ipfs add <file> --encrypt --target-publickey <publickey>
Or when nothing is provided like:
ipfs add <file> --encrypt
Then your own public key is used making you the one able to decrypt it.

  1. We'd either pass the root key as a form of bearer token (/ipfs/QmFoobar#SomeKey) or make a root "ACL" object (encrypting the symmetric key to various identity keys, either using peer IDs or DIDs or something like that).

Could you elaborate a bit more on how this is supposed to work? It sounds interesting but i'm missing some details that make it snap into place for me :)

@RangerMauve
Copy link

Slightly tangential, but I've been using FTP servers running on localhost which get mounted as network drives by OSs. https://github.com/RangerMauve/hyper-gateway-ftp

This is usually a lot easier to put together compared to FUSE since support is a lot more straightforward across operating systems.

@ianopolous
Copy link
Member

Thanks for the (7 months ago) mention @markg85. I just wanted to mention we also have a fuse mount in Peergos so you can mount any folder you have access to, including writable ones (in that case writes are sent back to your Peergos instance). This could be with a local Peergos instance, which then acts as a cache, or without (via a public URL to one).

The other neat thing we have is post-quantum block level access control (you have to auth to retrieve a block over bitswap). This means none of the ciphertext ipld blocks in Peergos are actually public unless you want them to be. For more details see https://peergos.org/posts/bats

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
need/triage Needs initial labeling and prioritization
Projects
None yet
Development

No branches or pull requests

5 participants