Skip to content
This repository has been archived by the owner on Feb 8, 2023. It is now read-only.

Filesystem backed ipfs-watch #434

Open
Stebalien opened this issue Jun 19, 2020 · 2 comments
Open

Filesystem backed ipfs-watch #434

Stebalien opened this issue Jun 19, 2020 · 2 comments
Labels
kind/discussion Topical discussion; usually not changes to codebase

Comments

@Stebalien
Copy link
Member

Stebalien commented Jun 19, 2020

Proposal: a tool for watching a folder and making it available over over IPFS (a) without having to re-add everything any time a file changes and (b) without having to duplicate the file data on disk.

Users currently use the "filestore" feature to add files to go-ipfs without storing the data on disk twice. Unfortunately, the filestore doesn't integrate very well with go-ipfs as-is because go-ipfs expects pinned data to remain available, while files on disk can change. Furthermore, re-syncing a large directory into go-ipfs can be quite expensive.

IMO, the best solution would be to not use the go-ipfs daemon, but instead is to implement an ipfs-watch tool. It would:

  1. Monitor a directory for changes.
  2. When a file is added, it would chunk, hash, and index the file into a database (e.g., sqlite). Then it would use MFS (go-mfs) to add the file to an IPFS directory.
  3. When a file is removed/changed, it would remove references to the file, and remove the file from the IPFS directory structure using MFS.
  4. Finally, whenever the IPFS directory structure changes, the resulting root hash would be (a) printed on standard out and (b) published to IPNS.

The database schema would be:

  • Table: files
    • filename (primary key)
    • modtime
  • Table: blocks
    • id (primary key)
    • cid (indexed)
    • filename (indexed)
    • offset

Events:

  • On start:
    • scan for changed files, comparing with the mod times in the database.
  • On add/update:
    • Add the file to the files table.
    • Run DELETE FROM blocks where filename=filename (just in case)
    • Chunk the file, adding each block to the blocks table.
    • Link the file into MFS.
  • On remove:
    • Run DELETE FROM blocks where filename=filename
    • Remove the file from the files table.
    • Unlink the file from MFS.

Prior art and related:

@Stebalien Stebalien added the kind/discussion Topical discussion; usually not changes to codebase label Jun 19, 2020
@markg85
Copy link

markg85 commented Jun 24, 2021

I like this idea! It makes it possible to have a folder "synced" on IPFS.

But.. I don't really understand the part where it makes it available on IPFS. You say "use MFS (go-mfs) to add the file to an IPFS directory". How does that magically work? Where's the glue that makes it available on IPFS?

Also, why is there a need for a database in this logic? IPFS internally stores data, can't that be (ab)used to store this too?

Edit.
What you propose is - on linux at least - conceptually not that difficult. I do this very same logic with inotify where i'm watching one folder for changes and index those files in a SQLite database. It allows me to say things like "hey google, ask to play " :)

@TheDiscordian
Copy link
Member

But.. I don't really understand the part where it makes it available on IPFS. You say "use MFS (go-mfs) to add the file to an IPFS directory". How does that magically work? Where's the glue that makes it available on IPFS?

I believe the idea is something similar to what ipfs-sync does, just without using the HTTP API. If you add a directory like /home/user/Documents/MyIPFSWebsite it's mirrored into MFS as /ipfs-sync/MyIPFSWebsite. On MFS, it's not magic, MFS works sorta like a pin, so the node makes the data available like it would any other data.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/discussion Topical discussion; usually not changes to codebase
Projects
None yet
Development

No branches or pull requests

3 participants