Extract data from ipfs-search' database, for phun and profit.
- Python 3
- pipenv
pipenv shell
python extractor.py
- Behold result.
- Tweak parameters (in script).
(ipfs-search-extractor) $ python extractor.py | bzip2 -c > exports/ipfs-search-2018.json.bz2
131645 documents written in 57.59049701690674
First item: 2018-01-16T18:46:00Z
Last item: 2018-12-31T23:58:57Z
[
"QmbAvZoiPvAaLY6vFyQSxAaMhzSa5vp2CDi4LzRejpw9DZ",
"xkcd: Brontosaurus",
"2018-01-16T18:46:00Z"
],
[
"QmcZ2a1tQpDUoDFGHhXs6Ga795LAbX2t4FEuTBYWxLYuUP",
"Botany Readings",
"2018-01-16T18:46:15Z"
],
...
For efficiency reasons, we are omitting field names. We're using JSON mainly to avoid encoding issues.
[
"<CID>",
"<title>",
"<first-seen>"
]
- 2018 (5.97 MiB)
- 2019 (10.24 MiB, 131645 documents)
- 2020 (20.12 MiB, 450436 documents)
- 2021, until 10-8 (456.52 MiB, 10485760 documents) Note that the greater majority of files on ipfs-search.com seem not to have an extracted title!