-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Choose object storage provider for haikuports #141
Comments
My preference:
|
I've reached out to wasabi to try and get "actual" bandwidth utilization numbers. They don't publish it in our portal (but I sure as hell know they look at it since they have cut us off before due to egress) |
Why do you have Backblaze as "20-25 a month"? If we factor in the CDN with free egress then shouldn't it be storage costs only, and thus be equivalent to Wasabi + CDN? |
"Free egress up to 3x their average monthly storage amount. Egress over average stored is $0.01/GiB. EDIT: I did that math wrong. Lets re-run the cost numbers. Assuming 6TiB egress, and 400GiB storage. Backblaze:
Storj:
Wasabi:
Telnyx:
Backblaze + bunny.net CDN seems like the best deal tbh with controlled risk. The Bunny.net CDN could cut that 6TiB way down to a "a few TiB or less" on all providers, but it's an unknown how efficient their caching is in our use-case |
EDIT - Actual worst-case egress bandwidth numbers:
|
For me, while I think reliability is important, it is not the end of the world if we get cut off and need to relocate. However, how do we keep control of our packages? I.e. is there going to be a backup or a primary source for them? |
The nice thing about s3 is it actually gets easier to back things up. Today we have the automatic "compress all the artifacts, encrypt them, and upload to an s3 bucket" backup system. That doesn't work for huge things though since I really don't want to work with 300GiB tar delta's 😅 In the model where some object storage provider is the source of truth, we really just need to rclone the bucket "somewhere" else. Historically i've just rcloned to a dedicated bit of local storage at my house as a cold backup (you could do the same). rclone works off of deltas like rsync, so it's bandwidth consumption friendly after the initial clone. rclone also lets you sync between storage providers... and it supports a TON We actually have an rclone container today ready to go that will do that to storj. We can make some fixes though to make it more generic. I also have rclonefs which will (theoretically) let us mount s3 buckets as fuse storage mounts on each k8s node so we can (theoretically) offer s3 buckets over rsync to mirrors from pods running on any k8s node. (fuse in k8s is weird though, and we need elevated security context).
Agree. Definitely the biggest pain point of object storage. I really like the pricing of Telnyx, but the whole "per million API hits" thing makes me nervous on something complex and large like haikuports.
Agree. Lets strike DO off the list. They had some appealing things to them, but needing a whole gaggle of buckets to groom to get reasonable pricing is too much lift. I'm tired of forming infrastructure "around" providers weird limitations. |
I updated #141 (comment) with the pricing based on the actual worst case bandwidth numbers I saw on digital ocean. |
Oh, and I just looked at the Wasabi bill.. it does list "908.40 API requests" for the month. I'm guessing that's 1000's though given the decimal point.. so 908,400 makes more sense. |
Looks like the preferred is backblaze + bunny then? |
Agree. I think backblaze + bunny are going to be the cheapest combo. Bunny will cut down the xfer 50%, so that $30 / month should be "worst case" |
Ryan went ahead and entered our billing info. I went ahead and deployed a temporary VM @ digital ocean to use to shovel artifacts over to backblaze. I'm going to start with the Haiku repos themselves since it's an easy (smaller) test of data before moving on to haikuports. |
That's not great and definitely false advertising... |
I went ahead and put the haiku repo over onto backblaze. We already blew past the "free tier" of class C api calls during the last sync. 😮💨 I'm about to head out of town and will be back Sunday.. so here are important facts:
If the 💩 hits the fan, you can take the following actions to undo the migration to backblaze: |
With haikuporter's support of s3, we need to choose a object storage provider.
For context, this will be replacing our Digital Ocean volume block attachment which is $25 month / 250GiB
Assuming ~400GiB stored... 2TiB of egress a month (which gives us a lot of head room)
Assuming 35 million API ops a month (17M Class A, 17M Class B)
$16-35 / month likely as we grow. Risk of pulling too much egress and getting shut off.
Likely $11-20 / month, $ per API operations a big risk. Haikuporter, hpkgbouncer, all hit APIs
Likely $12-20 / month
$16-24 / month
Digital Ocean Spaces -We don't like having to have multiple buckets to get reasonable pricing.~$23 per month for 400GiB + 2TiB xfer
Notes: We don't have to go all-in on a single S3 provider. Haiku can remain at wasabi, haikuports can be "where ever". We can run one deployment of hpkgbouncer per repo.
The text was updated successfully, but these errors were encountered: