Scale database reads to replicas in Rails
🍊 Battle-tested at Instacart
Add this line to your application’s Gemfile:
gem "distribute_reads"
Makara does most of the work. First, update database.yml
to use it:
default: &default
url: postgresql-makara:///
makara:
sticky: true
connections:
- role: master
name: primary
url: <%= ENV["DATABASE_URL"] %>
- name: replica
url: <%= ENV["REPLICA_DATABASE_URL"] %>
development:
<<: *default
production:
<<: *default
Note: You can use the same instance for the primary and replica in development.
By default, all reads go to the primary instance. To use the replica, do:
distribute_reads { User.count }
Works with multiple queries as well.
distribute_reads do
User.find_each do |user| # replica
user.orders_count = user.orders.count # replica
user.save! # primary
end
end
Distribute all reads in a job with:
class TestJob < ApplicationJob
distribute_reads
def perform
# ...
end
end
You can pass any options as well.
Active Record uses lazy evaluation, which can delay the execution of a query to outside of a distribute_reads
block. In this case, the primary will be used.
users = distribute_reads { User.where(orders_count: 1) } # not executed yet
Call to_a
or load
inside the block to ensure the query runs on a replica.
users = distribute_reads { User.where(orders_count: 1).to_a }
You can automatically load relations returned from distribute_reads
blocks by creating an initializer with:
DistributeReads.eager_load = true
Raise an error when replica lag is too high (specified in seconds)
distribute_reads(max_lag: 3) do
# raises DistributeReads::TooMuchLag
end
Instead of raising an error, you can also use primary
distribute_reads(max_lag: 3, lag_failover: true) do
# ...
end
If you have multiple databases, this only checks lag on ActiveRecord::Base
connection. Specify connections to check with
distribute_reads(max_lag: 3, lag_on: [ApplicationRecord, LogRecord]) do
# ...
end
Note: If lag on any connection exceeds the max lag and lag failover is used, all connections will use their primary.
If no replicas are available, primary is used. To prevent this situation from overloading the primary, you can raise an error instead.
distribute_reads(failover: false) do
# raises DistributeReads::NoReplicasAvailable
end
Change the defaults
DistributeReads.default_options = {
lag_failover: true,
failover: false
}
Messages about failover are logged to the Active Record logger by default. Set a different logger with:
DistributeReads.logger = Logger.new(STDERR)
Or use nil
to disable logging.
At some point, you may wish to distribute reads by default.
DistributeReads.by_default = true
To make queries go to primary, use:
distribute_reads(primary: true) do
# ...
end
Get replication lag in seconds
DistributeReads.replication_lag
Most of the time, Makara does a great job automatically routing queries to replicas. If it incorrectly routes a query to primary, you can use:
distribute_reads(replica: true) do
# send all queries in block to replica
end
Rails 6+ has native support for replicas 🎉
ActiveRecord::Base.connected_to(role: :reading) do
# do reads
end
However, it’s not able to do automatic statement-based routing like Makara yet.
Thanks to TaskRabbit for Makara, Sherin Kurian for the max lag option, and Nick Elser for the write-through cache.
View the changelog
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features
To get started with development and testing:
git clone https://github.com/ankane/distribute_reads.git
cd distribute_reads
createdb distribute_reads_test_primary
createdb distribute_reads_test_replica
bundle install
bundle exec rake test