Definitions for Grafana, Grafana Agent, Mimir, Loki, and Tempo on Google Kubernetes Engine. This repository also includes definitions for Traefik ingress, K3S helm controller, cert manager and kubernetes dashboard for convenience.
We use Cert Manager configured with a DNS resolver using Cloudflare and Lets Encrypt. You'll likely need to edit this configuration based on your setup. For our case, create a cloudflare-api-token-secret.yaml
file under /cert-manager
based on the template in cloudflare-api-token-secret.example.yaml
. Replace the api-token
field based on the following guide: https://cert-manager.io/docs/configuration/acme/dns01/cloudflare/
You should now be ready to install cert-manager with kubectl apply -k cert-manager
. If the install fails, try re-running it a few times. This tends to happen because required resources are still pending while kustomize is deploying.
We use the K3S Helm Controller to get a HelmChart
CRD for a more Kubernetes native way of deploying helm charts. Install the controller with kubectl apply -k helm-controller
Install the Grafana Agent operator which sets up required CRDs such as ServiceMonitor
for Traefik and other components kubectl apply -f observability/grafana-agent-operator.yaml
We use the open source edge router Traefik for ingress which comes with a built in dashboard.
- Create a file called
secret.yaml
undertraefik
based onsecret.example.yaml
. Follow the instructions to setup the login - Update the
/traefik/dashboard.yaml
file with your domain - Install the Traefik ingress controller
kubectl apply -k traefik
- Go to https://traefik.observability.your-domain.com to see your dashboard
Our observability stack (/observability
) consists of Grafana, Mimir (distributed), Loki (simple scalable) and Tempo (monolithic). Tempo will soon be switched to distributed while Loki will remain in simple scalable since it can already handle 100GB+/day of logs. We also include Grafana Agents (/observability/ingest
) pre-configured for use with Grafana Faro. We use GCP for our buckets (GCS) and Workload Identity Federation for connecting Kubernetes service accounts with GCP IAM service accounts.
Before continuing, please ensure that Workload Identity Federation has been configured on your cluster as it is not enabled by default at the time of writing: https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity
- Update
/observability/setup-gcp.sh
with yourPROJECT_ID
. This script will create GCS buckets, IAM roles, IAM service accounts and IAM policy bindings. It requires thatgcloud
andgsutil
are installed. Please read through the entire script before running - Create
grafana-auth.yaml
in/observability
based ongrafana-auth.example.yaml
and edit as appropriate - Edit
grafana.yaml
changing the certificates and ingress routes as appropriate for your domain - Edit
mimir.prod.yaml
loki.prod.yaml
andtempo.prod.yaml
replacing the service account annotation ofPROJECT_ID_HERE
with your GCP project id - Install the stack with
kubectl apply -k observability
- If desired, you can use Grafana Faro with Grafana agents. Edit and install the ingest Grafana Agents with
kubectl apply -k observability/ingest
The Kubernetes Dashboard is a web-based interface for your cluster. "It allows users to manage applications running in the cluster and troubleshoot them, as well as manage the cluster itself."
- Update
/kube-dashboard/ingress.yaml
with your domain - Install the Kubernetes dashboard with
kubectl apply -k kube-dashboard
- Run
kube-dashboard/get-auth.sh
and follow the instructions to setup your Kubernetes config with a token for logging in - Go to https://kube.observability.speechify.dev and use your Kubernetes config (usually
~/.kube/config
) to login