Skip to content

Commit

Permalink
[sophora-server][sophora-cluster-common] update pre-stop hook to 2.0.0 (
Browse files Browse the repository at this point in the history
#87)

* [sophora-server] update pre-stop hook to 2.0.0 and adapt chart to the changes introduced by the new version

* fix cli call

* [sophora-cluster-common] add alert for situations with multiple primary servers and change alerting runbook to include the new ways to disable switching

* restore to use the gh warning markdown format
  • Loading branch information
philmtd authored Apr 16, 2024
1 parent dc2d065 commit a7d263c
Show file tree
Hide file tree
Showing 11 changed files with 152 additions and 31 deletions.
2 changes: 1 addition & 1 deletion charts/sophora-cluster-common/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@ apiVersion: v2
name: sophora-cluster-common
description: A Helm chart containing some common resources useful for Sophora cloud setups
type: application
version: 1.0.2
version: 1.1.0
appVersion: "4"
62 changes: 51 additions & 11 deletions charts/sophora-cluster-common/alerting-runbook.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,29 +17,69 @@ replication will happen to other running servers, if there are any.
* Check if the deployment has been uninstalled by mistake
* Check whether the server might have crashed
* Check the server logs for error messages
* Check if it would be possible to elect another cluster server to the primary. This should be done carefully to ensure no data is lost.
* Check if it would be possible to elect another cluster server to the primary. This should be done carefully to ensure
no data is lost.
* Try to restart the server, if it is running but unresponsive
* Restore the server from a working backup

### SophoraServerNotInSync

**Severity:** high

**Summary:** The Sophora server is not in sync. This is concluded from comparing the server's *SourceTime* with the
SourceTime of the primary server. The SourceTime is the timestamp of the latest event that occured on the primary server.
**Summary:** The Sophora server is not in sync. This is concluded from comparing the server's *SourceTime* with the
SourceTime of the primary server. The SourceTime is the timestamp of the latest event that occured on the primary
server.
Usually the SourceTimes of the servers should not diverge too much and stay equal when compared over a short time frame.

**Remediation steps:**

* Check if the primary server logged a message containing "ReplicationMaster stopped" or "StagingMaster stopped". If yes: The primary server needs to be
restarted. If "ReplicationMaster stopped" is logged, this needs to happen **without electing another server to the primary**. The last part is absolutely critical to prevent data loss. As
the servers automatically switch using a shutdown hook, a workaround is to exec into the container and replace the
shutdown hook located in the `/tools/` directory with an empty executable file before restarting the server. Note that during the restart
working with Sophora will not be possible for a few minutes. If the error persists check the logs of the primary
to find error logs hinting at the root cause of the problem.
* Check if there is a large replication queue (e.g. due to a large amount of imports), which would result in a short replication
delay
* Check if the primary server logged a message containing "ReplicationMaster stopped" or "StagingMaster stopped". If
yes: The primary server needs to be
restarted. If "ReplicationMaster stopped" is logged, this needs to happen **without electing another server to the
primary**. The last part is absolutely critical to preventing data loss. Depending on the version of the Server Helm
Chart
you are using, there are two options to ensure this:
* Server Helm Chart 2.1.0 and later: Give the server's Pod the
annotation `prestop.server.sophora.cloud/switch-enabled: "false"`.
* Before 2.1.0: As the servers automatically switch using a shutdown hook, a workaround is to exec into the
container and replace the
shutdown hook located in the `/tools/` directory with an empty executable file before restarting the server. Note
that
during the restart
working with Sophora will not be possible for a few minutes. If the error persists check the logs of the primary
to find error logs hinting at the root cause of the problem.
* Check if there is a large replication queue (e.g. due to a large amount of imports), which would result in a short
replication
delay
* Check whether the not-in-sync server is in an erroneous state and stopped receiving replication messages
* Check whether network connection issues between the server and the primary server exist
* Check the server's and the primary server's logs for errors or warnings
* Restart the server

### MultiplePrimarySophoraServers

**Severity:** critical

**Summary:** The Sophora Cluster has more than one server claiming to be the primary server.
Write operations with client tools can likely lead to inconsistencies in the entire Sophora cluster
that will need to be resolved manually.

**Remediation steps:**

* Check if a cluster switch is in progress and taking longer than expected to complete
* Restart all servers which should not be primary. To prevent these servers from switching automatically,
give their pods the annotation `prestop.server.sophora.cloud/switch-enabled: "false"`(*)
* Check the server logs for error messages
* Make sure the servers are started in the correct order. Currently, servers can only have one remote-server configured.
This means in a scenario with three or more cluster servers, it is possible that a server mistakenly assumes it should
start
as primary.
* Make sure the PDB is configured to only let one cluster server be down at the same time, which should prevent this
from happening if the remote servers of each server are configured correctly in a loop. (e.g. 3 <- 1 <- 2 <- 3)
* If there are inconsistencies in the cluster (e.g. documents created in both primaries) see if these can be resolved
manually.
Else, restore the servers from a backup.

(*) This works starting with the Server Helm Chart 2.1.0 and the there included pre-stop hook 2.0.0.
Before, this can only be done by replacing the shutdown hook located in the `/tools/` directory of the server's
container with an empty executable file.
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,15 @@ spec:
summary: Server is not in sync
description: The server "{{`{{ $labels.pod }}`}}" is not in sync for more than 2 minutes.
runbook_url: 'https://github.com/subshell/helm-charts/blob/main/charts/sophora-cluster-common/alerting-runbook.md'
- alert: MultiplePrimarySophoraServers
for: 2m
expr: 'count(sophora_server_replication_mode == 1) > 1'
labels:
severity: critical
annotations:
summary: The Sophora Cluster has more than one server claiming to be the primary.
description: There are two primary servers in the cluster for more than 2 minutes.
runbook_url: 'https://github.com/subshell/helm-charts/blob/main/charts/sophora-cluster-common/alerting-runbook.md'
{{- end }}
{{- with .Values.prometheusRules.rules }}
{{ tpl (toYaml .) $ | nindent 8 }}
Expand Down
2 changes: 1 addition & 1 deletion charts/sophora-server/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 2.0.0
version: 2.1.0

# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
Expand Down
61 changes: 47 additions & 14 deletions charts/sophora-server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ In later chart versions this will be the default.

## Postgres connection

Starting with Sophora 5 the installation requires postgres.
Starting with Sophora 5, the installation requires postgres.
You can provide credentials via a secret: `sophora.server.persistence.postgres.secret`.
To enable the postgres version store set `sophora.server.persistence.postgres.versionStoreEnabled` to `true`.
For all other configuration options use `sophora.server.properties`.
Expand All @@ -24,9 +24,9 @@ It's also possible to use postgres as your jcr repository. To use postgres with
Cluster servers require one statefulset per instance. Deploy multiple statefulsets to create an actual sophora cluster.
Therefore `replicaCount` only supports `0` and `1`.

#### Pod Anti Affinity
#### Pod Anti-Affinity

To prevent multiple cluster servers to be scheduled on the same k8s node you can use the podAntiAffinity. Per default,
To prevent multiple cluster servers from being scheduled on the same k8s node, you can use the podAntiAffinity. Per default,
you can write the following in your values file:

```yaml
Expand All @@ -53,33 +53,66 @@ You could also use a different `topologyKey` in order to make sure that deployme
also across unique zones or regions.

This is only necessary for cluster servers as there are usually only two of them, and you would want to ensure that in
case of a node failure at least one cluster server remains running.
case of a node failure, at least one cluster server remains running.

#### Taint tolerations

Kubernetes allows to select a node to schedule a pod based on different criteria. If one wants to make sure pod is only scheduled
Kubernetes allows selecting a node to schedule a pod based on different criteria. If one wants to make sure pod is only scheduled
on a certain node, one shall set a node affinity. If the node shall be exclusive for this kind of pod, there is the possibility
to taint the node and provide the pods with a set of toleration to tolerate the taint. In sophora one may use this to provide a
to taint the node and provide the pods with a set of toleration to tolerate the taint. In Sophora, one may use this to provide a
separate node pool for a certain type of sophora servers exclusively.
Further information on how taints work: [kubernetes.io/Taints and Tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/#example-use-cases)

#### Server pre-stop lifecycle hook

All Sophora cluster servers are equipped with a pre-stop lifecycle hook, that is executed when the pod is about to shut down
due to user request, uninstallation or the Kubernetes scheduler deciding to move it to another node (etc.).

If the server to be shut down is the primary server, the hook will initiate a cluster switch to one of the other available
servers, if there are any. Before switching, it filters the list of available replicas to find those suitable to switch to.

The behaviour of the hook can be manipulated using the following **optional** annotations on the server's Pods:

1. `prestop.server.sophora.cloud/switch-enabled: "<true|false>"`
2. `prestop.server.sophora.cloud/is-switch-target: "<true|false>"`

The first annotation controls whether the server shutting down should switch.
In some edge-cases, it might be useful to shut down a server without switching.

The second annotation can be used to specify whether the annotated server should be a valid switch target server.
If set to `false`, the tool will not switch to that server.

Both annotations default to `true`, if not specified or the value is not parseable to a boolean value, because generally
switches should happen and should only be deactivated for maintenance, recovery or similar scenarios.

For this to work, the server's pod requires a service account with the permission to `get` and `list` Pods and services
in the namespace the server runs in. The SA, Role and Role Binding are created automatically.
The creation of these resources can be controlled with the `serviceAccount:` section in the values file.

#### Server mode pod labels

Cluster servers run a sidecar container which continuously labels the pods with their server mode
to make it possible to create a service which always points to the current primary server.

For the sidecar to work the server requires a service account with the permission to `get` and `patch` pods
in the namespace the server runs in. SA, Role and Role Binding are created if not unchecked via
`serverModeLabeler.createServiceAccount: false`.
You can provide your own Service Account via `serviceAccountName:` in the values.
For the sidecar to work, the server requires a service account with the permission to `get` and `patch` Pods
in the namespace the server runs in. The SA, Role and Role Binding are created automatically by this chart.
The creation of these resources can be controlled with the `serviceAccount:` section in the values file.


## Breaking changes
> [!WARNING]
## Notable Changes

## 2.1.0
Updates the pre-stop hook to version 2.0.0 and configures it accordingly.
Please note that this now involves the creation of another Role and RoleBinding for this specific use-case, so that
the hook can get information through the Kubernetes API. If you don't manage the Service Account through this Helm
chart, you may need to configure it manually to provide the required permissions.

## 2.0.0 (Breaking changes)
> [!WARNING]
> Please read this information carefully before updating!
### 2.0.0

* Renamed `serverModeLabeler.enabledOnClusterServers` to `serverModeLabeler.enabled`
* Removed `serverModeLabeler.createServiceAccount` in favor of `serviceAccount.create`
* Removed `serverModeLabeler.createServiceAccount` in favour of `serviceAccount.create`
* Renamed `sidecars` to `extraContainers`
* Create `serviceAccount` by default even if `serverModeLabeler.enabled` is set to `false
* Names of `Role` and `RoleBinding` have been suffixed with `-server-mode-labeler`.
Expand Down
16 changes: 16 additions & 0 deletions charts/sophora-server/templates/role-prestop-hook.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{{- if and .Values.preStop.enabled .Values.serviceAccount.create -}}
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "common.safeSuffixFullname" (list . "prestop-hook") }}
labels: {{- include "sophora-server.labels" . | nindent 4 }}
rules:
- apiGroups:
- ""
resources:
- "pods"
- "services"
verbs:
- "get"
- "list"
{{- end }}
15 changes: 15 additions & 0 deletions charts/sophora-server/templates/rolebinding-prestop-hook.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{{- if and .Values.preStop.enabled .Values.serviceAccount.create -}}
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "common.safeSuffixFullname" (list . "prestop-hook") }}
labels: {{- include "sophora-server.labels" . | nindent 4 }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "common.safeSuffixFullname" (list . "prestop-hook") }}
subjects:
- kind: ServiceAccount
name: {{ include "sophora-server.fullname" . }}
namespace: {{ .Release.Namespace }}
{{- end }}
14 changes: 11 additions & 3 deletions charts/sophora-server/templates/statefulset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -245,20 +245,28 @@ spec:
optional: false
{{- end }}
{{ if and (eq .Values.sophora.server.isClusterServer true) (.Values.sophora.server.authentication.secret) -}}
- name: SOPHORAUSERNAME
- name: SOPHORA_USERNAME # required for the preStop hook
valueFrom:
secretKeyRef:
key: {{ .Values.sophora.server.authentication.secret.usernameKey }}
name: {{ .Values.sophora.server.authentication.secret.name }}
optional: false
- name: SOPHORAPASSWORD
- name: SOPHORA_PASSWORD # required for the preStop hook
valueFrom:
secretKeyRef:
key: {{ .Values.sophora.server.authentication.secret.passwordKey }}
name: {{ .Values.sophora.server.authentication.secret.name }}
optional: false
- name: LOG_MODE # used by the preStop hook to configure JSON logging
value: "prod"
- name: POD_NAME # required for the preStop hook
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE # required for the preStop hook
valueFrom:
fieldRef:
fieldPath: metadata.namespace
{{- end }}
{{ if .Values.sophora.server.env -}}
{{- toYaml .Values.sophora.server.env | nindent 12 }}
Expand Down Expand Up @@ -312,7 +320,7 @@ spec:
[
"/bin/sh",
"-c",
"/tools/sophora-prestop switch --serverUrl=http://localhost:1196 1> /proc/1/fd/1 2> /proc/1/fd/2",
"/tools/sophora-prestop switch --server-url=http://localhost:1196 1> /proc/1/fd/1 2> /proc/1/fd/2",
]
{{- end }}
resources:
Expand Down
2 changes: 1 addition & 1 deletion charts/sophora-server/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ preStop:
image:
repository: docker.subshell.com/tools/sophora-prestop
pullPolicy: IfNotPresent
tag: "1.2.0"
tag: "2.0.0"

serverModeLabeler:
image:
Expand Down

0 comments on commit a7d263c

Please sign in to comment.