Skip to content

Commit

Permalink
Merge branch 'master' into add_tool_argocd
Browse files Browse the repository at this point in the history
  • Loading branch information
nherment authored Dec 6, 2024
2 parents 2a90b6b + 9afa83f commit 9decbcb
Show file tree
Hide file tree
Showing 155 changed files with 104,742 additions and 1,096 deletions.
9 changes: 9 additions & 0 deletions .github/workflows/build-and-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,16 @@ name: Build and test HolmesGPT
on: [push, pull_request, workflow_dispatch]

jobs:
check:
name: Pre-commit checks
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
- uses: pre-commit/[email protected]

build:
needs: check
strategy:
matrix:
python-version: ["3.9", "3.10", "3.11", "3.12"]
Expand Down
9 changes: 9 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
repos:
- repo: https://github.com/python-poetry/poetry
rev: 1.8.4
hooks:
- id: poetry-check
- id: poetry-lock
pass_filenames: false
args:
- --no-update
1,412 changes: 727 additions & 685 deletions poetry.lock

Large diffs are not rendered by default.

3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ boto3 = "^1.34.145"
setuptools = "^72.1.0"
aiohttp = "^3.10.2"
cachetools = "^5.5.0"
playwright = "^1.48.0"
playwright = "1.48.0"
bs4 = "^0.0.2"
markdownify = "^0.13.1"
starlette = "^0.40"
Expand All @@ -46,6 +46,7 @@ pytest-xdist = "^3.6.1"
ruff = "^0.7.3"
braintrust = "^0.0.168"
autoevals = "^0.0.103"
pre-commit = "^4.0.1"

[build-system]
requires = ["poetry-core"]
Expand Down
7 changes: 0 additions & 7 deletions tests/fixtures/test_chat/7_get_pod_events/kubectl_events.txt

This file was deleted.

This file was deleted.

25 changes: 0 additions & 25 deletions tests/fixtures/test_chat/7_get_pod_events/test_case.yaml

This file was deleted.

This file was deleted.

69 changes: 69 additions & 0 deletions tests/llm/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@


There are 2 types of test cases:

- *ask holmes*: tests the ask holmes functionality. For a single question is supported. back and forth conversation is not supported/tested
- *investigate*: tests the ability of Holmes to investigate issues reported by the alertmanager

## How to write a test case

### ask_holmes

#### 1. Create a test folder

Add a new folder to `tests/llm/fixtures/test_ask_holmes`. For example:

```sh
mkdir tests/llm/fixtures/test_ask_holmes/999_my_test_case
```

#### 2. Add a test case definition

In this folder, add a `test_case.yaml` file:

```yaml
user_prompt: 'Is pod xyz healthy? '
expected_output: "Yes, pod xyz is healthy. It is running and there are no errors in the logs."
retrieval_context:
- Any element of context. This will inform the evaluation score 'context'
- These context elements are expected to be present in the output
evaluation: # expected evaluation scores. The test will fail unless the LLM scores at least the following:
faithfulness: 0.5 # defaults to 0.3
context: 0 # defaults to 0
before-test: kubectl apply -f manifest.yaml
after-test: kubectl delete -f manifest.yaml
```
The above file requires a manifest.yaml to deploy resources to your kubernetes cluster. Add that file and any file required by the `before-test` and `after-test` commands.

Here are the possible fields in the `test_case.yaml` yaml file:

| Field | Type | Required/optional | Example value | Description |
|-------------------|------------------|-------------------|-----------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| user_prompt | str | Required | Is pod xyz healthy? | The user prompt |
| expected_output | str | Required | Yes, pod xyz is healthy. It is running and there are no errors in the logs. | The expected answer from the LLM |
| retrieval_context | List[str] | Optional | - pod xyz is running and healthy - there are no errors in the logs | Context that the LLM is expected to have used in its answer. If present, this generates a 'context' score proportional to the number of matching context elements found in the LLM's output. |
| evaluation | Dict[str, float] | Optional | evaluation: <br/> faithfulness: 1 <br/> context: 1 <br/> | The minimum expected scores. The test will fail unless these are met. Set to 0 for unstable tests. |
| before-test | str | Optional | kubectl apply -f manifest.yaml | A command to run before the LLM evaluation. The CWD for this command is the same folder as the fixture. This step is skipped unless `RUN_LIVE` environment variable is set |
| after-test | str | Optional | kubectl delete -f manifest.yaml | A command to run after the LLM evaluation.The CWD for this command is the same folder as the fixture. Typically cleans up any before-test action. This step is skipped unless `RUN_LIVE` environment variable is set |
| generate_mocks | bool | Optional | True | Whether the test suite should generate mock files. Existing mock files are overwritten. |


#### 3. Run the test

Run the following:

```sh
UPLOAD_DATASET=1 RUN_LIVE=1 pytest ./tests/llm/test_ask_holmes.py -k 999_my_test_case
```

The test may pass or not based on whether the evaluation scores are high enough. If the test fail,

# Environment variables

| Name | Example | Description |
|--------------------|-------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------|
| RUN_LIVE | RUN_LIVE=1 | Enables the execution of `before-test` and `after-test` commands to setuo any remote resource. This also ignores any mock files. |
| BRAINTRUST_API_KEY | BRAINTRUST_API_KEY=sk-1dh1...swdO02 | The braintrust API key you get from your account. Log in https://www.braintrust.dev -> top right persona logo -> settings -> API keys. |
| UPLOAD_DATASET | UPLOAD_DATASET=1 | Synchronise the dataset from the local machine to braintrust. This is usually safe as datasets are separated by branch name. |
| EXPERIMENT_ID | EXPERIMENT_ID=nicolas_gemini_v1 | Override the experiment name in Braintrust. Helps with identifying and comparing experiments. Must be unique across ALL experiments. |
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
{"toolset_name":"kubernetes/core","tool_name":"kubectl_find_resource","match_params":{"kind":"pod","keyword":"ip-172-31-8-128.us-east-2.compute.internal"}}
stdout:
default alertmanager-robusta-kube-prometheus-st-alertmanager-0 2/2 Running 0 3d22h 172.31.5.200 ip-172-31-8-128.us-east-2.compute.internal <none> <none> alertmanager=robusta-kube-prometheus-st-alertmanager,app.kubernetes.io/instance=robusta-kube-prometheus-st-alertmanager,app.kubernetes.io/managed-by=prometheus-operator,app.kubernetes.io/name=alertmanager,app.kubernetes.io/version=0.26.0,apps.kubernetes.io/pod-index=0,controller-revision-hash=alertmanager-robusta-kube-prometheus-st-alertmanager-57cd7fb46f,statefulset.kubernetes.io/pod-name=alertmanager-robusta-kube-prometheus-st-alertmanager-0
default analytics-exporter-fast-8cf8c9446-6rqwc 0/1 CrashLoopBackOff 1061 (2m51s ago) 3d18h 172.31.15.122 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app=analytics-exporter-fast,pod-template-hash=8cf8c9446
default customer-relations-webapp-5d98ffcfd-nj5gs 0/1 ImagePullBackOff 0 3d18h 172.31.14.171 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app=customer-relations,pod-template-hash=5d98ffcfd,visualize=true
default db-certs-authenticator-7ffd769f48-d9pxl 0/1 CrashLoopBackOff 886 (69s ago) 3d18h 172.31.3.214 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app=flask,pod-template-hash=7ffd769f48
default java-api-checker-9pj7k 0/1 Error 0 3d18h 172.31.12.200 ip-172-31-8-128.us-east-2.compute.internal <none> <none> batch.kubernetes.io/controller-uid=ea3f2c52-3382-4cbc-8958-41832511a3e7,batch.kubernetes.io/job-name=java-api-checker,controller-uid=ea3f2c52-3382-4cbc-8958-41832511a3e7,job-name=java-api-checker
default java-api-checker-vzm7z 0/1 Error 0 3d18h 172.31.13.205 ip-172-31-8-128.us-east-2.compute.internal <none> <none> batch.kubernetes.io/controller-uid=ea3f2c52-3382-4cbc-8958-41832511a3e7,batch.kubernetes.io/job-name=java-api-checker,controller-uid=ea3f2c52-3382-4cbc-8958-41832511a3e7,job-name=java-api-checker
default logging-agent 0/1 Init:CrashLoopBackOff 1067 (15s ago) 3d18h 172.31.1.249 ip-172-31-8-128.us-east-2.compute.internal <none> <none> <none>
default prometheus-robusta-kube-prometheus-st-prometheus-0 2/2 Running 0 3d22h 172.31.11.168 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app.kubernetes.io/instance=robusta-kube-prometheus-st-prometheus,app.kubernetes.io/managed-by=prometheus-operator,app.kubernetes.io/name=prometheus,app.kubernetes.io/version=2.48.1,apps.kubernetes.io/pod-index=0,controller-revision-hash=prometheus-robusta-kube-prometheus-st-prometheus-55d87c869b,operator.prometheus.io/name=robusta-kube-prometheus-st-prometheus,operator.prometheus.io/shard=0,prometheus=robusta-kube-prometheus-st-prometheus,statefulset.kubernetes.io/pod-name=prometheus-robusta-kube-prometheus-st-prometheus-0
default robusta-forwarder-89f44d49b-fxtrh 1/1 Running 0 3d22h 172.31.3.106 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app=robusta-forwarder,pod-template-hash=89f44d49b
default robusta-kube-prometheus-st-operator-7fc5db7f4d-dr46l 1/1 Running 0 3d22h 172.31.6.195 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app.kubernetes.io/instance=robusta,app.kubernetes.io/managed-by=Helm,app.kubernetes.io/part-of=kube-prometheus-stack,app.kubernetes.io/version=55.7.0,app=kube-prometheus-stack-operator,chart=kube-prometheus-stack-55.7.0,heritage=Helm,pod-template-hash=7fc5db7f4d,release=robusta
default robusta-prometheus-node-exporter-t2b5k 1/1 Running 0 3d22h 172.31.8.128 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app.kubernetes.io/component=metrics,app.kubernetes.io/instance=robusta,app.kubernetes.io/managed-by=Helm,app.kubernetes.io/name=prometheus-node-exporter,app.kubernetes.io/part-of=prometheus-node-exporter,app.kubernetes.io/version=1.7.0,controller-revision-hash=7bf445876b,helm.sh/chart=prometheus-node-exporter-4.24.0,jobLabel=node-exporter,pod-template-generation=1,release=robusta
default search-engine-service 0/1 Running 0 3d18h 172.31.11.151 ip-172-31-8-128.us-east-2.compute.internal <none> <none> <none>
kube-system aws-node-m47xg 2/2 Running 0 25d 172.31.8.128 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app.kubernetes.io/instance=aws-vpc-cni,app.kubernetes.io/name=aws-node,controller-revision-hash=54f5998898,k8s-app=aws-node,pod-template-generation=1
kube-system ebs-csi-controller-7bb676b68d-cs2gx 6/6 Running 0 25d 172.31.12.254 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app.kubernetes.io/component=csi-driver,app.kubernetes.io/managed-by=EKS,app.kubernetes.io/name=aws-ebs-csi-driver,app.kubernetes.io/version=1.35.0,app=ebs-csi-controller,pod-template-hash=7bb676b68d
kube-system ebs-csi-node-pgrvq 3/3 Running 0 25d 172.31.2.194 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app.kubernetes.io/component=csi-driver,app.kubernetes.io/managed-by=EKS,app.kubernetes.io/name=aws-ebs-csi-driver,app.kubernetes.io/version=1.35.0,app=ebs-csi-node,controller-revision-hash=6bc69bc4b9,pod-template-generation=1
kube-system eks-pod-identity-agent-vgz8h 1/1 Running 0 25d 172.31.8.128 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app.kubernetes.io/instance=eks-pod-identity-agent,app.kubernetes.io/name=eks-pod-identity-agent,controller-revision-hash=74bcb67854,pod-template-generation=1
kube-system kube-proxy-l7vqp 1/1 Running 0 25d 172.31.8.128 ip-172-31-8-128.us-east-2.compute.internal <none> <none> controller-revision-hash=6b64cc6947,k8s-app=kube-proxy,pod-template-generation=1
sock-shop user-5bd96d75fb-ld8xv 1/1 Running 0 3d18h 172.31.0.106 ip-172-31-8-128.us-east-2.compute.internal <none> <none> name=user,pod-template-hash=5bd96d75fb
sock-shop user-db-5dc5c5f488-dw6xw 1/1 Running 0 3d18h 172.31.0.66 ip-172-31-8-128.us-east-2.compute.internal <none> <none> name=user-db,pod-template-hash=5dc5c5f488

stderr:
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{"toolset_name":"kubernetes/core","tool_name":"kubectl_find_resource","match_params":{"kind":"node","keyword":"ip-172-31-8-128.us-east-2.compute.internal"}}
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME LABELS
ip-172-31-8-128.us-east-2.compute.internal Ready <none> 25d v1.30.4-eks-a737599 172.31.8.128 3.147.70.176 Amazon Linux 2 5.10.225-213.878.amzn2.x86_64 containerd://1.7.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t3.medium,beta.kubernetes.io/os=linux,eks.amazonaws.com/capacityType=ON_DEMAND,eks.amazonaws.com/nodegroup-image=ami-067ed4d12a282fb31,eks.amazonaws.com/nodegroup=nicolas-node-group,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2a,k8s.io/cloud-provider-aws=02bcd7cbb8e774ede4606ab79260ae31,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-172-31-8-128.us-east-2.compute.internal,kubernetes.io/os=linux,node.kubernetes.io/instance-type=t3.medium,topology.ebs.csi.aws.com/zone=us-east-2a,topology.k8s.aws/zone-id=use2-az1,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2a
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
{"toolset_name":"kubernetes/core","tool_name":"kubectl_find_resource","match_params":{"kind":"pod","name":"ip-172-31-8-128.us-east-2.compute.internal"}}
Error from server (NotFound): pods "ip-172-31-8-128.us-east-2.compute.internal" not found
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{"toolset_name":"kubernetes/live-metrics","tool_name":"kubectl_top_pods","match_params":{}}
Command `kubectl top pods -A` failed with return code 1
stdout:

stderr:
error: Metrics API not available
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,3 @@ retrieval_context:
evaluation:
answer_relevancy: 0
faithfulness: 0
contextual_precision: 0
contextual_recall: 0
contextual_relevancy: 0
20 changes: 20 additions & 0 deletions tests/llm/fixtures/test_ask_holmes/07_high_latency/helm/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
FROM python:3.10-slim

# Set working directory
WORKDIR /app

# Copy requirements.txt
COPY requirements.txt .

# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy the FastAPI app
COPY . .

# Expose the ports
EXPOSE 8000 8001

# Run the FastAPI app
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Loading

0 comments on commit 9decbcb

Please sign in to comment.