-
Notifications
You must be signed in to change notification settings - Fork 53
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'master' into add_tool_argocd
- Loading branch information
Showing
155 changed files
with
104,742 additions
and
1,096 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,7 +5,16 @@ name: Build and test HolmesGPT | |
on: [push, pull_request, workflow_dispatch] | ||
|
||
jobs: | ||
check: | ||
name: Pre-commit checks | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v4 | ||
- uses: actions/setup-python@v5 | ||
- uses: pre-commit/[email protected] | ||
|
||
build: | ||
needs: check | ||
strategy: | ||
matrix: | ||
python-version: ["3.9", "3.10", "3.11", "3.12"] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
repos: | ||
- repo: https://github.com/python-poetry/poetry | ||
rev: 1.8.4 | ||
hooks: | ||
- id: poetry-check | ||
- id: poetry-lock | ||
pass_filenames: false | ||
args: | ||
- --no-update |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
2 changes: 0 additions & 2 deletions
2
tests/fixtures/test_chat/7_get_pod_events/kubectl_find_resource.txt
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
6 changes: 0 additions & 6 deletions
6
tests/fixtures/test_chat/8_multi_container_pod/kubectl_logs_grep.txt
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
|
||
|
||
There are 2 types of test cases: | ||
|
||
- *ask holmes*: tests the ask holmes functionality. For a single question is supported. back and forth conversation is not supported/tested | ||
- *investigate*: tests the ability of Holmes to investigate issues reported by the alertmanager | ||
|
||
## How to write a test case | ||
|
||
### ask_holmes | ||
|
||
#### 1. Create a test folder | ||
|
||
Add a new folder to `tests/llm/fixtures/test_ask_holmes`. For example: | ||
|
||
```sh | ||
mkdir tests/llm/fixtures/test_ask_holmes/999_my_test_case | ||
``` | ||
|
||
#### 2. Add a test case definition | ||
|
||
In this folder, add a `test_case.yaml` file: | ||
|
||
```yaml | ||
user_prompt: 'Is pod xyz healthy? ' | ||
expected_output: "Yes, pod xyz is healthy. It is running and there are no errors in the logs." | ||
retrieval_context: | ||
- Any element of context. This will inform the evaluation score 'context' | ||
- These context elements are expected to be present in the output | ||
evaluation: # expected evaluation scores. The test will fail unless the LLM scores at least the following: | ||
faithfulness: 0.5 # defaults to 0.3 | ||
context: 0 # defaults to 0 | ||
before-test: kubectl apply -f manifest.yaml | ||
after-test: kubectl delete -f manifest.yaml | ||
``` | ||
The above file requires a manifest.yaml to deploy resources to your kubernetes cluster. Add that file and any file required by the `before-test` and `after-test` commands. | ||
|
||
Here are the possible fields in the `test_case.yaml` yaml file: | ||
|
||
| Field | Type | Required/optional | Example value | Description | | ||
|-------------------|------------------|-------------------|-----------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| user_prompt | str | Required | Is pod xyz healthy? | The user prompt | | ||
| expected_output | str | Required | Yes, pod xyz is healthy. It is running and there are no errors in the logs. | The expected answer from the LLM | | ||
| retrieval_context | List[str] | Optional | - pod xyz is running and healthy - there are no errors in the logs | Context that the LLM is expected to have used in its answer. If present, this generates a 'context' score proportional to the number of matching context elements found in the LLM's output. | | ||
| evaluation | Dict[str, float] | Optional | evaluation: <br/> faithfulness: 1 <br/> context: 1 <br/> | The minimum expected scores. The test will fail unless these are met. Set to 0 for unstable tests. | | ||
| before-test | str | Optional | kubectl apply -f manifest.yaml | A command to run before the LLM evaluation. The CWD for this command is the same folder as the fixture. This step is skipped unless `RUN_LIVE` environment variable is set | | ||
| after-test | str | Optional | kubectl delete -f manifest.yaml | A command to run after the LLM evaluation.The CWD for this command is the same folder as the fixture. Typically cleans up any before-test action. This step is skipped unless `RUN_LIVE` environment variable is set | | ||
| generate_mocks | bool | Optional | True | Whether the test suite should generate mock files. Existing mock files are overwritten. | | ||
|
||
|
||
#### 3. Run the test | ||
|
||
Run the following: | ||
|
||
```sh | ||
UPLOAD_DATASET=1 RUN_LIVE=1 pytest ./tests/llm/test_ask_holmes.py -k 999_my_test_case | ||
``` | ||
|
||
The test may pass or not based on whether the evaluation scores are high enough. If the test fail, | ||
|
||
# Environment variables | ||
|
||
| Name | Example | Description | | ||
|--------------------|-------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------| | ||
| RUN_LIVE | RUN_LIVE=1 | Enables the execution of `before-test` and `after-test` commands to setuo any remote resource. This also ignores any mock files. | | ||
| BRAINTRUST_API_KEY | BRAINTRUST_API_KEY=sk-1dh1...swdO02 | The braintrust API key you get from your account. Log in https://www.braintrust.dev -> top right persona logo -> settings -> API keys. | | ||
| UPLOAD_DATASET | UPLOAD_DATASET=1 | Synchronise the dataset from the local machine to braintrust. This is usually safe as datasets are separated by branch name. | | ||
| EXPERIMENT_ID | EXPERIMENT_ID=nicolas_gemini_v1 | Override the experiment name in Braintrust. Helps with identifying and comparing experiments. Must be unique across ALL experiments. | |
File renamed without changes.
File renamed without changes.
23 changes: 23 additions & 0 deletions
23
tests/llm/fixtures/test_ask_holmes/01_how_many_pods/kubectl_find_resource_pod_by_keyword.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
{"toolset_name":"kubernetes/core","tool_name":"kubectl_find_resource","match_params":{"kind":"pod","keyword":"ip-172-31-8-128.us-east-2.compute.internal"}} | ||
stdout: | ||
default alertmanager-robusta-kube-prometheus-st-alertmanager-0 2/2 Running 0 3d22h 172.31.5.200 ip-172-31-8-128.us-east-2.compute.internal <none> <none> alertmanager=robusta-kube-prometheus-st-alertmanager,app.kubernetes.io/instance=robusta-kube-prometheus-st-alertmanager,app.kubernetes.io/managed-by=prometheus-operator,app.kubernetes.io/name=alertmanager,app.kubernetes.io/version=0.26.0,apps.kubernetes.io/pod-index=0,controller-revision-hash=alertmanager-robusta-kube-prometheus-st-alertmanager-57cd7fb46f,statefulset.kubernetes.io/pod-name=alertmanager-robusta-kube-prometheus-st-alertmanager-0 | ||
default analytics-exporter-fast-8cf8c9446-6rqwc 0/1 CrashLoopBackOff 1061 (2m51s ago) 3d18h 172.31.15.122 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app=analytics-exporter-fast,pod-template-hash=8cf8c9446 | ||
default customer-relations-webapp-5d98ffcfd-nj5gs 0/1 ImagePullBackOff 0 3d18h 172.31.14.171 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app=customer-relations,pod-template-hash=5d98ffcfd,visualize=true | ||
default db-certs-authenticator-7ffd769f48-d9pxl 0/1 CrashLoopBackOff 886 (69s ago) 3d18h 172.31.3.214 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app=flask,pod-template-hash=7ffd769f48 | ||
default java-api-checker-9pj7k 0/1 Error 0 3d18h 172.31.12.200 ip-172-31-8-128.us-east-2.compute.internal <none> <none> batch.kubernetes.io/controller-uid=ea3f2c52-3382-4cbc-8958-41832511a3e7,batch.kubernetes.io/job-name=java-api-checker,controller-uid=ea3f2c52-3382-4cbc-8958-41832511a3e7,job-name=java-api-checker | ||
default java-api-checker-vzm7z 0/1 Error 0 3d18h 172.31.13.205 ip-172-31-8-128.us-east-2.compute.internal <none> <none> batch.kubernetes.io/controller-uid=ea3f2c52-3382-4cbc-8958-41832511a3e7,batch.kubernetes.io/job-name=java-api-checker,controller-uid=ea3f2c52-3382-4cbc-8958-41832511a3e7,job-name=java-api-checker | ||
default logging-agent 0/1 Init:CrashLoopBackOff 1067 (15s ago) 3d18h 172.31.1.249 ip-172-31-8-128.us-east-2.compute.internal <none> <none> <none> | ||
default prometheus-robusta-kube-prometheus-st-prometheus-0 2/2 Running 0 3d22h 172.31.11.168 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app.kubernetes.io/instance=robusta-kube-prometheus-st-prometheus,app.kubernetes.io/managed-by=prometheus-operator,app.kubernetes.io/name=prometheus,app.kubernetes.io/version=2.48.1,apps.kubernetes.io/pod-index=0,controller-revision-hash=prometheus-robusta-kube-prometheus-st-prometheus-55d87c869b,operator.prometheus.io/name=robusta-kube-prometheus-st-prometheus,operator.prometheus.io/shard=0,prometheus=robusta-kube-prometheus-st-prometheus,statefulset.kubernetes.io/pod-name=prometheus-robusta-kube-prometheus-st-prometheus-0 | ||
default robusta-forwarder-89f44d49b-fxtrh 1/1 Running 0 3d22h 172.31.3.106 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app=robusta-forwarder,pod-template-hash=89f44d49b | ||
default robusta-kube-prometheus-st-operator-7fc5db7f4d-dr46l 1/1 Running 0 3d22h 172.31.6.195 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app.kubernetes.io/instance=robusta,app.kubernetes.io/managed-by=Helm,app.kubernetes.io/part-of=kube-prometheus-stack,app.kubernetes.io/version=55.7.0,app=kube-prometheus-stack-operator,chart=kube-prometheus-stack-55.7.0,heritage=Helm,pod-template-hash=7fc5db7f4d,release=robusta | ||
default robusta-prometheus-node-exporter-t2b5k 1/1 Running 0 3d22h 172.31.8.128 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app.kubernetes.io/component=metrics,app.kubernetes.io/instance=robusta,app.kubernetes.io/managed-by=Helm,app.kubernetes.io/name=prometheus-node-exporter,app.kubernetes.io/part-of=prometheus-node-exporter,app.kubernetes.io/version=1.7.0,controller-revision-hash=7bf445876b,helm.sh/chart=prometheus-node-exporter-4.24.0,jobLabel=node-exporter,pod-template-generation=1,release=robusta | ||
default search-engine-service 0/1 Running 0 3d18h 172.31.11.151 ip-172-31-8-128.us-east-2.compute.internal <none> <none> <none> | ||
kube-system aws-node-m47xg 2/2 Running 0 25d 172.31.8.128 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app.kubernetes.io/instance=aws-vpc-cni,app.kubernetes.io/name=aws-node,controller-revision-hash=54f5998898,k8s-app=aws-node,pod-template-generation=1 | ||
kube-system ebs-csi-controller-7bb676b68d-cs2gx 6/6 Running 0 25d 172.31.12.254 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app.kubernetes.io/component=csi-driver,app.kubernetes.io/managed-by=EKS,app.kubernetes.io/name=aws-ebs-csi-driver,app.kubernetes.io/version=1.35.0,app=ebs-csi-controller,pod-template-hash=7bb676b68d | ||
kube-system ebs-csi-node-pgrvq 3/3 Running 0 25d 172.31.2.194 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app.kubernetes.io/component=csi-driver,app.kubernetes.io/managed-by=EKS,app.kubernetes.io/name=aws-ebs-csi-driver,app.kubernetes.io/version=1.35.0,app=ebs-csi-node,controller-revision-hash=6bc69bc4b9,pod-template-generation=1 | ||
kube-system eks-pod-identity-agent-vgz8h 1/1 Running 0 25d 172.31.8.128 ip-172-31-8-128.us-east-2.compute.internal <none> <none> app.kubernetes.io/instance=eks-pod-identity-agent,app.kubernetes.io/name=eks-pod-identity-agent,controller-revision-hash=74bcb67854,pod-template-generation=1 | ||
kube-system kube-proxy-l7vqp 1/1 Running 0 25d 172.31.8.128 ip-172-31-8-128.us-east-2.compute.internal <none> <none> controller-revision-hash=6b64cc6947,k8s-app=kube-proxy,pod-template-generation=1 | ||
sock-shop user-5bd96d75fb-ld8xv 1/1 Running 0 3d18h 172.31.0.106 ip-172-31-8-128.us-east-2.compute.internal <none> <none> name=user,pod-template-hash=5bd96d75fb | ||
sock-shop user-db-5dc5c5f488-dw6xw 1/1 Running 0 3d18h 172.31.0.66 ip-172-31-8-128.us-east-2.compute.internal <none> <none> name=user-db,pod-template-hash=5dc5c5f488 | ||
|
||
stderr: |
File renamed without changes.
File renamed without changes.
File renamed without changes.
3 changes: 3 additions & 0 deletions
3
tests/llm/fixtures/test_ask_holmes/01_how_many_pods/kubectl_get_node.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
{"toolset_name":"kubernetes/core","tool_name":"kubectl_find_resource","match_params":{"kind":"node","keyword":"ip-172-31-8-128.us-east-2.compute.internal"}} | ||
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME LABELS | ||
ip-172-31-8-128.us-east-2.compute.internal Ready <none> 25d v1.30.4-eks-a737599 172.31.8.128 3.147.70.176 Amazon Linux 2 5.10.225-213.878.amzn2.x86_64 containerd://1.7.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t3.medium,beta.kubernetes.io/os=linux,eks.amazonaws.com/capacityType=ON_DEMAND,eks.amazonaws.com/nodegroup-image=ami-067ed4d12a282fb31,eks.amazonaws.com/nodegroup=nicolas-node-group,failure-domain.beta.kubernetes.io/region=us-east-2,failure-domain.beta.kubernetes.io/zone=us-east-2a,k8s.io/cloud-provider-aws=02bcd7cbb8e774ede4606ab79260ae31,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-172-31-8-128.us-east-2.compute.internal,kubernetes.io/os=linux,node.kubernetes.io/instance-type=t3.medium,topology.ebs.csi.aws.com/zone=us-east-2a,topology.k8s.aws/zone-id=use2-az1,topology.kubernetes.io/region=us-east-2,topology.kubernetes.io/zone=us-east-2a |
2 changes: 2 additions & 0 deletions
2
tests/llm/fixtures/test_ask_holmes/01_how_many_pods/kubectl_get_pod.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
{"toolset_name":"kubernetes/core","tool_name":"kubectl_find_resource","match_params":{"kind":"pod","name":"ip-172-31-8-128.us-east-2.compute.internal"}} | ||
Error from server (NotFound): pods "ip-172-31-8-128.us-east-2.compute.internal" not found |
6 changes: 6 additions & 0 deletions
6
tests/llm/fixtures/test_ask_holmes/01_how_many_pods/kubectl_top_pods.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
{"toolset_name":"kubernetes/live-metrics","tool_name":"kubectl_top_pods","match_params":{}} | ||
Command `kubectl top pods -A` failed with return code 1 | ||
stdout: | ||
|
||
stderr: | ||
error: Metrics API not available |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
20 changes: 20 additions & 0 deletions
20
tests/llm/fixtures/test_ask_holmes/07_high_latency/helm/Dockerfile
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
FROM python:3.10-slim | ||
|
||
# Set working directory | ||
WORKDIR /app | ||
|
||
# Copy requirements.txt | ||
COPY requirements.txt . | ||
|
||
# Install dependencies | ||
RUN pip install --no-cache-dir -r requirements.txt | ||
|
||
# Copy the FastAPI app | ||
COPY . . | ||
|
||
# Expose the ports | ||
EXPOSE 8000 8001 | ||
|
||
# Run the FastAPI app | ||
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"] | ||
|
Oops, something went wrong.