Skip to content

Commit

Permalink
Merge pull request #180 from getanteon/develop
Browse files Browse the repository at this point in the history
Docs Update
  • Loading branch information
kursataktas authored Aug 8, 2024
2 parents eb23447 + d43951e commit b4a7ebe
Show file tree
Hide file tree
Showing 4 changed files with 168 additions and 103 deletions.
79 changes: 51 additions & 28 deletions Alaz-Architecture.md → ARCHITECTURE.md
Original file line number Diff line number Diff line change
@@ -1,53 +1,71 @@
# Alaz Architecture

<!-- vim-markdown-toc GFM -->

- [1. Kubernetes Client](#1-kubernetes-client)
- [2. Container Runtimes (`containerd`)](#2-container-runtimes-containerd)
- [3. eBPF Programs](#3-ebpf-programs)
- [Note](#note)
- [How to Build](#how-to-build)
- [How to Deploy](#how-to-deploy)

<!-- vim-markdown-toc -->

Alaz is designed to run in a kubernetes cluster as an agent, deployed as Daemonset (runs on each cluster node separately).

What it does is to watch and pull data from cluster to gain visibility onto the cluster.

It gathers information from 3 different sources:

## 1- Kubernetes Client

## 1. Kubernetes Client

Using kubernetes client, it polls different type of events related to kubernetes resources. Like **ADD, UPDATE, DELETE** events for any kind of K8s resources like **Pods,Deployments,Services** etc.

Packages used:
- `k8s.io/api/core/v1`
- `k8s.io/apimachinery/pkg/util/runtime`
- `k8s.io/client-go`
We use the following packages:

- `k8s.io/api/core/v1`
- `k8s.io/apimachinery/pkg/util/runtime`
- `k8s.io/client-go`

## 2. Container Runtimes (`containerd`)

## 2- Container Runtimes (containerd)
There are different types of container runtimes available for K8s clusters like containerd, crio, docker etc.
By connecting to chosen container runtimes socket, Alaz is able to gather more detailed information on containers running on nodes.

- log directory of the container,
- information related to its sandbox,
- pid,
- cgroups
- environment variables
- ...
- etc.

> We do not take into consideration container runtimes data, we do not need it for todays objectives. Will be used later on for collecting more detailed data.
## 3- eBPF Programs
## 3. eBPF Programs

In Alaz's eBPF directory there are a couple of **eBPF programs written in C using libbpf**.
In Alaz's eBPF directory there are a couple of eBPF programs written in C using libbpf.

In order to compile these programs, we have a **eBPF-builder image** that contains necessary dependencies installed like **clang, llvm, libbpf and go**.
In order to compile these programs, we have a **eBPF-builder image** that contains necessary dependencies installed like clang, llvm, libbpf and go.

eBPF programs are compiled in mentioned container, leveraging [Cilium bpf2go package](https://github.com/cilium/ebpf/tree/main/cmd/bpf2go).
> eBPF programs are compiled in mentioned container, leveraging [Cilium bpf2go package](https://github.com/cilium/ebpf/tree/main/cmd/bpf2go).
Using go generate directive with `bpf2go`, it compiles the eBPF program and generated necessary helper files in go in order us to interact with eBPF programs.
Using go generate directive with `bpf2go`, it compiles the eBPF program and generated necessary helper files in go in order us to interact with eBPF programs.

- Link the program to a tracepoint or a kprobe.
- Link the program to a tracepoint or a kprobe.
- Read bpf maps from user space and pass them for sense-making of data.

Used packages from cilium are :
- `github.com/cilium/eBPF/link`
- `github.com/cilium/eBPF/perf`
- `github.com/cilium/eBPF/rlimit`
Used packages from cilium are:

eBPF programs:
- `tcp_state` : Detects newly established, closed, and listened TCP connections. The number of sockets associated with the program's PID depends on the remote IP address. Keeping this data together with the file descriptor is useful.
- `l7_req` : Monitors both incoming and outgoing payloads by tracking the write,read syscalls and uprobes. Then use `tcp_state` to aggregate the data we receive, allowing us to determine who sent which request to where.

Current programs are generally attached to kernel tracepoints like:
- `github.com/cilium/eBPF/link`
- `github.com/cilium/eBPF/perf`
- `github.com/cilium/eBPF/rlimit`

eBPF programs:

- `tcp_state` : Detects newly established, closed, and listened TCP connections. The number of sockets associated with the program's PID depends on the remote IP address. Keeping this data together with the file descriptor is useful.
- `l7_req` : Monitors both incoming and outgoing payloads by tracking the write,read syscalls and uprobes. Then use `tcp_state` to aggregate the data we receive, allowing us to determine who sent which request to where.

Current programs are generally attached to kernel tracepoints like:

```
tracepoint/syscalls/sys_enter_write (l7_req)
Expand All @@ -64,25 +82,30 @@ tracepoint/syscalls/sys_exit_connect (tcp_state)
```

uprobes:

```
SSL_write
SSL_read
crypto/tls.(*Conn).Write
crypto/tls.(*Conn).Read
```

#### Note:
Uretprobes crashes go applications. (https://github.com/iovisor/bcc/issues/1320)
### Note

Uretprobes crashes go applications. See <https://github.com/iovisor/bcc/issues/1320>

That's why we disassemble the executable and find return instructions addresses and attach classic uprobes on them as a workaround.

## How to Build Alaz
## How to Build

Alaz embeds compiled eBPF programs in it. After compilation process on eBPF-builder is done, compiled programs are located in project structure.

Using **//go:embed** directive of golang. We embed *.o* files and load them into kernel using [Cilium eBPF package](https://github.com/cilium/eBPF).
Using **//go:embed** directive of golang. We embed _.o_ files and load them into kernel using [Cilium eBPF package](https://github.com/cilium/eBPF).

Then we build Alaz like a ordinary golang app more or less since compiled codes are embedded.

#### How to Deploy Alaz
## How to Deploy

Deployed as a privileged DaemonSet resource on the cluster. Alaz is required to run as a privileged container since it needs read access to `/proc` directory of the host machine.

And Alaz's `serviceAccount` must be should be associated with `ClusterRole` and `ClusterRoleBinding` resources in order to be able to talk with K8s server.
114 changes: 70 additions & 44 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,13 @@

<h1 align="center">Alaz - Anteon (Formerly Ddosify) eBPF Agent for Kubernetes Monitoring</h1>

<p align="center">
<a href="https://github.com/getanteon/alaz/blob/master/LICENSE" target="_blank"><img src="https://img.shields.io/badge/LICENSE-AGPL--3.0-orange?style=for-the-badge&logo=none" alt="alaz license" /></a>
<a href="https://discord.com/invite/9KdnrSUZQg" target="_blank"><img src="https://img.shields.io/discord/898523141788287017?style=for-the-badge&logo=discord&label=DISCORD" alt="Anteon discord server" /></a>
<a href="https://hub.docker.com/r/ddosify/alaz" target="_blank"><img src="https://img.shields.io/docker/v/ddosify/alaz?style=for-the-badge&logo=docker&label=docker&sort=semver" alt="alaz docker image" /></a>
</p>
<h1 align="center">Alaz - Anteon eBPF Agent for Kubernetes Monitoring</h1>

<p align="center">
<img src="https://raw.githubusercontent.com/getanteon/anteon/master/assets/anteon_service_map.png" alt="Anteon Kubernetes Monitoring Service Map" />
<i>Anteon automatically generates Service Map of your K8s cluster without code instrumentation or sidecars with eBPF Agent Alaz. So you can easily find the bottlenecks in your system. Red lines indicate the high latency between services.</i>
<a href="https://github.com/getanteon/alaz/blob/master/LICENSE" target="_blank"><img src="https://img.shields.io/badge/LICENSE-AGPL--3.0-orange?style=for-the-badge&logo=none" alt="alaz license" /></a>
<a href="https://discord.com/invite/9KdnrSUZQg" target="_blank"><img src="https://img.shields.io/discord/898523141788287017?style=for-the-badge&logo=discord&label=DISCORD" alt="Anteon discord server" /></a>
<a href="https://hub.docker.com/r/ddosify/alaz" target="_blank"><img src="https://img.shields.io/docker/v/ddosify/alaz?style=for-the-badge&logo=docker&label=docker&sort=semver" alt="alaz docker image" /></a>

<i>Anteon (formerly Ddosify) automatically generates Service Map of your K8s cluster without code instrumentation or sidecars with eBPF Agent Alaz. So you can easily find the bottlenecks in your system. Red lines indicate the high latency between services.</i>

</p>

<h2 align="center">
Expand All @@ -19,62 +17,89 @@
<a href="https://discord.com/invite/9KdnrSUZQg" target="_blank">Discord</a>
</h2>

<details>
<summary>Table of Contents</summary>

<!-- vim-markdown-toc GFM -->

- [What is Alaz?](#what-is-alaz)
- [Features](#features)
- [🚀 Getting Started](#-getting-started)
- [☁️ For Anteon Cloud](#-for-anteon-cloud)
- [Using the kubectl](#using-the-kubectl)
- [Using the Helm](#using-the-helm)
- [🏠 For Anteon Self-Hosted](#-for-anteon-self-hosted)
- [Using the kubectl](#using-the-kubectl-1)
- [Using the Helm](#using-the-helm-1)
- [🧹 Cleanup](#-cleanup)
- [Supported Protocols](#supported-protocols)
- [Limitations](#limitations)
- [Encryption Libraries](#encryption-libraries)
- [Contributing](#contributing)
- [Communication](#communication)
- [License](#license)

<!-- vim-markdown-toc -->

## What is Alaz?

[Alaz](https://github.com/getanteon/alaz) is an open-source Anteon eBPF agent that can inspect and collect Kubernetes (K8s) service traffic without the need for code instrumentation, sidecars, or service restarts. This is possible due to its use of eBPF technology.
[**Alaz**](https://github.com/getanteon/alaz) is an open-source Anteon eBPF agent that can inspect and collect Kubernetes (K8s) service traffic without the need for code instrumentation, sidecars, or service restarts. This is possible due to its use of eBPF technology.

Alaz can create a **Service Map** that helps identify golden signals and problems like:

- High latencies between K8s services
- Detect 5xx HTTP status codes
- Detect 5xx HTTP status codes
- Detect Idle / Zombie services
- Detect slow SQL queries

Additionally, Anteon tracks and displays live data on your cluster instances CPU, memory, disk, and network usage. All of the dashboards are generated out-of-box and you can create alerts based on these metrics values. Check out the [docs](https://getanteon.com/docs/) for more.
Additionally, Anteon tracks and displays live data on your cluster instances CPU, memory, disk, and network usage. All of the dashboards are generated out-of-box and you can create alerts based on these metrics values. Check out the [documentation](https://getanteon.com/docs/) for more.

<p align="center">
<img src="https://raw.githubusercontent.com/getanteon/anteon/master/assets/anteon_metrics.png" alt="Anteon Kubernetes Monitoring Metrics" />
<i>Anteon tracks and displays live data on your cluster instances CPU, memory, disk, and network usage.</i>
</p>


➡️ For more information about Anteon, see [Anteon](https://github.com/getanteon/anteon).
➡️ See [Anteon repository](https://github.com/getanteon/anteon) for more information.

## Features

**Low-Overhead:**
**Low-Overhead**

Inspect and collect K8s service traffic without the need for code instrumentation, sidecars, or service restarts.

**Effortless:**
**Effortless**

Anteon will create the Service Map & Metrics Dashboard that helps identify golden signals and issues such as high latencies, 5xx errors, zombie services.

**Prometheus Compatible:**
**Prometheus Compatible**

Gather system information and resources via the Prometheus Node Exporter, which is readily available on the agent.

**Cloud or On-premise:**
**Cloud or On-premise**

Export metrics to [Anteon Cloud](https://getanteon.com), or install the [Anteon Self-Hosted](https://getanteon.com/docs/self-hosted/) in your infrastructure and manage everything according to your needs.

**Test & Observe**

Export metrics to [Anteon Cloud](https://getanteon.com), or install the [Anteon Self-Hosted](https://github.com/getanteon/anteon/tree/master/selfhosted) in your infrastructure and manage everything according to your needs.
Anteon Performance Testing and Alaz can work collaboratively. You can start a load test and monitor your system simultaneously. This will help you spot performance issues instantly. Check out the [Anteon documentation](https://getanteon.com/docs) for more information about Anteon Stack.

**Test & Observe:**
**Alerts for Anomalies**

Anteon Performance Testing and Alaz can work collaboratively. You can start a load test and monitor your system simultaneously. This will help you spot performance issues instantly. Check out the [Anteon GitHub Repository](https://github.com/getanteon/anteon) for more information about Anteon Stack.
If something unusual, like a sudden increase in CPU usage, happens in your Kubernetes (K8s) cluster, Anteon immediately sends alerts to your Slack.

**Alerts for Anomalies:** If something unusual, like a sudden increase in CPU usage, happens in your Kubernetes (K8s) cluster, Anteon immediately sends alerts to your Slack.
**Platform Support**

Works on both Arm64 and x86_64 architectures.
Works on both Arm64 and x86_64 architectures.

## Getting Started
## 🚀 Getting Started

To use Alaz, you need to have a [Anteon Cloud](https://app.getanteon.com/register) account or [Anteon Self-Hosted](https://github.com/getanteon/anteon/tree/master/selfhosted) installed.
To use Alaz, you need to have a [Anteon Cloud](https://app.getanteon.com/register) account or [Anteon Self-Hosted](https://github.com/getanteon/anteon) installed.

### ☁️ For Anteon Cloud

1. Register for a [Anteon Cloud account](https://app.getanteon.com/register).
2. Add a cluster on the [Observability page](https://app.getanteon.com/clusters). You will receive a Monitoring ID and instructions.
3. Run the agent on your Kubernetes cluster using the instructions you received. There are two options for Kubernetes deployment:
3. Run the agent on your Kubernetes cluster using the instructions you received. There are two options for Kubernetes deployment:

#### Using the kubectl

Expand Down Expand Up @@ -102,11 +127,11 @@ Then you can view the metrics and Kubernetes Service Map on the [Anteon Observab

### 🏠 For Anteon Self-Hosted

1. Install [Anteon Self-Hosted](https://github.com/getanteon/anteon/tree/master/selfhosted)
1. Install [Anteon Self-Hosted](https://getanteon.com/docs/self-hosted)
2. Add a cluster on the Observability page of your Self-Hosted frontend. You will receive a Monitoring ID and instructions.
3. Run the agent on your Kubernetes cluster using the instructions you received.
3. Run the agent on your Kubernetes cluster using the instructions you received.

Note: After you install Anteon Self-Hosted, you will have a Anteon Self-Hosted endpoint of nginx reverse proxy. The base URL of the Anteon Self-Hosted endpoint forwards traffic to the frontend. The base URL of the Anteon Self-Hosted endpoint with `/api` suffix forwards traffic to the backend. So you need to set the backend host variable as `http://<your-anteon-self-hosted-endpoint>/api`.
Note: After you install Anteon Self-Hosted, you will have a Anteon Self-Hosted endpoint of Nginx reverse proxy. The base URL of the Anteon Self-Hosted endpoint forwards traffic to the frontend. The base URL of the Anteon Self-Hosted endpoint with `/api` suffix forwards traffic to the backend. So you need to set the backend host variable as `http://<your-anteon-self-hosted-endpoint>/api`.

There are two options for Kubernetes deployment:

Expand Down Expand Up @@ -139,19 +164,19 @@ helm upgrade --install --namespace anteon alaz anteon/alaz --set monitoringID=$M

Then you can view the metrics and Kubernetes Service Map on the Anteon Self-Hosted Observability dashboard. For more information, see [Anteon Monitoring Docs](https://getanteon.com/docs/kubernetes-monitoring/).

Alaz runs as a DaemonSet on your Kubernetes cluster. It collects metrics and sends them to Anteon Cloud or Anteon Self-Hosted. You can view the metrics on the Anteon Observability dashboard. For the detailed Alaz architecture, see [Alaz Architecture](https://github.com/getanteon/alaz/blob/master/Alaz-Architecture.md).
Alaz runs as a DaemonSet on your Kubernetes cluster. It collects metrics and sends them to Anteon Cloud or Anteon Self-Hosted. You can view the metrics on the Anteon Observability dashboard. For the detailed Alaz architecture, see [Alaz Architecture](https://github.com/getanteon/alaz/blob/master/ARCHITECTURE.md).

## Cleanup
## 🧹 Cleanup

To remove Alaz from your Kubernetes cluster, run the following command:

- For Kubectl
- For Kubectl:

```bash
kubectl delete -f https://raw.githubusercontent.com/getanteon/alaz/master/resources/alaz.yaml
```

- For Helm
- For Helm:

```bash
helm delete alaz --namespace anteon
Expand All @@ -172,7 +197,7 @@ Alaz supports the following protocols:
- MySQL
- MongoDB

Other protocols will be supported soon. If you have a specific protocol you would like to see supported, please open an issue.
Other protocols will be supported soon. If you have a specific protocol you would like to see supported, please [open an issue](https://github.com/getanteon/alaz/issues/new).

## Limitations

Expand All @@ -182,35 +207,36 @@ In the future, we plan to support Docker containers.
Alaz is an eBPF application that uses [CO-RE](https://github.com/libbpf/libbpf#bpf-co-re-compile-once--run-everywhere).
Most of the latest linux distributions support CO-RE. In order to CO-RE to work, the kernel has to be built with BTF(bpf type format) information.

You can check your kernel version with `uname -r`
You can check your kernel version with `uname -r`
command and whether btf is enabled by default or not at the [btfhub](https://github.com/aquasecurity/btfhub/blob/main/docs/supported-distros.md).

For the time being, we expect that btf information is readily available on your system. We'll support all kernels in the upcoming weeks leveraging [btfhub](https://github.com/aquasecurity/btfhub).
For the time being, we expect that btf information is readily available on your system. We will support all kernels in the upcoming weeks leveraging [btfhub](https://github.com/aquasecurity/btfhub).

### Encryption Libraries

These are the libraries that alaz hooks into for capturing encrypted traffic.

- [crypto/tls](https://pkg.go.dev/crypto/tls):
In order to Alaz to capture tls requests in your Go applications, your go version must be **1.17+** and your executable must include debug info.
In order to Alaz to capture tls requests in your Go applications, your go version must be **1.17+** and your executable must include debug info.

- [OpenSSL](https://www.openssl.org/):
OpenSSL shared objects that is dynamically linked into your executable is supported.
Supported versions : **1.0.2**, **1.1.1** and **3.***
OpenSSL shared objects that is dynamically linked into your executable is supported.
Supported versions : **1.0.2**, **1.1.1** and **3.\***

## Contributing

Contributions to Alaz are welcome! To contribute, please follow these steps:

1. Fork the repository
2. Create a new branch: `git checkout -b my-branch`
3. Make your changes and commit them: `git commit -am 'Add some feature'`
3. Make your changes and commit them: `git commit -am "Add some feature"`
4. Push to the branch: `git push origin my-branch`
5. Submit a pull request
5. Submit a pull request.

## Communication

You can join our [Discord Server](https://discord.com/invite/9KdnrSUZQg) for issues, feature requests, feedbacks or anything else.
You can join our [Discord Server](https://discord.com/invite/9KdnrSUZQg) for issues, feature requests, feedbacks or anything else.

## License

Alaz is licensed under the AGPLv3: https://www.gnu.org/licenses/agpl-3.0.html

Alaz is licensed under the [AGPLv3](LICENSE)
Loading

0 comments on commit b4a7ebe

Please sign in to comment.