Skip to content

Latest commit

 

History

History
214 lines (172 loc) · 7.94 KB

DESIGN.md

File metadata and controls

214 lines (172 loc) · 7.94 KB

Design

The Honeycomb agent is an observability tool that continuously observes the log files of an application (e.g., nginx, MySQL, etc.), parses them, and sends them to the Honeycomb API.

This document describes the core abstractions of the Honeycomb agent, and how to use it in your Kubernetes cluster. Specifically:

  1. What happens when the Honeycomb agent is deployed on a Kubernetes cluster
  2. How to deploy the Honeycomb agent to a Kubernetes cluster
  3. How to configure the agent to consume up logs of different types running in your cluster

You will need:

Deploying the Honeycomb agent to a Kubernetes cluster (and what happens when you do)

After you have set up a Kubernetes environment (e.g., minikube, or a cluster on AWS), run the following command.

kubectl apply -f honeycomb-agent-ds.json

This deploys Honeycomb agent is deployed to Kubernetes as a DaemonSet, which means that Kubernetes will try to have the Honeycomb agent run on every node in the cluster.

When a Honeycomb agent starts, in the general case, it needs to know:

  1. Which logs to tail
  2. Which parsers to use on which logs

The way this works in Kubernetes is: the user annotates some set of pods with a label (e.g., {app: nginx}, and then specifies which parsers to use on pods whose labels match a pattern (specified with label selectors).

This behavior is specified using a YAML config file, an example of which is below. An important note about this code is that for the MVP, this code (which you can see here) comes hard-coded into the Honeycomb agent container. This will change to be pluggable as this project matures.

Here is the example YAML config file:

apiHost: https://api.honeycomb.io
writekey: "YOUR_WRITE_KEY_HERE"
watchers:
- dataset: kubernetestest
  parser: json
  sampleRate: 22
  labelSelector: "app=nginx"
- dataset: mysql
  labelSelector: "app=mysql"
  parser: mysql

The watchers field defines label patterns (called "label selectors") that tell the Honeycomb agent which pods to read stdout of, as well as which parser to use when it reads the data.

For example, the data set "kubernetestest" requires the JSON parser, and we want Honeycomb to parse the stdout of any pod with a label matching app=nginx.

Once the agent boots up, it reads this file to configure itself, and begins parsing the logs and sending data to Honeycomb. As soon as there are events to consume, they will begin appearing in the dashboard.

Customizing the Honeycomb agent for different applications

When the agents are deployed to the cluster, they need to be configured so that they have the resources they need to actually get to the logs of different pods (e.g., volume mounts, and so on).

This is typically done by writing a YAML file specifying (e.g.) a DaemonSet, with all the resources declared, and then kubectl this to the cluster.

But, this is a burden on consumers. Because YAML is purely declarative, it means that the user must either template the YAML somehow, or else each configuration of the agent requires a completely different YAML file to be written.

To reduce this burden, the Honeycomb agent uses ksonnet.

Let's look at an example of how the base Honeycomb agent DaemonSet can be customized to read the container logs for different pods. In honeycomb-agent-ds-app.jsonnet, we see code like the following:

local honeycomb = import "honeycomb-agent-ds-base.libsonnet";
local custom = import "honeycomb-agent-ds-custom.libsonnet";

// Import Honeycomb agent DaemonSet, append volume to it. The output
// of this equivalent to `honeycomb-agent-ds-custom.json`.
honeycomb.base("honeycomb-agent-v1.1", "kube-system") +
custom.daemonSet.addHostMountedPodLogs("varlog", "varlibdockercontainers")

One way to read this code is as saying "take honeycomb.base, the default DaemonSet for the Honeycomb agent, and use addHostMountedPodLogs to mount a pods's container logs inside all the containers in that DaemonSet.

Breaking it down, this code is doing three important things:

  1. Imports honeycomb.base, which will define the most basic Honeycomb agent daemonset
  2. Imports custom.daemonSet.addHostMountedPodLogs, which will mount the path to the Pod container logs into a (customizable) subset of the containers in a DaemonSet
  3. Combining these two things with the Jsonnet + ("mixin") operator.

As we can see, this approach allows users to decouple the "base" logic that all configurations have in common, and customize it as necessary.

Now let's look at how addHostMountedPodLogs works.

    // addhostMountedPodLogs takes a two volume names and produces a
    // mixin that will mount the Kubernetes pod logs into a set of
    // containers specified by `containerSelector`.
    addHostMountedPodLogs(
      varlogVolName, podLogVolName, containerSelector=function(c) true
    )::
      // Pod logs are located on the host at
      // `/var/lib/docker/containers`. Define volumes and mounts for
      // these paths, so the Honeytailer can access them.
      local varlogVol = volume.fromHostPath(varlogVolName, "/var/log");
      local varlogMount =
        volumeMount.new(varlogVol.name, varlogVol.hostPath.path);
      local podLogsVol =
        volume.fromHostPath(
          podLogVolName,
          "/var/lib/docker/containers");
      local podLogMount =
        volumeMount.new(podLogsVol.name, podLogsVol.hostPath.path, true);

      // Add volume to DaemonSet, and attach mounts to every
      // container for which `containerSelector` is true.
      ds.mixin.spec.template.spec.volumes([varlogVol, podLogsVol]) +

      // Add volume mount to every container in the DaemonSet.
      ds.mapContainers(
        function (c)
          if containerSelector(c)
          then c + container.volumeMounts([varlogMount, podLogMount])
          else c),
  }

This code is a bit more complicated, but it's essentially doing four things:

  1. Creating a volume that exposes /var/lib and /var/lib/docker/containers from the host. These directories contain logs for applications and pod containers. Because they are exposed from the host, the Honeycomb agent can simply read from those files to get logs for a pod.

    local varlogVol = volume.fromHostPath(varlogVolName, "/var/log");
     local podLogsVol =
       volume.fromHostPath(
         podLogVolName,
         "/var/lib/docker/containers");
  2. Creating volume mounts for both of those volumes, which can be embedded in the Honeycomb agent container.

    local varlogMount =
      volumeMount.new(varlogVol.name, varlogVol.hostPath.path);
    local podLogMount =
      volumeMount.new(podLogsVol.name, podLogsVol.hostPath.path, true);
  3. Adding the volumes to the Honeycomb agent pod.

       ds.mixin.spec.template.spec.volumes([varlogVol, podLogsVol]) +
  4. Adding these volume mounts to every container in the Honeycomb agent pod.

    ds.mapContainers(
      function (c)
        if containerSelector(c)
        then c + container.volumeMounts([varlogMount, podLogMount])
        else c),

Limitations

  • The Honeycomb agent can currently only read stdout for container logs, but eventually we will expand this to support reading from (e.g.) arbitrary files on a persistent volume.