Skip to content

Commit

Permalink
feat: wrap images with figure
Browse files Browse the repository at this point in the history
Signed-off-by: Yue Yang <[email protected]>
  • Loading branch information
g1eny0ung committed Sep 11, 2024
1 parent 23120b6 commit faf29aa
Show file tree
Hide file tree
Showing 13 changed files with 19 additions and 64 deletions.
6 changes: 0 additions & 6 deletions blog/2020-01-15-chaos-mesh-your-chaos-engineering-solution.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,6 @@ Here is an example of how we use Chaos Mesh to locate a TiDB system bug. In this

![Chaos Mesh discovers downtime recovery exceptions in TiKV](/img/blog/chaos-mesh-discovers-downtime-recovery-exceptions-in-tikv.png)

<div className="caption"> Chaos Mesh discovers downtime recovery exceptions in TiKV</div>

As you can see from the dashboard:

- During the first two downtimes, the QPS returns to normal after about 1 minute.
Expand Down Expand Up @@ -126,8 +124,6 @@ With the CRD design settled, let's look at the big picture on how Chaos Mesh wor

![Chaos Mesh workflow](/img/blog/chaos-mesh-workflow.png)

<div className="caption"> Chaos Mesh workflow </div>

Here is how these components streamline a chaos experiment:

1. Using a YAML file or Kubernetes client, the user creates or updates chaos objects to the Kubernetes API server.
Expand Down Expand Up @@ -215,8 +211,6 @@ The following chaos experiment simulates the TiKV Pods being frequently killed i

![Chaos experiment running](/img/blog/chaos-experiment-running.gif)

<div className="caption"> Chaos experiment running </div>

We use a sysbench program to monitor the real-time QPS changes in the TiDB cluster. When errors are injected into the cluster, the QPS show a drastic jitter, which means a specific TiKV Pod has been deleted, and Kubernetes then re-creates a new TiKV Pod.

For more YAML file examples, see https://github.com/chaos-mesh/chaos-mesh/tree/master/examples.
Expand Down
6 changes: 0 additions & 6 deletions blog/2020-03-18-run-your-first-chaos-experiment.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,6 @@ The following clip shows the process of installing Chaos Mesh, deploying web-sho

![The whole process of the chaos experiment](/img/blog/whole-process-of-chaos-experiment.gif)

<div className="caption"> The whole process of the chaos experiment </div>

Now it's your turn! It's time to get your hands dirty.

## Let's get started!
Expand Down Expand Up @@ -147,8 +145,6 @@ To start NetworkChaos, do the following:

![Using Chaos Mesh to insert delays in web-show](/img/blog/using-chaos-mesh-to-insert-delays-in-web-show.png)

<div className="caption"> Using Chaos Mesh to insert delays in web-show </div>

Congratulations! You just stirred up a little bit of chaos. If you are intrigued and want to try out more chaos experiments with Chaos Mesh, check out [examples/web-show](https://github.com/chaos-mesh/chaos-mesh/tree/master/examples/web-show).

### Delete the chaos experiment
Expand All @@ -168,8 +164,6 @@ From the line graph, you can see the network latency level is back to normal.
![Network latency level is back to normal](/img/blog/network-latency-level-is-back-to-normal.png)
<div className="caption"> Network latency level is back to normal </div>
### Delete Kubernetes clusters
After you're done with the chaos experiment, execute the following command to delete the Kubernetes clusters:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,6 @@ We can see that the whole vDSO is like a `.so` file, so we can use an executable

![TimeChaos workflow](/img/blog/timechaos-workflow.jpg)

<div className="caption"> TimeChaos workflow </div>

The chart above is the process of **TimeChaos**, an implementation of clock skew in Chaos Mesh.

1. Use ptrace to attach the specified PID process to stop the current process.
Expand Down Expand Up @@ -169,8 +167,6 @@ That's encouraging. But does TimeChaos affect services other than PD? We can che

![Chaos Dashboard](/img/blog/chaos-dashboard.jpg)

<div className="caption"> Chaos Dashboard </div>

It's clear that in the monitor, TimeChaos was injected every 1 millisecond and the whole duration lasted 10 seconds. What's more, TiDB was not affected by that injection. The bank program ran normally, and performance was not affected.

## Try out Chaos Mesh
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,6 @@ The current Chaos Mesh architecture is suited for individual Kubernetes clusters

![Chaos Mesh architecture](/img/blog/chaos-mesh-remake-architecture.jpeg)

<p className="caption">The current Chaos Mesh architecture</p>

During this refactor, **to allow Chaos Dashboard to manage multiple Kubernetes clusters, we separate Chaos Dashboard from the main architecture**. Now, if you deploy Chaos Dashboard outside of the Kubernetes cluster, you can add the cluster to Chaos Dashboard via the web UI. If you deploy Chaos Dashboard inside the cluster, it automatically obtains the cluster information through environment variables.

You can register Chaos Mesh (technically, the Kubernetes configuration) in Chaos Dashboard or ask `chaos-controller-manager` to report to Chaos Dashboard via configuration. Chaos Dashboard and `chaos-controller-manager` interact via CustomResourceDefinitions (CRDs). When `chaos-controller-manager` finds a Chaos Mesh CRD event, it invokes `chaos-daemon` to carry out the related chaos experiment. Therefore, Chaos Dashboard can manage experiments by operating on CRDs.
Expand All @@ -59,8 +57,6 @@ chaosd is a toolkit for running chaos experiments on physical machines. Previous

![chaosd, a Chaos Engineering command line tool](/img/blog/chaosd-chaos-engineering-command-line-tool.jpeg)

<p className="caption">Previously, chaosd was a command line tool</p>

During the refactoring, **we enabled chaosd to support the RESTful API and enhanced its services so that it can configure chaos experiments by parsing CRD-format JSON or YAML files**.

Now, chaosd can register itself to Chaos Dashboard via configuration and send regular heartbeats to Chaos Dashboard. With the heartbeat signals, Chaos Dashboard can manage the chaosd node status. You can also add chaosd nodes to Chaos Dashboard via the web UI.
Expand All @@ -71,8 +67,6 @@ With new Chaos Dashboard and chaosd, the optimized architecture of Chaos Mesh is

![Chaos Mesh's optimized architecture](/img/blog/chaos-mesh-optimized-architecture.jpeg)

<p className="caption">Chaos Mesh's optimized architecture</p>

### Improve observability

Another improvement is observability, namely how to tell if an experiment is carried out successfully.
Expand Down Expand Up @@ -106,8 +100,6 @@ A closed loop of Chaos Engineering includes four steps: exploring chaos, discove

![A closed loop of Chaos Engineering](/img/blog/closed-loop-of-chaos-engineering.jpeg)

<p className="caption">A closed loop of Chaos Engineering</p>

However, **most of the current open source Chaos Engineering tools only focus on exploration and do not provide pragmatic feedback.** Based on the improved observability component, we can monitor chaos experiments in real time and compare and analyze the experiment results.

With these results, we will be able to realize a closed loop by adding another important component: orchestration. The Chaos Mesh community already proposed a [Workflow](https://github.com/chaos-mesh/rfcs/pull/10/files) feature, which enables you to easily orchestrate and call back chaos experiments or conveniently integrate Chaos Mesh with other systems. You can run chaos experiments in the CI/CD phase or after a canary release.
Expand Down
2 changes: 0 additions & 2 deletions blog/2021-07-09-chaos-mesh-q&a.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,6 @@ Big thanks to the more than 200 of you who joined us! We received so many great

![Project Architecture](/img/blog/chaos-mesh-linkerd-architecture.png)

<p className="caption">Project Architecture</p>

**Q: Can I use Chaos Mesh on-premises or do I need Amazon Web Services (AWS) or Google Cloud Platform (GCP)?**

**A:** You can do either! You can deploy Chaos Mesh on your Kubernetes cluster, so it does not matter whether you manage it yourself or have it hosted on AWS or GCP. However, if you would like to use it in a Kubernetes environment, you need to [set relevant parameters](https://chaos-mesh-website-archived.netlify.app/docs/1.2.4/user_guides/installation) during installation.
Expand Down
2 changes: 0 additions & 2 deletions blog/2021-08-05-chaos-mesh-celebrates-100-contributors.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,6 @@ So far, Chaos Mesh has brought out 35 releases, received 1,500+ commits from 100

![Chaos Mesh contributors](/img/blog/chaos-mesh-all-contributors.jpeg)

<p className="caption">Chaos Mesh contributors (as of 2021.08.02)</p>

Here are a few of our favourite contributions to highlight:

- [@YangKeao](https://github.com/YangKeao) introduced `kubebuilder` to Chaos Mesh, an SDK for building Kubernetes APIs using CRD, which simplified the steps to implement the Controller.
Expand Down
6 changes: 0 additions & 6 deletions blog/2021-08-20-chaos-mesh-apisix.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,6 @@ As our community grows, Apache APISIX's features more frequently interact with e

![Apache APISIX architecture](/img/blog/apache-apisix-architecture.jpg)

<p className="caption"> Apache APISIX architecture </p>

In this post, we'll share how we use [Chaos Mesh](https://chaos-mesh.org/) to improve our system stability.

## Our pain points
Expand Down Expand Up @@ -55,8 +53,6 @@ We deployed a Chaos Engineering experiment using the following steps:

![High network latency occurs between etcd and Apache APISIX](/img/blog/high-network-latency-between-etcd-and-apache-apisix.jpg)

<p className="caption"> High network latency occurs between etcd and Apache APISIX </p>

### Scenario #2

After we conducted the same experiment as above in the control group, we introduced pod-kill chaos and reproduced the expected error. When we randomly deleted a small number of etcd nodes in the cluster, sometimes APISIX could connect to etcd and sometimes not, and the log printed a large number of connection rejection errors.
Expand All @@ -69,8 +65,6 @@ After we fixed this problem, we added a health check to the etcd Lua API to ensu

![Error Reported from etcd Node Interaction](/img/blog/error-reported-from-etcd-node-interaction.jpg)

<p className="caption"> An error is reported from one etcd node's interaction with the Apache APISIX admin API </p>

## Our future plans

### Run a chaos test in E2E simulation scenarios
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,6 @@ IEG officially launched its chaos engineering project over a year ago. We wanted

![A comparison of chaos engineering tools](/img/blog/comparison-of-chaos-engineering-tools.png)

<p className="caption"> A comparison of chaos engineering tools </p>

> Note: This comparison is outdated and is intended simply to compare fault injection features supported by Chaos Mesh with other well-known chaos engineering platforms. It is not intended to favor or position one project over another. Any corrections are welcome.
## Build a chaos testing platform
Expand All @@ -52,8 +50,6 @@ Our chaos engineering team embedded Chaos Mesh into our continuous integration a

![Chaos Mesh embedded in IEG's operation platform](/img/blog/chaos-mesh-embedded-in-IEG's-operation-platform.png)

<p className="caption">Chaos Mesh embedded in IEG's operation platform</p>

In IEG, **chaos engineering is generally summarized as a closed loop with several key phases**:

- Improve overall system resilience.
Expand All @@ -78,8 +74,6 @@ In IEG, **chaos engineering is generally summarized as a closed loop with severa

![Five phases of chaos engineering in IEG](/img/blog/five-phases-of-chaos-engineering-in-IEG.png)

<p className="caption">Five phases of chaos engineering in IEG</p>

We frequently **test the performance of services under high CPU usage**, for example. We begin by orchestrating and scheduling experiments. Following that, we run experiments and monitor the performance of related services. Multiple monitoring metrics, such as QPS, latency, response success, are immediately visible through the operation platform. The platform then generates reports for us to review, so we can check whether these experiments met our expectations.

## Use cases
Expand All @@ -96,8 +90,6 @@ Understandably, our team members grew bored of regular chaos experiments. After

![The red teaming process in IEG](/img/blog/red-teaming-process-in-IEG.png)

<p className="caption">The red teaming process in IEG</p>

### Dependency analysis

It’s important to manage dependencies for microservices. In our case, non-core services cannot be the bottleneck for core services. Fortunately, with chaos engineering, we can run dependency analysis simply by injecting faults into called services and observing how badly the main service is affected. Based on the results, we can optimize the service calling chain in a specific scenario.
Expand All @@ -114,6 +106,4 @@ Gone are the days when performing fault injection requires a handwritten script,

![Chaos engineering with DevOps ensures efficient fault injection](/img/blog/chaos-engineering-with-devops.png)

<p className="caption">Chaos engineering with DevOps ensures efficient fault injection</p>

Thanks to full-featured chaos engineering tools and streamlined DevOps processes, we estimate that the efficiency of fault injection and chaos-based optimization at IEG has been improved at least by 10 times in the last six months. If you were unsure about implementing chaos engineering in your business, I hope our experience can be of some help.
2 changes: 0 additions & 2 deletions blog/2021-09-15-run-chaos-experiments-on-physical-machines.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,8 +106,6 @@ We will continue to enhance its usability and implement more functionalities suc

![Chaos Mesh's optimized architecture](/img/blog/chaos-mesh-optimized-architecture.png)

<p className="caption">Chaos Mesh's optimized architecture</p>

For more, check out [Chaos Mesh's optimized architecture](https://pingcap.com/blog/chaos-mesh-remake-one-step-closer-toward-chaos-as-a-service#developing-chaos-mesh-towards-caas).

### Add more fault injection types
Expand Down
6 changes: 0 additions & 6 deletions blog/2021-12-10-implement-chaos-engineering-in-k8s.md
Original file line number Diff line number Diff line change
Expand Up @@ -552,8 +552,6 @@ As shown in the Chaos Mesh workflow below, we need to implement a server that se

![Chaos Mesh's basic workflow](/img/blog/chaos-mesh-basic-workflow.png)

<p className="caption">Chaos Mesh's basic workflow</p>

Let's take a look at the example on the Chaos Mesh website:

```go
Expand Down Expand Up @@ -741,8 +739,6 @@ This example uses the manager. This mode prevents the cache mechanism from repet

![List request](/img/blog/list-request.png)

<p className="caption">List request</p>

### Orchestrate chaos

The container runtime interface (CRI) container runtime provides strong underlying isolation capabilities that can support the stable operation of the container. But for more complex and scalable scenarios, container orchestration is required. Chaos Mesh also provides [`Schedule`](https://chaos-mesh.org/docs/define-scheduling-rules/) and [`Workflow`](https://chaos-mesh.org/docs/create-chaos-mesh-workflow/) features. Based on the set `Cron` time, `Schedule` can trigger faults regularly and at intervals. `Workflow` can schedule multiple fault tests like Argo Workflows.
Expand All @@ -755,8 +751,6 @@ The following figure shows Chaos Mesh Dashboard. We need to consider what featur

![Chaos Mesh Dashboard](/img/blog/chaos-mesh-dashboard-k8s.png)

<p className="caption">Chaos Mesh Dashboard</p>

From the Dashboard, we know that the platform may have these features:

- Chaos injection
Expand Down
6 changes: 0 additions & 6 deletions blog/2022-01-11-develop-a-daily-reporting-system.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,8 +93,6 @@ For plotting, I used [gnuplot](http://www.gnuplot.info/), a Linux command-line g

![QPS line graph](/img/blog/qps-line-graph.png)

<p class="caption">QPS line graph</p>

### Generate the report in PDF

Currently, there is no available API for generating Chaos Mesh reports or analyzing results. I decided to generate the report in PDF format so it would be readable on different browsers. In my case, I used [gopdf](https://github.com/signintech/gopdf), a support library that allows users to create PDF files. It also lets me insert images or draw tables, which meets my needs.
Expand All @@ -113,14 +111,10 @@ Below is an example of a web application that I developed for daily reporting. T

![Web application for daily reporting](/img/blog/web-app-for-daily-reporting.png)

<p class="caption">Web application for daily reporting</p>

Clicking the red card will open the report, as shown below. I used [pdf.js](https://github.com/mozilla/pdf.js) to view the PDF.

![Daily report in PDF](/img/blog/daily-report-pdf.png)

<p class="caption">Daily report in PDF</p>

## Summary

Chaos Mesh enables you to simulate faults that most cloud-native applications might encounter. In this article, I created a PodChaos experiment and observed that QPS in the TiDB cluster was affected when the Pod became unavailable. After analyzing the logs, I can enhance the robustness and high availability of the system. I built a web application to generate daily reports for troubleshooting and debugging. You can also customize the reports to meet your own requirements.
Expand Down
6 changes: 0 additions & 6 deletions src/styles/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -70,12 +70,6 @@
text-decoration-thickness: 2px;
}

.caption {
color: var(--ifm-color-emphasis-700);
font-style: italic;
text-align: center;
}

/* Integrate tailwindcss. */
@tailwind base;
@tailwind components;
Expand Down
19 changes: 19 additions & 0 deletions src/theme/MDXComponents/Img/index.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import Img from '@theme-original/MDXComponents/Img'
import React from 'react'

export default function ImgWrapper(props) {
return (
<figure>
<Img {...props} />
<figcaption
className="text--italic text--center"
style={{
color: 'var(--ifm-color-content-secondary)',
fontSize: '0.875rem',
}}
>
{props.alt}
</figcaption>
</figure>
)
}

0 comments on commit faf29aa

Please sign in to comment.