Skip to content

UPPAAL Models for measuring the efficacy of Gossip Enabled Distributed Circuit Breaking. GEDCB detects dependency failures 30-50% faster than traditional circuit breakers.

License

Notifications You must be signed in to change notification settings

aashaypalliwar/gedcb-uppaal-models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


UPPAAL Models for measuring the efficacy of GEDCB

Paper: Using Gossip Enabled Distributed Circuit Breaking for Improving Resiliency of Distributed Systems

Table of Contents
  1. About The Project
  2. Getting Started
  3. Model Details
  4. Key Model Parameters
  5. Modelling Availability Scenarios
  6. Obtaining Data Points
  7. License
  8. Contact

About The Project

The circuit breaker resiliency pattern has been popularly used to improve the resiliency of distributed systems. We targetted a scope of improvement in this pattern by bringing in some modifications to the traditional circuit breaker pattern and coupling it with a gossip-based information dissemination protocol. We modelled the relevant systems using UPPAAL and provided data supporting the efficacy of our proposals. This repository holds the three UPPAAL models that were used to claim the efficacy. These proposals have been submitted and accepted for presentation at the 19th IEEE International Conference on Software Architecture (ICSA 2022) as a full technical paper.

Getting Started

Running and experimenting with the models require UPPAAL v4.1.25. The experiments were conducted on a machine using Linux (Ubuntu 20.04) operating system with the 64-bit installation of UPPAAL available on their website. The model .xml files can be launched using the UPPAAL GUI.

Model Details

To support our claims of superiority of Gossip Enabled Distributed Circuit Breaking (GEDCB) over the traditional circuit breaker pattern, we modelled three systems - a system that uses GEDCB, one that uses the traditional circuit breakers (CB) and a baseline system that uses no resiliency pattern. We can get a good idea about the efficacy of our proposals based on the relative performance of the three models.

The system models are composed from the following components:

  • Server
  • Clients (using GEDCB, traditional circuit breaker, no resiliency pattern)
  • Client Shuffler
  • Gossiper
  • Gossip Set Revisor

Systems under study:

  1. Using no resiliency pattern - One Server, Eight Clients (using no resiliency pattern), One Client Shuffler
  2. Using traditional CB - One Server, Eight Clients (using traditional circuit breaker), One Client Shuffler
  3. Using GEDCB - One Server, Eight Clients (using GEDCB), eight (corresponding) Gossipers, One Client Shuffler, One Gossip Set Reviser

Key Model Parameters

Parameter Meaning Value Used
resp_time Response Time 4
time_out Timeout Period 25
UST Unavailability Streak Time 250
window_size Window Size 10
softFT Soft Failure Threshold 2
(H)FT (Hard) Failure Threshold 6
HOFT Half-Open Failure Threshold 1
HOST Half-Open Success Threshold 2
SST Suspicion Success Threshold 2
OD Open Duration 100
s_delay Client Shuffling Period 500
delay Client Gossiper Period 4
revision_time Gossip Set Revision Period 40
MRC Maximum Request Count 500
gossip_count Number of clients gossiped to 2
infection_count Number of clients receiving set revision message 2

Modelling Availability Scenarios

Availability ASRC USC
1 500 0
0.8 167 2
0.6 84 5
0.4 39 12
0.2 16 32

Obtaining Data Points

The variables of interest are TC and exec where the former hold the number of timeouts the system encounters to successfully address MRC number of requests and the latter one holds the total time required to do the same.

We performed the following two example SMC queries on the three models:

E[<= 10500; 500](max : TC)
E[<= 10500; 500](max : exec)

The queries shown above provide the expected maximum value of the variable TC and the clock variable exec respectively within the time 10500 units evaluated over 500 simulations (use 50 simulations for a less accurate but faster result). For every given availability value configuration and the system at hand, we first run a few simulations to find the time bound by which the system halts and use this time period for the above SMC queries For example, we found that 10500 is a time bound by which all our modelled systems halted. Such simulations may be run using a query like:

simulate [<=10500; 1] {TC}

Once the system halts, the two variables of interest do not increase in value with time. Hence the queries to find the expected maximum value of the variables within this experimentally found time-bound provide the correct estimations.

To reproduce the results of the study, use the step by step procedure mentioned here

License

Distributed under the MIT License. See LICENSE.md for more information.

Contact

Aashay Palliwar - [email protected]

Srinivas Pinisetty - [email protected]

About

UPPAAL Models for measuring the efficacy of Gossip Enabled Distributed Circuit Breaking. GEDCB detects dependency failures 30-50% faster than traditional circuit breakers.

Topics

Resources

License

Stars

Watchers

Forks