Ace is a bounded, and exhaustive workload generator for POSIX file systems. A workload is simply a sequence of file-system operations. Ace comprises of two main components
- High-level workload generator : This is responsible for exhaustively generating workloads within the defined bounds. The generated workloads are represented in a high-level language which resembles the one below.
mkdir B 0777
open Bfoo O_RDWR|O_CREAT 0777
fsync Bfoo
checkpoint 1
close Bfoo
- Workload Adapter : Workloads represented in the high-level language have to be converted into executables that can be run and verified. We support the following two formats:
- Crashmonkey : The Crashmonkey adapter translates the high-level language into a format that CrashMonkey understands. For example, the above workload is converted to the run method of CrashMonkey as follows.
virtual int run( int checkpoint ) override {
test_path = mnt_dir_ ;
B_path = mnt_dir_ + "/B";
Bfoo_path = mnt_dir_ + "/B/foo";
int local_checkpoint = 0 ;
if ( mkdir(B_path.c_str() , 0777) < 0){
return errno;
}
int fd_Bfoo = cm_->CmOpen(Bfoo_path.c_str() , O_RDWR|O_CREAT , 0777);
if ( fd_Bfoo < 0 ) {
cm_->CmClose( fd_Bfoo);
return errno;
}
if ( cm_->CmFsync( fd_Bfoo) < 0){
return errno;
}
if ( cm_->CmCheckpoint() < 0){
return -1;
}
local_checkpoint += 1;
if (local_checkpoint == checkpoint) {
return 1;
}
if ( cm_->CmClose ( fd_Bfoo) < 0){
return errno;
}
return 0;
}
- XFSTest : The XFSTest adapter translates the high-level language into a test file and expected output file to be run with xfstest. For example, the above workload would be converted into the following code (excluding the xfstest initializiation and code and helper methods):
mkdir $SCRATCH_MNT/B -p -m 0777
touch $SCRATCH_MNT/B/foo
chmod 0777 $SCRATCH_MNT/B/foo
$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/B/foo
check_consistency $SCRATCH_MNT/B/foo
clean_dir
The definition of check_consistency
and clean_dir
as well as all other helper functions can be found here. The XFSTest adapter itself can be run with the following required arguments:
--base_file BASE_FILE, -b BASE_FILE Base test file to generate workload
--test_file TEST_FILE, -t TEST_FILE J lang test skeleton to generate workload
--target_path TARGET_PATH, -p TARGET_PATH Directory to save the generated test files
--test_number TEST_NUMBER, -n TEST_NUMBER The test number following xfstest convention.
Will generate <test_number> and <test_number>.out
--filesystem_type FILESYSTEM_TYPE, -f FILESYSTEM_TYPE The filesystem type for the test
(i.e. generic, ext4, btrfs, xfs, f2fs, etc.)
For example, in running the following:
python2 xfstestAdapter.py -b ../ace/base_xfstest.sh -t <J-LANG FILE> -p output/ -n 001 -f generic
would create output/001
and output/001.out
from the given J-lang file. Note that the base file at ../ace/base_xfstest.sh
should be used for every use of the XFSTest adapter.
For more information explaining how to run the adapters directly, refer to the beginning of the following files for Crashmonkey and xfstest respectively.
Ace currently generates workloads of sequence length 1, 2, and 3. We say a workload has sequence length 'x' if it has 'x' core file-system operations in it. Currently, Ace places the following bounds.
-
Sequence length : Only sequences of length up to 3 have been tested so far.
-
File-system operations : Ace currently supports the following system calls :
creat, mkdir, falloc, write, direct write, mmap write, link, unlink, remove, rename, fsetxattr, removexattr, truncate, fdatasync, fsync, sync, symlink
However, be cautious of the number of workloads that can be generated in each sequence length, if you decide to test this whole set of operations. As you go higher up the sequence length, it is advisable to narrow down the operation set, or increase the compute available to test.
-
File set : Ace supports only a pre-defined file set. You have the freedom to restrict it to a smaller set. Currently, we support 4 directories and 2 files within each directory.
dir : test (files: foo, bar) | |__ dir : A (files: foo, bar) | | | |_ dir C (files: foo, bar) | |__ dir : B (files: foo, bar)
However, by default, we only use directory A, B, and the files under them. To include a level of nesting, add the files under test dir or C to the file set.
-
Persistence operations : Ace supports four types of persistence operations to be included in the workload
sync, fsync, fdatasync, none
Since by defaultfdatasync
is one of the file-system operations under test, it is excluded from the list of persistence operations. However you could add it to the persistence operation set in the ace script. -
Fileset for persistence operations : Ideally, the file set for persistence operations should be the same as the above defined file set. However, there is no point persisting a file that was not touched by any of the system calls in the workload. Hence, we reduce the space of workloads further by restricting the file set for persistence operations:
- Range of used files : In this mode, in addition to the files/directories involved in the workload, we allow persisting sibling files and parent directory. This is the default for seq-1 and seq-2.
- Strictly used files : In this mode, only files/directories acted upon by the workload can be persisted. We default to this mode in seq-3, to restrict the workload set.
Generating workloads with Ace is a two-step process.
-
Set the bounds that you wish to explore, using the guidelines and advice in the previous section.
-
Start workload generation.
cd ace python ace.py -l <seq_length> -n <True|False> -d<True|False>
For example, to generate seq-2 workloads with no additional nested directory, run :
python ace.py -l 2 -n False -d false
Flags:
-l
- Sequence length of the workload, i.e., the number of core file-system operations in the workload.-n
- If True, provides an additional level of nesting to the file set. Adds a directoryA/C
and two filesA/C/foo
andA/C/bar
to the set of files.-d
- Demo workload. If true, simply restricts the workload space to test two file-system operationslink
andfallocate
, allowing the persistence of used files only. The file set is also restricted to justfoo
andA/bar
*-t
- The type of test to generate. Should one of 'crashmonkey' and 'xfstest'. If unspecified, the adapter will default to 'crashmonkey'.
You can extend Ace to generate workloads of larger sequences, expand the set of files and directories acted upon, or support new file-system operations. Let's see what changes are required to do so.
There is nothing preventing Ace from generating workloads of length > 3. You could simply say -l 4
to generate sequence 4 workloads. One bit of dependency that you should add in for sequences 5 and above is, is the insertion of persistence operations.
if int(num_ops) == 4:
for i in itertools.product(SyncSetCustom, SyncSetCustom, SyncSetCustom, SyncSetNoneCustom):
syncPermutationsCustom.append(i)
This piece of code ensures that the last persistence op is never None
. We need to generalize this in Ace, to handle higher sequences.
Workload generation is Ace is somewhat closely coupled to the set of files and directories available. While we aim to make it general enough to handle any user-defined file, it currently requires quite a few modifications in Ace to support new files. You need to ensure that the methods SiblingOf(file)
and Parent(file)
return appropriate values for the new file you are adding. The functions to check dependencies are also tied to our pre-defined list of files. For example, we need to generalize checkDirDep
and checkParentExistsDep
to include the new file. In future, we aim to generalize these functions to understand this layout of files and directories.
Supporting new file-system operations requires additions to both Ace and Adapter. First, add the new system call to the buildTuple
function, with appropriate argument list to this operation. Next, ensure that all the dependencies arising due to this system call are addressed in the satisfyDep
function. Finally, add the new system call to the buildJlang
function, in a format you would like to see it in the high-level language test file. Now for the adapter, add the new system call to the insertFunctions
method, and write an appropriate insert<NewSysCall>
method that handles the conversion of high-level language description to C++ equivalent code.