This is the set of scripts that drives the Ozone data generation tool HDDS-4395 (Ozone Data Generator for Fast Scale Test).
Clone this repo locally at your cluster.
The scripts make the following assumption:
- all hosts are within the same domain
- there is a user that can ssh without password, and this user can sudo.
- one block per per
- All keys are in volume vol1, bucket bucket1, Key name in format /L1-x/L2-y/L3-z/key1234
- Edit conf.sh. Change these variables:
Variable | Description | Example |
---|---|---|
CLUSTER_DOMAIN | cluster domain | .halxg.cloudera.com |
SCM_HOST | SCM host name | vc1504 |
OM_HOSTS | OM host names | (vc1502 vc1503 vc1506) |
DN_HOSTNAME | DN host names | (vb0213 vb0214 vb0215) |
SSH_PASSWORDLESS_USER | Linux user name | systest |
OZONE_TARBALL | Ozone tar ball file name | hadoop-ozone-1.1.0-SNAPSHOT.tar.gz |
JAVA_HOME | path to Java | /usr/java/jdk1.8.0_232-cloudera/ |
DISKS_TOTAL | number of disks per DN | 3 |
DATAGEN_THREADS | number of threads in the DN data gen | 6 |
OM_DIR | OM directory | /var/lib/hadoop-ozone/om |
SCM_DIR | SCM directory | /var/lib/hadoop-ozone/scm |
DN_DIR | DN ID directory | /var/lib/hadoop-ozone/dn |
DN_DATA_DIR_TYPE | location of the cluster | ycloud |
CONTAINER_SIZE | max size of a container | 5368709120 (5GB) |
KEY_SIZE | Key file length. e.g. 300KB | 307200 |
TOTAL_CONTAINERS | total containers in the cluster | 100000000 |
TOTAL_KEYS | number of keys. e.g. 100 million | 1024 |
The tool assumes the DN volumes are mounted at directory /data/1, /data/2, /data3. If the DNs mount volumes at different path, update init_dn.sh.
for disk_id in $(seq $(( $DISKS_PER_DATAGEN * $datagen_id + 1 )) $(( $DISKS_PER_DATAGEN * ($datagen_id + 1) ))); do
paths+=("/data/${disk_id}/hadoop-ozone/datanode/data")
done
- Run ./copy_scripts.sh to copy the scripts to all nodes.
- Copy the Ozone tarball to /tmp, and run ./deploy_ozone_tarball.sh
- Run ./remote_sys_init.sh to update ulimits on all nodes.
- Run ./remote_init.sh to generate OM, SCM and DN data.
- The OM db is generated under /var/lib/hadoop-ozone/fake_om; The SCM db under /var/lib/hadoop-ozone/fake_scm. Make sure to update cluster configuration before start.
That's it!