-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add exostellar infrastructure optimizer playbook
Configure XIO Resolves #226
- Loading branch information
Showing
33 changed files
with
1,989 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# Exostellar Infrastructure Optimizer | ||
|
||
[Exostellar Infrastructure Optimizer](https://exostellar.io/infrastructureoptimizer-technical-information/) (XIO) runs applications in virtual machines (VMs) on EC2 instances and dynamically relocates the VMs between instances based on availability and cost. | ||
Long-running, stateful jobs cannot normally be run on spot instances because they can't be restarted after a spot termination. | ||
XIO reduces this risk by predicting spot terminations and dynamically relocating the VM to an on-demand instance. | ||
When spot capacity becomes available again, the VM can be migrated back to a spot instance. | ||
This allows you to save up to 90% over on-demand pricing by running on spot when capacity is available. | ||
|
||
## XIO Configuration | ||
|
||
Refer to [Exostellar's documentation](https://docs.exostellar.io/latest/Latest/HPC-User/getting-started-installation) to make sure you have the latest instructions. | ||
|
||
### Create IAM permissions stack | ||
|
||
[Create the EC2 instances profiles](https://docs.exostellar.io/latest/Latest/HPC-User/getting-ready-prerequisites#GettingReady:Prerequisites-EC2InstanceProfiles). | ||
|
||
* Download the CloudFormation template | ||
* Create a stack using the template | ||
|
||
### Install the Management Server | ||
|
||
[Install the management server](https://docs.exostellar.io/latest/Latest/HPC-User/installing-management-server) | ||
|
||
For the shared security group id use the SlurmLoginNodeSGId so that it has access to the Slurm head node. | ||
|
||
### Configure Slurm | ||
|
||
``` | ||
export MGMT_SERVER=10.4.130.5 | ||
export SLURM_CONF_DIR=/opt/slurm/res-eda-pc-3-10-1-rhel8-x86/etc | ||
"I2Nsb3VkLWNvbmZpZwpydW5jbWQ6CiAgLSBbc2gsIC1jLCAibWtkaXIgLXAgL3hjb21wdXRlIl0KICAtIFtzaCwgLWMsICJtb3VudCAxNzIuMzEuMjQuNToveGNvbXB1dGUgL3hjb21wdXRlIl0KICAtIFtzaCwgLWMsICJta2RpciAtcCAvaG9tZS9zbHVybSJdCiAgLSBbc2gsIC1jLCAibW91bnQgMTcyLjMxLjI0LjU6L2hvbWUvc2x1cm0gL2hvbWUvc2x1cm0iXQogIC0gW3NoLCAtYywgInJtIC1yZiAvZXRjL3NsdXJtIl0KICAtIFtzaCwgLWMsICJsbiAtcyAveGNvbXB1dGUvc2x1cm0vIC9ldGMvc2x1cm0iXQogIC0gW3NoLCAtYywgImNwIC94Y29tcHV0ZS9zbHVybS9tdW5nZS5rZXkgL2V0Yy9tdW5nZS9tdW5nZS5rZXkiXQogIC0gW3NoLCAtYywgInN5c3RlbWN0bCByZXN0YXJ0IG11bmdlIl0KICAjIEFMV0FZUyBMQVNUIQogIC0gWwogICAgICBzaCwKICAgICAgLWMsCiAgICAgICJlY2hvIFhTUE9UX05PREVOQU1FID4gL3Zhci9ydW4vbm9kZW5hbWU7IHNjb250cm9sIHVwZGF0ZSBub2RlbmFtZT1YU1BPVF9OT0RFTkFNRSBub2RlYWRkcj1gaG9zdG5hbWUgLUlgIiwKICAgIF0KCg==" | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
41 changes: 41 additions & 0 deletions
41
source/resources/parallel-cluster/config/bin/xio-compute-node-ami-configure.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
#!/bin/bash -ex | ||
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. | ||
# SPDX-License-Identifier: MIT-0 | ||
|
||
# This script configures an instance so that it can be used to create an AMI to be used by | ||
# Exostellar Infrastructure Optiizer. | ||
# The instance should be launched using a plain RHEL AMI. | ||
|
||
script=$0 | ||
script_name=$(basename $script) | ||
|
||
# Jinja2 template variables | ||
assets_bucket={{assets_bucket}} | ||
assets_base_key={{assets_base_key}} | ||
export AWS_DEFAULT_REGION={{Region}} | ||
ClusterName={{ClusterName}} | ||
ErrorSnsTopicArn={{ErrorSnsTopicArn}} | ||
playbooks_s3_url={{playbooks_s3_url}} | ||
|
||
# Redirect all IO to /var/log/messages and then echo to stderr | ||
exec 1> >(logger -s -t xio-compute-node-ami-configure.sh) 2>&1 | ||
|
||
# Install ansible | ||
if ! yum list installed ansible &> /dev/null; then | ||
yum install -y ansible || amazon-linux-extras install -y ansible2 | ||
fi | ||
ansible-galaxy collection install ansible.posix | ||
|
||
config_dir=/opt/slurm/config | ||
config_bin_dir=$config_dir/bin | ||
ANSIBLE_PATH=$config_dir/ansible | ||
PLAYBOOKS_PATH=$ANSIBLE_PATH/playbooks | ||
PLAYBOOKS_ZIP_PATH=$ANSIBLE_PATH/playbooks.zip | ||
|
||
pushd $PLAYBOOKS_PATH | ||
|
||
ansible-playbook $PLAYBOOKS_PATH/XioComputeNodeAmi.yml \ | ||
-i inventories/local.yml \ | ||
-e @$ANSIBLE_PATH/ansible_head_node_vars.yml | ||
|
||
popd |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
--- | ||
- name: Configure instance for Exostellar Infrastructure Optimizer AMI creation | ||
hosts: XioComnputeNodeAmi | ||
become_user: root | ||
become: yes | ||
roles: | ||
- eda_tools | ||
- XioComputeNodeAmi |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
13 changes: 13 additions & 0 deletions
13
source/resources/playbooks/roles/XioComputeNodeAmi/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
XioComputeNodeAmi | ||
========= | ||
|
||
Configure an instance to create an AMI to be used by Exostellar Infrastructure Optimizer (XIO). | ||
The instance should be launched from a base RHEL AMI, not a ParallelCluster AMI. | ||
|
||
* Mount /opt/slurm in /etc/fstab | ||
* Install required packages | ||
* Configure munge | ||
* Configure slurmd. | ||
|
||
Requirements | ||
------------ |
Oops, something went wrong.