Please review the introduction to image building for general information on building custom images using the Toolkit.
This module uses Packer to create an image within an Cluster Toolkit deployment. Packer operates by provisioning a short-lived VM in Google Cloud on which it executes scripts to customize the boot disk for repeated use. The VM's boot disk is specified from a source image that defaults to the HPC VM Image. This Packer "template" supports customization by the following approaches following a recommended use:
- startup-script metadata from raw string or file
- Shell scripts uploaded from the Packer execution environment to the VM
- Ansible playbooks uploaded from the Packer execution environment to the VM
They can be specified independently of one another, so that anywhere from 1 to 3 solutions can be used simultaneously. In the case that 0 scripts are supplied, the source boot disk is effectively copied to your project without customization. This can be useful in scenarios where increased control over the image maintenance lifecycle is desired or when policies restrict the use of images to internal projects.
Most customization scripts require access to resources on the public internet. This can be achieved by one of the following 2 approaches:
- Using a public IP address on the VM
- Set var.omit_external_ip to
true
- Configuring a VPC with a Cloud NAT in the region of the VM
- Use the [vpc] module which automates NAT creation
Read order of execution below for a discussion of VM customization solutions and their requirements for inbound SSH access. Environments without SSH access should use the metadata-based startup-script solution.
A simple way to enable inbound SSH access is to use the VPC module with
allowed_ssh_ip_ranges
set to 0.0.0.0/0
.
The user or service account running Packer must have the permission to create VMs in the selected VPC network and, if use_iap is set, must have the "IAP-Secured Tunnel User" role. Recommended roles are:
roles/compute.instanceAdmin.v1
roles/iap.tunnelResourceAccessor
The service account attached to the temporary build VM created by Packer should have the ability to write Cloud Logging entries so that you may inspect and debug build logs. When using the metadata startup-script customization solution, the service account attached to the temporary build VM created by Packer must have the permission to modify its own metadata and to read from Cloud Storage buckets. Recommended roles are:
roles/compute.instanceAdmin.v1
roles/logging.logWriter
roles/monitoring.metricWriter
roles/storage.objectViewer
It is recommended to create this service account as a separate step outside a blueprint due to known delay in IAM bindings propagation.
A recommended pattern for building images with this module is to use the terraform based startup-script module along with this packer custom-image module. Below you can find links to several examples of this pattern, including usage instructions.
The Image Builder blueprint demonstrates a solution that builds an image using:
- The HPC VM Image as a base upon which to customize
- A VPC network with firewall rules that allow IAP-based SSH tunnels
- A Toolkit runner that installs a custom script
Please review the examples README for usage instructions.
The startup script specified in metadata executes in parallel with the other supported methods. However, the remaining methods execute in a well-defined order relative to one another.
- All shell scripts will execute in the configured order
- After shell scripts complete, all Ansible playbooks will execute in the configured order
NOTE: if both startup_script and startup_script_file are specified, then startup_script_file takes precedence.
Because the metadata startup script executes in parallel
with the other solutions, conflicts can arise, especially when package managers
(yum
or apt
) lock their databases during package installation. Therefore, it
is recommended to choose one of the following approaches:
- Specify either startup_script or startup_script_file and do
not specify shell_scripts or ansible_playbooks.
- This can be especially useful in environments that restrict SSH access
- Specify any combination of shell_scripts and ansible_playbooks and do not specify startup_script or startup_script_file.
If any of the startup script approaches fail by returning a code other than 0, Packer will determine that the build has failed and refuse to save the image.
The shell scripts and Ansible playbooks customization solutions both require SSH access to the VM from the Packer execution environment. SSH access can be enabled one of 2 ways:
- The VM is created without a public IP address and SSH tunnels are created
using Identity-Aware Proxy (IAP).
- Allow use_iap to take on its default value of
true
- Allow use_iap to take on its default value of
- The VM is created with an IP address on the public internet and firewall
rules allow SSH access from the Packer execution environment.
- Set
omit_external_ip = false
(oromit_external_ip: false
in a blueprint) - Add firewall rules that open SSH to the VM
- Set
The Packer template defaults to using to the 1st IAP-based solution because it is more secure (no exposure to public internet) and because the Toolkit VPC module automatically sets up all necessary firewall rules for SSH tunneling and outbound-only access to the internet through Cloud NAT.
In either SSH solution, customization scripts should be supplied as files in the shell_scripts and ansible_playbooks settings.
Many network environments disallow SSH access to VMs. In these environments, the metadata-based startup scripts are appropriate because they execute entirely independently of the Packer execution environment.
In this scenario, a single scripts should be supplied in the form of a string to the startup_script input variable. This solution integrates well with Toolkit runners. Runners operate by using a single startup script whose behavior is extended by downloading and executing a customizable set of runners from Cloud Storage at startup.
NOTE: Packer will attempt to use SSH if either shell_scripts or ansible_playbooks are set to non-empty values. Leave them at their default, empty values to ensure access by SSH is disabled.
The startup_script parameter accepts scripts formatted as strings. In
Packer and Terraform, multi-line strings can be specified using
heredoc syntax
in an input Packer variables file (*.pkrvars.hcl
) For example, the
following snippet defines a multi-line bash script followed by an integer
representing the size, in GiB, of the resulting image:
startup_script = <<-EOT
#!/bin/bash
yum install -y epel-release
yum install -y jq
EOT
disk_size = 100
In a blueprint, the equivalent syntax is:
...
settings:
startup_script: |
#!/bin/bash
yum install -y epel-release
yum install -y jq
disk_size: 100
...
When using startup script customization, Packer will print very limited output to the console. For example:
==> example.googlecompute.toolkit_image: Waiting for any running startup script to finish...
==> example.googlecompute.toolkit_image: Startup script not finished yet. Waiting...
==> example.googlecompute.toolkit_image: Startup script not finished yet. Waiting...
==> example.googlecompute.toolkit_image: Startup script, if any, has finished running.
Using the default value for [var.scopes][#input_scopes], the output of
startup script execution will be stored in Cloud Logging. It can be examined
using the Cloud Logging Console or with a
gcloud logging read command (substituting <<PROJECT_ID>>
with your project ID):
$ gcloud logging --project <<PROJECT_ID>> read \
'logName="projects/<<PROJECT_ID>>/logs/GCEMetadataScripts" AND jsonPayload.message=~"^startup-script: "' \
--format="table[box](timestamp, resource.labels.instance_id, jsonPayload.message)" --freshness 2h
Note that this command will print all startup script entries within the
project within the "freshness" window in reverse order. You may need to
identify the instance ID of the Packer VM and filter further by that value using
gcloud
or grep
. To print the entries in the order they would have appeared
on your console, we recommend piping the output of this command to the standard
Linux utility tac
.
Copyright 2022 Google LLC
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
No requirements.
No providers.
No modules.
No resources.
Name | Description | Type | Default | Required |
---|---|---|---|---|
accelerator_count | Number of accelerator cards to attach to the VM; not necessary for families that always include GPUs (A2). | number |
null |
no |
accelerator_type | Type of accelerator cards to attach to the VM; not necessary for families that always include GPUs (A2). | string |
null |
no |
ansible_playbooks | A list of Ansible playbook configurations that will be uploaded to customize the VM image | list(object({ |
[] |
no |
communicator | Communicator to use for provisioners that require access to VM ("ssh" or "winrm") | string |
null |
no |
deployment_name | Cluster Toolkit deployment name | string |
n/a | yes |
disk_size | Size of disk image in GB | number |
null |
no |
disk_type | Type of persistent disk to provision | string |
"pd-balanced" |
no |
enable_shielded_vm | Enable the Shielded VM configuration (var.shielded_instance_config). | bool |
false |
no |
image_family | The family name of the image to be built. Defaults to deployment_name |
string |
null |
no |
image_name | The name of the image to be built. If not supplied, it will be set to image_family-$ISO_TIMESTAMP | string |
null |
no |
image_storage_locations | Storage location, either regional or multi-regional, where snapshot content is to be stored and only accepts 1 value. See https://developer.hashicorp.com/packer/plugins/builders/googlecompute#image_storage_locations |
list(string) |
null |
no |
labels | Labels to apply to the short-lived VM | map(string) |
null |
no |
machine_type | VM machine type on which to build new image | string |
"n2-standard-4" |
no |
manifest_file | File to which to write Packer build manifest | string |
"packer-manifest.json" |
no |
metadata | Instance metadata for the builder VM (use var.startup_script or var.startup_script_file to set startup-script metadata) | map(string) |
{} |
no |
network_project_id | Project ID of Shared VPC network | string |
null |
no |
omit_external_ip | Provision the image building VM without a public IP address | bool |
true |
no |
on_host_maintenance | Describes maintenance behavior for the instance. If left blank this will default to MIGRATE except the use of GPUs requires it to be TERMINATE |
string |
null |
no |
project_id | Project in which to create VM and image | string |
n/a | yes |
scopes | DEPRECATED: use var.service_account_scopes | set(string) |
null |
no |
service_account_email | The service account email to use. If null or 'default', then the default Compute Engine service account will be used. | string |
null |
no |
service_account_scopes | Service account scopes to attach to the instance. See https://cloud.google.com/compute/docs/access/service-accounts. |
set(string) |
[ |
no |
shell_scripts | A list of paths to local shell scripts which will be uploaded to customize the VM image | list(string) |
[] |
no |
shielded_instance_config | Shielded VM configuration for the instance (must set var.enabled_shielded_vm) | object({ |
{ |
no |
source_image | Source OS image to build from | string |
null |
no |
source_image_family | Alternative to source_image. Specify image family to build from latest image in family | string |
"hpc-centos-7" |
no |
source_image_project_id | A list of project IDs to search for the source image. Packer will search the first project ID in the list first, and fall back to the next in the list, until it finds the source image. |
list(string) |
null |
no |
ssh_username | Username to use for SSH access to VM | string |
"hpc-toolkit-packer" |
no |
startup_script | Startup script (as raw string) used to build the custom Linux VM image (overridden by var.startup_script_file if both are set) | string |
null |
no |
startup_script_file | File path to local shell script that will be used to customize the Linux VM image (overrides var.startup_script) | string |
null |
no |
state_timeout | The time to wait for instance state changes, including image creation | string |
"10m" |
no |
subnetwork_name | Name of subnetwork in which to provision image building VM | string |
n/a | yes |
tags | Assign network tags to apply firewall rules to VM instance | list(string) |
null |
no |
use_iap | Use IAP proxy when connecting by SSH | bool |
true |
no |
use_os_login | Use OS Login when connecting by SSH | bool |
false |
no |
windows_startup_ps1 | A list of strings containing PowerShell scripts which will customize a Windows VM image (requires WinRM communicator) | list(string) |
[] |
no |
wrap_startup_script | Wrap startup script with Packer-generated wrapper | bool |
true |
no |
zone | Cloud zone in which to provision image building VM | string |
n/a | yes |
No outputs.