Skip to content

Commit

Permalink
Merge pull request #203 from DataBiosphere/dev
Browse files Browse the repository at this point in the history
PR for 0.3.9 release
  • Loading branch information
wnojopra authored Jul 6, 2020
2 parents f1a478e + 6bbee1a commit eac45f9
Show file tree
Hide file tree
Showing 11 changed files with 210 additions and 271 deletions.
134 changes: 128 additions & 6 deletions docs/providers/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -290,13 +290,135 @@ Logging paths and the `[prefix]` are discussed further in [Logging](../logging.m

#### Resource requirements

The `google-v2` and `google-cls-v2` providers support resource-related flags
such as `--machine-type`, `--boot-disk-size`, `--disk-size`, and several other
Compute Engine VM parameters.
The `google-v2` and `google-cls-v2` providers support many resource-related
flags to configure the Compute Engine VMs that tasks run on, such as
`--machine-type` or `--min-cores` and `--min-ram`, as well as `--boot-disk-size`
and `--disk-size`. Additional provider-specific parameters are available
and documented below.

##### Disk allocation

The Docker container launched by the Pipelines API will use the host VM boot
disk for system paths. All other directories set up by `dsub` will be on the
data disk, including the `TMPDIR` (as discussed above). Thus you should only
ever need to change the `--disk-size`.
disk for the system services needed to orchestrate the set of docker actions
defined by `dsub`. All other directories set up by `dsub` will be on the
data disk, including the `TMPDIR` (as discussed above). In general it should
be unnecessary for end-users to ever change the `--boot-disk-size` and they
should only need to set the `--disk-size`. One known exception is when very
large Docker images are used, as such images need to be pulled to the boot disk.

#### Provider specific parameters

The following `dsub` parameters are specific to the `google-v2` and
`google-cls-v2` providers:

* [Location resources](https://cloud.google.com/about/locations)

- `--location` (`google-cls-v2` only):
- Specifies the Google Cloud region to which the pipeline request will be
sent and where operation metadata will be stored. The associated dsub task
may be executed in another region if the `--regions` or `--zones`
arguments are specified. (default: us-central1)

- `--project`:
- Cloud project ID in which to run the job.
- `--regions`:
- List of Google Compute Engine regions. Only one of `--zones` and
`--regions` may be specified.
- `--zones`:
- List of Google Compute Engine zones.

- [Network resources](https://cloud.google.com/vpc/docs/overview)
- `--network`:
- The Compute Engine VPC network name to attach the VM's network interface
to. The value will be prefixed with `global/networks/` unless it contains
a `/`, in which case it is assumed to be a fully specified network
resource URL.
- `--subnetwork`:
- The name of the Compute Engine subnetwork to attach the instance to.
- `--use-private-address`:
- If set to true, do not attach a public IP address to the VM.
(default: False)

- Per-task compute resources
- `--boot-disk-size`:
- Size (in GB) of the boot disk. (default: 10)
- `--cpu-platform`:
- The CPU platform to request. Supported values can be found at
[Specifying a minimum CPU](https://cloud.google.com/compute/docs/instances/specify-min-cpu-platform)
- `--disk-type`:
- The disk type to use for the data disk. Valid values are `pd-standard`,
`pd-ssd` and `local-ssd`. (default: `pd-standard`)
- `--docker-cache-images`:
- The Compute Engine Disk Images to use as a Docker cache. At the moment,
only a single image is supported. Image passed must be of the form
"projects/{PROJECT_ID}/global/images/{IMAGE_NAME}".
Instructions for creating a disk image can be found at
[Create private images](https://cloud.google.com/compute/docs/images/create-delete-deprecate-private-images)
- `--machine-type`:
- Provider-specific machine type.
- `--preemptible`:
- If `--preemptible` is given without a number, enables preemptible VMs
for all attempts for all tasks. If a number value N is used, enables
preemptible VMs for up to N attempts for each task. Defaults to not
using preemptible VMs.
- `--timeout`:
- The maximum amount of time to give the task to complete. This includes
the time spent waiting for a worker to be allocated. Time can be listed
using a number followed by a unit. Supported units are s (seconds),
m (minutes), h (hours), d (days), w (weeks). Example: '7d' (7 days).
(default: '7d')

- [Task credentials](https://cloud.google.com/docs/authentication)
- `--credentials-file`:
- Path to a local file with JSON credentials for a service account.
- `--scopes`:
- Space-separated scopes for Google Compute Engine instances. If
unspecified, provider will use

- https://www.googleapis.com/auth/bigquery,
- https://www.googleapis.com/auth/compute,
- https://www.googleapis.com/auth/devstorage.full_control,
- https://www.googleapis.com/auth/genomics,
- https://www.googleapis.com/auth/logging.write,
- https://www.googleapis.com/auth/monitoring.write
- `--service-account`:
- Email address of the service account to be authorized on the Compute
Engine VM for each job task. If not specified, the default Compute
Engine service account for the project will be used.

- Monitoring, logging, and debugging
- `--enable-stackdriver-monitoring`:
- If set to true, enables Stackdriver monitoring on the VM.
(default: False)
- `--log-interval`:
- The amount of time to sleep between copies of log files from the task to
the logging path. Time can be listed using a number followed by a unit.
Supported units are s (seconds), m (minutes), h (hours).
Example: '5m' (5 minutes). (default: '1m')
- `--ssh`:
- If set to true, start an ssh container in the background to allow you to
log in using SSH and debug in real time. (default: False)

- GPU resources
- `--accelerator-type`:
- The Compute Engine accelerator type. By specifying this parameter, you
will download and install the following third-party software onto your
job's Compute Engine instances:

- NVIDIA(R) Tesla(R) drivers and NVIDIA(R) CUDA toolkit.

Please see [GPUs](https://cloud.google.com/compute/docs/gpus/) for
supported GPU types and
[pipelines.accelerator] (https://cloud.google.com/lifesciences/docs/reference/rest/v2beta/projects.locations.pipelines/run#accelerator)
for more details.
- `--accelerator-count`:
- The number of accelerators of the specified type to attach. By
specifying this parameter, you will download and install the following
third-party software onto your job's Compute Engine instances: NVIDIA(R)
Tesla(R) drivers and NVIDIA(R) CUDA toolkit. (default: 0)
- `--nvidia-driver-version`:
- The NVIDIA driver version to use when attaching an NVIDIA GPU
accelerator. The version specified here must be compatible with the GPU
libraries contained in the container being executed, and must be one of
the drivers hosted in the nvidia-drivers-us-public bucket on Google
Cloud Storage.
2 changes: 1 addition & 1 deletion dsub/_dsub_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,4 @@
0.1.3.dev0 -> 0.1.3 -> 0.1.4.dev0 -> ...
"""

DSUB_VERSION = '0.3.8'
DSUB_VERSION = '0.3.9'
53 changes: 31 additions & 22 deletions dsub/commands/dsub.py
Original file line number Diff line number Diff line change
Expand Up @@ -402,8 +402,8 @@ def _parse_arguments(prog, argv):
# Shared between the "google-cls-v2" and "google-v2" providers
google_common = parser.add_argument_group(
title='google-common',
description="""Options common to the "google", "google-cls-v2", and
"google-v2" providers""")
description="""Options common to the "google-cls-v2" and "google-v2"
providers""")
google_common.add_argument(
'--project', help='Cloud project ID in which to run the job')
google_common.add_argument(
Expand Down Expand Up @@ -453,78 +453,74 @@ def _parse_arguments(prog, argv):
'--credentials-file',
type=str,
help='Path to a local file with JSON credentials for a service account.')

google_v2 = parser.add_argument_group(
title='"google-v2" provider options',
description='See also the "google-common" options listed above')
google_v2.add_argument(
google_common.add_argument(
'--regions',
nargs='+',
help="""List of Google Compute Engine regions.
Only one of --zones and --regions may be specified.""")
google_v2.add_argument(
google_common.add_argument(
'--machine-type', help='Provider-specific machine type (default: None)')
google_v2.add_argument(
google_common.add_argument(
'--cpu-platform',
help="""The CPU platform to request. Supported values can be found at
https://cloud.google.com/compute/docs/instances/specify-min-cpu-platform
(default: None)""")
google_v2.add_argument(
google_common.add_argument(
'--network',
help="""The Compute Engine VPC network name to attach the VM's network
interface to. The value will be prefixed with global/networks/ unless
it contains a /, in which case it is assumed to be a fully specified
network resource URL. (default: None)""")
google_v2.add_argument(
google_common.add_argument(
'--subnetwork',
help="""The name of the Compute Engine subnetwork to attach the instance
to. (default: None)""")
google_v2.add_argument(
google_common.add_argument(
'--use-private-address',
default=False,
action='store_true',
help="""If set to true, do not attach a public IP address to the VM.
(default: False)""")
google_v2.add_argument(
google_common.add_argument(
'--timeout',
help="""The maximum amount of time to give the task to complete.
This includes the time spent waiting for a worker to be allocated.
Time can be listed using a number followed by a unit. Supported units
are s (seconds), m (minutes), h (hours), d (days), w (weeks). The
provider-specific default is 7 days. Example: '7d' (7 days).""")
google_v2.add_argument(
google_common.add_argument(
'--log-interval',
help="""The amount of time to sleep between copies of log files from
the task to the logging path.
Time can be listed using a number followed by a unit. Supported units
are s (seconds), m (minutes), h (hours).
Example: '5m' (5 minutes). Default is '1m'.""")
google_v2.add_argument(
google_common.add_argument(
'--ssh',
default=False,
action='store_true',
help="""If set to true, start an ssh container in the background
to allow you to log in using SSH and debug in real time.
(default: False)""")
google_v2.add_argument(
google_common.add_argument(
'--nvidia-driver-version',
help="""The NVIDIA driver version to use when attaching an NVIDIA GPU
accelerator. The version specified here must be compatible with the
GPU libraries contained in the container being executed, and must be
one of the drivers hosted in the nvidia-drivers-us-public bucket on
Google Cloud Storage. (default: None)""")
google_v2.add_argument(
google_common.add_argument(
'--service-account',
type=str,
help="""Email address of the service account to be authorized on the
Compute Engine VM for each job task. If not specified, the default
Compute Engine service account for the project will be used.""")
google_v2.add_argument(
google_common.add_argument(
'--disk-type',
help="""
The disk type to use for the data disk. Valid values are pd-standard
pd-ssd and local-ssd. The default value is pd-standard.""")
google_v2.add_argument(
google_common.add_argument(
'--enable-stackdriver-monitoring',
default=False,
action='store_true',
Expand Down Expand Up @@ -1174,8 +1170,8 @@ def run(provider,
script = job_model.Script(command_name, '#!/usr/bin/env bash\n' + command)
elif script:
# Read the script file
script_file = dsub_util.load_file(script)
script = job_model.Script(os.path.basename(script), script_file.read())
script_file_contents = dsub_util.load_file(script)
script = job_model.Script(os.path.basename(script), script_file_contents)
else:
raise ValueError('One of --command or a script name must be supplied')

Expand Down Expand Up @@ -1296,6 +1292,12 @@ def _name_for_command(command):
'command'
>>> _name_for_command('\\\n\\\n# Bad continuations, but ignore.\necho hello.')
'echo'
>>> _name_for_command('(uname -a && pwd) # Command begins with non-letter.')
'uname'
>>> _name_for_command('my-program.sh # Command with hyphens.')
'my-program.sh'
>>> _name_for_command('/home/user/bin/-my-sort # Path with hyphen.')
'my-sort'
Arguments:
command: the user-provided command
Expand All @@ -1307,7 +1309,14 @@ def _name_for_command(command):
for line in lines:
line = line.strip()
if line and not line.startswith('#') and line != '\\':
return os.path.basename(re.split(r'\s', line)[0])
# Tokenize on whitespace [ \t\n\r\f\v]
names = re.split(r'\s', line)
for name in names:
# Make sure the first character is a letter, number, or underscore
# Get basename so something like "/usr/bin/sort" becomes just "sort"
name = re.sub(r'^[^a-zA-Z0-9_]*', '', os.path.basename(name))
if name:
return name

return 'command'

Expand Down
16 changes: 8 additions & 8 deletions dsub/lib/dsub_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@

from __future__ import print_function

from contextlib import contextmanager
from datetime import datetime
import contextlib
import datetime
import fnmatch
import io
import os
Expand Down Expand Up @@ -54,7 +54,7 @@ def write(self, buf):
self._fileobj.write(buf)


@contextmanager
@contextlib.contextmanager
def replace_print(fileobj=sys.stderr):
"""Sys.out replacer, by default with stderr.
Expand Down Expand Up @@ -142,7 +142,7 @@ def _get_storage_service(credentials):

def _retry_storage_check(exception):
"""Return True if we should retry, False otherwise."""
now = datetime.now().strftime('%Y-%m-%d %H:%M:%S.%f')
now = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S.%f')
print_error(
'%s: Exception %s: %s' % (now, type(exception).__name__, str(exception)))
return isinstance(exception, google.auth.exceptions.RefreshError)
Expand Down Expand Up @@ -184,7 +184,7 @@ def _load_file_from_gcs(gcs_file_path, credentials=None):
filevalue = file_handle.getvalue()
if not isinstance(filevalue, six.string_types):
filevalue = filevalue.decode()
return six.StringIO(filevalue)
return filevalue


def load_file(file_path, credentials=None):
Expand All @@ -196,13 +196,13 @@ def load_file(file_path, credentials=None):
credentials: Optional credential to be used to load the file from gcs.
Returns:
A python File object if loading file from local or a StringIO object if
loading from gcs.
The contents of the file as a string.
"""
if file_path.startswith('gs://'):
return _load_file_from_gcs(file_path, credentials)
else:
return open(file_path, 'r')
with open(file_path, 'r') as f:
return f.read()


# Exponential backoff retrying downloads of GCS object chunks.
Expand Down
Loading

0 comments on commit eac45f9

Please sign in to comment.