diff --git a/docs/_archive/develop-on-a-remote-host.md b/docs/developers-guide/developing-on-a-remote-host.md similarity index 50% rename from docs/_archive/develop-on-a-remote-host.md rename to docs/developers-guide/developing-on-a-remote-host.md index 3cba374277..8a758c9971 100644 --- a/docs/_archive/develop-on-a-remote-host.md +++ b/docs/developers-guide/developing-on-a-remote-host.md @@ -1,3 +1,7 @@ +# Developing on a remote host + +## Introduction + Team members who develop locally may not benefit from the same compute resources. The most notable resources that can impact the productivity of developers are the number and frequency of the CPU cores, the memory available and internet speed. The worse case is when a machine does not have the @@ -5,28 +9,23 @@ resources to run the apps that the team develops, for example when not enough me On other times, the time required to complete a task may be many times slower on a computer with lower CPU resources. -Working remotely means that developers no longer benefit from the same internet speed, either -because of the quality of the internet connection available at their location or because the speed -is shared among the members of a household. As a result, tasks that involve downloading or uploading -artifacts, like pulling or pushing Docker images, may take significantly longer to complete. - -This page describes how to setup a development environment that enables developers to use VS Code -while using the compute resources of a remote host. The developers start by creating identical EC2 -instances before [connecting to them with VS -Code](https://code.visualstudio.com/remote/advancedcontainers/develop-remote-host). This SOP enables -developers to continue working [inside the devcontainer](#devcontainer) provided with this project, -hence further contributing to the standardization of the development envrionment. +Moreover, working remotely means that developers no longer benefit from the same internet speed, +either because of the quality of the internet connection available at their location or because the +speed is shared among the members of a household. As a result, tasks that involve downloading or +uploading artifacts, like pulling or pushing Docker images, may take significantly longer to +complete. -> **Note** 2023-01-28: Added documentation to connect to a GitHub Codespace. +This page describes how to setup a environment that enables developers to use VS Code while using +the compute resources of a remote host. -## Use case +## Motivation -This table summarizes the local compute resources available to the developers of the challenge -registry. The same information is displayed for two types of Amazon EC2 instances and one type of -GitHub Codespace instance that were selected as candidate alternative development environments for -the team members. The table also includes the runtimes in seconds of different tasks such as linting -or testing all the projects included in the monorepo (the method used to generate these results is -described in the next section). +To illustrate the benefit of developing on a remote host, this table summarizes the local compute +resources available to the developers of OpenChallenges in 2023. The same information is displayed +for two types of Amazon EC2 instances and one type of GitHub Codespace instance that were selected +as candidate alternative development environments for the team members. The table also includes the +runtimes in seconds of different tasks such as linting or testing all the projects included in the +monorepo (the method used to generate these results is described in the next section). | | Shirou | Rin | Sakura | m5.2xlarge | t3a.xlarge | 4-core Codespace | 8-core Codespace | | ------------------------------------------------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ---------------- | ---------------- | @@ -45,11 +44,12 @@ described in the next section). | On-Demand Cost ($/day) | n/a | n/a | n/a | 9.2 | 3.6 | 8.64 (1,2) | 17.28 (1,2) | | On-Demand Cost ($/year) | n/a | n/a | n/a | 3363.8 | 1317.5 | 3153.6 (1,2) | 6307.2 (1,2) | -(1) GitHub codespaces stop automatically after 1h of inactivity. A codespace used by an engineer -with 100 %FTE and 8 working hours per day - without taking into account vacation for the sake of -simplicity - would cost 8 hours/day * 5 days/week * 52 weeks * $0.36/hour (4-core) = $748/year (see -[Codespaces pricing]). Similarly, the cost for an 8-core codespace would become $1496/year. In -addition, GitHub bills $0.07 of GB of storage. +(1) GitHub codespaces stop automatically after 1h of inactivity. A codespace used by an full-time +engineer (8h/day) - without taking into account vacation for the sake of simplicity - would cost 8 +hours/day * 5 days/week * 52 weeks * $0.36/hour (4-core) = $748/year (see [Codespaces pricing]). +Similarly, the cost for an 8-core codespace would become $1496/year. In addition, GitHub bills $0.07 +of GB of storage independently on whether the codespace is running or stopped. Pricing valid on +2023-12-31. (2) GitHub offers core hours and storage. For example, a Free user can use a 2-core instance for 60 hours per month for free or an 8-core instance for 15 hours. You will be notified by email when you @@ -57,13 +57,15 @@ have used 75%, 90%, and 100% of your included quotas. - Free users: 120 core hours/month and 15 GB month of storage - Pro users: 180 core hours/month and 20 GB month of storage -Note that developers have been asked to measure runtimes and internet speeds while keeping open the -applications that are usually running when they develop (e.g. Spotify, several instances of VS Code, -browser with many tabs open). This could be one reason why runtimes reported by a developer are -larger that those reported by another developer who has less compute resources available. +!!! note -The table below shows the number of times a task is faster than the slowest runtime (denoted by -"1.0"). + Note that developers have been asked to measure runtimes and internet speeds while keeping open the + applications that are usually running when they develop (e.g. Spotify, several instances of VS Code, + browser with many tabs open). This could be one reason why runtimes reported by a developer are + larger that those reported by another developer who has less compute resources available. + +The table below shows the number of times a task ran by a developer is faster than the slowest +runtime (denoted by "1.0"). | | Shirou | Rin | Sakura | m5.2xlarge | t3a.xlarge | | ------------------------------------------------------ | ------------ | ------------ | ------------ | ------------ | ------------ | @@ -82,139 +84,186 @@ instance. This table illustrates well the diversity in compute resources availab developers, and how relying on remote hosts like EC2 instances can provide a better working environment to developers. -### Data collection - -- Runtimes are obtained from [this - commit](https://github.com/Sage-Bionetworks/sage-monorepo/tree/25f2292388d9e71bf46ba137aa530aefb571deab). -- Identification of the compute resources. - ```console - $ nproc - $ cat /proc/cpuinfo - $ cat /proc/meminfo - ``` -- Runtimes are averaged over 10 runs that follow a warmup run using - [hyperfine](https://github.com/sharkdp/hyperfine). - ```console - $ hyperfine --warmup 1 --runs 10 'nx run-many --all --target=lint --skip-nx-cache' - $ hyperfine --warmup 1 --runs 10 'nx run-many --all --target=build --skip-nx-cache' - $ hyperfine --warmup 1 --runs 10 'nx run-many --all --target=test --skip-nx-cache' - $ hyperfine --warmup 1 --runs 10 'nx test api --skip-nx-cache' - $ hyperfine --warmup 1 --runs 10 'nx test web-ui --skip-nx-cache' - ``` -- Internet speeds are measured with [speedtest-cli](https://www.speedtest.net/apps/cli). - ```console - $ speedtest - ``` - -## Preparing the remote host (AWS EC2) - -This section describes how to instantiate an AWS EC2 as the remote host. Steps outlined below will +### Collectings OS info and benchmarking tasks + +Runtimes are obtained from [this +commit](https://github.com/Sage-Bionetworks/sage-monorepo/tree/25f2292388d9e71bf46ba137aa530aefb571deab). + +Identification of the compute resources. + +```console +$ nproc +$ cat /proc/cpuinfo +$ cat /proc/meminfo +``` + +Runtimes are averaged over 10 runs that follow a warmup run using +[hyperfine](https://github.com/sharkdp/hyperfine). + +```console +$ hyperfine --warmup 1 --runs 10 'nx run-many --all --target=lint --skip-nx-cache' +$ hyperfine --warmup 1 --runs 10 'nx run-many --all --target=build --skip-nx-cache' +$ hyperfine --warmup 1 --runs 10 'nx run-many --all --target=test --skip-nx-cache' +$ hyperfine --warmup 1 --runs 10 'nx test api --skip-nx-cache' +$ hyperfine --warmup 1 --runs 10 'nx test web-ui --skip-nx-cache' +``` + +Internet speeds are measured with [speedtest-cli](https://www.speedtest.net/apps/cli). + +```console +$ speedtest +``` + +## Preparing the remote host - AWS EC2 + +This section describes how to instantiate an AWS EC2 as the remote host. Steps outlined below will assume you have access to the Sage AWS Service Catalog. -### On the Service Catalog Portal +### Creating the EC2 instance -- Log in to the [Service Catalog](https://sc.sageit.org) with your Synapse credentials. -- From the list of Products, select **EC2: Linux Docker**. On the Product page, click on **Launch +1. Log in to the [Service Catalog](https://sc.sageit.org) with your Synapse credentials. +2. From the list of Products, select **EC2: Linux Docker**. On the Product page, click on **Launch product** in the upper-right corner. -- On the next page, fill out the wizard as follows: - - **Provisioned product name** - - Name: `-devcontainers` - - **Parameters**: - - EC2 Instance Type: `t3.2xlarge` - - Base Image: `AmazonLinuxDocker` (leave default) - - Disk Size: 80 - - **Manage tags**: - - `Department`: `IBC` or `CNB` (selected from [this - list](https://github.com/Sage-Bionetworks-IT/organizations-infra/blob/master/sceptre/scipool/sc-tag-options/internal/Departments.json)) - - `Project`: `challenge` (selected from [this - list](https://github.com/Sage-Bionetworks-IT/organizations-infra/blob/master/sceptre/scipool/sc-tag-options/internal/Projects.json)) - - `CostCenter`: `NIH-ITCR / 101600` (selected from [these - lists](https://github.com/Sage-Bionetworks/aws-infra/tree/master/templates/tags)) - - **Enable event notifications**: SKIP - DO NOT MODIFY -- Click on **Launch product**. Your instance will take anywhere between 3-5 minutes to deploy. You +3. On the next page, fill out the wizard as follows: + - **Provisioned product name** + - Name: `{GitHub username}-devcontainers-{yyyymmdd}` + - Example: `tschaffter-devcontainers-20240404` + - **Parameters** + - EC2 Instance Type: `t3a.2xlarge` + - Base Image: `AmazonLinuxDocker` (leave default) + - Disk Size: 80 + - **Manage tags** + - `CostCenter`: Select the Cost Center associated to your project + - **Enable event notifications**: SKIP - DO NOT MODIFY +4. Click on **Launch product**. Your instance will take anywhere between 3-5 minutes to deploy. You can either wait on this page until "EC2Instance" shows up on the list under Resources, or you can leave and come back at a later time. -### On your local host - -> #### Note: -> If this is your first time **ever** connecting to an instance from your machine, you will first -> need to set up EC2 access with the AWS Systems Manager (SSM). Follow the instructions below to -> complete the setup: -> - [**Create a Synapse personal access -> token**](https://help.sc.sageit.org/sc/Service-Catalog-Provisioning.938836322.html#ServiceCatalogProvisioning-CreateaSynapsepersonalaccesstoken) -> - [**SSM access to an -> Instance**](https://help.sc.sageit.org/sc/Service-Catalog-Provisioning.938836322.html#ServiceCatalogProvisioning-SSMaccesstoanInstance) -> -> (Don't worry, you will only need to do this once for your local machine!) - -- In your terminal, connect to your instance following the [**Connecting to an Instance - SSM with -SSH**](https://help.sc.sageit.org/sc/Service-Catalog-Provisioning.938836322.html#ServiceCatalogProvisioning-SSMwithSSH) -instructions from the Service Catalog Provisioning doc. -- Once you can successfully login through SSM with SSH, exit the instance. -- Navigate to the Provisioned products page for your instance. Under **Events**, copy the -`EC2InstancePrivateIpAddress` -- In your terminal, add the following into your local `~/.ssh/config`: - ```console - Host devcontainers - HostName - User ec2-user - IdentityFile ~/.ssh/id_rsa - ``` -- Connect to the [Sage - VPN](https://sagebionetworks.jira.com/wiki/spaces/IT/pages/1705246745/AWS+Client+VPN+User+Guide) -- In your terminal, SSH to the instance to ensure `~/.ssh/config` was setup correctly. - ```console - ssh devcontainers - ``` - -### On the EC2 instance - -- Update the system packages. - ```console - sudo yum update -y - ``` -- Docker should already be readily available on the instance. Verify this by running any Docker -command, e.g. - ```console - docker --version - ``` -- Clone your fork into the home directory. -- To easily pull and push changes, we suggest storing your GitHub credentials onto the instance. -Follow the [**Storing GitHub credentials on the EC2 -instance**](https://sagebionetworks.jira.com/wiki/spaces/APGD/pages/2590244872/Service+Catalog+Instance+Setup#Storing-GitHub-credentials-on-the-EC2-instance). -instructions to do so. - -### In VS Code - -- Install the extension `Remote - SSH` and `Remote - Containers`. -- `Remote-SSH: Connect to Host...` > Select the host. -- Verify that the bottom-left corner of the VSCode window shows `SSH: ` upon successfully - connecting to the remote instance. - - - -- `Remote-Containers: Open Folder in Container...` -- Select the project folder and click on `OK`. -- Verify that the bottom-left corner of the VSCode window shows `Dev Container: OpenChallenges @ - ssh://`. - - - -Congratulations, you are now ready to develop in the devcontainer that runs on the EC2 instance! 🚀 - -## Preparing the remote host (GitHub Codespace) +### Stopping the EC2 instance + +It's not something you should do now as part of this tutorial. This section serves as a reminder +that AWS charges for evey hour the EC2 instance is running. As soon as you identify that you will no +longer need the instance for the rest of the day, open the Service Catalog to stop it. + +1. Open the Service Catalog, then select **Provisioned products**. +2. Select the EC2 instance. +3. Click on the button **Actions** > **Service actions** > **Stop**. +4. Confirm the action. + +After a few seconds, the EC2 instance will be stopped. + +!!! note + + AWS still charges us for the storage space that the EC2 instance takes even when it's not running. + Consider destroying the EC2 instance when you decide that you will no longer need it. + +### Connecting to the EC2 instance with AWS Console + +We will now use the AWS Console to open a terminal to the EC2 instance and setup your public SSH +key. + +!!! note + + This section assumes that you already have a public and private SSH key created on your local + machine from where you are running VS Code. + +1. Open the Service Catalog, then select **Provisioned products**. +2. In the section **Resources**, click on the link for "EC2Instance". +3. Click on the checkbox of the new EC2 instance created. +4. Click on the button **Actions** > **Connect**. +5. Click on **Connect**. + +### Configuring the SSH public key on the EC2 instance + +6. Login as the user `ec2-user` and move to its home directory. + ```console + $ sudo -s + # su ec2-user + $ cd + ``` +7. Create the folder `~/.ssh` (if needed). + ```console + $ mkdir ~/.ssh + $ chmod 700 ~/.ssh + ``` +8. Create the file `~/.ssh/authorized_keys` (if needed). + ```console + $ touch ~/.ssh/authorized_keys + $ chmod 644 ~/.ssh/authorized_keys + ``` +9. Copy and paste your public SSH key at the end of `~/.ssh/authorized_keys`. +10. Click on the button **Terminate** to terminate the session and confirm the action. + +### Configuring SSH on the local machine + +This section describes how to create a profile for the EC2 instance in your local `~./ssh/config` +file. + +!!! note + + This section assumes that you already have a public and private SSH key created on your local + machine from where you are running VS Code. + +First, you need to identify the private IP address of the EC2 instance. + +1. Open the Service Catalog, then select **Provisioned products**. +2. In the section **Outputs**, the private IP address is the value associated to + "EC2InstancePrivateIpAddress". + +Then, on your local machine: + +1. Create the file `~/.ssh/config` (if needed). + ```console + $ touch ~/.ssh/config + $ chmod 600 ~/.ssh/config + ``` +2. Add the following content to your local `~/.ssh/config`. + ```console + Host {alias} + HostName {private ip} + User ec2-user + IdentityFile {path to your private SSH key, e.g. ~/.ssh/id_rsa} + ``` + where the placeholder values `{...}` should be replaced with the correct values. + +### Connecting to the EC2 instance with VS Code + +1. Open VS Code. +2. Install the VS Code extension pack "Remote Development". +3. Open the command palette with `Ctrl+Shit+P`. +4. `Remote-SSH: Connect to Host...` > Select the host. +5. Answer the prompts + +You are now connected to the EC2 instance! 🚀 + +!!! tip + + Please remember to stop the EC2 instance at the end of your working day to save on costs. + +### Next + +Go to the section XXX for the instructions on how to setup your environment to contribute to Sage +Monorepo. + +## Preparing the remote host - GitHub Codespace + +This section describes how to open your fork of Sage Monorepo in a GitHub Codespaces instance. + +!!! note + + In practice, we will prefer to develop in an EC2 instance created from the Service Catalog for + security and budget reasons. Please refer to the instructions given above. Using a GitHub Codespace + has been proven to be ponctually useful for quick tests that require a fresh environment, as one of + Codespaces benefits is that they can be created and destroyed faster than EC2 instances. 1. Open your browser and go to [GitHub Codespaces]. 2. Click on the "New codespace". 3. Enter the information requested: - - `Repository`: Select your fork of the monorepo - - `Branch`: Select the default branch - - `Dev container configuration`: Select the dev container definition - - `Region`: Select your preferred region - - `Machine type`: Select the machine type - > **Note** 4-core is preferred for the OpenChallenges project as a trade-off between - > performance and cost. + - **Repository**: Select your fork of the monorepo + - **Branch**: Select the default branch + - **Dev container configuration**: Select the dev container definition + - **Region**: Select your preferred region + - **Machine type**: Select the machine type 4. Click on "Create codespace". 5. Wait for the codespace to be created. 6. Configure the monorepo and install its dependencies (see README). diff --git a/mkdocs.yml b/mkdocs.yml index 4e4ee1255f..a17f481b81 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -39,6 +39,7 @@ nav: - R: - Add an R project: tutorials/r/new-project.md - Developers Guide: + - Developing on a remote host: developers-guide/developing-on-a-remote-host.md - Creating a commit with multiple authors: developers-guide/creating-a-commit-with-multiple-authors.md - Common Issues: developers-guide/faq.md - Reference: