Skip to content

Commit

Permalink
Merge pull request NREL#589 from tdthatcher/jupyter-kestrel
Browse files Browse the repository at this point in the history
Jupyter kestrel
  • Loading branch information
yandthj authored Mar 6, 2024
2 parents 8d8b915 + 574e9d5 commit 6ef243a
Show file tree
Hide file tree
Showing 3 changed files with 150 additions and 145 deletions.
249 changes: 150 additions & 99 deletions docs/Documentation/Development/Jupyter/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,20 @@

**A web app for interactive Python in a browser**

- "Live coding"
- Instant visualization
- Sharable
- Reproducible
- Customizable
- Now supports other languages besides Python (R, Julia..)
- https://github.com/jupyter/jupyter/wiki/Jupyter-kernels
- these slides were created using Markdown in Jupyter!
Jupyter offers a number of benefits for researchers in many fields, including:

- Live coding: Make changes and see the effects in real-time.
- Instant visualization: Charts and graphics render quickly in a browser window.
- Sharable: Notebooks can be copied and sent to others, or multiple users can edit a single shared notebook.
- Reproducible: Create a shareable environment with pinned Python and scientific library versions.
- Customizable: Many configuration options, extensions, and libraries are available.
- Not just for Python: Supports many other languages (including R, Julia, and many others.)
- See [https://github.com/jupyter/jupyter/wiki/Jupyter-kernels](https://github.com/jupyter/jupyter/wiki/Jupyter-kernels) for examples.


## Example Notebook Code

With the appropriate libraries installed into the Jupyter environment, the following code can be placed in one cell in a notebook, or split across multiple cells, and executed to produce quick graphs:

```python
import chart_studio.plotly as py
Expand Down Expand Up @@ -46,122 +52,172 @@ axes[3].set_title("fill_between");
![png](../../../../assets/images/output_4_0.png)


## Terminology - a Confusion of Words

## Jupyter Terminology


### **Jupyterhub**
* Multi-user "backend" server
* Controls launching the single-user Jupyter server
* NREL's "Europa" runs Jupyterhub

(In general, don't worry about JupyterHub--unless you're a sysadmin)

This is the multi-user "backend" server. The "Hub" allows users to login, then launches the single-user Jupyter server for them. Hubs are usually installed and managed by system administrators, not Jupyter users.

NREL's "Europa" (Eagle-only) runs Jupyterhub. More on Europa later in this document.

### **Jupyter/Jupyter Server/Notebook server**
* The single-user server/web interface
* Create/save/load .ipynb notebook files
* What users generally interact with

The single-user server/web interface. Use to create, save, or load .ipynb notebook files. This is what users generally interact with.

### **Jupyter Notebook**
* An individual .pynb file
* Contains your Python code and visualizations
* Sharable/downloadable

A Notebook is an individual .pynb file. It contains your Python code and visualizations, and is sharable/downloadable.

### **Jupyter lab**
* A "nicer" web interface for Jupyter - "notebooks 2.0"
* Preferred by some
* Lacking some features of "classic" notebooks

A "nicer" redesigned web interface for your Jupyter Server - "Notebooks 2.0". Preferred by some, and promoted as the next evolution of Notebooks.
Lab has many new and different extensions, but many are also not compatible between Notebook and Lab. Lab is still under development, so is lacking some features of "classic" notebooks.

### **Kernel**
* The Python environment used by a notebook
* More on kernels later

## Using Europa
Kernels define the Python environments used by your notebooks. Derived from ipykernel, a predecessor project to Jupyter: you may see Jupyter kernels referred to as "ipykernels". Custom kernels require the "ipykernel" package installed in your Jupyter conda environment.

More on kernels later.

We run a Jupyterhub server that is available.
## Eagle's "Europa" Jupyterhub Server

The NREL HPC team runs a Jupyterhub server called Europa that is available for internal (NREL) Eagle users only.

Europa is connected to Eagle's Lustre storage system for access to /projects data.

A replacement for Europa on Kestrel is in the planning stage.

### Europa's Advantages:
* Fast and easy access
* Use regular Eagle credentials
* Great for light to moderate processing/debugging/testing

### Europa's Disadvantages:
* Limited resource: 8 cores/128GB RAM per user beefore automatic throttling
* Compete with other users for CPU/RAM on a single machine
* No custom environments (for now)
* Fast and easy access to notebooks with no setup.
* Use regular Eagle credentials to log in.
* Great for simple tasks, including light to moderate data processing, code debugging/testing, and light to moderate visualization using standard/basic scientific and visualization libraries.

### Europa's Disadvantages:

### Simple Instructions:

- Visit Europa at (https://europa.hpc.nrel.gov/)
* Limited resources: Only 48 CPU cores and 190GB RAM total.
* Managed usage: Up to 8 cores/128GB RAM per user before automatic throttling will greatly slow down processing.
* Must compete with other users for CPU and RAM on a single machine.
* Limited list of scientific libraries and visualization tools are available, and may not be latest versions.
* Custom environments are difficult to configure.
* No access for external (non-NREL) users.
* Not available for Kestrel (yet).

### Simple Instructions to access Europa:

- Log in using your HPC credentials
- Opens a standard "notebooks" interface
- Change url end /tree to /lab for Lab interface
* Visit Europa at (https://europa.hpc.nrel.gov/) in a web browser and log in using your HPC credentials.

Europa opens a standard "notebooks" interface by default. Change the url ending from "/tree" to "/lab" in your web browser to use the Jupyter Lab interface, if preferred.


## Using a Compute Node
## Using a Compute Node to Run Your Own Jupyter Notebooks

### Advantages:
* Custom environments
* 36 cores and up to ~750GB RAM
* No competing with other users for cores

* Custom conda environments to load preferred libraries.
* Full node usage: Exclusive access to the resources of the node your job is reserved on, including up to 36 CPU cores and up to ~750GB RAM on Eagle bigmem nodes, and up to 104 CPU cores and up to ~2TB RAM on Kestrel bigmem nodes. See the system specifications page for the cluster you are working on.
* No competing with other users for CPU cores and RAM, and no Arbiter2 process throttling.

### Disadvantages:
* Compete with other users for nodes
* Costs AU

* Must compete with other users for a node via the job queue.
* Costs your allocation AU.

## Launching Your Own Jupyter Server on an HPC System

Both Kestrel and Eagle support running your own Jupyter Notebook server. This is highly recommended over Europa for advanced Jupyter use and heavy computational processing.

External (non-NREL) **Kestrel** users may follow the directions below for Kestrel, but please use `kestrel.nrel.gov` instead of `kestrel.hpc.nrel.gov`.

External (non-NREL) **Eagle** users will no longer be able to use Jupyter in this fashion as of February 2024. If you require Jupyter, please consider transitioning to Kestrel as soon as possible.

## Using a Compute Node to run Jupyter Notebooks

Connect to a login node and request an interactive job using the `salloc` command.

The examples below will start a 2-hour job. Edit the `<account>` to the name of your allocation, and adjust the time accordingly. Since these are interactive jobs, they will get some priority, especially if they're shorter, so only book as much time as you will be actively working on the notebook.

Before you get started, we recommend installing your own Jupyter inside of a conda environment. The default conda/anaconda3 modules contain basic Jupyter Notebook servers, but you will likely want your own Python libraries, notebook extensions, and other features. Basic directions are included later in this document.

### Kestrel:

`[user@laptop:~]$ ssh kestrel.hpc.nrel.gov`

`[user@kl1:~]$ salloc -A <account> -t 02:00:00`

### Eagle:

#### ...Is more than one node possible?
`[user@laptop:~]$ ssh eagle.hpc.nrel.gov`

Yes... please see our advanced Jupyter documentation.
`[user@el1:~]$ salloc -A <account> -t 02:00:00`

## Using a Compute Node - Hard Way

### Start on a login node:
## Starting Jupyter Inside the Job

`ssh eagle.hpc.nrel.gov`
Once the job starts and you are allocated a compute node, load the appropriate modules, activate your Jupyter environment, and launch the Jupyter server.

`[user@el1:]$ srun -A <account> -t 02:00:00 --pty /bin/bash`
#### Kestrel:

`[user@x1000c0s0b0n1:~]$ module load anaconda3`

`[user@x1000c0s0b0n1:~]$ source activate myjupenv`

`[user@x1000c0s0b0n1:~]$ jupyter-notebook --no-browser --ip=$(hostname -s)`

Take note of the node name that your job is assigned. (x1000c0s0b0n1 in the above example.)

### When the job starts on the compute node:
Also note the url that Jupyter displays when starting up, e.g. `http://127.0.0.1:8888/?token=<alphabet soup>`.

`[user@r2i7n35]$ module load conda`
The `<alphabet soup>` is a long string of letters and numbers. This is a unique authorization token for your Jupyter session. you will need it, along with the full URL, for a later step.

#### Eagle:

`[user@r2i7n35:~]$ module load conda`

`source activate myjupenv`

`jupyter-notebook --no-browser --ip=$(hostname -s)`

note the node name (r2i7n35 in this example)
Take note of the node name that your job is assigned. (r2i7n35 in this example.)

and the url, e.g. `http://127.0.0.1:8888/?token=<alphabet soup>`
Also note the url that Jupyter displays when starting up, e.g. `http://127.0.0.1:8888/?token=<alphabet soup>`.

### In a terminal on your computer:
The `<alphabet soup>` is a long string of letters and numbers. This is a unique authorization token for your Jupyter session. you will need it, along with the full URL, for a later step.

`[user@laptop]$ ssh -N -L 8888:<nodename>:8888 [email protected]`

copy full url from jupyter startup into your web browser. e.g.:
### On Your Own Computer

Next, open an SSH tunnel through a login node to the compute node. Log in when prompted using your regular HPC credentials, and put this terminal to the side or minimize it, but leave it open until you are done working with Jupyter for this session.

#### Kestrel:

`[user@laptop:~]$ ssh -N -L 8888:<nodename>:8888 [email protected]`


#### Eagle:

`[user@laptop:~]$ ssh -N -L 8888:<nodename>:8888 [email protected]`


### Open a Web Browser

Copy the full url and token from Jupyter startup into your web browser. For example:

`http://127.0.0.1:8888/?token=<alphabet soup>`


## Using a Compute Node - Easy Way
## Using a Compute Node - The Easy Way

Automation makes life better!
Scripted assistance with launching a Jupyter session on Eagle or Kestrel is available.

### pyeagle - NREL Users

### Internal NREL Users Only: pyeagle

The [pyeagle](https://github.nrel.gov/MBAP/pyeagle) package is available for internal users to handle launching and monitoring a jupyter server on a compute node. This package is maintained by an NREL HPC user group, and provides utilities for working on Eagle and Kestrel.

### Auto-launching with an sbatch script - All Users
### Auto-launching on Eagle With an sbatch Script

These scripts are designed for Eagle and may not yet be adapted for Kestrel, but may be downloaded and adapted manually.

Full directions included in the [Jupyter repo](https://github.com/NREL/HPC/tree/master/general/Jupyterhub/jupyter).

Expand All @@ -177,75 +233,72 @@ Run [auto_launch_jupyter.sh](https://github.com/NREL/HPC/blob/master/general/Jup

That's it!

## Using a Login Node


Yes, you can run jupyter directly on a login node.
## Reasons to Not Run Jupyter Directly on a Login Node

* Data processing and visualization should be done via Europa or compute nodes.
* Uses a highly shared resource (login nodes): there will be competition for CPU, RAM, and network I/O for storage. Arbiter2 software will automatically throttle moderate to heavy usage on login nodes, greatly slowing down processing.

## Custom Conda Environments and Jupyter Kernels

Should you run jupyter directly on a login node?
On Kestrel, the module 'anaconda3' is available to run the conda command and manage your environments.

As an alternative, the module 'mamba' is available instead. Mamba is a conda-compatible environment manager with very similar usage. Most conda commands in this documentation may be used with mamba instead and they may generally be considered interchangeable.

On Eagle, the module 'conda' contains the conda command. The Eagle conda module also contains mamba installed as a conda package.

### Reasons to Not Run Jupyter Directly on a Login Node
* Heavy lifting should be done via Europa or compute nodes
* Using a highly shared resource (login nodes)
* Competition for cycles
* arbiter2 will throttle moderate to heavy usage
### Creating a Conda Environment

## Custom Conda Environments and Jupyter Kernels
To add your own packages to conda on Kestrel or Eagle:

### Creating a conda environment:
Create an environment and install the base jupyter packages. Then activate the environment and install other libraries that you want to use, e.g. scipy, numpy, and so on.

`conda create -n myjupyter -c conda-forge jupyter ipykernel`

`source activate myjupyter`

`conda install -c conda-forge scipy numpy matplotlib`

### Add Custom iPykernel


### Add custom ipykernel
A kernel is what allows Jupyter to use your customized conda environment inside Jupyter, in a notebook. Use ipykernel to build your kernel. Inside your custom conda environment, run:

`python -m ipykernel install --user --name=myjupyter`

Restart your jupyter server
If you already have a Jupyter server running, restart it to load the new kernel.

The new kernel will appear in the drop-down as an option to open a new notebook.

New kernel will appear in drop-down as an option
You can have multiple kernels, allowing you to load different conda environments for your different projects into Notebooks.

### Remove custom ipykernel
### Jupyter Kernel Management

Use the kernelspec list command inside your Jupyter conda environment to see what ipykernels you have installed:

`jupyter kernelspec list`

`jupyter kernelspec remove myoldjupyter`
To remove an old kernel, use the kernelspec remove command:

`jupyter kernelspec remove myoldjupyter`

## Magic commands

## Magic Commands

Magic commands are "meta commands" that add extra functionality.
Magic commands are "meta commands" that add extra functionality to Jupyter.

Magic commands begin with % or %%.

### A Few Useful Examples
### Example Magic Commands

* %lsmagic - list all magic commands
* %run _file.py_ - run an external python script
* %%time - placed at top of cell, prints execution time
* %who - list all defined variables in notebook




```
%lsmagic
```




Available line magics:
%alias %alias_magic %autoawait %autocall %automagic %autosave %bookmark %cat %cd %clear %colors %conda %config %connect_info %cp %debug %dhist %dirs %doctest_mode %ed %edit %env %gui %hist %history %killbgscripts %ldir %less %lf %lk %ll %load %load_ext %loadpy %logoff %logon %logstart %logstate %logstop %ls %lsmagic %lx %macro %magic %man %matplotlib %mkdir %more %mv %notebook %page %pastebin %pdb %pdef %pdoc %pfile %pinfo %pinfo2 %pip %popd %pprint %precision %prun %psearch %psource %pushd %pwd %pycat %pylab %qtconsole %quickref %recall %rehashx %reload_ext %rep %rerun %reset %reset_selective %rm %rmdir %run %save %sc %set_env %store %sx %system %tb %time %timeit %unalias %unload_ext %who %who_ls %whos %xdel %xmode

Expand All @@ -254,8 +307,6 @@ Magic commands begin with % or %%.

Automagic is ON, % prefix IS NOT needed for line magics.



## Shell Commands

You can also run shell commands inside a cell. For example:
Expand All @@ -268,14 +319,14 @@ You can also run shell commands inside a cell. For example:
!ls
```

/home/tthatche/jup
/home/username/jup
auto_launch_jupyter.sh Jupyter Presentation.ipynb slurm-6445885.out
geojsondemo.ipynb old sshot1.png
Interesting Graphs.ipynb sbatch_jupyter.sh sshot2.png
jup-logo.png slurm


## Interesting/Useful Notebooks
## Interesting/Useful Notebooks, Extensions, and Learning Resources

[Awesome Jupyter](https://github.com/markusschanta/awesome-jupyter)

Expand Down
Loading

0 comments on commit 6ef243a

Please sign in to comment.