Workspace Maker

Overview

Workspace Maker is a small utility used to manage access to scratch storage on Kamiak. In order to use any scratch storage, you’ll need to create what we call a “workspace” to store your data within. A workspace is simply a directory that expires and is automatically removed. The maximum lifetime of a scratch workspace on Kamiak is 2 weeks. Clearly, scratch storage is not intended for permanent data and users should consider project space for that use. To use a workspace you’ll need to know three commands: mkworkspace, lsworkspace, rmworkspace. Only the first is required since removal of a workspace is automatic if we let it expire. We recommend you remove workspaces as soon as they are no longer needed rather than allowing them to expire. That is so the system can promptly release the storage space for others to use.

Click each command below to see more information about it.

mkworkspace: Create a workspace on the scratch storage of Kamiak
$ mkworkspace --help
Usage: mkworkspace [options]

Options:
  -h, --help            show this help message and exit
  -v, --verbose         Enable verbose mode
  -n WORKSPACE_NAME, --name=WORKSPACE_NAME
                        The name of the workspace to be created (defaults to
                        user name + random number)
  -t TIMELIMIT, --timelimit=TIMELIMIT
                        Time limit for the workspace to exist before its
                        automatic removal.  Syntax is DAYS-HOURS:MINUTES.
                        Default: 14-00:00, Max: 14-00:00
  -b BACKEND, --backend=BACKEND
                        Storage backend to use for the workspace.  Supported
                        backends: '/scratch' (default), '/local' (compute
                        nodes only; workspace will only be available on the
                        node upon which it was created)
  -q, --quiet           Output only the workspace created (useful in scripts)
lsworkspace: List your workspaces
$ lsworkspace --help
Usage: lsworkspace [options]

Options:
  -h, --help            show this help message and exit
  -v, --verbose         Enable verbose mode
  -b BACKEND, --backend=BACKEND
                        Storage backend to use for the workspace.  Supported
                        backends: '/scratch' (default), '/local' (compute
                        nodes only)
rmworkspace: Remove workspaces
$ rmworkspace --help
Usage: rmworkspace [options]

Options:
  -h, --help            show this help message and exit
  -n WORKSPACE_NAME, --name=WORKSPACE_NAME
                        The name of an individual workspace to remove
  -b BACKEND, --backend=BACKEND
                        Storage backend to use for the workspace.  Supported
                        backends: '/scratch' (default), '/local' (compute
                        nodes only)
  -a, --autoremove      Remove workspace(s) without prompting for confirmation
  -f, --force           Remove a workspace even if it is not expired (only
                        valid with --name)
  -v, --verbose         Enable verbose mode

Creating and Using Workspaces

Let’s look at an example of creating, using, and removing a workspace interactively:

$ mkworkspace 
Successfully created workspace.  Details:
    Workspace: /scratch/my.NID_616253
    User: my.NID
    Group: its_p_sys_ur_kam-its
    Expiration: 2017-08-20 16:21:30

Note that expiration date. Shortly after that time the workspace and all of the data within will be deleted. Now that the workspace has been created, we can use it and delete it when we’re done.

$ cd /scratch/my.NID_616253

[my.NID_616253]$ touch file.txt

[my.NID_616253]$ cd

$ lsworkspace
Workspace: /scratch/my.NID_616253
    Creation host: login-p1n01
    Creation time: 2017-08-6 16:21:30
    User owner: my.NID
    Group owner: its_p_sys_ur_kam-its
    Expiration time: 2017-08-20 16:21:30

$ rmworkspace -n my.NID_616253 --force
Remove workspace '/scratch/my.NID_616253' (expired 2017-08-20 16:21:30)?  y or n: y
Removing workspace /scratch/my.NID_616253
rmworkspace completed, total removed workspaces: 1

A workspace is a directory and you use it as you would any other. Let’s use a workspace in a batch job script:

#!/bin/bash      
#SBATCH --time=0-00:010:00    ### Wall clock time limit in Days-HH:MM:SS
#SBATCH --ntasks-per-node=1   ### Number of tasks to be launched per Node

my_workspace="$(mkworkspace --quiet)"

echo "My workspace is: $my_workspace"

cd $my_workspace

echo 'Hello!' > file.txt

This job will simply create a workspace, place the text “Hello!” into a file in the workspace, and end the job. Let’s extend it slightly to use a Local Scratch workspace by using the option –backend to mkworkspace (see mkworkspace --help for all options):

#!/bin/bash      
#SBATCH --time=0-00:010:00    ### Wall clock time limit in Days-HH:MM:SS
#SBATCH --ntasks-per-node=1   ### Number of tasks to be launched per Node

my_workspace="$(mkworkspace --backend=/local)"

echo "My workspace is: $my_workspace"

cd $my_workspace

echo 'Hello!' > file.txt

Workspaces in Compute Jobs

Let’s go through an example workflow of how we can utilize workspaces within compute jobs on Kamiak. Say we want to have a compute job:

  1. Create a workspace
  2. Run a computation using that workspace
  3. Write/copy final results to permanent storage, such as project space
  4. Remove the workspace when the job ends or is canceled (e.g. due to preemption)

We can use features of Workspace Maker and Kamiak’s scheduler, Slurm, to do exactly that.

#!/bin/bash
#SBATCH -n 1 # Number of cores
#SBATCH -t 0-00:10 # Runtime in D-HH:MM

my_workspace=$(mkworkspace --backend=/local --quiet)

function clean_up {
     # Clean up. Remove temporary workspaces and the like.
     rmworkspace --autoremove --force --name=$my_workspace

    exit
}

# Call our clean_up function when we exit, even if SLURM cancels the job this should still run
trap 'clean_up' EXIT

echo "My current workspace is $my_workspace"

# Work happens here ...

In this job script we’re creating a workspace on Local Scratch storage. We’re also utilizing Unix process signals to remove the workspace when the job ends or is canceled (see here for more information). If that isn’t desired and you need to keep the workspace for further use, simply remove the function definition and “trap” command.

This job script is a simple example of how workspaces can be managed within a compute job. How to effectively use workspaces with your applications will depend on the application and your needs. Workspaces can be used as temp directories, buckets to download working datasets to, or -of course- as fast storage for general computation. Note: Scratch storage is faster than Project storage; Local Scratch storage is faster than Scratch.

Have questions or feedback regarding Workspace Maker? Simply submit a service request with your needs or what you would like see in future versions.