Using Julia

Julia is an open-source programming language designed for high-performance scientific and numerical computing.

Using Julia on CARC systems

Begin by logging in. You can find instructions for this in the Getting Started with Discovery or Getting Started with Endeavour user guides.

You can use Julia in either interactive or batch modes. In either mode, first load the corresponding software module:

module load julia

This loads the default version, currently 1.6.1, and is equivalent to module load julia/1.6.1. If you require a different version, specify the version of Julia when loading. For example:

module load julia/1.5.2

To see all available versions of Julia, enter:

module spider julia

The Julia modules depend on the gcc/8.3.0 module, which is loaded by default when logging in. This module needs to be loaded first because Julia was built with the GCC 8.3.0 compiler.

If needed, the gcc module should be loaded before loading a julia module:

module purge
module load gcc/8.3.0
module load julia/1.6.1

Or alternatively enter module load usc and then load a julia module.

Installing a different version of Julia

If you require a different version of Julia that is not currently installed on CARC systems, please submit a help ticket and we will install it for you.

Alternatively, you can install a different version of Julia inside your home or project directory from official binaries. The following steps show how to do this using Julia version 1.6.1 as an example.

Find the binary file for the version of Julia that you want here (one of the "Generic Linux Binaries for x86" 64-bit files) and copy the link to the file. Then download the file into your home directory using wget:

wget https://julialang-s3.julialang.org/bin/linux/x64/1.6/julia-1.6.1-linux-x86_64.tar.gz 

After the file is downloaded, unpack it by entering:

tar -xf julia-1.6.1-linux-x86_64.tar.gz

You can then start using this version of Julia by entering the path to the binary file:

$HOME/julia-1.6.1/bin/julia

Alternatively, you can add the directory to your PATH variable:

export PATH=$HOME/julia-1.6.1/bin:$PATH

and then simply enter julia. You can add this export line to your ~/.bashrc to automatically set it every time you log in.

Running Julia in interactive mode

After loading the module, to run Julia interactively on a login node, simply enter julia and this will start a new Julia session. Using Julia on a login node should be reserved for installing packages. Conversely, using Julia interactively on a compute node is useful for more intensive work like exploring data, testing models, and debugging.

A common mistake for new users of HPC clusters is to run heavy workloads directly on a login node (e.g., discovery.usc.edu or endeavour.usc.edu). Unless you are only running a small test, please make sure to run your program as a job interactively on a compute node. Processes left running on login nodes may be terminated without warning. For more information on jobs, see our Running Jobs user guide.

To run Julia interactively on a compute node, first use Slurm's salloc command to reserve job resources on a node. Once you are granted the resources and logged in to a compute node, load the modules and then enter julia:

user@discovery1:~$ salloc --time=1:00:00 --ntasks=1 --cpus-per-task=8 --mem=16GB --account=<project_id>
salloc: Pending job allocation 24737
salloc: job 24737 queued and waiting for resources
salloc: job 24737 has been allocated resources
salloc: Granted job allocation 24737
salloc: Waiting for resource configuration
salloc: Nodes d05-04 are ready for job

Make sure to change the resource requests (the --time=1:00:00 --ntasks=1 --cpus-per-task=8 --mem=16GB --account=<project_id> part after your salloc command) as needed, such as the number of cores and memory required. Also make sure to substitute your project ID, which is of the form <PI_username>_<id>. You can find your project ID in the CARC User Portal.

Once you are granted the resources and logged in to a compute node, load the modules and then enter julia:

user@d05-04:~$ module load gcc/8.3.0 julia/1.6.1
user@d05-04:~$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.6.1 (2021-04-23)
 _/ |\__'_|_|_|\__'_|  |
|__/                   |

julia>

Notice that the shell prompt changes from user@discovery1 to user@<nodename> to indicate that you are now on a compute node (e.g., d05-04).

To run Julia scripts from within Julia, use the include() function. Alternatively, to run Julia scripts from the shell, use the julia command.

To exit the node and relinquish the job resources, enter exit() in Julia and then enter exit in the shell. This will return you to the login node:

julia> exit()
user@d05-04:~$ exit
exit
salloc: Relinquishing job allocation 24737
user@discovery1:~$

Please note that compute nodes do not have access to the internet, so any data downloads or package installations should be completed on the login or transfer nodes, either before the interactive job or concurrently in a separate shell session.

Running Julia in batch mode

In order to submit jobs to the Slurm job scheduler, you will need to use Julia in batch mode. There are a few steps to follow:

  1. Create a Julia script
  2. Create a Slurm job script that runs the Julia script
  3. Submit the job script to the job scheduler using sbatch

Your Julia script should consist of the sequence of Julia commands needed for your analysis. The julia command, available after a Julia module has been loaded, runs Julia scripts, and it can be used in the shell and in Slurm job scripts.

A Slurm job script is a special type of Bash shell script that the Slurm job scheduler recognizes as a job. For a job running Julia, a Slurm job script should look something like the following:

#!/bin/bash

#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=16GB
#SBATCH --time=1:00:00
#SBATCH --account=<project_id>

module purge
module load gcc/8.3.0
module load julia/1.6.1

julia --threads $SLURM_CPUS_PER_TASK script.jl

Each line is described below:

Command or Slurm argumentMeaning
#!/bin/bashUse Bash to execute this script
#SBATCHSyntax that allows Slurm to read your requests (ignored by Bash)
--nodes=1Use 1 compute node
--ntasks=1Run 1 task (e.g., running a Julia script)
--cpus-per-task=8Reserve 8 CPUs for your exclusive use
--mem=16GBReserve 16 GB of memory for your exclusive use
--time=1:00:00Reserve resources described for 1 hour
--account=<project_id>Charge compute time to <project_id>. You can find your project ID in the CARC User Portal
module purgeClear environment modules
module load gcc/8.3.0Load the gcc compiler environment module
module load julia/1.6.1Load the julia environment module
julia --threads $SLURM_CPUS_PER_TASK script.jlUse julia to run script.jl

Make sure to adjust the resources requested based on your needs, but remember that fewer resources requested leads to less queue time for your job. Note that to fully utilize the resources, especially the number of cores, you may need to explicitly change your Julia code to do so (see the section on parallel programming below).

You can develop Julia scripts and job scripts on your local machine and then transfer them to the cluster, or you can use one of the available text editor modules (e.g., micro) to develop them directly on the cluster.

Save the job script as jl.job, for example, and then submit it to the job scheduler with Slurm's sbatch command:

user@discovery1:~$ sbatch jl.job
Submitted batch job 13587

To check the status of your job, enter squeue --me. For example:

user@discovery1:~$ squeue --me
         JOBID PARTITION     NAME     USER     ST    TIME  NODES NODELIST(REASON)
        170552      main   jl.job     user      R    1:01      1 d05-04

If there is no job status listed, then this means the job has completed.

The results of the job will be logged and, by default, saved to a file of the form slurm-<jobid>.out in the same directory where the job script is located. To view the contents of this file, enter less slurm-<jobid>.out, and then enter q to exit the viewer.

For more information on job status and running jobs, see the Running Jobs user guide.

Installing Julia packages

To install Julia packages, start an interactive Julia session as explained above and enter the ] key to switch to package mode. You do not need to press enter — the shell prompt will immediately toggle to indicate you are now in package mode:

(@v1.6) pkg>

By default, packages will be installed to ~/.julia/packages.

To install a registered package, use the add command together with the package name:

(@v1.6) pkg> add DataFrames

To install unregistered or development versions of packages, such as from GitHub or GitLab, use the URL path to the Git repository:

(@v1.6) pkg> add https://github.com/JuliaData/DataFrames.jl

To exit package mode, enter the Backspace key on an empty line. The shell prompt will toggle back to the standard Julia prompt (julia>).

You can also install packages to other locations, such as for use in a shared group or project library. You will need to change the relevant environment variable in the shell before starting Julia:

export JULIA_DEPOT_PATH=</path/to/dir>

This changes the Julia depot location to the specified directory, and then packages will be installed to and loaded from a </path/to/dir>/packages directory. After exporting this variable, you can simply start Julia and install and load packages like normal. Note that this line needs to be added to Slurm job scripts in order to load packages from that location.

To clear this environment variable and return to the default depot location in your home directory, enter unset JULIA_DEPOT_PATH in the shell.

Also consider using Julia environments to create reproducible, project-specific package environments.

Please note that when installing and using a package for the first time, the package is compiled and this may take some time. After this first time, the package will be quick to load and use.

Parallel programming with Julia

Julia uses only one thread by default, but it also supports both implicit and explicit parallel programming to enable full use of multi-core processors and compute nodes. This also includes the use of shared memory on a single node or distributed memory on multiple nodes. On CARC systems, 1 thread = 1 core = 1 logical CPU (requested with Slurm's --cpus-per-task option).

Parallelizing your code to use multiple cores or nodes can reduce the runtime of your Julia jobs, but the speedup does not necessarily increase in a proportional manner. The speedup depends on the scale and types of computations that are involved. Furthermore, sometimes using a single core is optimal. There is a cost to setting up parallel computation (e.g., modifying code, communications overhead, etc.), and that cost may be greater than the achieved speedup, if any, of the parallelized version of the code. Some experimentation will be needed to optimize your code and resource requests (optimal number of cores and amount of memory). Also keep in mind that your project account will be charged CPU-minutes based on the cores reserved for a job, even if all those cores are not actually used during the job.

Implicit parallelism

Implicit parallelism is based on multi-threading, so that you do not need to explicitly call for parallel computation in your Julia code. To set the number of threads to use, start Julia with the --threads option (e.g., julia --threads 8) or set the environment variable JULIA_NUM_THREADS (e.g., export JULIA_NUM_THREADS=8) before starting Julia. In job scripts, you can also use the SLURM_CPUS_PER_TASK variable to set threads (e.g., julia --threads $SLURM_CPUS_PER_TASK). Multi-threaded Julia packages and functions will automatically detect and use the number of threads that are set when Julia is started.

Explicit parallelism

Explicit parallelism means explicitly calling for parallel computation in your Julia code, either in relatively simple ways or potentially in more complex ways depending on the tasks to be performed. Many Julia packages exist for explicit parallelism, designed for different types of tasks that can be parallelized.

The main Julia packages for explicit parallelism are summarized in the following table:

PackagePurpose
Base.ThreadsFor explicit multi-threading
DistributedFor explicit multi-processing
MPI.jlFor interfacing to MPI libraries
DistributedArrays.jlFor working with distributed arrays
Elemental.jlFor distributed linear algebra
ClusterManagers.jlFor launching jobs via cluster job schedulers (e.g., Slurm)
Dagger.jlFor asynchronous evaluations and workflows

Please review the linked documentation above for examples and more information about how to use these packages and their functions.

Additional resources

If you have questions about or need help with Julia, please submit a help ticket and we will assist you.

Julia language
Julia documentation
Julia cheat sheet
Julia performance tips
JuliaHub
JuliaParallel
JuliaStats
JuliaGPU
Flux
JuMP
JuliaGeo

Tutorials:

Julia Learning
JuliaAcademy

Web books:

Think Julia: How to Think Like a Computer Scientist

Back to top