# Quantum Espresso

Quantum ESPRESSO is an integrated suite of computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials. Quantum ESPRESSO has evolved into a distribution of independent and inter-operable codes. The Quantum ESPRESSO distribution consists of a core set of components (e.g. PWscf and CP), and a set of plug-ins that perform more advanced tasks, plus a number of third-party packages designed to be inter-operable with the core components.

The program start with pseudo potentials and calculates the first estimate of the stationary state wavefunctions at each k point and band (i.e. each Kohn-Sham state). This is done by diagonalizing the Hamiltonian matrix using highly parallel linear algebra packages. Some algorithms (e.g. CP) requires the calculated stationary state wavefunctions to be orthonormalized, again, done using parallel linear algebra packages. The program then used 3D FFT (fast Fourier transforms) to calculate electron density, charge density and a new estimate of potentials. This process is repeated until self consistency is achieved. For some simulations, e.g. Nudged Elastic Band (neb.x) calculations and Atomic Displacement (ph.x), once the self consistent potential is achieved, program calculates forces and stresses on all atoms. It then moves the atoms around to reduce the forces and stresses to zero. For each new atomic position, self consistent field calculations are again repeated. aa

## New to Kogence?

1. Why Kogence?: Watch the Video
2. How Kogence Woks?: Watch the Video
3. Step by step instruction on how to execute your first simulation model on Kogence: How To Use Kogence

## Quantum Espresso Access on Kogence

On Kogence, Quantum Espresso is pre-configured and pre-optimized and is available to be run on a single cloud machine of your choice or on an autoscaling cloud HPC cluster. On Kogence, Quantum Espresso runs in a docker container. Quantum Espresso is available under Free Individual Plans.

## Versions Available on Kogence

Following versions are deployed, provisioned and performance optimized on Kogence.

1. Quantum Espresso 6.1

## Parallel Computing in Quantum Espresso

Quantum Espresso supports shared memory parallelism through multi-threading (openMP), distributed memory parallelism through multi-processing (MPI) as well as hybrid parallelism where users may choose to use share memory parallelism within each node and distributed memory parallelism between the nodes. In order to effectively use all the CPUs in the Kogence autoscaling cloud HPC cluster, user may need to modify the $PARA_PREFIX and $PARA_POSTFIX parameters in the input bash shell script if you are invoking Quantum Espresso using a bash shell script or on the command prompt if you are invoking Quantum Espresso using a shell terminal (see below for more details).

$PARA_PREFIX tells the system if openMP, MPI or hybrid parallelism is desired. On Kogence, by default, Quantum Espresso is setup to use openMP parallelism. If user is running simulations on single node and if openMP parallelism is desired then user does not need to modify the $PARA_PREFIX parameter. If, on the other hand, MPI parallelism is desired then $PARA_PREFIX should be modified. For single node simulations, the $PARA_PREFIX would look like mpirun  -np  4096 while on multi node cluster simulation $PARA_PREFIX would look like qsub -pe mpi 4096 mpirun -np 4096. Please see Single Node Invocation Options and Autoscaling Cluster Invocation Options sections below for more details. The $PARA_POSTFIX tells the system how many of these available CPUs (i.e. 4096 in above example) should be used for which portion of the calculations. To control the number of CPU assigned for calculations in each group, use the following $PARA_POSTFIX switches in your bash shell script on the shell terminal depending upon how you are invoking Quantum Espresso on Kogence clusters: • -nimage (or the shorthand -ni). Available only with neb.x or ph.x. -ni provides a loosely coupled parallelization mechanism to parallelly simulate replicas of same system. Different images can be run on different nodes even with poor network as the images are loosely coupled and rarely interact. • -npools (or the shorthand -nk). If the simulation in each image (i.e -ni) consists of multiple k-points (number of points the irreducible Brillouin zone of the reciprocal space) then those can be simulated in parallel in -nk number of pools of CPUs. For example, if your simulation included 1000 k points and if -nk is set as 4 then 4 pools of multiple CPUs will be created for parallel calculations. Each pool would work on 250 k points. The number of CPUs available in each pool will depend on the settings of other switches in $PARA_POSTFIX as well as total number of CPUs available through the settings of $PARA_PREFIX. k-point parallelism is also loosely coupled and pools can be located on different nodes with poor network. CPUs within each pool are instead tightly coupled and communications can be signiﬁcant. This means that fast communication hardware is needed if your pool extends over more than a few CPUs on diﬀerent nodes. If that is the case then you should use the Kogence Network Limited Workload nodes that come with OS Bypass remote direct memory access network such as the Infiniband network. • -nband (or the shorthand -nb). Each pool (i.e. -nk) can be subpartitioned into ”band groups”, each taking care of a group of Kohn-Sham orbitals (also called bands). This type of calculation parallelism is especially useful for calculations with hybrid functionals. Depending upon the problem at hand, band may or may not interact much. If you are running different bands on different nodes of the cluster and if you expect your system to be significantly "hybridized", meaning if you expect bands in your system to interact significantly, then you should use the Kogence Network Limited Workload nodes that come with OS Bypass remote direct memory access network such as the Infiniband network. • -ntg (or the shorthand -nt). Each band group can then be partitioned into task groups. Task groups perform the 3D FFT operations to calculate electron density and charge density given a set of stationary state wavefunctions. Task are very tightly coupled. We recommend running them on same node. If that is not possible then please use the Kogence Network Limited Workload nodes that come with OS Bypass remote direct memory access network such as the Infiniband network. • -ndiag (or -northo or the shorthand -nd). This switch specifies the number of CPUs in contrast to the other switches that specify the number of partitions. This switch creates a linear-algebra group of -nd CPUs from within the band group as and when needed (it does not conflict with -nt settings) and parallelize matrix diagonalization and matrix-matrix multiplications needed for the iterative diagonalization (SCF) or orthonormalization (CP). CPUs within ortho group should be very tightly coupled. We recommend running them on same node. If that is not possible then please use the Kogence Network Limited Workload nodes that come with OS Bypass remote direct memory access network such as the Infiniband network. There are 2 constraints that should be kept in mind while selecting appropriate values for these switches in the $PARA_POSTFIX.

• Please note that the product of -ni, -nk, -nb and -nt can not be more than the maximum total number of CPUs available in your Kogence autoscaling cloud HPC cluster. Please note that the maximum number of CPUs is determined by the maximum number of compute nodes that you select to let your cluster to scale up to and the number of CPUs in each compute node -- both of these choices are made in the Cluster tab of your Model.
• The diagonalization and the orthonomalization operations block partition the matrices into a square 2D array of smaler matrices. As a consequence the number of CPUs in the linear-algebra group is given by ${\displaystyle n_{d}=m^{2}}$, where ${\displaystyle m}$ is an integer such that ${\displaystyle m^{2}}$ is less than or equal to CPUs in each band group. The diagonalization is then performed in parallel using ScaLAPACK in Kogence clusters.

There are also some constraints in the $PARA_PREFIX. • $OMP_NUM_THREADS (set in your input bash script, see below) cannot be more than the number of CPUs on each compute node. openMP provides shared memory parallelism within each node.
• The product of $OMP_NUM_THREADS (set in your input bash script, see below) and the number of MPI processes (set in $PARA_PREFIX as mpirun -np switch) can not be more than the maximum total number of CPUs available in your Kogence autoscaling cloud HPC cluster. As described above, the maximum number of CPUs is determined by the maximum number of compute nodes that you select to let your cluster to scale up to and the number of CPUs in each compute node -- both of these choices are made in the Cluster tab of your Model.

As an example of combined effect of both $PARA_PREFIX and $PARA_POSTFIX, consider the following command line:

mpirun  -np  4096  neb.x  -ni  8  -nk  2  -nt  4  -nd  144  -i  my.input

This executes a NEB calculation on 4096 CPUs, 8 images (points in the conﬁguration space in this case) at the same time, each of which is distributed across 512 processors (${\displaystyle 8\times 512=4096}$). k-points are distributed across 2 pools of 256 processors each, 3D FFT is performed using 4 task groups (64 processors each, so the 3D real-space grid is cut into 64 slices), and the diagonalization of the subspace Hamiltonian is distributed to a square grid of 144 processors (12x12).

Default values are: -ni  1  -nk  1  -nt  1 . Default value for the switch -nd is 1 if ScaLAPACK is not compiled and it is set to the square integer smaller than or equal to half the number of processors of each pool by default.

## Single Node Invocation Options

For using Quantum Espresso on Kogence, you would first create or copy an existing Quantum Espresso model. You can see list of ready to copy and execute Quantum Espresso model on Kogence here: Category:Quantum Espresso. Step by step instructions for creating and copying models is here How To Use Kogence.

1. Using CAD GUI: Quantum Espresso does not come with any GUI. The code requires a single user input file, *.sh which you can edit using Kogence built-in browser based code editor accessible by double clicking on *.sh under File tab of your Quantum Espresso model. Quantum Espresso outputs data in standard formats and can easily be plotted with several common charting and data analysis utilities such as VESTA , ParaView, Matlab or Octave, all of which are available on Kogence.
2. Using Solvers in Batch Mode
1. Using Kogence Software Stack Builder GUI: This is the easiest way to use Quantum Espresso on Kogence. Stack Builder GUI also allows you to connect multiple software (such as pre-processing and post-processing software) to your model and create complex multi-software workflows. Open your Quantum Espresso model. On top navigation bar, click on the Stack tab. Click the + button to connect a software to your model. A pop up will come up on which you can search/filter/scroll to find Quantum Espresso. Click + button next to Quantum Espresso. Quantum Espresso container is now added to your workflow. Now you will see a dropdown menu that will let you pick an entrypoint binary. Pick the bash-shell binary. Now you will see an empty text box. Type the name of your *.sh input file (this file should be available under the Files tab of your model). Click the Save button to save the software stack and then click the Run button on top navigation bar. You may have to wait 2 minutes for your HPC server to boot up. Once it is ready, you will see that the Run button turns into Stop button and there is a Visualizer button that lets you connect to your HPC server.
1. On Kogence, we use *.sh input files to run Quantum Espresso models in batch mode. If you are used to using *.in files directly as an input to solvers such as pw.x, you can do that very easily on Kogence as well. Take a look at any of existing Quantum Espresso models. The *.sh file has a section where you should copy/paste your *.in file.
2. Results are saved in various *.OUT files. Logs and errors are printed in the ___titusiOutput and ___titusiError files. All of these are available in real time when your simulations are in running state under the Visualize tab or under the Files tab after the simulation has ended.
3. When you execute Quantum Espresso through Kogence Software Stack Builder GUI, then Quantum Espresso will use OpenMP parallelism automatically and you don't have to do anything. If you like, you can also use MPI parallelism. In order to effectively use all the CPUs in the Kogence autoscaling cloud HPC cluster, user may need to modify the $PARA_PREFIX and $PARA_POSTFIX parameters in the input bash shell script. The $PARA_PREFIX tells the system if openMP, MPI or hybrid parallelism is desired and how many maximum number of CPUs are potentially available for parallelism in the Kogence autoscaling cloud HPC cluster that has been orchestrated for this simulation. While the $PARA_POSTFIX tells the system how many of these available CPUs should be used for which portion of the calculation (please see above section on Parallel Computing in Quantum Espresso for detailed description of $PARA_POSTFIX). Here is an example that uses only MPI parallelism (i.e. 36 MPI parallel processes will be started) on a 36 CPU machine:  export OMP_NUM_THREADS=1 PARA_PREFIX=" mpirun --display-allocation --display-map --use-hwthread-cpus -np 36 " PARA_POSTFIX="-nk 6 -nt 6 -nd 4 " PW_COMMAND="$PARA_PREFIX $BIN_DIR/pw.x$PARA_POSTFIX"

Please refer to User Manual for more details. There are some constraints in the $PARA_PREFIX. • $OMP_NUM_THREADS (set in your input bash script, see below) cannot be more than the number of CPUs on each compute node. openMP provides shared memory parallelism within each node.
• The product of $OMP_NUM_THREADS (set in your input bash script, see below) and the number of MPI processes (set in $PARA_PREFIX as mpirun -np switch) can not be more than the maximum total number of CPUs available in the cloud HPC server that you requested in the Cluster tab of your Model.
2. Shell Terminal Access: Follow the steps as described above in the 'Using Kogence Software Stack Builder GUI' section. But instead of selecting bash-shell, select the shell-terminal as the entrypoint binary. Kogence offers 2 different shell terminal emulators: xterm and gnome-terminal. In the empty text either type xterm or gnome-terminal depending upon which emulator you prefer. When you run the model, you can go to Visualizer tab and you will see your shell terminal. Here you can also execute your own bash scripts. Make sure you add/upload your bash script under the Files tab of your model before running the model. For example you can do:
1. export OMP_NUM_THREADS=1; mpirun --display-allocation --display-map --use-hwthread-cpus-np 36 pw.x -i MyCode.in -nk  6  -nt  6  -nd  4 : This will invoke Quantum Espresso on single node with MPI parallelism instead of OpenMP.
2. YourScript.sh YourScriptArgs: Will run your custom shell script. Make sure you add/upload your bash script under the Files tab of your model before running the model.

## Autoscaling Cluster Invocation Options

On Kogence, you can easily send large jobs or a set of parallel jobs to an HPC cluster that automatically scales up and down depending upon the workload submitted. All you have to do is to select the "Run on Autoscaling Cluster" checkbox in the Cluster tab of your model and then on the Stack tab of your model select the specific commands that you want to execute on the compute/worker nodes of the cluster vs those that you want to run on the master node of your cluster. Note: Skip this second step in case of Quantum Espresso. Since, on Kogence, we always run Quantum Espresso simulation through a bash script or through shell terminal, this bash script always needs to be executed on the master node of the cluster. In the script or on the terminal you provide appropriate commands to send the the simulations on the worker/compute nodes or the cluster. As described in the above section on Parallel Computing in Quantum Espresso, you will need to modify the $PARA_PREFIX to run your simulations on an autocaling cluster. 1. Using CAD GUI: Master node of cluster support interactive job while compute nodes can only run batch mode non-interactive, non-graphical jobs. Quantum Espresso does not come with any GUI. The code requires a single user input file, *.sh which you can edit using Kogence built-in browser based code editor accessible by double clicking on *.sh under File tab of your Quantum Espresso model. Quantum Espresso outputs data in standard formats and can easily be plotted with several common charting and data analysis utilities such as VESTA , ParaView, Matlab or Octave, all of which are available on Kogence. 2. Using Solvers in Batch Mode 1. Using Kogence Software Stack Builder GUI: This is the easiest way to use Quantum Espresso on Kogence. Stack Builder GUI also allows you to connect multiple software (such as pre-processing and post-processing software) to your model and create complex multi-software workflows. Open your Quantum Espresso model. On top navigation bar, click on the Stack tab. Click the + button to connect a software to your model. A pop up will come up on which you can search/filter/scroll to find Quantum Espresso. Click + button next to Quantum Espresso. Quantum Espresso container is now added to your workflow. Now you will see a dropdown menu that will let you pick an entrypoint binary. Pick the bash-shell binary. Now you will see an empty text box. Type the name of your *.sh input file (this file should be available under the Files tab of your model). Click the Save button to save the software stack and then click the Run button on top navigation bar. You may have to wait 5 to 10 minutes for your autoscaling cloud HPC cluster to boot up depending upon the number of nodes you requested. Once it is ready, you will see that the Run button turns into Stop button and there is a Visualizer button that lets you connect to your HPC server. 1. On Kogence, we use *.sh input files to run Quantum Espresso models in batch mode. If you are used to using *.in files directly as an input to solvers such as pw.x, you can do that very easily on Kogence as well. Take a look at any of existing Quantum Espresso models. The *.sh file has a section where you should copy/paste your *.in file. 2. Results are saved in various *.OUT files. Logs and errors are printed in the ___titusiOutput and ___titusiError files. All of these are available in real time when your simulations are in running state under the Visualize tab or under the Files tab after the simulation has ended. 3. In order to effectively use all the CPUs in the Kogence autoscaling cloud HPC cluster, user may need to modify the $PARA_PREFIX and $PARA_POSTFIX parameters in the input bash shell script. The $PARA_PREFIX tells the system if openMP, MPI or hybrid parallelism is desired and how many maximum number of CPUs are potentially available for parallelism in the Kogence autoscaling cloud HPC cluster that has been orchestrated for this simulation. While the $PARA_POSTFIX tells the system how many of these available CPUs should be used for which portion of the calculation (please see above section on Parallel Computing in Quantum Espresso for detailed description of $PARA_POSTFIX). Here is an example that uses only MPI parallelism (i.e. 36 MPI parallel processes will be started) on a 36 CPU machine:
export OMP_NUM_THREADS=1
PARA_PREFIX=" qsub -pe mpi 72 -b y -N job1 -b y -cwd -o ./___titusiOutput -j y -V mpirun --display-allocation --display-map --use-hwthread-cpus -np 72 "
PARA_POSTFIX="-nk  12  -nt  6  -nd  4 "
PW_COMMAND="$PARA_PREFIX$BIN_DIR/pw.x $PARA_POSTFIX"  Please refer to User Manual for more details. There are some constraints in the $PARA_PREFIX.
• $OMP_NUM_THREADS (set in your input bash script, see below) cannot be more than the number of CPUs on each compute node. openMP provides shared memory parallelism within each node. • The product of $OMP_NUM_THREADS (set in your input bash script, see below) and the number of MPI processes (set in \$PARA_PREFIX as mpirun -np switch) can not be more than the maximum total number of CPUs available in your Kogence autoscaling cloud HPC cluster. As described above, the maximum number of CPUs is determined by the maximum number of compute nodes that you select to let your cluster to scale up to and the number of CPUs in each compute node -- both of these choices are made in the Cluster tab of your Model.
2. Shell Terminal Access: Follow the steps as described above in the 'Using Kogence Software Stack Builder GUI' section. But instead of selecting bash-shell, select the shell-terminal as the entrypoint binary. Kogence offers 2 different shell terminal emulators: xterm and gnome-terminal. In the empty text either type xterm or gnome-terminal depending upon which emulator you prefer. When you run the model, you can go to Visualizer tab and you will see your shell terminal. Here you can also execute your own bash scripts. Make sure you add/upload your bash script under the Files tab of your model before running the model. For example you can do:
1. export OMP_NUM_THREADS=1; qsub -pe mpi 72 -b y -N job1 -b y -cwd -o ./___titusiOutput -j y -V mpirun --display-allocation --display-map --use-hwthread-cpus -np 72 pw.x -i MyCode.in -nk  12  -nt  6  -nd 4This will invoke Quantum Espresso on single node with MPI parallelism instead of OpenMP.
2. YourScript.sh YourScriptArgs: Will run your custom shell script. Make sure you add/upload your bash script under the Files tab of your model before running the model.

## Combining Multiple Software in Workflow

A typical scientific simulation workflow involves three steps. STEP 1: pre-processing using CAD environment on master node. STEP2: Sending solver jobs to multiple worker nodes in non-graphical batch mode. STEP3: Post-processing suing CAD environment on master node. Furthermore, pre-processing, post-processing and solvers may all be different software packages. Kogence Settings->Software tab allows you to configure all such complex workflows at ease.

First you will select CAD programs. These will come up in master node. Next you will add a shell terminal program and directly send your shell script to cluster-bash interpreter. This shell script will send jobs to job scheduler managing the autoscaling cluster. Then once shell script exits, your post processing CAD programs will kick in on the master node and worker nodes would automatically scale down/terminate.

You may want to pre-process you model using different software such as VESTA , ParaView, Matlab or Octave, run Quantum Espresso solver in batch mode on a cluster and then post-process your data in different software such as VESTA , ParaView, Matlab or Octave. Kogence allows you to build such workflows on settings-software. Please note that on Kogence all software and solvers are deployed on docker containers. Once you select the containers you want, Kogence automatically composes them so you can call one program from another just like you would on your onprem workstation or on your desktop. Therefore, before you start calling programs in shell terminal or in your bash script make sure you selected all software and solvers that your bash script needs otherwise those solvers would not be available to be used in the bash script. You can skip automatically invoking them through Kogence software-settings by not providing an inputs after the solver name/binary name. This way you can make them available to your bash script and then invoke them from your bash script.

1. Blocking vs No Blocking Executions: By default all software selected in Settings->Software tab work in blocking mode. Meaning second software/solver will not start until first one finished. If you want them to get started simultaneously then check mark the Run With Previous checkbox next to the command in the Stack tab of your model.

## Output and Error Logs

On Kogence, stdout is automatically saved under Files tab of your model in a file called ___titusiOut. Similarly, stderr is automatically saved in another file called ___titusiError. While your model is running, you will see Visualizer button being active in the top right corner of the NavBar. You will see errors and outputs being printed live on the screen. Once simulation ends and machine/cluster terminates, the Visualizer will become inactive. Updated ___titusiOut and___titusiError files will automatically sync back to the app webserver and you will can view these updated files under the Files tab of your model.

## Model Input and Result Files

You can upload and edit your input files under Files tab of your model. Similarly,, model outputs and results are also saved under Files tab of your model. Please make sure that you upload and edit your model files before launching the simulation. Once simulation is launched, editing and uploading under Files tab is locked and Kogence web app is connected with high spared NFS on both the master and the worker nodes. So your data is automatically shared in live mode between all your master and worker nodes. Once simulation has been launched, and if requested a terminal shell program such as xterm or gnome-terminal under software->settings, then you can use the shell terminal to check all your model inputs and outputs in you job folder. After simulation ends, the lock on the Files tab would be removed and you can see the updated files under the Files tab again.

## References

1. http://www.quantum-espresso.org/project/manifesto/