Genesis-1 7 1
Genesis-1 7 1
1.7.1
RIKEN
GENESIS website
https://www.r-ccs.riken.jp/labs/cbrt/
Citation Information
• C. Kobayashi, J. Jung, Y. Matsunaga, T. Mori, T. Ando, K. Tamura, M. Kamiya, and Y. Sugita,
"GENESIS 1.1: A hybrid-parallel molecular dynamics simulator with enhanced sampling algo-
rithms on multiple computational platforms", J. Comput. Chem. 38, 2193-2206 (2017).
• J. Jung, T. Mori, C. Kobayashi, Y. Matsunaga, T. Yoda, M. Feig, and Y. Sugita, "GENESIS: A
hybrid-parallel and multi-scale molecular dynamics simulator with enhanced sampling algorithms
for biomolecular and cellular simulations", WIREs Computational Molecular Science 5, 310-323
(2015).
Copyright Notices
GENESIS is distributed under the GNU Lesser General Public License version 3.
Copyright ©2014-2021 RIKEN.
GENESIS is free software; you can redistribute it and/or modify it under the terms of the
GNU Lesser General Public License as published by the Free Software Foundation; either
version 3 of the License, or (at your option) any later version.
1
GENESIS is distributed in the hope that it will be useful, but WITHOUT ANY WAR-
RANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with
GENESIS – see the file COPYING and COPYING.LESSER. If not, see https://www.gnu.
org/licenses/.
It should be mentioned this package contains the following softwares for convenience. Please note that
these are not covered by the license under which a copy of GENESIS is licensed to you, while neither
composition nor distribution of any derivative work of GENESIS with these software violates the terms
of each license, provided that it meets every condition of the respective licenses.
2
You may use, copy, modify this code for any purpose (include commercial use) and without
fee. You may distribute this ORIGINAL package.
3
CONTENTS
1 Introduction 7
2 Getting Started 9
2.1 Installation of GENESIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Basic usage of GENESIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Control file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3 Available Programs 29
3.1 Simulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Analysis tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Parallel I/O tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4 Input section 38
4.1 How to prepare input files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2 General input files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3 Input files for implicit solvent models . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.4 Input files for restraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.5 Input files for REMD and RPATH simulations . . . . . . . . . . . . . . . . . . . . . . . 42
4.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5 Output section 45
5.1 General output files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2 Output files in REMD and RPATH simulations . . . . . . . . . . . . . . . . . . . . . . 46
5.3 Output file in GaMD simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.4 Output file in Vibrational analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.5 Output file in FEP simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6 Energy section 48
6.1 Force fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.2 Non-bonded interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.3 Particle mesh Ewald method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.4 Lookup table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.5 Generalized Born/Solvent-Accessible Surface-Area model . . . . . . . . . . . . . . . . 54
6.6 EEF1, IMM1, and IMIC implicit solvent models . . . . . . . . . . . . . . . . . . . . . . 57
6.7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
7 Dynamics section 61
7.1 Molecular dynamics simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4
7.2 Simulated annealing and heating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.3 Targeted MD and Steered MD simulations . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
8 Minimize section 67
8.1 Energy minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8.2 Steepest descent method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
8.3 LBFGS method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
8.4 Macro/micro-iteration scheme in QM/MM . . . . . . . . . . . . . . . . . . . . . . . . . 69
8.5 Fixing ring penetrations and chirality errors . . . . . . . . . . . . . . . . . . . . . . . . 70
8.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
9 Constraints section 73
9.1 SHAKE/RATTLE algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
9.2 SETTLE algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
9.3 LINCS algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
9.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
10 Ensemble section 76
10.1 Thermostat and barostat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
10.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
11 Boundary section 80
11.1 Boundary condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
11.2 Domain decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
11.3 Spherical potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
11.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
12 Selection section 85
12.1 Atom selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
12.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
13 Restraints section 88
13.1 Restraint potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
13.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
14 Fitting section 93
14.1 Structure fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
14.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
15 REMD section 95
15.1 Replica-exchange molecular-dynamics simulation (REMD) . . . . . . . . . . . . . . . . 95
15.2 Replica-exchange umbrella-sampling (REUS) . . . . . . . . . . . . . . . . . . . . . . . 97
15.3 Replica-exchange with solute-tempering (gREST) . . . . . . . . . . . . . . . . . . . . . 98
15.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5
GENESIS User Guide, 1.7.1
23 Appendix 141
23.1 Install the requirements in Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
23.2 Install the requirements in Mac . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Bibliography 149
CONTENTS 6
CHAPTER
ONE
INTRODUCTION
GENESIS (Generalized-Ensemble Simulation System) is a suite of computer programs for carrying out
molecular dynamics (MD) simulations of biomolecular systems. MD simulations of biomolecules such
as proteins, nucleic acids, lipid bilayers, N-glycans, are used as important research tools in structural and
molecular biology. Many useful MD simulation packages [1] [2] [3] [4] [5] are now available together
with accurate molecular force field parameter sets [6] [7] [8] [9] [10]. Most of the MD software have
been optimized and parallelized for distributed-memory parallel supercomputers or PC-clusters. There-
fore hundreds of CPUs or CPU cores can be used efficiently for a single MD simulation of a relatively
large biomolecular system, typically composed of several hundred thousands of atoms. In recent years,
the number of available CPUs or CPU cores is rapidly increasing. The implmentation of highly effi-
cient parallel schemes is therefore required in modern MD simulation programs. Accelerators such as
GPGPU (General-Purpose computing on Graphics Processing Units) also become popular, and thus their
utilization is also desired. Actually, many MD program packages support various accelerators.
Our major motivation is to develop MD simulation software with a scalable performance on such mod-
ern supercomputers. For this purpose, we have developed the software from scratch, introducing the
hybrid (MPI + OpenMP) parallelism, several new parallel algorithms [11] [12], and GPGPU calculation.
Another motivation is to develop a simple MD program, which can be easily understood and modified
for methodological developments. These two policies (high parallel performance and simplicity) usually
conflict each other in computer software. To avoid the conflict, we have developed two MD programs
in GENESIS, namely SPDYN (Spatial decomposition dynamics) and ATDYN (Atomic decomposition
dynamics).
SPDYN and ATDYN share almost the same data structures, subroutines, and modules, but differ in
their parallelization schemes. In SPDYN, the spatial decomposition scheme is implemented with new
parallel algorithms [11] [12] and GPGPU calculation. In ATDYN, the atomic decomposition scheme is
introduced for simplicity. The performance of ATDYN is not comparable to SPDYN due to the simple
parallelization scheme. However, ATDYN is easier to modify for development of new algorithms or novel
molecular models. We hope that users develop new methodologies in ATDYN at first and, eventually,
port them to SPDYN for the better performance. As we maintain consistency between the source codes
of ATDYN and SPDYN, switching from ATDYN to SPDYN is not quite hard.
Other features in GENESIS are listed below:
• Not only atomistic molecular force field (CHARMM, AMBER) but also some coarse-grained mod-
els are available in ATDYN.
• For extremely large biomolecular systems (more than 10 million atoms), parallel input/output (I/O)
scheme is implemented.
• GENESIS is optimized for K and Fugaku computer (developed by RIKEN and Fujitsu company),
but it is also available on Intel-based supercomputers and PC-clusters.
7
GENESIS User Guide, 1.7.1
• GENESIS is written in modern Fortran language (90/95/2003) using modules and dynamic mem-
ory allocation. No common blocks are used.
• GENESIS is free software under the GNU Lesser General Public License (LGPL) version 3 or
later. We allow users to use/modify GENESIS and redistribute the modified version under the
same license.
This user manual mainly provides detailed description of keywords used in the control file. Tutorials for
standard MD simulations, REMD simulations, and some analyses are available online (https://www.r-
ccs.riken.jp/labs/cbrt/). We recommend new users of GENESIS to start from the next chapter to learn a
basic idea, installation, and work flow of the program.
Comparing to other MD software, e.g. AMBER, CHARMM, or NAMD, GENESIS is a very young MD
simulation program. Before releasing the program, the developers and contributors in GENESIS devel-
opment team worked hard to fix all bugs in the program, and performed a bunch of test simulations. Still,
there might be defects or bugs in GENESIS. Since we cannot bear any responsibility for the simulation
results produced by GENESIS, we strongly recommend the users to check the results carefully.
The GENESIS development team has a rich plan for future development of methodology and molecular
models. We would like to make GENESIS one of the most powerful and feasible MD software packages,
contributing to computational chemistry and biophysics. Computational studies in life science is still
at a very early stage (like ‘GENESIS’) compared to established experimental researches. We hope that
GENESIS pushes forward the computational science and contribute to bio-tech and medical applications
in the future.
8
CHAPTER
TWO
GETTING STARTED
2.1.1 Requirements
Compilers
GENESIS works on various systems: laptop PCs, workstations, cluster machines, and supercomputers.
Since the source code of GENESIS is mainly written in Fortran language, Fortran compiler is the first re-
quirement for installation. In addition, “preprocessor” is required, because the source code is “processed”
according to the user’s computer environment before the compilation. One of the commonly used Fortran
compilers is gfortran, which is freely available as part of the GNU Compiler Collection (GCC). In this
case, cpp is selected as a preprocessor, which is also available freely. Another recommended Fortran
compiler is ifort provided by Intel Corporation which enables us to run the program much faster on
Intel CPU. In the Intel compiler package, fpp is provided as a preprocessor. Fujitsu compiler frtpx,
which also functions as a preprocessor, is suitable for Fujitsu machines like FX100.
Both ATDYN and SPDYN work on multiple CPU cores using MPI (Message Passing Interface) and
OpenMP protocols (hybrid MPI+OpenMP). MPI and OpenMP are commonly used for parallel comput-
ing. In general, MPI is employed for communication between different machines, nodes, or processors,
where the memory is not shared among them (distributed-memory). On the other hand, OpenMP is
employed in a single processor, and thus, memory is shared in the parallel computation.
OpenMP is natively supported in most modern Fortran compilers. As for MPI, however, the users may
have to install MPI libraries by themselves, especially, in the case of laptop PCs and workstations. One of
the commonly used MPI software is OpenMPI (https://www.open-mpi.org/). When the users install the
OpenMPI libraries in the computer, the users must specify Fortran and C compilers (e.g., gfortran and
gcc) to be used with MPI. After installing the libraries, the users can use mpif90, mpicc, and mpirun,
which are necessary to compile and run the program that is parallelized with MPI. OpenMPI is available
freely, and the example installation scheme is shown in Appendix. Intel and Fujitsu Corporations are also
providing their own MPI libraries for parallel computation.
9
GENESIS User Guide, 1.7.1
Mathematical libraries
GPGPU
SPDYN works not only with CPU but also with CPU+GPU. Some of the source code in SPDYN are writ-
ten in CUDA, which enables us to effectively run the program on NVIDIA GPU cards. If the users want
to run SPDYN with GPGPU calculations, the CUDA toolkit (https://developer.nvidia.com/cuda-toolkit)
must be also installed in the computer. Note that OpenACC is not employed in GENESIS currently.
The recommended compilers, preprocessors, and libraries for GENESIS are listed below. Please make
sure that at least one of them in each section is installed on your system (GPU is optional). If the users
do not use the Intel or Fujitsu compilers, the combination of GCC compiler, GCC preprocessor, and
OpenMPI is recommended.
• Operating systems (see Appendix)
– Linux
– macOS
• Fortran and C compilers
– GCC compiler gfortran, gcc (version 4.4.7 or higher is required)
– Intel compiler ifort, icc
– Fujitsu compiler frtpx, fccpx
• Preprocessors
– GCC preprocessor cpp
– Intel preprocessor fpp
– Fujitsu compiler frtpx
• MPI libraries for parallel computing
– OpenMPI mpirun, mpif90, mpicc
– Intel MPI
– Fujitsu MPI
• Numerical libraries for mathematical algorithms
– LAPACK/BLAS
– Intel Math Kernel Library (MKL)
– Fujitsu Scientific Subroutine Library (SSL II)
• GPU (Optional)
– NVIDIA GPU cards which support Compute Capability (CC) 3.5 or higher
– The following GPU cards and CUDA versions have been tested by the GENESIS developers
∗ NVIDIA K20, K40, P100, TITAN V, GTX 1080, GTX 1080Ti, RTX 2080, RTX 2080Ti
∗ CUDA ver. 8.0, 9.0, 9.1, 9.2, 10.0
Note: If you are using a supercomputer in universities or research institutes, there is a high chance that
the system already provides the above requirements so that you don’t need to install yourself. Please refer
to the users’ guide of the supercomputer, or consult the system administrator.
In general, the latest version of CUDA does not support the latest version of GCC compiler. If you cannot
compile GENESIS with new CUDA (ver. 10) and new GCC compiler (ver. 8.0 or higher), please first
make an attempt to install CUDA with older GCC compilers (ver. 7.0 or older), and then install GENESIS
with those CUDA and GCC compilers.
$ mkdir $HOME/genesis
$ cd $HOME/genesis
$ mv ~/Downloads/genesis-1.7.1.tar.bz2 ./
$ tar xvfj genesis-1.7.1.tar.bz2
$ cd genesis-1.7.1
$ ls
AUTHORS Makefile.am aclocal.m4 depcomp src
COPYING Makefile.in compile fortdep.py
ChangeLog NEWS configure install-sh
INSTALL README configure.ac missing
Step2. Configure
In order to compile the source code, the users execute the “configure” script in the directory. This script
automatically detects appropriate compilers, preprocessors, and libraries in the users’ computer, and
create “Makefile”.
$ ./configure
If you encountered a failure in the configure command, please check the error message carefully. You may
have to add appropriate options in this command according to your computer environment (see Advanced
installation). The followings are possible suggestions to solve frequent problems. Other solutions might
be found in the online page (https://www.r-ccs.riken.jp/labs/cbrt/installation/).
• First of all, please check whether the Fortran and C compilers are installed in your computer. If
you are going to run GENESIS with multiple CPUs, you should additionally install MPI libraries
such as OpenMPI before compiling GENESIS (see Appendix).
• If you see the error message “configure: error: Fortran compiler cannot create executables”, it
may imply that the path to the installed compilers or MPI libraries might not be correctly set
in “~/.bashrc” or “~/.bash_profile” (see Appendix). This configure script automatically detects
“mpif90”, “mpifrtpx”, or “mpifrt” for Fortran compiler, and “mpicc”, “mpifccpx”, or “mpifrt” for
C compiler. The error message may indicate that the detection was failed due to some reasons.
For example, if you installed OpenMPI in your computer, both “mpif90” and “mpicc” should be
detected. Please check the path to these executables by typing the “which” command (e.g., which
mpif90) in the terminal window. If you cannot see any paths, setting of the path in “~/.bashrc” or
“~/.bash_profile” might have a mistake (see Appendix). You should also check typing mistakes of
the path.
• If the recommended software are not used in compilation, warning messages might be displayed in
the terminal when the configure command is executed. Those messages are just a warning (not an
error), and you may continue the compilation. However, we strongly recommended you to verity
the installation in such cases (see Verify the installation).
• In some supercomputer systems, “module load [module]” command is required to use compliers,
and need to be set before the configure. See the user guide of the system.
• Try “autoreconf” or “./bootstrap” before the configure command, if your computer environment is
significantly different from what we assume and/or if you modify “configure.ac” or “Makefile.am”
by yourself.
After the “configure” command is successful, type the following command to compile and install GEN-
ESIS. All programs in GENESIS are compiled and installed into the “./bin” directory by default.
$ make install
If you encountered a failure, please check the error message carefully. In many cases, errors are caused
by invalid path of compilers and libraries. The followings are possible suggestions to solve frequent
problems. Other solutions might be found in the online page (https://www.r-ccs.riken.jp/labs/cbrt/
installation/).
• If the error message is like “/usr/bin/ld: cannot find -lblas” or “/usr/bin/ld: cannot find -llapack”,
make sure that the BLAS or LAPACK libraries are installed in the computer (see also Appendix).
The users may also have to set the path to the libraries in the “configure” command with the
“LAPACK_LIBS” or “LAPACK_PATH” option (see Advanced installation).
• If you have installed additional software or libraries to solve a make error, please execute “make
clean”, and try Step2 and “make install” again.
Step4. Confirmation
After the installation is successfully finished, the following binary files are found in the “bin” directory.
There are 42 programs in total. Brief description of each program is shown in Available Programs.
$ ls ./bin
atdyn fret_analysis qmmm_generator
avecrd_analysis hb_analysis qval_analysis
comcrd_analysis hbond_analysis rdf_analysis
contact_analysis kmeans_clustering remd_convert
crd_convert lipidthick_analysis rg_analysis
density_analysis mbar_analysis rmsd_analysis
diffusion_analysis meanforce_analysis rpath_generator
distmat_analysis msd_analysis rst_convert
drms_analysis pathcv_analysis rst_upgrade
dssp_interface pcavec_drawer sasa_analysis
eigmat_analysis pcrd_convert spdyn
emmap_generator pmf_analysis tilt_analysis
prjcrd_analysis trj_analysis flccrd_analysis
prst_setup wham_analysis
In the above scheme, GENESIS is installed with default options, and all installed programs run on CPU
with double precision calculation. The users can specify additional options in the configure command
according to the users’ computer environment or desired conditions. The full lists of the available options
are obtained by “./configure --help”. The representative options are as follows.
--enable-single
Turn on single precision calculation. In this case, only SPDYN is installed.
--enable-gpu
Turn on GPGPU calculation. In this case, only SPDYN is installed.
--with-cuda=PATH
Define path to the CUDA libraries manually.
--disable-mpi
Turn off MPI parallelization. In this case, SPDYN is not installed.
--disable-openmp
Turn off OpenMP parallelization.
--disable-parallel_IO
Do not install the parallel I/O tool (prst_setup)
--enable-debug
Turn on program debugging (see below)
--prefix=PREFIX
Install the programs in the directory designated by PREFIX
--with-msmpi
Turn on use of MSMPI. Compilation and execution must be done on Windows10.
Although the compilers are set to “mpif90” and “mpicc” by default, the users may specify different
compilers by configure commands. Fortran compiler is specified with FC and F77, and C compiler with
CC. For example, in the case of “mpiifort” and “mpiicc”, the following options are added:
The following is an example command to set the path to LAPACK and BLAS libraries that are installed in
$HOME/Software/lapack-3.8.0/ (see also Appendix). Please be careful about the filename of the installed
libraries. If the BLAS libraries are installed as “librefblas.a”, the option “-lrefblas” must be used. If
“librefblas.a” is renamed to “libblas.a”, the following command can be used. Linking with the reverse
order of “-llapack” and “-lblas” might also cause a failure of installation of GENESIS.
$ ./configure LAPACK_PATH=$HOME/Software/lapack-3.8.0
The following command is used to turn on single-precision calculation in SPDYN. In this case, force
calculations are carried out with single precision, while integration of the equations of motion as well as
accumulation of the force and energy are still done with double-precision.
$ ./configure --enable-single
Only SPDYN that works on CPU will be installed with this option. If the user additionally needs anal-
ysis tools as well as ATDYN, one must prepare another GENESIS directory, and install without the
“--enable-single” option.
In the following command, the users install SPDYN that works on CPU+GPU with single-precision
calculation. If “--enable-single” is omitted in the command, SPDYN works on CPU+GPU with
double-precision calculation.
Here, if the users encountered an error message like “nvcc: command not found”, make sure that the
CUDA Toolkit is installed in the computer. In typical Linux workstations or cluster machines, CUDA
is installed in “/usr/local/cuda-x.x/” or “/usr/lib/x86_64-linux-gnu/”, and “nvcc” should be in a “bin”
directory of the install directory. The path to “nvcc” and CUDA libraries should be set in a startup file
such as “~/.bashrc”. For example, add the following information to “~/.bashrc” in the case of CUDA 9.0,
CUDAROOT=/usr/local/cuda-9.0
export PATH=$CUDAROOT/bin:$PATH
export LD_LIBRARY_PATH=$CUDAROOT/lib64:/lib:$LD_LIBRARY_PATH
then reload “~/.bashrc” and try the configure command again. If there still remain some troubles, explic-
itly specify a path to CUDA libraries in the configure command by:
The configuration for supercomputer systems may require non-standard setups. In the online usage page,
we describe recommended configure options for some supercomputers (https://www.r-ccs.riken.jp/labs/
cbrt/usage/).
For example, the following commands are used to compile GENESIS on HOKUSAI GreatWave (FX100)
in RIKEN. Note that the parallel I/O tool (prst_setup) is not compiled in this configuration, because
Fujitsu compiler has a trouble in compiling prst_setup (see aldo Available Programs).
By specifying the “--disable-mpi” option, the users can install GENESIS that can work on one CPU.
The configure script automatically looks for “gfortran”, “ifort”, “frt”, or “frtpx” for Fortran compiler,
and “gcc”, “icc”, “fcc”, or “fccpx” for C compiler. Therefore, in this case MPI libraries are not required
for the installation and execution of GENESIS. ATDYN and analysis tools are installed.
$ ./configure --disable-mpi
If the users encountered memory leak errors during the simulation using GENESIS, the origin of the
error might be tracked by using a program compiled with a debug option. Note that the debug option
makes the calculation much slow. In this case, the runtime check is activated only for CPU codes, even
if the “–enable-gpu” option is added to the command.
$ ./configure --enable-debug=3
In Step3, -j option is available, which enables quick compilation of the program using multiple CPU
cores. The following command uses 4 CPU cores.
If you encountered an error message like “Fatal Error: Can’t delete temporary module file ‘. . . ’: No such
file or directory”, please try “make install” without the “-j” option.
The users can verify the installation of GENESIS by using test sets which are available in the GEN-
ESIS website (https://www.r-ccs.riken.jp/labs/cbrt/download/). Please uncompress the downloaded file
in an appropriate directory, and move to the “regression_test” directory. Note that the file name of the
tar.bz2 file contains the date (year, month, and day), so please change the following execution commands
accordingly.
$ cd $HOME/genesis
$ mv ~/Downloads/tests-1.7.1_YYMMDD.tar.bz2 ./
$ tar xvfj tests-1.7.1_YYMMDD.tar.bz2
$ cd tests-1.7.1_YYMMDD/regression_test
$ ls
build test_analysis test_gamd_spdyn test_rpath_atdyn
charmm.py test_atdyn test_nonstrict.py test_rpath_spdyn
cleanup.sh test_common test_parallel_IO test_spana
fep.py test_fep test_remd.py test_spdyn
genesis.py test_fep.py test_remd_common test_vib
param test_gamd.py test_remd_spdyn test_vib.py
test.py test_gamd_atdyn test_rpath.py
In the sub-directories in “regression_test”, the users can find a lot of input files (”inp”), in which various
combinations of simulation parameters are specified. In addition, each sub-directory contains output file
(”ref”) obtained by the developers. The users run “test.py”, “test_remd.py”, “test_rpath.py”, and so on,
which enable automatic comparison between the users’ and developers’ results for each MD algorithm.
The following is an example command to verify the two simulators atdyn and spdyn for basic MD and
energy minimization. Here, the programs are executed using 1 CPU core with the “mpirun” command.
The users can increase the number of MPI processors according to the users’ computer environment, but
only 1, 2, 4, or 8 are allowed in these tests. Other MPI launchers such as “mpiexec” are also available in
the command. There are about 50 test sets, and each test should finish in a few seconds.
$ export OMP_NUM_THREADS=1
$ ./test.py "mpirun -np 1 ~/genesis/genesis-1.7.1/bin/atdyn"
$ ./test.py "mpirun -np 1 ~/genesis/genesis-1.7.1/bin/spdyn"
Passed 46 / 46
Failed 0 / 46
Aborted 0 / 46
If all tests were passed, it means that your GENESIS can generate identical results to the developer’s
GENESIS. Note that the developer’s GENESIS was compiled with Intel compilers, Intel MKL, Open-
MPI library, and the double precision option on Intel CPUs. If your computer system is significantly
different from the developer’s one, unexpected numerical errors may happen, which can cause failures
in some tests. If there were any aborted tests, the users had better check their log or error files care-
fully, which exist in the tested sub-directory, and figure out why the error happened. The followings are
suggestions to solve typical problems:
• If some tests were aborted due to “memory allocation error”, the reason might come from limitation
of the memory size. Namely, those tested systems were too large for your computer. The problem
should not be so serious.
• Available number of MPI slots in your computer might be actually smaller than the given number
of MPI processors. Try to use less number of MPI processors.
• Try to specify the “absolute path” to the program instead of using “relative path”.
• Make sure that the MPI environment is properly set.
• Detailed solutions in specific supercomputer systems might be found in the GENESIS website
(https://www.r-ccs.riken.jp/labs/cbrt/usage/).
By using a similar way, the users can check other functions in atdyn and spdyn, such as GaMD, REMD,
path sampling, vibrational analysis, parallel I/O, and GPGPU calculation. Available number of MPI
processors depends on each test (test_gamd: 1, 2, 4, 8; test_remd: 4, 8, 16, 32; test_rpath: 8; test_vib:
8; parallel_io: 8; gpu: 1, 2, 4, 8). As for the GPGPU tests, the users must use spdyn that was installed
with the “–enable-gpu” option. The parallel_io tests require both spdyn and prst_setup. Note that
prst_setup is not installed in some cases according to the configure options or compilers (see Advanced
installation). In order to run the analysis tool tests, the users first move to “test_analysis”, and then
execute “./test_analysis.py”. Note that MPI is not used in the analysis tool tests. In a similar way, the
users can test the SPANA (spatial decomposition analysis) tool sets. SPANA tool sets are tested with 1,
2, 4, and 8 MPI processes.
$ export OMP_NUM_THREADS=1
$ ./test_gamd.py "mpirun -np 1 ~/genesis/genesis-1.7.1/bin/atdyn"
$ ./test_gamd.py "mpirun -np 1 ~/genesis/genesis-1.7.1/bin/spdyn"
$ ./test_remd.py "mpirun -np 4 ~/genesis/genesis-1.7.1/bin/atdyn"
$ ./test_remd.py "mpirun -np 4 ~/genesis/genesis-1.7.1/bin/spdyn"
$ ./test_fep.py "mpirun -np 8 ~/genesis/genesis-1.7.1/bin/spdyn"
$ ./test_rpath.py "mpirun -np 8 ~/genesis/genesis-1.7.1/bin/atdyn"
$ ./test_rpath.py "mpirun -np 8 ~/genesis/genesis-1.7.1/bin/spdyn"
$ ./test_vib.py "mpirun -np 8 ~/genesis/genesis-1.7.1/bin/atdyn"
$ ./test.py "mpirun -np 8 ~/genesis/genesis-1.7.1/bin/spdyn" parallel_io
$ ./test.py "mpirun -np 8 ~/genesis/genesis-1.7.1/bin/spdyn" gpu
$ cd test_spana
$ ./cleanup.sh
$ export OMP_NUM_THREADS=1
$ ./test_spana.py ~/genesis/genesis-1.7.1/bin/
Note: Some tests might be using “abnormal” parameters or conditions in the input files for the sake of
simple tests. Do not use such parameters in your research. “Normal” parameters are mainly introduced
in this user manual or online tutorials.
The following commands are used to fully recompile GENESIS. Note that the direct “make clean” com-
mand may not work in the case where Makefiles were created in another machine. In this case, the
users must run the “./configure” command before “make clean”.
$ make clean
$ make distclean
$ ./configure [option]
$ make install
2.1.6 Uninstall
The user can uninstall GENESIS by just removing the program directory. If the user changed the install
directory by specifying “--prefix=PREFIX” in the configure command, please remove the programs
(atdyn, spdyn, and so on) in the “PREFIX” directory.
$ rm -rf $HOME/genesis/genesis-1.7.1
The GENESIS programs are executed on a command line. The first argument is basically interpreted
as an input file of the program. The input file, which we call control file hereafter, contains parameters
for simulations. The following examples show typical usage of the GENESIS programs. In the case of
serial execution,
$ [program_name] [control_file]
For example, SPDYN is executed in the following way using 8 MPI processors:
The users should specify an OpenMP thread number explicitly before running the program. Appropriate
number of CPU cores must be used according to the user’s computer environment (see also Available
Programs). For example, if the users want to use 32 CPU cores in the calculation, the following command
might be executed.
$ export OMP_NUM_THREADS=4
$ mpirun -np 8 ~/genesis/genesis-1.7.1/bin/spdyn INP
As for the analysis tools, the usage is almost same, but mpirun is not used. Note that some analysis tools
(e.g., mbar_analysis, wham_analysis, msd_analysis, and drms_analysis) are parallelized with OpenMP.
# MBAR analysis
$ export OMP_NUM_THREADS=4
$ ~/genesis/genesis-1.7.1/bin/mbar_analysis INP
Basic usage of each program is shown by executing the program with the -h option. In addition, sample
control file of each program can be obtained with the -h ctrl option:
For example, in the case of SPDYN, the following messages are displayed:
$ spdyn -h
# normal usage
% mpirun -np XX ./spdyn INP
(skipped...)
This message tells the users that SPDYN can be executed with mpirun. A template control file for molec-
ular dynamics simulation (md) can be generated by executing SPDYN with the -h ctrl md option. The
same way is applicable for energy minimization (min), replica exchange simulation (remd), and replica
path sampling simulation (rpath). The template control file for energy minimization is shown below. If
the users want to show all available options, please specify ctrl_all instead of ctrl. The users can
edit this template control file to perform the simulation that the users want to do.
$ less INPMIN
[INPUT]
topfile = sample.top # topology file
parfile = sample.par # parameter file
psffile = sample.psf # protein structure file
pdbfile = sample.pdb # PDB file
[ENERGY]
forcefield = CHARMM # [CHARMM,AMBER,GROAMBER,GROMARTINI]
electrostatic = PME # [CUTOFF,PME]
switchdist = 10.0 # switch distance
cutoffdist = 12.0 # cutoff distance
pairlistdist = 13.5 # pair-list distance
[MINIMIZE]
method = SD # [SD]
(continues on next page)
[BOUNDARY]
type = PBC # [PBC, NOBC]
In the control file, detailed simulation conditions are specified. The control file consists of several sections
(e.g., [INPUT], [OUTPUT], [ENERGY], [ENSEMBLE], and so on), each of which contains closely-
related keywords. For example, in the [ENERGY] section, parameters are specified for the potential
energy calculation such as a force field type and cut-off distance. In the [ENSEMBLE] section, there
are parameters to select the algorithm to control the temperature and pressure in addition to the target
temperature and pressure of the system. Here, we show example control files for the energy minimization
and normal molecular dynamics simulations.
The control file for the energy minimization must include a [MINIMIZE] section (see Minimize section).
By using the following control file, the users carry out 2,000-step energy minimization with the steepest
descent algorithm (SD). The CHARMM36m force field is used, and the particle mesh Ewald (PME)
method is employed for the calculation of long-range interaction.
[INPUT]
topfile = top_all36_prot.rtf # topology file
parfile = par_all36m_prot.prm # parameter file
strfile = toppar_water_ions.str # stream file
psffile = build.psf # protein structure file
pdbfile = build.pdb # PDB file
[OUTPUT]
dcdfile = min.dcd # coordinates trajectory file
rstfile = min.rst # restart file
[ENERGY]
forcefield = CHARMM # CHARMM force field
electrostatic = PME # Particl mesh Ewald method
switchdist = 10.0 # switch distance (Ang)
cutoffdist = 12.0 # cutoff distance (Ang)
pairlistdist = 13.5 # pair-list cutoff distance (Ang)
pme_nspline = 4 # order of B-spline in PME
pme_max_spacing = 1.2 # max grid spacing allowed (Ang)
vdw_force_switch = YES # turn on van der Waals force switch
contact_check = YES # turn on clash checker
[MINIMIZE]
method = SD # Steepest descent method
nsteps = 2000 # number of steps
eneout_period = 100 # energy output freq
crdout_period = 100 # coordinates output frequency
rstout_period = 2000 # restart output frequency
nbpdate_period = 10 # pairlist update frequency
[BOUNDARY]
type = PBC # periodic boundary condition
(continues on next page)
The control file for normal MD simulations must include a [DYNAMICS] section (see Dynamics sec-
tion). By using the following control file, the users carry out a 100-ps MD simulation at 𝑇 = 298.15
K and 𝑃 = 1 atm in the NPT ensemble. The equations of motion are integrated by the RESPA algo-
rithm with a time step of 2.5 fs, and the bonds of light atoms (hydrogen atoms) are constrained using the
SHAKE/RATTLE and SETTLE algorithms. The temperature and pressure are controlled with the Bussi
thermostat and barostat.
[INPUT]
topfile = top_all36_prot.rtf # topology file
parfile = par_all36m_prot.prm # parameter file
strfile = toppar_water_ions.str # stream file
psffile = build.psf # protein structure file
pdbfile = build.pdb # PDB file
rstfile = min.rst # restart file
[OUTPUT]
dcdfile = md.dcd # coordinates trajectory file
rstfile = md.rst # restart file
[ENERGY]
forcefield = CHARMM # CHARMM force field
electrostatic = PME # Particl mesh Ewald method
switchdist = 10.0 # switch distance (Ang)
cutoffdist = 12.0 # cutoff distance (Ang)
pairlistdist = 13.5 # pair-list cutoff distance (Ang)
pme_nspline = 4 # order of B-spline in PME
pme_max_spacing = 1.2 # max grid spacing allowed (Ang)
vdw_force_switch = YES # turn on van der Waals force switch
[DYNAMICS]
integrator = VRES # RESPA integrator
timestep = 0.0025 # timestep (2.5fs)
nsteps = 40000 # number of MD steps (100ps)
eneout_period = 400 # energy output period (1ps)
crdout_period = 400 # coordinates output period (1ps)
rstout_period = 40000 # restart output period
nbupdate_period = 10 # nonbond update period
elec_long_period = 2 # period of reciprocal space calculation
thermostat_period = 10 # period of thermostat update
barostat_period = 10 # period of barostat update
[CONSTRAINTS]
rigid_bond = YES # constraint all bonds involving hydrogen
[ENSEMBLE]
ensemble = NPT # NPT ensemble
tpcontrol = BUSSI # BUSSI thermostat and barostat
temperature = 300 # target temperature (K)
(continues on next page)
[BOUNDARY]
type = PBC # periodic boundary condition
THREE
AVAILABLE PROGRAMS
3.1 Simulators
atdyn
The simulator that is parallelized with the atomic decomposition scheme. In most cases,
atdyn is applied to small systems or coarse-grained systems. The program runs on CPU
with the hybrid MPI+OpenMP protocol, where only double-precision calculation is avail-
able. Since the atomic decomposition is a simple parallelization scheme, the source code is
actually simple compared to that for the domain decomposition. Therefore, this program is
also useful to develop a new function of GENESIS.
spdyn
The simulator that is parallelized with the domain decomposition scheme. The program
is designed to achieve high-performance molecular dynamics simulations, such as micro-
second simulations and cellular-scale simulations. The program runs on not only CPU but
also CPU+GPU with the hybrid MPI+OpenMP protocol. Here, beside double-precision,
mixed-precision calculations are also available. In the mixed-precision model, force calcu-
lations are carried out with single precision, while integration of the equations of motion as
well as accumulation of the force and energy are done with double-precision.
29
GENESIS User Guide, 1.7.1
In the atomic decomposition MD, which is also called a replicated-data MD algorithm, all MPI proces-
sors have the same coordinates data of all atoms in the system. MPI parallelization is mainly applied to
the “DO loops” of the bonded and non-bonded interaction pair lists for the energy and force calculations.
Fig. 3.1 (a) shows a schematic representation of the atomic decomposition scheme for the non-bonded
interaction calculation in a Lennard-Jones system, where 2 MPI processors are used. In this scheme,
MPI_ALLREDUCE must be used to accumulate all the atomic forces every step, resulting in large com-
munication cost.
In the domain-decomposition MD, which is also called a distributed-data MD algorithm, the whole sys-
tem is decomposed into domains according to the number of MPI processors, and each MPI processor
is assigned to a specific domain. Each MPI processor handles the coordinates data of the atoms in the
assigned domain and in the buffer regions near the domain boundary, and carries out the calculation of
the bonded and non-bonded interactions in the assigned domain, enabling us to reduce computational
cost. In this scheme, communication of the atomic coordinates and forces in the buffer region is essen-
tial. Fig. 3.1 (b) is a schematic representation of the domain decomposition scheme, where the system is
decomposed into two domains to use 2 MPI processors. Note that in the figure the system periodicity is
not considered for simplicity.
Fig. 3.1: Parallelization scheme in the (a) atomic decomposition and (b) domain decomposition.
The users had better understand a basic scheme of parallel calculation in SPDYN to get the best perfor-
mance in the calculation. As described above, the simulation box is divided into domains according to
the number of MPI processors. Each domain is further divided into smaller cells, each of whose size is
adjusted to be approximately equal to or larger than the half of “pairlistdist + 𝛼”. Here, “pairlistdist” is
specified in the control file, and 𝛼 depends on the algorithms used in the simulation (see the next sub-
section). Note that all domains or cells have the same size with a rectangular or cubic shape. Each MPI
processor is assigned to each domain, and data transfer or communication about atomic coordinates and
forces is achieved between only neighboring domains. In addition, calculation of bonded and non-bonded
interactions in each domain is parallelized based on the OpenMP protocol. These schemes realize hy-
brid MPI+OpenMP calculation, which is more efficient than flat MPI calculation on recent computers
with multiple CPU cores. Because MPI and OpenMP are designed for distributed-memory and shared-
memory architectures, respectively, MPI is mainly used for parallelization between nodes and OpenMP
is used within one node.
3.1. Simulators 30
GENESIS User Guide, 1.7.1
The following figures illustrate how the hybrid MPI+OpenMP calculations are achieved in SPDYN. In
Fig. 3.2 (a) and 2(b), 8 MPI processors with 4 OpenMP threads (32 CPU cores in total), and 27 MPI
processors with 2 OpenMP threads (54 CPU cores in total) are used, respectively. In these Figures, only
XY dimensions are shown for simplicity.
$ export OMP_NUM_THREADS=4
$ mpirun -np 8 ~/genesis/genesis-1.7.1/bin/spdyn INP > log
$ export OMP_NUM_THREADS=2
$ mpirun -np 27 ~/genesis/genesis-1.7.1/bin/spdyn INP > log
In the log file, the users can check whether the given numbers of MPI processors and OpenMP threads
are actually employed or not. The following information should be found in the log file for instance for
Case (a):
Note: In most cases, the number of domains in each dimension is automatically determined according
to the given number of MPI processors. However, if such automatic determination is failed, the users
must specify the number of domains explicitly in the control file (see Boundary section).
3.1. Simulators 31
GENESIS User Guide, 1.7.1
Basically, there is no strict limitation in the available number of MPI processors in ATDYN. However,
there are a few limitations in SPDYN. First, the number of domains must be equal to the number of
MPI processors. Second, one domain must be composed of at least 8 cells (= 2×2×2), where the cell
size in one dimension is automatically set to be larger than the half of “pairlistdist + 𝛼”, The following
table summarizes the 𝛼 value in each algorithm. According to these rules, the available “maximum”
number of MPI processors (𝑁max ) for a certain target system is mainly determined by the simulation
3
box size and “pairlistdist”. For example, if the box size of your target system is 64×64×64Å , and
“pairlistdist=13.5” is specified in the control file, 𝑁max is 4×4×4 = 64 in the case of NVT ensemble
and “rigid_bond=YES”. If the users want to use much more CPU cores than 𝑁max , the number of
OpenMP threads should be increased instead of the MPI processors.
In the MD simulation with the NPT ensemble, these rules become more important, because the box size
(or cell size) can change during the simulation. In fact, the number of domains in each dimension is
initially fixed, but the number of cells can be changed and adjusted to keep the cell size larger than the
half of “pairlistdist + 𝛼”. If the box size is decreased during the simulation, and the number of cells in
one dimension of the domain unfortunately becomes one, the calculation stops immediately because of
the violation of the above rule. The users may often encounter this situation if the number of cells in one
dimension of the domain is just two at the beginning of the MD simulation, and the simulation box has
significantly shrunk during the simulation. To avoid such problems, the users may have to use smaller
number of MPI processors (which makes cells larger) or shorter pairlistdist (making much cells in one
domain), or reconstruct a larger system.
If the users encountered the following error message in the simulation, the problem is probably related
to the above rules, where the specified number of MPI processors might exceed 𝑁max .
In this case, please make sure that one domain can be composed of at least 8 cells. If the domains and
cells are successfully determined, they can be seen in the early part of the log file. The following example
is corresponding to the situation in Fig. 3.2 (b).
3.1. Simulators 32
GENESIS User Guide, 1.7.1
Fundamental functions in SPDYN and ATDYN are energy minimization (Min), molecular dynamics
method (MD), replica-exchange method (REMD), string method (String), and vibrational analysis (Vib).
As shown in the last part of the previous chapter, the users carry out simulations of these methods by
writing related sections in the control file. The users can extend these fundamental functions by combin-
ing various sections. For example, to run a “restrained MD simulation”, the users add [SELECTION]
and [RESTRAINTS] sections in the control file of the “normal MD simulation”. In fact, there are 17
individual sections in GENESIS version 1.4. The following table summarizes the available sections
in each function. Detailed usage of each section is described in this user guide, and also in the online
tutorials (https://www.r-ccs.riken.jp/labs/cbrt/tutorials2019/).
The following programs are available as the trajectory analysis tools in GENESIS. Basic usage of each
tool is similar to that of spdyn or atdyn. The users can automatically generate a template control file for
each program by using the “[program_name] -h ctrl” command. The control file is mainly composed of
INPUT, OUTPUT, TRAJECTORY, FITTING, SELECTION, and OPTION sections. The trajectory files
to be analyzed are specified in the [TRAJECTORY] section, and the parameters used in the analysis are
specified in the [OPTION] section. Note that the required sections are depending on the program. For
example, eigmat_analysis requires only INPUT and OUTPUT sections. Detailed usage of each tool is
described in the online tutorial.
comcrd_analysis
Analyze the coordinates of the center of mass of the selected atoms.
diffusion_analysis
Analyze the diffusion constant.
distmat_analysis
Analyze the matrix of the averaged distance of the selected atoms.
drms_analysis
Analyze the distance RMSD of the selected atoms with respect to the initial structure.
fret_analysis
Analyze the FRET efficiency.
hb_analysis
Analyze the hydrogen bond.
lipidthick_analysis
Analyze the membrane thickness.
msd_analysis
Analyze the mean-square displacement (MSD) of the selected atoms or molecules.
qval_analysis
Analyze the fraction of native contacts (Q-value).
rg_analysis
Analyze the radius of gyration of the selected atoms.
rmsd_analysis
Analyze the root-mean-square deviation (RMSD) of the selected atoms with respect to the
initial structure.
tilt_analysis
Analyze the tilt angle
trj_analysis
Analyze the distance, angle, dihedral angle, distance of the centers of mass (COM) of the
selected atom groups, angle of the COM of the selected atom groups, and dihedral angle of
the COM of the selected atom groups.
avecrd_analysis
Calculate the average structure of the target molecule.
flccrd_analysis
Calculate the variance-covariance matrix from the trajectories and averaged coordinates.
This tool can be also used to calculate root-mean-square fluctuation (RMSF).
eigmat_analysis
Diagonalize the variance-covariance matrix in PCA.
prjcrd_analysis
Project the trajectories onto PC axes.
pcavec_drawer
Create a script for VMD and PyMol to visualize PC vectors obtained from eigmat_analysis.
crd_convert
Convert trajectories to PDB/DCD formats. This tool can do centering of the target molecule,
fitting of a given atom group to the initial structure, wrapping of molecules into the unit cell,
combining multiple trajectory files into a single file, extraction of coordinates of selected
atoms, and so on.
remd_convert
Convert REMD trajectories to those sorted by parameters. Since the trajectory files are
generated from each replica during the REMD simulations, the obtained “raw” trajectories
are composed of “mixed” data at various conditions (replica parameters). remd_convert
enables the users to sort the REMD trajectories by parameters. This is applicable to not only
dcdfile but also energy log files.
rst_convert
Convert GENESIS restart file (rstfile) to the PDB file.
rst_upgrade
Convert old restart file (version < 1.1.0) to that in the new format (version >= 1.1.0).
wham_analysis
Free energy analysis using the Weighted Histogram Analysis Method (WHAM).
mbar_analysis
Free energy analysis using the Multistate Bennett Acceptance Ratio (MBAR) method.
pmf_analysis
3.2.5 Clustering
kmeans_clustering
Carry out k-means clustering for coordinates trajectories
dssp_interface
Interface program to analyze the protein secondary structure in the DCD trajectory file using
the DSSP program (https://swift.cmbi.umcn.nl/gv/dssp/).
3.2.7 SPANA
SPANA (SPatial decomposition ANAlysis) is developed to carry out trajectory analyses of large-scale
biological simulations using multiple CPU cores in parallel. SPANA employs a spatial decomposition
of a system to distribute structural and dynamical analyses into the individual CPU core and allows us
to reduce the computational time for the analysis significantly. SPANA is suitable for the analysis of
systems with multiple macromolecules (such as cellular crowding systems) under the periodic boundary
condition.
contact_analysis
Calculate the number of close atomic pairs between given molecules. The close atomic
pairs (or atomic contacts) are defined if the closest atom-atom distance between two macro-
molecules is shorter than given cutoff distance. This program also finds the closest atom
pairs between macromolecule pairs within the cutoff distance.
density_analysis
Calculate 3D density distribution of atoms and output the density in X-PLOR/CCP4/DX
format.
hbond_analysis
Analyze hydrogen bonds
rdf_analysis
Calculate the radial distribution function (RDF) and proximal distribution function (PDF) of
molecules (as solvent) around the target group (as solute). PDF provides density of solvent
as function of the distance to the surface of macromolecules.
sasa_analysis
Calculate solvent accessible surface area (SASA) of the target molecules. This program
outputs not only the total SASA but also the SASA for each atom in the target molecules.
rpath_generator
Generate inputs for the string method. This tool is usually used after targeted MD simulation
for generating an initial pathway for the subsequent string method.
pathcv_analysis
Calculate tangential and orthogonal coordinates to a pathway from samples.
qmmm_generator
Generate a system for QM/MM calculation from MD data.
emmap_generator
Generate cryo-EM density map from PDB file.
SPDYN can be employed with the parallel I/O protocol to achieve massively parallel computation. Since
SPDYN is parallelized with the domain decomposition scheme, each MPI processor has the coordinates
of atoms in the assigned domain. Therefore, large ammount of communication is needed between MPI
processors to write the coordinates in a single DCD file, which is a waste of time in the case of the
simulations for a huge system like 100,000,000 atoms. To avoid such situations, file I/O in each node
(parallel I/O) is useful. The following tools are used to handle the files generated from parallel I/O
simulations.
prst_setup
This tool divides input files (PDB and PSF) for a huge system into multiple files, where
each file is assinged to each domain. The obtained files can be read as restart files in the
[INPUT] section. Note that prst_setup is not compiled with Fujitsu compilers. Therefore,
if the users are going to perform MD simulations with parallel I/O in Fujitsu supercomputers,
the users must create the files without using Fujitsu compilers elsewhere in advance. Even
if prst_setup and SPDYN are compiled with different compilers, there is no problem to
execute SPDYN with parallel I/O.
pcrd_convert
Convert multiple trajectory files obtained from the parallel I/O simulation to a single DCD
file. This tool has a similar function to crd_convert.
FOUR
INPUT SECTION
In order to run MD simulations, the users have to prepare input files that contains information about
the coordinates of the initial structure as well as topology of the system and force field parameters. The
users first create those input files by using a setup tool, and their filenames are specified in the [INPUT]
section of the control file. GENESIS supports various input file formats such as CHARMM, AMBER,
and GROMACS. Basically, required input files depend on the force field to be used in the simulation.
The following table summarizes the essential input files and setup tools for each force field.
One of the commonly used parameters for biomolecules is the CHARMM force field, which was origi-
nally developed by the Karplus group at the Harvard University [7]. The users can obtain the files that
contain the force field parameters from the CHARMM group’s web site. At this momemt, the latest ver-
sion of the CHARMM force field is C36m [13]. In the download file, there are topology and parameter
files (e.g., top_all36_prot.rtf and par_all36m_prot.prm).
In order to run the MD simulation with the CHARMM force field, the users have to additionally make a
new file that holds the information about the atom connectivity of the “whole” target system. Note that
the topology file (e.g., top_all36_prot.rtf) does not contain such information, because it is designed to
generally define the topology of proteins by dealing with the 20 amino acid residues as “fragments”. In
order to hold the topology information of the target system, the users will create a “PSF” file (protein
structure file). It is commonly used in other MD software, and can be generated from the PDB and
topology files by using VMD/PSFGEN [14], CHARMM-GUI [15], or the CHARMM program [2].
When the PSF file is created, “processed PDB” file is also obtained, where the atom name or residue
name might be changed from those in the original PDB file. The users must use this PDB file as the input
of the MD simulation, because it has a consistency with the information in PSF. Consequently, the users
need four files (processed PDB, parameter, topology, and PSF) as the inputs of GENESIS. These files
are specified in the [INPUT] section of the control file of GENESIS.
38
GENESIS User Guide, 1.7.1
The AMBER force field has been also commonly used for the MD simulations of biomolecules, which
was originally developed by the Kollman group at the University of California, San Francisco [16]. GEN-
ESIS can deal with the AMBER force fields. Basic scheme to prepare the input files for GENESIS is
similar to that in the case of CHARMM. The users utilize the LEaP program in AmberTools [1]. LEaP
has a similar function to PSFGEN. After building the target system using LEaP, the users obtain PRM-
TOP, CRD, and PDB files. PRMTOP contains the information about parameter and topology of the target
system, and CRD and PDB include the coordinates of atoms in the initial structure. GENESIS uses these
files as the inputs.
GENESIS can deal with coarse-grained (CG) models such as the Go-model [17] and MARTINI [18].
In this case, the users again use external setup tools to build the system and prepare the parameter and
topology files. For the all-atom Go-model [19], the users use the SMOG server [20] or SMOG2 program
[21], which generates grotop and grocrd files. The grotop file contains the information about parameter
and topology, and the grocrd file includes the coordinates of the initial structure, both of which are the
file formats used in the GROMACS program. For the Karanicolas-Brooks (KB) Go-model [22] [23], the
users use the MMTSB server [24], which generates par, top, pdb, and psf files.
topfile
CHARMM topology file containing information about atom connectivity of residues and
other molecules. For details on the format, see the CHARMM web site [25].
parfile
CHARMM parameter file containing force field parameters, e.g. force constants and equi-
librium geometries.
strfile
CHARMM stream file containing both topology information and parameters.
psffile
CHARMM/X-PLOR ‘psffile` containing information of the system such as atomic masses,
charges, and atom connectivities.
prmtopfile
AMBER ‘PARM’ or ‘prmtop’ file (AMBER7 or later format) containing information of
the system such as atomic masses, charges, and atom connectivities. For details about this
format, see the AMBER web site [26].
grotopfile
Gromacs ‘top’ file containing information of the system such as atomic masses, charges,
atom connectivities. For details about this format, see the Gromacs web site [27].
pdbfile
Coordinates file in the PDB format. If rstfile is also specified in the [INPUT] section, coor-
dinates in pdbfile are replaced with those in rstfile.
crdfile
Coordinates file in the CHARMM format. If pdbfile is also specified in the [INPUT] section,
coordinates in crdfile are NOT used. However, if pdbfile is not specified, coordinates in
crdfile are used. If rstfile is further specified, coordinates in rstfile are used.
ambcrdfile
Coordinates file in the AMBER format (ascii). If pdbfile is also specified in the [INPUT]
section, coordinates in ambcrdfile are NOT used. However, if pdbfile is not specified, coor-
dinates in ambcrdfile are used. If rstfile is further specified, coordinates in rstfile are used.
grocrdfile
Coordinates file in the GROMACS format (.gro file). If pdbfile is also specified in the [IN-
PUT] section, coordinates in grocrdfile are NOT used. However, if pdbfile is not specified,
coordinates in grocrdfile are used. If rstfile is further specified, coordinates in rstfile are
used. Note that velocites and simulation box size in grocrdfile are NOT used.
rstfile
Restart file in the GENESIS format. This file contains atomic coordinates, velocities, simula-
tion box size, and other variables which are essential to restart the simulation continuously.
If rstfile is specified in the [INPUT] section, coordinates in pdbfile, crdfile, grocrdfile, or
ambcrdfile are replaced with those in rstfile. The box size specified in the [BOUNDARY]
seciton is also overwritten. Note that pdbfile, crdfile, grocrdfile, or ambcrdfile should be still
specified in the [INPUT] section, even if rstfile is specified.
Note that the file format of rstfile was changed after ver. 1.1.0. The rst_upgrade tool enables
us to change the old format used in ver. 1.0.0 or older to the new one.
reffile
Reference coordinates (PDB file format) for positional restraints and coordinate fitting. This
file should contain the same total number of atoms as pdbfile, crdfile, ambcrdfile, or grocrd-
file.
ambreffile
Reference coordinates (‘amber crd’ file format) for positional restraints and coordinate fit-
ting. This file should contain the same total number of atoms as pdbfile or ambcrdfile.
groreffile
Reference coordinates (‘gro’ file format) for positional restraints and coordinate fitting. This
file should contain the same total number of atoms as pdbfile or grocrdfile.
modefile
Principal modes used with principal component (PC) restraints. This file contains only single
column ascii data. The XYZ values of each atom’s mode vector are stored from the low-
index modes.
localresfile (for SPDYN only)
This file defines restraints to be applied in the system. If you are not an expert of GENESIS,
we strongly recommend you to simply use the [RESTRAINTS] section for restraint instead
of using localresfile.
In localresfile, only bond, angle, and dihedral angle restraints can be defined. In addition, se-
lected atoms in localresfile must exist in the same cell in the domain decomposition scheme.
The restraint energy calculated for the lists in localresfile is NOT explicitly displayed in the
log file. Instead, the local restraint energy is hidden in the conventional bond, angle, and
dihedral angle energy terms of the log file.
The restraint potentials defined in localresfile are given by harmonic potentials:
𝑈 (𝑟) = 𝑘 (𝑟 − 𝑟0 )2 for bonds
𝑈 (𝜃) = 𝑘 (𝜃 − 𝜃0 )2 for bond angles
𝑈 (𝜑) = 𝑘 (𝜑 − 𝜑0 )2 for dihedral angles
Here, 𝑟, 𝜃, and 𝜑 are bond distance, angle, and dihedral angles, respectively; subscript 0
denotes their reference values; and 𝑘 is the force constant.
The syntax in localresfile is as follows:
The users must carefully specify the atom index in this file. The atom indexes in localresfile
must be consistent with those in the other input files such as psffile.
The following is an example of localresfile:
In the REMD or RPATH simulations, input files (mainly coordinates and restart files) should be prepared
for each replica. In GENESIS, we can easily specify those multiple files in the [INPUT] section. If we
include ‘{}’ in the input filename, {} is automatically replaced with the replica index. For example,
in the case of REMD simulations with 4 replicas, we prepare input_1.pdb, input_2.pdb, input_3.pdb,
and input_3.pdb, and specify pdbfile = input_{}.pdb in the [INPUT] section. This rule is also
applicable to the restart filename.
fitfile (for RPATH only; GENESIS 1.1.5 or later)
Reference coordinates for structure fitting. This file is only used in the string method. For
other cases (MD, MIN, or REMD), reffile, groreffile, or ambreffile is used for reference
coordinates for fitting, and this fitfile is simply ignored, even if it is specified in the [INPUT]
section.
4.6 Examples
[INPUT]
topfile = ../toppar/top_all36_prot.rtf
parfile = ../toppar/par_all36m_prot.prm
strfile = ../toppar/toppar_water_ions.str
psffile = ../build/input.psf
pdbfile = ../build/input.pdb
[INPUT]
topfile = ../toppar/top_all36_prot.rtf
parfile = ../toppar/par_all36m_prot.prm
strfile = ../toppar/toppar_water_ions.str
psffile = ../build/input.psf
pdbfile = ../build/input.pdb
reffile = ../build/input.pdb
[INPUT]
topfile = ../toppar/top_all36_prot.rtf, ../toppar/top_all36_lipid.rtf
parfile = ../toppar/par_all36m_prot.prm, ../toppar/par_all36_lipid.prm
strfile = ../toppar/toppar_water_ions.str
(continues on next page)
In this case, we specify multiple top and par files for proteins and lipids separated by commas.
If one line becomes very long, backslash ” \ ” can be used as a line continuation character:
[INPUT]
topfile = ../toppar/top_all36_prot.rtf, \
../toppar/par_all36_na.prm, \
../toppar/top_all36_lipid.rtf
parfile = ../toppar/par_all36m_prot.prm, \
../toppar/top_all36_na.rtf, \
../toppar/par_all36_lipid.prm
strfile = ../toppar/toppar_water_ions.str
psffile = ../build/input.psf
pdbfile = ../build/input.pdb
[INPUT]
prmtopfile = ../build/input.prmtop
ambcrdfile = ../build/input.crd
[INPUT]
grotopfile = ../build/input.top
grocrdfile = ../build/input.gro
In this case, we specify grotop and grocrd files obtained from the SMOG server or SMOG2 software.
MD simulation with the EEF1/IMM1/IMIC implicit solvent models (CHARMM19):
[INPUT]
topfile = ../support/aspara/toph19_eef1.1.inp
parfile = ../support/aspara/param19_eef1.1.inp
eef1file = ../support/aspara/solvpar.inp
psffile = ../build/input.psf
pdbfile = ../build/input.pdb
[INPUT]
topfile = ../support/aspara/top_all36_prot_eef1.1.rtf
parfile = ../toppar/par_all36_prot.prm
eef1file = ../support/aspara/solvpar22.inp
psffile = ../build/input.psf
pdbfile = ../build/input.pdb
4.6. Examples 43
GENESIS User Guide, 1.7.1
[INPUT]
topfile = ../toppar/top_all36_prot.rtf
parfile = ../toppar/par_all36m_prot.prm
strfile = ../toppar/toppar_water_ions.str
psffile = ../build/input.psf
pdbfile = ../build/input.pdb
[INPUT]
topfile = ../toppar/top_all36_prot.rtf
parfile = ../toppar/par_all36m_prot.prm
strfile = ../toppar/toppar_water_ions.str
psffile = ../build/input.psf
pdbfile = ../build/input_rep{}.pdb
[INPUT]
topfile = ../toppar/top_all36_prot.rtf
parfile = ../toppar/par_all36m_prot.prm
strfile = ../toppar/toppar_water_ions.str
psffile = ../build/input.psf
pdbfile = ../build/input.pdb
rstfile = run_rep{}.rst
4.6. Examples 44
CHAPTER
FIVE
OUTPUT SECTION
GENESIS yields trajectory data (coordinates and velocities) in the DCD file format regardless of the
force field or MD algorithm. GENESIS can also generate a restart file (rstfile) during or at the end of the
simulation, which can be used to restart and extend the simulation continuously. Output frequency of each
file (e.g., crdout_period and velout_period) is specified in the [DYNAMICS] section in the case of the
MD, REMD, and RPATH simulations, or [MINIMIZE] section in the case of the energy minimization.
dcdfile
Filename for the coordinates trajectory data. Coordinates are written in the DCD format,
which is commonly used in various MD software such as CHARMM and NAMD. The file-
name must be given in the case of crdout_period > 0. However, if crdout_period =
0 is specified in the control file, no dcdfile is generated, even if the filename is specified in
the [OUTPUT] section.
dcdvelfile
Filename for the velocity trajectory data. Velocities are written in the DCD format. The
filename must be given in the case of velout_period > 0. However, if velout_period
= 0 is specified in the control file, no dcdvelfile is generated, even if the filename is specified
in the [OUTPUT] section.
rstfile
Filename for the restart data. The rstfile contains coordinates, velocities, simulation box
size, and so on. This file can be used to extend the simulation continuously. In addition,
it can be used to switch the simulation algorithms (e.g., from minimization to MD, from
MD to REMD, from REMD to minimization, etc) The filename must be given in the case
of rstout_period > 0. However, if rstout_period = 0 is specified in the control file,
no rstfile is generated, even if the filename is specified in the [OUTPUT] section.
pdbfile (for ATDYN only)
Filename for the restart PDB file. This file is updated every rstout_period steps.
45
GENESIS User Guide, 1.7.1
When the user performs REMD or RPATH simulations, the user must include ‘{}’ in the output filename.
This {} is automatically replaced with the replica index.
remfile (only for REMD simulations)
This file contains parameter index data from the REMD simulation, which is written for each
replica every exchange_period steps. This is used as an input file for the remd_convert
tool to sort the coordinates trajectory data by parameters. The filename must contain ‘{}’,
which is automatically replaced with the replica index. Note that the information about the
parameter index as well as replica index in the entire REMD simulation is written in the
standard (single) output file (see online Tutorials).
logfile (only for REMD and RPATH simulations)
This file contains the energy trajectory data from the REMD or RPATH simulations, which
is written for each replica every exchange_period steps. This is used as an input file for
the remd_convert tool to sort the coordinates trajectory data by parameters. The filename
must contain ‘{}’, which is automatically replaced with the replica index.
rpathfile (only for RPATH simulations)
This file contains the trajectory of image coordinates in the string method, which are refer-
ence values used in the restraint functions. Columns correspond to the collective variables,
and rows are time steps. This data is written with the same timing as the dcdfile. For details,
see RPATH section.
gamdfile
This file provides GaMD parameters determined during the GaMD simulation. The filename
must be given in the case of update_period > 0 in [GAMD] section. The GaMD simula-
tion updates its parameters every update_period, and then updates parameters are output
to gamdfile. The file includes the maximum, minimum, average, and deviation of the total
potential or dihedral potential, which are calculated within the interval update_period.
minfofile
This file provides the coordinates and normal mode vectors of the molecules specified for
vibrational analysis. It is used in SINDO for visualizing the vibrational motion. It is also an
input file to start anharmonic vibrational calculations. See Vibration section for the vibra-
tional analysis.
fepfile
This file provides energy differences between adjacent windows in FEP simulations. The
filename must be given in the case of fepout_period > 0 in [ALCHEMY] section. How-
ever, if fepout_period = 0 is specified in the control file, no fepfile is generated, even if
the filename is specified in the [OUTPUT] section. If FEP/𝜆-REMD simulations are per-
formed (i.e., both [REMD] and [ALCHEMY] sections are specified in the control file), the
filename must contain ‘{}’, which is automatically replaced with the replica index.
5.6 Examples
[OUTPUT]
dcdfile = run.dcd
rstfile = run.rst
[OUTPUT]
logfile = run_rep{}.log
dcdfile = run_rep{}.dcd
remfile = run_rep{}.rem
rstfile = run_rep{}.rst
SIX
ENERGY SECTION
where 𝐾𝑏 , 𝐾𝜃 , 𝐾𝜑 , and 𝐾𝜑𝑖 are the force constant of the bond, angle, dihedral angle, and improper
dihedral angle term, respectively, and 𝑏0 , 𝜃0 , 𝜑0 , and 𝜑𝑖,0 are corresponding equilibrium values. 𝛿 is a
phase shift of the dihedral angle potential, 𝜖 is a Lennard-Jones potential well depth, 𝑅𝑚𝑖𝑛,𝑖𝑗 is a distance
of the Lennard-Jones potential minimum, 𝑞𝑖 is an atomic charge, 𝜖1 is an effective dielectric constant,
and 𝑟𝑖𝑗 is a distance between two atoms. The detailed formula and parameters in the potential energy
function depend on the force field and molecular model.
48
GENESIS User Guide, 1.7.1
Calculation of the non-bonded interaction is the most time consuming part in MD simulations. Compu-
tational time for the non-bonded interaction terms without any approximation is proportional to 𝑂(𝑁 2 ).
To reduce the computational cost, a cut-off approximation is introduced, where the energy and force
calculation is truncated at a given cut-off value (keyword cutoffdist). Simple truncation at the cut-off
distance leads to discontinuous energy and forces. So it is necessary to introduce a polynomial function
(so called switching function) that smoothly turn off the interaction from another given value (so called
switch cut-off ), which is generally applied to the van der Waals interactions (keyword switchdist). There
are two kinds of switching: “potential switch” and “force switch”. In GENESIS, potential switching
is turned on as the default. However, in the case of the AMBER force field, potential switching is still
turned off, since the original AMBER program package is not using the potential switching. To turn on
the “force switching”, vdw_force_switch=YES must be specified. Note that the cut-off scheme for the
electrostatic energy term is different from that for the van der Waals energy term, where the former uses
a shift function. Such shift is turned on when Electrostatic=Cutoff is specified.
Dielectric constant of the system. Note that the distance dependent dielectric constant is not
availabe in GENESIS.
vdw_force_switch YES / NO
Default : NO
This paramter determines whether the force switch function for van der Waals interactions is
employed or not. [34] The users must take care about this parameter, when the CHARMM
force field is used. Typically, “vdw_force_switch=YES” should be specified in the case of
CHARMM36.
vdw_shift YES / NO
Default : NO
This parameter determines whether the energy shift for the van der Waals interactions is
employed or not. If it is turned on, potential energy at the cut-off distance is shifted by
a constant value so as to nullify the energy at that distance, instead of the default smooth
quenching function. This parameter is available only when “forcefield = GROAMBER” or
“forcefield = GROMARTINI”.
dispersion_corr NONE / ENERGY / EPRESS
Default : NONE (automatically set to EPRESS in the case of AMBER)
This parameter determines how to deal with the long-range correction about the cut-off for
the van der Waals interactions. Note that the formula used for the correction is different
between the GROMACS and AMBER schemes. In the case of the CHARMM force filed,
“dispersion_corr=NONE” is always used.
• NONE: No correction is carried out.
• ENERGY: Only energy correction is carried out.
• EPRESS: Both energy and internal pressure corrections are carried out.
implicit_solvent NONE / GBSA / EEF1 / IMM1 / IMIC (ATDYN only)
Default : NONE
Use implicit solvent or not.
• NONE: Do not use implicit solvent model
• GBSA: Use the GB/SA implicit water model (Only available with the CHARMM
all-atom or AMBER force fields in non-boundary condition (“type=NOBC” in the
[BOUNDARY] section). [35] [36]
• EEF1: Use the EEF1 implicit water model (Only available with the CHARMM force
fields in NOBC) [37]
• IMM1: Use the IMM1 implicit membrane model (Only available with the CHARMM
force fields in NOBC) [38]
• IMIC: Use the IMIC implicit micelle model (Only available with the CHARMM force
fields in NOBC) [39]
contact_check YES / NO
Default : NO
If this parameter is set to YES, length of all covalent bonds as well as distance between non-
bonded atom pairs are checked at the begining of the simulation. If long covalent bonds or
clashing atoms are detected, those atom indexes are displayed in the log file. If contact_check
is turned on, nonb_limiter is also automatically enabled. If the users want to turn on only
“contact_check”, please specify “contact_check = YES” and “nonb_limiter = NO” explic-
itly. Note that this contact_check does not work in the parallel-io scheme. If you are using
SPDYN, please see also structure_check.
structure_check NONE / FIRST / DOMAIN (SPDYN only)
Default : NONE
If this parameter is set to FIRST or DOMAIN, length of all covalent bonds as well as dis-
tance between non-bonded atom pairs are checked at the begininig or during the simulation.
This option is similar to contact_check, but has an improved capability when the parallel-io
scheme is employed. In SPDYN, we recommend the users to use this option instead of con-
tact_check. Since the structure check spends additional computational time, the users had
better turn off this option in the production run.
• NONE: Do not check the structure
• FIRST: Check the structure only at the beginning of the simulation
• DOMAIN: Check the structure whenever the pairlist is updated
nonb_limiter YES / NO
Default : NO (automatically set to be equal to contact_check)
If this parameter is set to YES, large force caused by the atomic clash is suppressed during
the simulation. Here, the atomic clash can be defined by minimum_contact (see below).
If “contact_check = YES” is specified, this parameter is automatically set to “YES”. If
the users want to turn on only “contact_check”, please specify “contact_check = YES” and
“nonb_limiter = NO” explicitly. This option is basically useful for the energy minimization
or equilibration of the system. However, we strongly recommend the users to turn off this
option in the production run, because suppression of large forces is an “unphysical” manip-
ulation to avoid unstable simulations.
minimum_contact Real
Default : 0.5 (unit : Å)
This parameter defines the clash distance, when contact_check = YES is specified. If
the distance between the non-bonded atoms is less than this value, energy and force are
computed using this distance instead of the actual distance.
Here, the cut-off scheme can be used for the first term, because it decreases rapidly as distance between
atoms increases. The third term is so called self-energy, and is calculated only once. The second term
can be rewritten as:
∑︁ exp(−|G|2 /4𝛼2 )
2 |S(G)|2
2 |G|
|G| ̸=0
We cannot employ fast Fourier transformation (FFT) for the calculation of S(G) since atomic positions
are usually not equally spaced. In the smooth particle mesh Ewald (PME) method [40] [41], this structure
factor is approximated by using cardinal B-spline interpolation as:
∑︁
S(G) = 𝑞𝑖 exp(𝑖G · r𝑖 ) ≈ 𝑏1 (𝐺1 )𝑏2 (𝐺2 )𝑏3 (𝐺3 )F(Q)(𝐺1 , 𝐺2 , 𝐺3 )
𝑖
where 𝑏1 (𝐺1 ), 𝑏2 (𝐺2 ), and 𝑏3 (𝐺3 ) are the coefficients brought by the cardinal B-spline interpolation of
order 𝑛 and Q is a 3D tensor obtained by interpolating atomic charges on the grids. Since this Q has
equally spaced structure, its Fourier transformation, F(Q), can be calculated by using FFT in the PME
method.
pme_ngrid_y Integer
Default : N/A (Optional)
Number of FFT grid points along y dimension. If not specified, program will determine an
appropriate number of grids using pme_max_spacing.
pme_ngrid_z Integer
Default : N/A (Optional)
Number of FFT grid points along z dimension. If not specified, program will determine an
appropriate number of grids using pme_max_spacing.
pme_multiple YES/NO (ATDYN only)
Default : NO
IF pme_multiple is set to YES, MPI processes are divided into two groups to compute the
PME real and reciprocal parts individually.
pme_mul_ratio Integer (ATDYN only)
Default : 1
Ratio of the MPI processors for real and reciprocal PME term computations (only used when
“PME_multiple=YES” is specified).
FFT_scheme 1DALLGATHER / 1DALLTOALL / 2DALLTOALL (SPDYN only)
Default : 1DALLTOALL
This is a highly advanced option concerning reciprocal space calculations. Users usually
don’t need to change this option. See ref [42] for details.
Note: Both of ATDYN and SPDYN use OpenMP/MPI hybrid parallel fast Fourier transformation
library, FFTE [43]. The number of PME grid points must be multiples of 2, 3, and 5 due to the restriction
of this library. Moreover, in SPDYN, there are several additional rules, which depends on the number of
processes, in PME grid numbers. In SPDYN, we first define domain numbers in each dimension such that
product of them equals to the total number of MPI processors. Let us assume that the domain numbers in
each dimension are domain_x, domain_y, and domain_z. The restriction condition of the grid numbers
are as follows:
(1) pme_ngrid_x should be multiple of (2* domain_x)
(2) pme_ngrid_y should be multiple of (2* domain_y)
(3) pme_ngrid_z should be multiple of domain_z
If the given number of PME grid points does not meet the above conditions, the program will automat-
ically reassign suitable grid numbers which are larger than those written in the control input. In such
cases, warning message will be shown in the log file.
The following keywords are relevant if CHARMM or AMBER force field is used. For a linearly-
interpolating lookup table, table points are assigned at the unit interval of cut-off2 /𝑟2 and en-
ergy/gradients are evaluated as a function of 𝑏2 (𝐺2 ) [11].
where
and
𝐿 = INT(Density × 𝑟2 )
and
𝑡 = Density × 𝑟2 − 𝐿
Implicit solvent model is useful to reduce computational cost for the simulations of biomolecules [45].
The GB/SA (Generalized Born/Solvent accessible surface area) model is one of the popular implicit
solvent models, where the electrostatic contribution to the solvation free energy (∆𝐺elec ) is computed
with the GB theory [46], and the non-polar contribution (∆𝐺np ) is calculated from the solvent accessible
surface area [47]. In the GB theory, solvent molecules surrounding the solute are approximated as a
continuum that has the dielectric constant of ~80. To date, various GB models have been developed. In
GENESIS, the OBC model [35] and LCPO method [36] are available in the calculations of the GB and
SA energy terms, respectively. Note that the GB/SA model is implemented in ATDYN but NOT SPDYN.
The solvation free energy is incorporated into the molecular mechanics potential energy function as an
effective energy term, namely, 𝑈 = 𝑈FF + ∆𝐺elec + ∆𝐺np .
where 𝜀p and 𝜀w are the dielectric constants of solute and solvent, respectively, 𝑞𝑖 and 𝑞𝑗 are the partial
charges on the i-th and j-th atoms, respectively. 𝜅 is the inverse of Debye length. 𝑓𝑖𝑗 is the effective
distance between the i- and j-th atoms, which depends on the degree of burial of the atoms, and is given
by
⎯ (︃ )︃
2
⎸
⎸
2 + 𝑅 𝑅 exp
−𝑟 𝑖𝑗
𝑓𝑖𝑗 = ⎷𝑟𝑖𝑗 𝑖 𝑗 .
4𝑅𝑖 𝑅𝑗
Here, 𝑟𝑖𝑗 is the actual distance between the i- and j-th atoms, and 𝑅𝑖 is the effective Born radius of the
i-th atom, which is typically estimated in the Coulomb field approximation by
∫︁
1 1 1 1
= − 𝑑𝑉 .
𝑅𝑖 𝜌𝑖 4𝜋 solute,𝑟>𝜌𝑖 𝑟4
𝜌𝑖 is the radius of the i-th atom (mostly set to the atom’s van der Waals radius), and the integral is carried
out over the volume inside the solute but outside the i-th atom. In the case of an isolated ion, 𝑅𝑖 is equal
to its van der Waals radius. On the other hand, if the atom is buried inside a solute, 𝑅𝑖 becomes larger,
resulting in larger 𝑓𝑖𝑗 . In the OBC model, the effective Born radius is approximated as
1 1 1
= − tanh(𝛼Ψ𝑖 − 𝛽Ψ2𝑖 + 𝛾Ψ3𝑖 ),
𝑅𝑖 𝜌˜𝑖 𝜌𝑖
where 𝛾 is the surface tension coefficient, and 𝐴𝑖 is the surface area of the i-th atom. In the LCPO method,
𝐴𝑖 is calculated from a linear combination of the overlaps between the neighboring atoms, given by
⎡ ⎤
𝑛
∑︁ 𝑛 ∑︁
∑︁ 𝑚 𝑛
∑︁ 𝑛 ∑︁
∑︁ 𝑚
𝐴𝑖 = 𝑃1𝑖 4𝜋𝑅𝑖2 + 𝑃2𝑖 𝐴𝑖𝑗 + 𝑃3𝑖 𝐴𝑗𝑘 + 𝑃4𝑖 ⎣𝐴𝑖𝑗 𝐴𝑗𝑘 ⎦.
𝑗=1 𝑗=1 𝑘=1 𝑗=1 𝑗=1 𝑘=1
𝑃1−4 are the empirical parameters determined for each atom type, 𝑅𝑖 is the radius of the i-th atom +
probe radius (typically 1.4 Ang), and 𝐴𝑖𝑗 is the area of the i-th atom buried inside the j-th atom, given
by
(︃ )︃
𝑟𝑖𝑗 𝑅𝑖2 − 𝑅𝑗2
𝐴𝑖𝑗 = 2𝜋𝑅𝑖 𝑅𝑖 − −
2 2𝑟𝑖𝑗
gbsa_eps_solvent Real
Default : 78.5
Dielectric constant of solvent 𝜀w .
gbsa_eps_solute Real
Default : 1.0
Dielectric constant of solute 𝜀p .
gbsa_alpha Real
Default : 1.0
The empirical parameter 𝛼 in the equation for the effective Born radius calculation.
“gbsa_alpha=0.8” for OBC1, and “gbsa_alpha=1.0” for OBC2.
gbsa_beta Real
Default : 0.8
The empirical parameter 𝛽 in the equation for the effective Born radius calculation.
“gbsa_beta=0.0” for OBC1, and “gbsa_beta=0.8” for OBC2.
gbsa_gamma Real
Default : 4.85
The empirical parameter 𝛾 in the equation for the effective Born radius calculation.
“gbsa_gamma=2.91” for OBC1, and “gbsa_gamma=4.85” for OBC2.
gbsa_salt_cons Real
Default : 0.2 (unit : mol/L)
Concentration of the monovalent salt solution.
gbsa_vdw_offset Real
Default : 0.09 (unit : Å)
Intrinsic offset 𝜌0 for the van der Waals radius.
gbsa_surf_tens Real
Default : 0.005 (unit : kcal/mol/Å2 )
Surface tension coefficient 𝛾 in the SA energy term.
Note: Debye length is calculated by 𝜅−1 = 𝜀0 𝜀𝑤 𝑘𝐵 𝑇 /2𝑁𝐴 𝑒2 𝐼, where T is automatically set to the
√︀
target temperature specified in the [DYNAMICS] section. In the case of the energy minimization, T =
298.15 K is used. In the T-REMD simulations with the GB/SA model, each replica has an individual
Debye length depending on the assigned temperature.
In the EEF1 implicit solvent model [37], the effective energy W of a solute molecule is defined as the
sum of the molecular mechanics potential energy 𝐸MM and solvation free energy ∆𝐺solv , given by
𝑊 = 𝐸MM + ∆𝐺solv ,
where
∑︁ ∑︁ ∑︁
∆𝐺solv = ∆𝐺ref
𝑖 − 𝑔𝑖 (𝑟𝑖𝑗 )𝑉𝑗 ,
𝑖 𝑖 𝑗̸=𝑖
{︃ (︂ )︂ }︃
∆𝐺free
𝑖 𝑟𝑖𝑗 − 𝑅𝑖 2
𝑔𝑖 (𝑟𝑖𝑗 ) = √ 2 exp − .
2𝜋 𝜋𝜆𝑖 𝑟𝑖𝑗 𝜆𝑖
𝑟𝑖𝑗 is the distance between atoms i and j, and 𝑉𝑗 is the volume of the j-th atom. The function 𝑔𝑖 is
the density of the solvation free energy of the i-th atom, defined with the van der Waals radius 𝑅𝑖 and
thickness of the first hydration shell 𝜆𝑖 . ∆𝐺ref
𝑖 is the solvation free energy of the atom when it is fully
exposed to solvent. ∆𝐺free
𝑖 is similar to ∆𝐺 ref , but is determined to satisfy the zero solvation energy of
𝑖
deeply buried atoms.
In the IMM1 implicit membrane [38] and IMIC implicit micelle models [39], ∆𝐺free 𝑖 as well as ∆𝐺ref
𝑖 are
defined as a combination of the solvation free energies of the i-th solute atom in water and cyclohexane:
ref,water
∆𝐺ref
𝑖 = 𝑓𝑖 ∆𝐺𝑖 + (1 − 𝑓𝑖 )∆𝐺ref,cyclohexane
𝑖 ,
where f is a function that describes the transition between water and cyclohexane phases.
In the IMM1 model, 𝑓𝑖 is given by the sigmoidal function:
𝑧𝑖′𝑛
𝑓 (𝑧𝑖′ ) = ,
1 + 𝑧𝑖′𝑛
where 𝑧𝑖′ = |𝑧𝑖 |/(𝑇 /2), 𝑧𝑖 is the z-coordinate of the i-th atom, and T is the membrane thickness. n
controls the steepness of the membrane-water interface. In the IMM1 model, the membrane is centered
at 𝑧 = 0.
In the IMIC model, the following function is used for 𝑓𝑖 :
1
𝑓 (𝑑𝑖 ) = {tanh(𝑠𝑑𝑖 ) + 1} ,
2
where 𝑑𝑖 is the depth of the solute atom i from the micelle surface, and s controls the steepness of the
micelle-water interface. The shape of the micelle is defined using a super-ellipsoid function:
𝑚2
{︃(︂ 2 2 }︃ 𝑚 2
1
)︂ (︂ )︂ (︂ )︂
|𝑥| 𝑚2 |𝑦| 𝑚2 |𝑧| 𝑚1
+ + = 1.
𝑎 𝑏 𝑐
a, b, and c are the semi-axes of the super-ellipsoid, and 𝑚1 and 𝑚2 determine the shape of the cross
section in the super-ellipsoid. In the case of 𝑚1 = 𝑚2 = 1, the equation gives an ordinary ellipsoid.
If 0 < 𝑚1 < 1 and 𝑚2 = 1, the cross section in a plane perpendicular to the XY -plane is expanded,
keeping the semi-axes at the given lengths, and the shape also resembles a bicelle or nanodisc. If 𝑚1 = 1
and 0 < 𝑚2 < 1, the cross section in a plane parallel to the XY -plane is expanded. The shape becomes
close to rectangle as both 𝑚1 and 𝑚2 decrease. Note that 𝑚 < 0 or 𝑚 > 1 is not allowed, because it
produces a non-micelle-like shape resembling an octahedron. In the IMIC model, the micelle is centered
at the origin of the system (𝑥, 𝑦, 𝑧) = (0, 0, 0).
In the IMM1 and IMIC models, a distance-dependent dielectric constant is used for the electrostatic
interactions. The dielectric constant depends on the positions of interacting atoms with respect to the
membrane/micelle surface, defined as
√
𝜖 = 𝑟𝑝+(1−𝑝) 𝑓𝑖 𝑓𝑗 ,
where r is the distance between the i-th and j-th atoms, and p is an empirical parameter to adjust
strength of the interactions (p = 0.85 for CHARMM19 and 0.91 for CHARMM36). Far from the mem-
brane/micelle surface, the dielectric constant 𝜖 is close to r, corresponding to the EEF1 model, while in
the membrane/micelle center, it provides strengthened interactions. The IMIC model is nearly equivalent
to the IMM1 model when a and 𝑏 → ∞ and c is half membrane thickness.
In the control file of GENESIS, the following parameters are specified, and the other parameters such as
𝑖 , ∆𝐺𝑖 , and 𝜆 in the above equations are read from the eef1file, which is set in the [INPUT]
V, ∆𝐺ref free
imm1_memb_thick Real
Default : 27.0 (unit : Å)
Membrane thickness T in IMM1
imm1_exponent_n Real
Default : 10
Steepness parameter n in IMM1
imm1_factor_a Real
Default : 0.91
Adjustable empirical parameter p in IMM1 and IMIC. p = 0.85 and 0.91 are recommended
for CHARMM19 and CHARMM36, respectively.
imm1_make_pore
Default : NO
Use IMM1-pore model [49]
imm1_pore_radius Real
Default : 5.0 (unit : Å)
Aqueous pore radius in the IMM1-pore model
imic_axis_a Real
6.7 Examples
Simulation with the CHARMM36 force field in the periodic boundary condition:
[ENERGY]
forcefield = CHARMM # CHARMM force field
electrostatic = PME # use Particle mesh Ewald method
switchdist = 10.0 # switch distance
cutoffdist = 12.0 # cutoff distance
pairlistdist = 13.5 # pair-list distance
vdw_force_switch = YES # force switch option for van der Waals
pme_nspline = 4 # order of B-spline in [PME]
pme_max_spacing = 1.2 # max grid spacing allowed
Simulation with the AMBER force field in the periodic boundary condition:
[ENERGY]
forcefield = AMBER # AMBER force field
electrostatic = PME # use Particle mesh Ewald method
switchdist = 8.0 # switch distance
cutoffdist = 8.0 # cutoff distance
pairlistdist = 9.5 # pair-list distance
(continues on next page)
6.7. Examples 59
GENESIS User Guide, 1.7.1
Recommended options in the case of energy minimization (see Minimize section) for the initial structure
with the CHARMM36 force field:
[ENERGY]
forcefield = CHARMM # CHARMM force field
electrostatic = PME # use Particle mesh Ewald method
switchdist = 10.0 # switch distance
cutoffdist = 12.0 # cutoff distance
pairlistdist = 13.5 # pair-list distance
vdw_force_switch = YES # force switch option for van der Waals
pme_nspline = 4 # order of B-spline in [PME]
pme_max_spacing = 1.2 # max grid spacing allowed
contact_check = YES # check atomic clash
nonb_limiter = YES # avoid failure due to atomic clash
minimum_contact = 0.5 # definition of atomic clash distance
[ENERGY]
forcefield = CHARMM # CHARMM force field
electrostatic = CUTOFF # use cutoff scheme
switchdist = 23.0 # switch distance
cutoffdist = 25.0 # cutoff distance
pairlistdist = 27.0 # pair-list distance
implicit_solvent = GBSA # Turn on GBSA calculation
gbsa_eps_solvent = 78.5 # solvent dielectric constant in GB
gbsa_eps_solute = 1.0 # solute dielectric constant in GB
gbsa_salt_cons = 0.2 # salt concentration (mol/L) in GB
gbsa_surf_tens = 0.005 # surface tension (kcal/mol/A^2) in SA
[ENERGY]
forcefield = CHARMM # CHARMM force field
electrostatic = CUTOFF # use cutoff scheme
switchdist = 16.0 # switch distance
cutoffdist = 18.0 # cutoff distance
pairlistdist = 20.0 # pair-list distance
implicit_solvent = IMM1 # Turn on IMM1 calculation
imm1_memb_thick = 27.0 # membrane thickness in IMM1
6.7. Examples 60
CHAPTER
SEVEN
DYNAMICS SECTION
In MD simulations, Newton’s equation of motion (F = ma) is integrated numerically, where the force F
is derived from the first derivative of the potential energy function with respect to the atomic position.
To date, various integrators have been proposed. In the leap-frog algorithm, velocities are updated with
∆𝑡 ∆𝑡 ∆𝑡
v𝑖 (𝑡 + ) = v𝑖 (𝑡 − )+ F𝑖 (𝑡),
2 2 𝑚𝑖
∆𝑡
r𝑖 (𝑡 + ∆𝑡) = r𝑖 (𝑡) + ∆𝑡v𝑖 (𝑡 + ).
2
In the velocity Verlet algorithm, coordinates and velocities are obtained at the same time. The velocities
are updated with
∆𝑡 F𝑖 (𝑡) + F𝑖 (𝑡 − ∆𝑡)
v𝑖 (𝑡) = v𝑖 (𝑡 − ∆𝑡) + ,
𝑚𝑖 2
∆𝑡2
r𝑖 (𝑡 + ∆𝑡) = r𝑖 (𝑡) + ∆𝑡v𝑖 (𝑡) + F𝑖 (𝑡).
2𝑚𝑖
In ATDYN, both leap-frog and velocity Verlet integrators are available. The multiple time step integrator
(r-RESPA [50]) is also available in SPDYN. The users must pay attention to the [ENSEMBLE] section
as well, because the algorithms that control the temperature and pressure are involved in the integrator.
For details, see Ensemble section.
61
GENESIS User Guide, 1.7.1
timestep Real
Default : 0.001 (unit : ps)
Time step in the MD run. In general, timestep can be extended to 2 fs or longer, when the
SHAKE, RATTLE, or SETTLE algorithms are employed. (see Constraints section).
nsteps Integer
Default : 100
Total number of steps in one MD run. If “timestep=0.001” and “nsteps=1000000” are spec-
ified, the users can carry out 1-ns MD simulation.
eneout_period Integer
Default : 10
Output frequency for the energy data. The trajectories are written in the log file every
eneout_period steps during the simulation. For example, if “timestep=0.001” and “ene-
out_period=1000” are specified, the energy is written every 1 ps.
crdout_period Integer
Default : 0
Output frequency for the coordinates data. The trajectories are written in the “dcdfile” spec-
ified in the [OUTPUT] section every crdout_period steps during the simulation.
velout_period Integer
Default : 0
Output frequency for the velocities data. The trajectories are written in the “dcdvelfile”
specified in the [OUTPUT] section every velout_period steps during the simulation.
rstout_period Integer
Default : 0
Output frequency for the restart file. The restart information is written in the “rstfile” spec-
ified in the [OUTPUT] section every rstout_period steps during the simulation.
Note: In the REMD or RPATH simulations, the value must be a multiple of exchange_period (REMD)
or rpath_period (RPATH).
stoptr_period Integer
Default : 10
Frequency of removing translational and rotational motions of the whole system. Note that
the rotational motion is not removed when the periodic boundary condition is employed.
When you use positional restraints or RMSD restraints in the simulation, you may have
to take care about removal of those motions. In some cases, such restraints can generate
translational or rotational momentum in the system. If the momentum is frequently removed,
the dynamics can be significantly disturbed.
nbupdate_period Integer
Default : 10
In simulated annelaing or heating protocol, the following keywords are additinally specified in the con-
ventional MD simulation. In the protocol used in GENESIS, the target temperature is changed linearly.
Note that the protocol is available only in the LEAP integrator.
annealing YES / NO
Default : NO
Turn on or off the simulated annealing or heating protocol.
anneal_period Integer
Default : 0
The target temperature is changed every anneal_period steps during the simulation.
dtemperature Real
Default : 0.0 (unit : Kelvin)
Magnitute of changes of the target temperature. If dtemperature > 0, the temperature is
increased by dtemperature every anneal_period steps. If dtemperature < 0, the temper-
ature is decreased.
In GENESIS, targeted MD (TMD) and steered MD (SMD) methods are available. These methods are
useful to guide a protein structure towards a target. In SMD, restraint forces (or steering forces) are
applied on the selected atoms, where the RMSD with respect to the target is changed during the MD
simulation. The restraint force is calculated from the derivative of the RMSD restraint potential:
1
𝑈 = 𝑘 (𝑅𝑀 𝑆𝐷(𝑡) − 𝑅𝑀 𝑆𝐷0 (𝑡))2
2
where 𝑅𝑀 𝑆𝐷(𝑡) is the instantaneous RMSD of the current coordinates from the target coordinates, and
𝑅𝑀 𝑆𝐷0 is the target RMSD value. The target RMSD value is changed linearly from the initial to target
RMSD values:
𝑡
𝑅𝑀 𝑆𝐷0 (𝑡) = 𝑅𝑀 𝑆𝐷initial + (𝑅𝑀 𝑆𝐷final − 𝑅𝑀 𝑆𝐷initial )
𝑇
where 𝑇 is the total MD simulation time. Targeted MD (TMD), originally suggested by J. Schlitter et
al. [51], is different from SMD in that force constants are changed during MD simulations. If the users
perform SMD, there is a possibility observing the large difference between the instantaneous RMSD
and target RMSD. In TMD, force constants are given by Lagrangian multipliers to overcome the en-
ergy barrier between the instantaneous and target RMSDs. Therefore, the users could find trajectories
where RMSD is almost identical to the target RMSD at each time. In [SELECTION] section, the users
select atoms involved in RMSD calculations for SMD or TMD. Users should specify either RMSD or
RMSDMASS (mass-weighted RMSD) in [RESTRAINTS] section to run TMD or SMD. In SMD, force
constants defined in [RESTRAINTS] section are used, but force constants are automatically determined
using Lagrangian multipliers during simulation in TMD.
target_md YES / NO
Default : NO
Turn on or off the targeted MD simulation.
steered_md YES / NO
Default : NO
Turn on or off the steered MD simulation.
initial_rmsd Real
Note: In the RMSD restraint, structure fitting scheme is specified in the [FITTING] section (see Fitting
section). Since the default behavior was significantly changed in ver. 1.1.5 (no fitting applied on the
default setting), the users of 1.1.4 or before must pay special attention on the fitting scheme. In versions
of 1.1.4 or before, structure fitting is automatically applied for the atoms concerning restraint potential.
7.4 Examples
100-ps MD simulation with the velocity Verlet integrator with the timestep of 2 fs:
[DYNAMICS]
integrator = VVER # velocity Verlet
nsteps = 50000 # number of MD steps (100ps)
timestep = 0.002 # timestep (2fs)
eneout_period = 500 # energy output period (1ps)
crdout_period = 500 # coordinates output period (1ps)
rstout_period = 50000 # restart output period
nbupdate_period = 10 # nonbond pair list update period
100-ps MD simulation with the RESPA integrator with the timestep of 2.5 fs:
[DYNAMICS]
integrator = VRES # RESPA integrator
nsteps = 40000 # number of MD steps (100ps)
timestep = 0.0025 # timestep (2.5fs)
eneout_period = 400 # energy output period (1ps)
crdout_period = 400 # coordinates output period (1ps)
rstout_period = 40000 # restart output period
nbupdate_period = 10 # nonbond pair list update period
elec_long_period = 2 # period of reciprocal space calculation
thermostat_period = 10 # period of thermostat update
barostat_period = 10 # period of barostat update
The following is an example for simulated annelaing in the NVT ensemble (see Ensemble section), where
the temperature is decreased from 500 K by 2 K every 250 steps in the 250,000-steps MD simulation (1
step = 2 fs). Thus, the temperature eventually reaches to 300 K during 50 ps. Note that heating or
annealing is only available with the leap-frog integrator.
7.4. Examples 65
GENESIS User Guide, 1.7.1
[DYNAMICS]
integrator = LEAP # leap-frog integrator
nsteps = 25000 # number of MD steps
timestep = 0.002 # timestep (ps)
nbupdate_period = 10 # nonbond pair list update period
annealing = YES # simulated annealing
dtemperature = -2.0 # delta temperature
anneal_period = 250 # temperature change period
[ENSEMBLE]
ensemble = NVT # [NVT,NPT,NPAT,NPgT]
tpcontrol = LANGEVIN # [BERENDSEN,LANGEVIN]
temperature = 500.0 # initial temperature (K)
7.4. Examples 66
CHAPTER
EIGHT
MINIMIZE SECTION
In the [MINIMIZE] section, the user can select methods for energy minimization. Currently, the steep-
est descent (SD) algorithm is available in SPDYN and ATDYN, and the limited memory version of
Broyden-Fletcher-Goldfarb-Shano (LBFGS) is additionally available in ATDYN. Note that constraint
algorithms such as SHAKE are not available in the energy minimization scheme in GENESIS. The
energy minimization can be done with restraints (see Restraints section).
When the energy minimization is carried out for the initial structure, it is strongly recommended to use
the option “contact_check=YES” in the [ENERGY] section (see Energy section). This is because the
initial structure is usually artificial, and sometimes contains atomic clashes, where the distance between
atoms is very short. Such strong interactions can generate huge forces on the atoms, resulting in unstable
calculations, which might cause memory errors.
method SD / LBFGS
Default : LBFGS (for ATDYN), SD (for SPDYN)
Algorithm of minimization.
• SD : Steepest descent method
• LBFGS : Limited memory version of Broyden-Fletcher-Goldfarb-Shano method
(ATDYN only)
nsteps Integer
Default : 100
Number of minimization steps.
eneout_period Integer
Default : 10
Frequency of energy outputs.
crdout_period Integer
Default : 0
Frequency of coordinates outputs.
rstout_period Integer
67
GENESIS User Guide, 1.7.1
Default : 0
Frequency of restart file updates.
nbupdate_period Integer
Default : 10
Frequency of non-bonded pair-list updates
fixatm_select_index Integer (ATDYN only)
Default : N/A (all atoms are minimized)
Index of an atom group to be fixed during minimization. The index must be de-
fined in [SELECTION] (see Selection section). For example, if the user specifies
fixatm_select_index = 1, the reference atoms should be members of group1 in the
[SELECTION].
tol_rmsg Real (ATDYN only)
Default : 0.36 (unit : kcal/mol/Å)
Tolerence of convergence for RMS gradient.
tol_maxg Real (ATDYN only)
Default : 0.54 (unit : kcal/mol/Å)
Tolerence of convergence for maximum gradient.
Note: In ATDYN, a minimization run stops when both RMSG and MAXG become smaller than the
tolerence values.
force_scale_init Real
Default : 0.00005
The initial value of the force scaling coefficient in the steepest descent method. This value
is also used as the minimum value of the scaling coefficient.
force_scale_max Real
Default : 0.0001
Maximum value of the force scaling coefficient in the steepest descent method.
ncorrection Integer
Default : 10
Number of corrections to build the inverse Hessian.
lbfgs_bnd YES / NO
Default : YES
Set a boundary to move atoms in each step of minimization.
lbfgs_bnd_qmonly YES / NO
Default : NO
Set the boundary only to QM atoms.
lbfgs_bnd_maxmove Real
Default : 0.1 (unit : Å)
The maximum size of move in each step.
Note: LBFGS often makes a large move of atoms, especially, in the first few steps, and creates a distorted
structure. Although this is rarely a problem in MM calculation, it may cause convergence problem in QM
calculation. lbfgs_bnd prevents a huge move and crush of atoms by setting a maximum size of move.
The size is set by lbfgs_bnd_maxmove.
In this scheme, the MM region is first minimized while holding the QM region fixed. This step is called
micro-iteration. When the MM region reaches the minima (or the maximum number of steps), the whole
system including the QM region is updated. This step is called macro-iteration. Then, the MM region
is minimized again with the new QM region. The micro- and macro-iterations are repeated until the
convergence is reached.
This scheme requires time-consuming QM calculations only in the macro-iteration. During the micro-
iteration, ESP charges are used to represent the electrostatic interaction between QM and MM region.
Therefore, it is by far more efficient than the usual scheme, and is recommended to use when ESP charges
are available. Currently, this scheme works in combination with Gaussian.
The keywords in this subsection have no effect in MM calculations, of course.
macro YES / NO
Default : NO
Invoke macro/micro-iteration scheme if YES.
nsteps_micro Integer
Default : 100
Number of minimization steps for micro-iteration.
tol_rmsg_micro Real
Default : 0.27 (unit : kcal/mol/Å)
Tolerence of convergence for RMS gradient in micro-iteration.
tol_maxg_micro Real
Default : 0.41 (unit : kcal/mol/Å)
Tolerence of convergence for maximum gradient in micro-iteration.
macro_select_index Integer
Index of an atom group to be fixed in micro-iteration, and minimized in macro-iteration. The
index must be defined in [SELECTION] (see Selection section). QM atoms are selected by
default.
In the energy minimization of GENESIS, the users can automatically fix the ring penetrations or chi-
rality errors in protein, DNA, and RNA. The suspicious ring is detected based on the length of the co-
valent bonds consisting of the ring. Note that this algorithm is currently available for CHARMM or
CHARMM19 force fields. If the users utilized this algorithm, please cite the paper (Mori et al., J. Chem.
Inf. Model., 2021 [52]).
Fig. 8.1: Algorithms for fixing ring penetrations and chirality errors.
exclude_ring_grpid Integer
Default : N/A
Space-separated list of the indexes of the detected suspicious ring group, which are neglected
during the automatic error fixing.
fix_chirality_error YES / NO
Default : NO
Invert the position of the hydrogen bond of the suspicious chiral center in the initial structure
to fix the chirality error.
exclude_chiral_grpid Integer
Default : N/A
Space-separated list of indexes of the detected suspicious chirality center, which are ne-
glected during the automatic error fixing.
The basic usage of these options is as follows. First, the users specify “check_structure = YES”,
“fix_ring_error = NO”, and “fix_chirality_error = NO” to just check the errors. If there are no suspi-
cious rings or chiral centers in the final snapshot, the following messages are displayed:
If there are suspicious rings or chiral centers, some warning messages will be shown at the last part of
the log message:
WARNING!
Some suspicious residues were detected. Minimization might be too short,
or "ring penetration" might happen in the above residues.
Check the structure of those residues very carefully. If you found a ring
penetration, try to perform the energy minimization again with the
options "check_structure = YES" and "fix_ring_error = YES" in [MINIMIZE].
The energy minimization should start from the restart file obtained
in "this" run.
Please read the warning message very carefully. In this example, the users first had better check the ob-
tained structure around PRO82, PHE93, PHE206, and PHE230 using a molecular viewer like VMD.
Because this is just a warning message, there might be actually no errors. But, if the users found
some errors, energy minimization should be performed again, restarting from this run with the options
“check_structure = YES” and “fix_ring_error = YES” or “fix_chirality_error = YES”. If the users want
to fix only PHE93 and PHE230, please add the option “exclude_ring_grpid = 20 49”, where “20” and
“49” are the “suspicious ring group id” of PRO82 and PHE206, respectively, shown in the warning mes-
sage. Note that the reduction of the size of the penetrated ring or inversion of the hydrogen atom in the
bad chiral center is carried out for the initial structure.
8.6 Examples
[MINIMIZE]
method = SD # Steepest descent
nsteps = 2000 # number of minimization steps
eneout_period = 50 # energy output period
crdout_period = 50 # coordinates output period
rstout_period = 2000 # restart output period
nbupdate_period = 10 # nonbond pair list update period
[MINIMIZE]
method = LBFGS
nsteps = 500 # number of steps
eneout_period = 5 # energy output period
crdout_period = 5 # coordinates output period
rstout_period = 5 # restart output period
nbupdate_period = 1 # nonbond pair list update period
lbfgs_bnd = yes # set a boundary to move atoms
lbfgs_bnd_qmonly = no # set the boundary only to QM atoms
lbfgs_bnd_maxmove = 0.1 # the max. size of move
macro = yes # switch macro/micro-iteration scheme
nsteps_micro = 100 # number of steps of micro-iteration
Energy minimization with automatic fixing for ring penetrations and chirality errors:
[MINIMIZE]
method = SD # Steepest descent
nsteps = 2000 # number of minimization steps
eneout_period = 50 # energy output period
crdout_period = 50 # coordinates output period
rstout_period = 2000 # restart output period
nbupdate_period = 10 # nonbond pair list update period
check_structure = YES # check ring penetration and chirality error
fix_ring_error = YES # automatically fix the ring penetrations
fix_chirality_error = YES # automatically fix the chirality errors
8.6. Examples 72
CHAPTER
NINE
CONSTRAINTS SECTION
In the [CONSTRAINTS] section, keywords related to bond constraints are specified. In the leapfrog
integrator, the SHAKE algorithm is applied for covalent bonds involving hydrogen [53]. In the velocity
Verlet and multiple time-step integrators, not only SHAKE but also RATTLE are used [54]. Note that
bond constraint between heavy atoms is not available currently.
rigid_bond YES / NO
Default : NO
Turn on or off the SHAKE/RATTLE algorithms for covalent bonds involving hydrogen.
shake_iteration Integer
Default : 500
Maximum number of iterations for SHAKE/RATTLE constraint. If SHAKE/RATTLE does
not converge within the given number of iterations, the program terminates with an error
message.
shake_tolerance Real
Default : 1.0e-10 (unit : Å)
Tolerance of SHAKE/RATTLE convergence.
hydrogen_type NAME / MASS
Default : NAME
This parameter defines how hydrogen atoms are detected. This parameter is ignored when
rigid_bond = NO. Usually, the users do not need to take care about this parameter.
• MASS : detect hydrogen only based on the atomic mass. If the mass of an atom is
less than hydrogen_mass_upper_bound and greater than 0, that atom is considered as
a hydrogen.
• NAME : detect hydrogen based on the atom name, type, and mass. If the mass of an
atom is less than hydrogen_mass_upper_bound and the name or type begins with ‘h’,
‘H’, ‘d’, or ‘D’, that atom is considered as a hydrogen.
73
GENESIS User Guide, 1.7.1
fast_water YES / NO
Default : YES
Turn on or off the SETTLE algorithm for the constraints of the water molecules [55]. Al-
though the default is “fast_water=YES”, the users must specify “rigid_bond=YES” to use
the SETTLE algorithm. If “rigid_bond=YES” and “fast_water=NO” are specified, the
SHAKE/RATTLE algorithm is applied to water molecules, which is not computationally
efficient.
water_model expression or NONE
Default : TIP3
Residue name of the water molecule to be rigidified in the SETTLE algorithm. In the case
of the AMBER force field, “water_model = WAT” must be specified.
Note: TIP4P water model is availabe in GENESIS 1.2 or later. In the case of using TIP4P water model,
we regard it as rigid. In molecular dynamics simulations, please define rigid_bond and fast_water yes.
In minimization, [Constraints] has not been used before, but now you can define fast_water yes when
TIP4P water model is used, by regarding TIP4P water molecule rigid. However, please keep in mind
that other parameters cannot be defined in minimizations, and constraints are not applied except water
molecules. TIP4P water model can be used only in SPDYN.
9.4 Examples
[CONSTRAINTS]
rigid_bond = YES # Turn on SHAKE/RATTLE
fast_water = YES # Turn on SETTLE
[CONSTRAINTS]
rigid_bond = YES # Turn on SHAKE/RATTLE
fast_water = YES # Turn on SETTLE
water_model = WAT # residue name of the rigid water
[CONSTRAINTS]
rigid_bond = NO
fast_water = NO
TEN
ENSEMBLE SECTION
In the [ENSEMBLE] section, the type of ensemble, temperature and pressure control algorithm, and
parameters used in these algorithms (such as temperature and pressure) can be specified.
In the Langevin thermostat algorithm (“ensemble=NVT” with “tpcontrol=LANGEVIN”), every particles
are coupled with a viscous background and a stochastic heat bath [56]:
where 𝛾 is the thermostat friction parameter (gamma_t keyword) and R(𝑡) is the stochastic force. In
the Langevin thermostat and barostat method (“ensemble=NPT” with “tpcontrol=LANGEVIN”), the
equation of motion is given by [57]:
𝑑r(𝑡)
= v(𝑡) + 𝑣𝜖 r(𝑡)
𝑑𝑡
𝑑v(𝑡) F(𝑡) + R(𝑡) 3
= − [𝛾𝑝 + (1 + )𝑣𝜖 ]v(𝑡)
𝑑𝑡 𝑚 𝑓
𝑑𝑣𝜖 (𝑡) 3𝐾
= [3𝑉 (𝑃 (𝑡) − 𝑃0 (𝑡)) + − 𝛾𝑝 𝑣𝜖 + 𝑅𝑝 ]/𝑝𝑚𝑎𝑠𝑠
𝑑𝑡 𝑓
where 𝐾 is the kinetic energy, 𝛾𝑝 is the barostat friction parameter (gamma_p keyword), 𝑅𝑝 is the stochas-
tic pressure variable.
76
GENESIS User Guide, 1.7.1
temperature Real
Default : 298.15 (unit : Kelvin)
Initial and target temperature.
pressure Real
Default : 1.0 (unit : atm)
Target pressure in the NPT ensemble. In the case of the NPAT and NPgT ensembles, this is
the pressure along the ‘Z’ axis.
gamma Real
Default : 0.0 (unit : dyn/cm)
Target surface tension in NPgT ensemble.
tpcontrol NO / BERENDSEN / LANGEVIN / BUSSI
Default : NO
Type of thermostat and barostat. The availabe algorithm depends on the integrator.
• NO: Do not use temperature/pressure control algorithm (for NVE only)
• BERENDSEN: Berendsen thermostat/barostat [59]
• LANGEVIN: Langevin thermostat/barostat [57]
• BUSSI: Bussi’s thermostat/barostat [60] [61]
tau_t Real
Default : 5.0 (unit : ps)
Temperature coupling time in the Berendsen and Bussi thermostats.
tau_p Real
Default : 5.0 (unit : ps)
Pressure coupling time in the Berendsen and Bussi barostats.
compressibility Real
Default : 0.0000463 (unit : atm-1 )
Compressibility parameter in the Berendsen barostat.
gamma_t Real
10.2 Examples
[ENSEMBLE]
ensemble = NVT # Canonical ensemble
tpcontrol = BUSSI # Bussi thermostat
temperature = 300.0 # target temperature (K)
[ENSEMBLE]
ensemble = NPT # Isothermal-isobaric ensemble
tpcontrol = BUSSI # Bussi thermostat and barostat
temperature = 300.0 # target temperature (K)
pressure = 1.0 # target pressure (atm)
NPT ensemble with semi-isotropic pressure coupling, which is usually used for lipid bilayer systems:
[ENSEMBLE]
ensemble = NPT # Isothermal-isobaric ensemble
tpcontrol = BUSSI # Bussi thermostat and barostat
temperature = 300.0 # target temperature (K)
pressure = 1.0 # target pressure (atm)
isotropy = SEMI-ISO # Ratio of X to Y is kept constant
NPAT ensemble:
10.2. Examples 78
GENESIS User Guide, 1.7.1
[ENSEMBLE]
ensemble = NPAT # Constant area ensemble
tpcontrol = BUSSI # Bussi thermostat and barostat
temperature = 300.0 # target temperature (K)
pressure = 1.0 # target normal pressure (atm)
isotropy = XY-FIXED # the system area is kept constant
NP𝛾T ensemble:
[ENSEMBLE]
ensemble = NPgT # Constant surface-tension ensemble
tpcontrol = BUSSI # Bussi thermostat and barostat
temperature = 300.0 # target temperature (K)
pressure = 1.0 # target normal pressure (atm)
gamma = 200.0 # target surface tension (dyn/cm)
isotropy = SEMI-ISO # Ratio of X to Y is kept constant
10.2. Examples 79
CHAPTER
ELEVEN
BOUNDARY SECTION
Note: If the simulation system has a periodic boundary condition (PBC), the user must specify the box
size in the control file (at the energy minimization stage in most cases). During the simulations, box size
is saved in a restart file. If the restart file is used as an input of the subsequent simulation, the box size
is overwritten with the restart information. Note that in this case the box size given in the control file is
ignored.
80
GENESIS User Guide, 1.7.1
domain_x Integer
Default : N/A (Optional) (SPDYN only)
Number of domains along the x dimension.
domain_y Integer
Default : N/A (Optional) (SPDYN only)
Number of domains along the y dimension.
domain_z Integer
Default : N/A (Optional) (SPDYN only)
Number of domains along the z dimension.
Note: If number of domains (domain_x, domain_y, and domain_z) are not specified in the control file,
they are automatically determined based on the number of MPI processes. When the user specifies the
number of domains explicitly, please make sure that the product of the domain numbers in each dimension
(i.e., domain_x * domain_y * domain_z) is equal to the total number of MPI processes.
In MD simulations with NOBC, molecules may evaporate from a system, and, once such an event hap-
pens, the molecule runs in the vacuum with constant velocity to infinity. Therefore, it is useful to set a
potential which pulls the molecule back to the system.
In ATDYN, the users can set a spherical potential,
where 𝑘, 𝑛 and 𝑟𝑏 are a force constant, an exponent, and a radius of the sphere, respectively, and 𝑟𝑖 is the
distance between the 𝑖-th atom and the center of sphere,
𝑟𝑖 = |x𝑖 − x0 |.
Multiple spheres with different centers and radii can be combined to construct the potential; for example,
two spheres are combined in Fig. 11.1. The atoms that went out of the sphere (thin line) are pulled back
to the nearest center; the red atom to center 1 and the blue atoms to center 2.
The coordinates of the center can be specified in two ways. The first is to set the center to a position of
atoms in the initial structure (pdbfile) using [SELECTOR]. The other way is to directly specify coordi-
nates of the center in the input. See the description of options and the examples below for details.
The following options are available to set the spherical potential:
2
1
Fig. 11.1: An illustration of a combination of two spherical potentials (black thin circles), which pulls
back atoms that are out of the range (gray) towards the center of sphere (1 and 2).
spherical_pot YES / NO
Default : NO
If YES (with type=NOBC), use the spherical boundary potential.
constant Real
Default : 10.0 (unit : kcalmol−1 )
The force constant of the potential.
exponent Integer
Default : 2
The exponent of the potential.
nindex Integer
Default : 0
The number of index, used with center_select_index N.
center_select_index N Integer
Default : N/A
The index of center in [SELECTION]
nfunction Integer
Default : 0
The number of function, used with center N.
center N Real ×3
Default : N/A
The xyz coordinates of the center.
radius N
Default : 0.0 (unit : Å)
The radius of sphere.
fixatom YES / NO
Default : YES
Atoms out of the sphere in the input structure are fixed.
fixlayer Real
Default : 1.0 (unit : Å)
If fixatom = YES, atoms within this distance from the potential in the input structure are also
fixed.
restart YES / NO
Default : YES
Use the information in the restart file.
Note: The information of the sphere and fixed atoms are saved in a restart file. If the information exists
in rstfile, the options for the spherical potential in [BOUNDARY] will be ignored. If you want to re-set
the potential, you need to specify restart = NO.
11.4 Examples
[BOUNDARY]
type = NOBC # non-periodic system
• Simulations with the periodic boundary condition, where the box size is set to 64 x 64 x 64. In
this case, the user should not use a restart file as an input, because the box size in the control is
overwritten with that in the restart file.
[BOUNDARY]
type = PBC # periodic boundary condition
box_size_x = 64.0 # Box size in the x dimension (Ang)
box_size_y = 64.0 # Box size in the y dimension (Ang)
box_size_z = 64.0 # Box size in the z dimension (Ang)
• Simulations with two spherical potentials around atom number 1 and 100 with a radius of 22.0
Angs.
[BOUNDARY]
type = NOBC
spherical_pot = yes
constant = 2.0
exponent = 2
nindex = 1
center_select_index1 = 2
radius1 = 22.0
fix_layer = 0.0
fixatom = no
(continues on next page)
11.4. Examples 83
GENESIS User Guide, 1.7.1
[SELECTION]
...
group2 = ano:1 or ano:100
Note: Be careful not to set too many spheres because it may slow down the performance. If you
want to set the spheres around a protein, instead of specifying all atoms in a protein, select part of
the atoms, for example, by
• Simulations with two spherical potentials. The center coordinates are explicitly set by center1 and
center2. With fixatom =yes and fix_layer =1.0 Angs, the atoms that are farther than 34 Angs from
the centers are fixed.
[BOUNDARY]
type = NOBC # [PBC,NOBC]
spherical_pot = yes
constant = 10.0
exponent = 2
nfunctions = 2
center1 = 17.0, 0.0, 0.0 # [x,y,z]
radius1 = 35.0
center2 = -17.0, 0.0, 0.0 # [x,y,z]
radius2 = 35.0
fixatom = YES
fix_layer = 1.0
11.4. Examples 84
CHAPTER
TWELVE
SELECTION SECTION
This section is used to select atoms, and define them as a group. The user can select atoms accorging to
their name, index, residue number, segment name, and so on. The selected group index is used in other
sections. For example, restraint potential can be applied on the group selected in this section, and the
force constant of the potential is specified in the [RESTRAINTS] section. [SELECTION] section is
also used in the GENESIS analysis tools to specify the atoms to be analyzed.
groupN expression
The user defines selected atoms as “group1”, “group2”, . . . , and “group:math:N”. Here, N
must be a positive integer (𝑁 ≥ 1). The user selects atoms by using keywords and operators
with a certain syntax (see table below). Note that in the table mname (or moleculename,
molname) is a molecule name defined by mol_name.
mole_nameN molecule starting-residue ending-residue
The user defines a molecule by specifying its segment name, first and last residue numbers,
and residue name. N must be a positive integer (𝑁 ≥ 1). The syntax for the residue selection
is as follows:
[segment name]:[residue number]:[residue name]
For details, see the example below.
Table. Available keywords and operators in group.
85
GENESIS User Guide, 1.7.1
Note: ai and atno are slightly different. ai indicates the atom index which is sequentially re-numbered
over all atoms in the system. On the other hand, atno is the index of atoms in the PDB file. Atom index
in PDB file (column 2) does not always start from 1, nor is numbered sequentially. In such cases, atno
is useful to select atoms, although it is a very rare case.
Note: Atoms that are within a distance of a given atom (X) can be selected by around. Note that the
coordinates in reffile is used to judge the distance. If reffile is not present, those in input files
(pdbfile, crdfile, etc.) are used instead. Coordinates in rstfile are never used.
12.2 Examples
Select atoms based on their atom name, residue name, or residue number:
[SELECTION]
group1 = resno:1-60 and an:CA
group2 = (segid:PROA and not hydrogen) | an:CA
mole_name1 = molA PROA:1:TYR PROA:5:MET
group3 = mname:molA and (an:CA or an:C or an:O or an:N)
Select atoms around an atom X. In the following examples, X = atom number 100.
12.2. Examples 86
GENESIS User Guide, 1.7.1
[SELECTION]
group1 = atno:100 around:10.0
group2 = atno:100 around_res:10.0
group3 = atno:100 around_mol:10.0
group4 = atno:100 around_mol:10.0 or atno:100
In group1, atoms around 10.0 Å of X are selected. Group 2 selects residues around 10.0 Å of X, i.e., if
the distance between X and any one of atoms in a residue is less than 10.0 Å, all atoms of the residue are
selected. Group 3 is the same as group 2, but for a molecule. Note that these commands do NOT select
X itself. In order to include X in the selection, add “or atno:100”, as in group 4.
[SELECTION]
group1 = atno:100-101 around:10.0
group2 = (sid:PROT around_res:10.0) and rnam:TIP3
group3 = (rno:1 around:10.0) or rno:1
Group 1 selects atoms around 10.0 Å of atom 100 or 101. Note that it is NOT “100 and 101” nor a
center of 100 and 101. Group 2 is an example to select water molecules around a protein (segname
PROT). Group 3 selects not only the atoms around residue1 but also the atoms of residue1.
12.2. Examples 87
CHAPTER
THIRTEEN
RESTRAINTS SECTION
[RESTRAINTS] section contains keywords to define external restraint functions. The restraint functions
are applied to the selected atom groups in [SELECTION] section to restrict the motions of those atoms.
The potential energy of a restaint can be written as:
𝑈 (𝑥) = 𝑘 (𝑥 − 𝑥0 )𝑛
where 𝑥 is a variable (see bellow), 𝑥0 is a reference value, 𝑘 is a force constant, and 𝑛 is an exponent
factor.
nfunctions Integer
Default: 0
Number of restraint functions.
functionN POSI / DIST[MASS] / ANGLE[MASS] / DIHED[MASS] / RMSD[MASS] / PC[MASS] / EM
Default: N/A
Type of restraint.
• POSI: positional restraint. The reference coordinates are given by reffile, ambreffile,
or groreffile in [INPUT]. (see Input section)
• DIST[MASS]: distance restraint.
• ANGLE[MASS]: angle restraint.
• DIHED[MASS]: dihedral angle restraint.
• RMSD[MASS]: RMSD restraint. MASS means mass-weighted RMSD. Translational
and rotational fitting to the reference coordiate are done before calculating RMSD. The
reference coodinate is specified in the same manner as POSI.
Important Notice (1.1.5 or later) Structural fitting method can be defined in [FIT-
TING] section (Fitting section) on 1.1.5 or later. Users of GENESIS 1.1.4 or before
88
GENESIS User Guide, 1.7.1
should pay special attention on the fitting scheme. In versions 1.1.4 or before, trans-
lational and rotational fittings were automatically applied for the atoms concerning
RMSD restraint. (same as the current default setting, fitting_method = TR+ROT )
• PC[MASS]: principal component constraint. This option requires modefile in the Input
section.
• EM: cryo-EM flexible fitting (see Experiments section)
DIST, ANGLE, DIHED impose restraint on distance/angle/dihedral defined by the selected
groups. See select_indexN and examples below for the specification. MASS indicates
that the force is applied to the center of mass of the selected group. When MASS is omitted,
the force is applied to the geometric center of the coordinates. MASS keyword does nothing
for groups consist of a single particle.
In SPDYN, POSI and RMSD[MASS] restraints are mutually exclusive; you can use either
one or none of them. Two different POSI restraints might not be applied simultaneously,
either.
Notice: POSI, PC, and RMSD restraints can be influenced by the removal of transla-
tional/rotational momentum. See also the notices in the stoptr_period parameter in the
[DYNAMICS] section.
constantN Real
Default: 0.0 (unit: depend on the restraint type)
Force constant of a restraint function. The unit depends on the type of restraint. Namely,
n
kcal/mol/Å is used in the case of DIST and RMSD, while kcal/mol/radn in the case of
ANGLE and DIHED, where 𝑛 is exponentN specified in this section.
referenceN Real
Default: 0.0 (unit: depend on the restraint type)
Reference value of a restraint function. For the positional restraint, the value is ignored. The
unit depends the type of restraint. Namely, Å is used in the case of DIST, while degree (NOT
radian) is used in the case of ANGLE and DIHED.
select_indexN Integer
Default: N/A
Index of an atom group, to which restraint potentials are applied. The index must be defined
in [SELECTION] (see Selection section). For example, if you specify select_index1 =
1, this restraint function is applied for group1 in the [SELECTION].
Number of groups required depends on the type of the restraint function.
• POSI/RMSD[MASS]: 1
• DIST[MASS]: 2𝑚, where 𝑚 = 1, 2, ...
• ANGLE[MASS]: 3
• DIHED[MASS]: 4
• PC[MASS]: ≥ 1
A group can contain more than single atom. Suppose we have the following input.
[SELECTION]
group1 = ai:1-10
group2 = ai:11-20
[RESTRAINTS]
nfunctions = 1
function1 = DIST
constant1 = 3.0
reference1 = 10.0
select_index1 = 1 2
In this case, the distance restraint is applied for the distance between geometric centers of
group1 and group2. The calculated force is then scattered to each atom. If DISTMASS is
given instead of DIST, mass centers (mass-weighted average position) are used instead of
geometric centers (not mass-weighted average position).
In the case of DIST[MASS] restraint with more than 2 groups specified (i.e. 2𝑚 with 𝑚 ≥
2), the sum of 𝑚 distances will be restrained. See exponent_dist and weight_dist
parameters for this distance summation. However, this scheme might not be useful for the
standard cases.
directionN ALL / X / Y / Z
Default : ALL
Direction of the POSI restraint. If X or Y or Z is specified, restraints along the other two
axes are not applied.
exponentN Integer
Default : 2
Exponent factor of the restraint function. The default is the harmonic. This parameter does
not work for POSI and RMSD[MASS] restraints in SPDYN, where the default value, 2, is
always used.
exponent_distN Integer (DIST[MASS] only)
Default : 1
Exponent factor
∑︀ used in the distance sum calculations. The sum of distances is expressed
as: 𝑟sum = 𝑚 𝑤|𝑟𝑚 | , where (1 ≤ 𝑚 ≤ num groups/2), 𝑛 is exponent_distN, and 𝑤 is
𝑛
weight_distN.
weight_distN Real (DIST[MASS] only)
Default : 1.0
Weight factor used in the distance sum calculations.
modeN Integer
Specifies the mode index which is used for the PC (principal component) restraint. For
example, the 1st PC mode can be restrained by specifying mode1=1.
Basically, the pressure calculated from the restraint potential is included in an internal pressure, which is
kept constant during the simulation in the NPT ensemble. However, the pressure derived from positional
and RMSD restraints are treated as an external pressure by default. Keywords pressure_position and
pressure_rmsd are used to include those pressures in the internal pressure. If the simulation with POSI
or RMSD restraint showed a strange behaviour (especially, when a strong force constant is applied), turn
on these options.
pressure_position YES / NO
Default : NO
The virial terms from positional restraints are included in pressure evaluation.
pressure_rmsd YES / NO
Default : NO
The virial terms from RMSD restraints are included in pressure evaluation.
Restraints can be also defined in an external input file (localresfile). In this case, number of local restraints
must NOT be included in nfunctions. This option is availabe in SPDYN only. For details, see Input
section.
If you employed a certain restraint term for REUS runs, nreplica of force constants and reference values
must be given as a space-separated list. The above keywords except for nfunctions, pressure_position,
and pressure_rmsd, must have a serial number, ‘N’, of the function (𝑁 ≥ 1). This serial number is
referred when selecting restraints in REUS runs. For details, see REMD section.
13.2 Examples
[RESTRAINTS]
nfunctions = 1
function1 = DIST
reference1 = 10.0
constant1 = 2.0
select_index1 = 1 2 # group1 and group2 in [SELECTION]
[RESTRAINTS]
nfunctions = 2
13.2. Examples 91
GENESIS User Guide, 1.7.1
function2 = DIHED
constant2 = 3.0
reference2 = 120.0 # in degrees
select_index2 = 3 4 5 6
13.2. Examples 92
CHAPTER
FOURTEEN
FITTING SECTION
(In GENESIS 1.1.5 or later only) Keywords in [FITTING] section define a structure superimposition
scheme, which is often employed in targeted MD, steered MD, or String method (see RPATH section)
with positional restraint. In the String method, the reference coordinate for fitting is given by fitfile
in the [INPUT] section. Otherwise (MD, MIN, REMD), the reference coordinate is given by reffile,
ambreffile, or groreffile in the [INPUT] section. Note that this section is not related to the cryo-EM
flexible fitting (see Experiments section)
93
GENESIS User Guide, 1.7.1
Default: NO
This parameter must not be changed for standard MD runs. If the parameter is set to YES
and fitting_method is set to NO, the fitting routine is turned off. Translational and rotational
fittings are usually required to calculate correct RMSD values. So GENESIS simulators
(ATDYN and SPDYN) do not allow fitting_method = NO for simulations involving RMSD
calculation (targeted/steered MD, for example). But such fitting is not desirable when gener-
ating initial structure set for the String method using Cartesian coordinate as CV (see RPATH
section). Actually, fitting_method = NO was implemented just for this specific purpose. If
you are really want to turn off fittings of RMSD calculation for preparation of initial structure
set for String method, please specify fitting_method = NO and force_no_fitting = YES.
14.2 Examples
[FITTING]
fitting_method = TR+ROT
fitting_atom = 1
mass_weight = NO
14.2. Examples 94
CHAPTER
FIFTEEN
REMD SECTION
In the [REMD] section, the users can specify keywords for Replica-Exchange Molecular Dynamics
(REMD) simulation. REMD method is one of the enhanced conformational sampling methods used
for systems with rugged free-energy landscapes. The original temperature-exchange method (T-REMD)
is one of the most widely used methods in biomolecules’ simulations [63] [64]. Here, replicas (or copies)
of the original system are prepared, and different temperatures are assigned to each replica. Each replica
runs in a canonical (NVT) or isobaric-isothermal (NPT) ensemble, and the temperatures are periodically
exchanged between the neighboring replicas during a simulation. Exchanging temperature enforces a
random walk in temperature space, allowing the system overcoming energy barriers and sampling much
wider conformational space.
In REMD methods, the transition probability of the replica exchange process is given by the usual
Metropolis criterion,
𝑃 (𝑋 ′ )
𝑤(𝑋 → 𝑋 ′ ) = min(1, ) = min(1, exp(−∆)).
𝑃 (𝑋)
where 𝐸 is the potential energy, 𝑞 is the position of atoms, 𝛽 is the inverse temperature defined by
𝛽 = 1/𝑘𝐵 𝑇 , 𝑖 and 𝑗 are the replica indexes, and 𝑚 and 𝑛 are the parameter indexes. After the replica
exchange, atomic momenta are rescaled as follows:
√︂ √︂
[𝑖]′ 𝑇𝑛 [𝑖] [𝑗]′ 𝑇𝑚 [𝑗]
𝑝 = 𝑝 , 𝑝 = 𝑝 ,
𝑇𝑚 𝑇𝑛
95
GENESIS User Guide, 1.7.1
In GENESIS, not only Temperature REMD but also pressure REMD [67], surface-tension REMD [68],
REUS (or Hamiltonian REMD) [69] [70], replica exchange with solute tempering (REST) [71] [72],
and their multi-dimensional combinations are available in both ATDYN and SPDYN. Basically, these
methods can be employed in the NVT, NPT, NPAT, NPgT ensembles, except for the surface-tension
REMD, which is only used in the NPgT ensemble. REMD simulations in GENESIS require an MPI
environment. At least one MPI process must be assigned to one replica. For example, when the user
wants to employ 32 replicas, 32𝑛 MPI processes are required.
In the following parameters excluding dimension, exchange_period, and iseed, the last character ‘N’ must
be replaced with a positive integer number (i.e. 𝑁 ≥ 1), which defines the index of replica dimension. For
example, type1, nreplica1 are the replica type and number of replicas for the first dimension, respectively.
For details, see the examples below.
dimension Integer
Default: 1
Number of dimensions (i.e. number of parameter types to be exchanged)
typeN TEMPERATURE / PRESSURE / GAMMA / RESTRAINT / REST
Default: TEMPERATURE
Type of parameter to be exchanged in the 𝑁 -th dimension
• TEMPERATURE: Temperature REMD [63]
• PRESSURE: Pressure REMD [67]
• GAMMA: Surface-tension REMD [68]
• RESTRAINT: REUS (or Hamiltonian REMD) [69] [70]
• REST: replica exchange with solute tempering (REST2 or gREST) [71] [72], which is
totally different from the original version of REST [73]. Currently, only AMBER and
CHARMM force fields are supported.
• ALCHEMY: FEP/𝜆-REMD [74]
nreplicaN Integer
Default: 0
Number of replicas (or parameters) in the 𝑁 -th dimension
parametersN Real
Default: N/A
List of parameters for each replica in the 𝑁 -th dimension. Parameters must be given as a
space-separated list, and the total number of parameters must be equal to nreplicaN. In case
of REUS (type = RESTRAINT), parameters must be specified in [RESTRAINTS] section
(see the sample below). In case of gREST (type = REST), these parameters are considered
as temperature of solute region. Note that the order of the parameters in this list must NOT
be changed before and after the restart run, even if the parameters are exchanged during the
REMD simulation.
exchange_period Integer
Default: 100
Frequency of the parameter exchange attempt. If “exchange_period = 0” is specified, REMD
simulation is carried out without parameter exchange, which is useful to equilibrate the sys-
tem in a condition assinged to each replica before performing the production run.
cyclic_paramsN YES / NO
Default: NO
Turn on or off the periodicity of the parameters in the 𝑁 -th dimension. If “cyclic_paramsN
= YES” is specified, the first and last parameters are considered as neighbouring parameters.
This option can be applicabe to all parameter types. Basically, this is useful in the case of
REUS in dihedral angle space, since the dihedral angle has a periodicity.
iseed Integer
Default: 3141592
Random number seed in the replica exchange scheme. If this is not specified explicitly, iseed
is taken over from the restart file.
Note: In the [ENSEMBLE] section, there is also a parameter “temperature”. In the T-REMD simula-
tion, this temperature is ignored, even if it is specified explicitly. Similarly, pressure and gamma in the
[ENSEMBLE] section are ignored in the P-REMD and surface-tension REMD simulations, respectively.
Note: When multi-dimensional REMD is carried out, parameters are exchanged alternatively. For
example, in TP-REMD (type1 = TEMPERATURE and type2 = PRESSURE), there is a temperature
exchange first, followed by a pressure exchange. This is repeated during the simulations.
Note: Positional restraint is not available for REUS. In SPDYN, PCA restraint is not available for REUS.
The control file format was completely changed after verion 1.1.0, since the off-grid REUS scheme was
introduced. When the users use the old control file, please be careful.
select_indexN Integer
Default: N/A
Index of an atom group. The selected atoms are considered as “solute” in gREST. The index
must be defined in [SELECTION] (see Selection section).
param_typeN ALL / BOND / ANGLE / UREY / DIHEDRAL / IMPROPER / CMAP / CHARGE / LJ
Default: ALL
Solute energy terms for gREST [72] simulations. Energy terms selected by this parame-
ter in the solute atom group (defined by select_indexN) are considered as “solute” (scaled
according to solute temperature) in gREST. Other terms are considered as “solvent” (kept
intact). Solute-solvent terms are automatically determined from the solute selection. You
can specify multiple terms (see examples). The parameter names are case-insensitive as
follows:
• ALL: all the available energy terms.
• BOND: (aliases: B, BONDS): 1-2 bonding terms.
• ANGLE: (aliases: A, ANGLES): 1-2-3 angle terms.
• UREY: (aliases: U, UREYS): Urey-Bradley terms.
• DIHEDRAL: (aliases: D, DIHEDRALS): 1-2-3-4 dihedral terms.
• IMPROPER: (aliases: I, IMPROPERS): improper torsion terms.
• CMAP: (aliases: CM, CMAPS): CMAP terms.
• CHARGE: (aliases: C, CHARGES): coulombic interaction terms.
• LJ: (aliases: L, LJS): Lennard-Jones interaction terms.
Note: Note that restraint energy terms defined in [RESTRAINTS] cannot be treated as solute terms.
They never be affected by gREST solute temperatures. In SPDYN, water atoms cannot be specified as
“solute” now. This limitation will be removed in the future version.
Note: When the coulombic interaction terms are considered as the solute, the solute region should have
a net charge of 0 for an adequate PME calculation.
15.4 Examples
Basically, REMD simulations in GENESIS can be carried out by just adding the [REMD] section in the
control file of a normal MD simulation. For details, see the online Tutorial (https://www.r-ccs.riken.jp/
labs/cbrt/tutorials2019/).
15.4.1 T-REMD
If the users want to carry out T-REMD simulations with 4 replicas in the NVT ensemble, where each
replica has the temperature 298.15, 311.79, 321.18, or 330.82 K, and replica exchange is attempted
every 1000 steps, the following section is added to the control file of a normal MD simulation in the
NVT ensemble:
[REMD]
dimension = 1
exchange_period = 1000
type1 = TEMPERATURE
nreplica1 = 4
parameters1 = 298.15 311.79 321.18 330.82
As for the T-REMD simulation in the NPT ensemble, the users add this section to the control file of a
normal MD simulation in the NPT ensemble. The REMD temperature generator (http://folding.bmc.uu.
se/remd/) is a useful tool to set the target temperature of each replica.
The following is an example of two-dimensinal REMD, where temperature and restraint are exchanged
alternatively, The 1st dimension is T-REMD with 8 parameters, and 2nd dimension is REUS in distance
space with 4 parameters. In total, 8 x 4 = 32 replicas are used:
[REMD]
dimension = 2
exchange_period = 1000
type1 = TEMPERATURE
nreplica1 = 8
parameters1 = 298.15 311.79 321.18 330.82 340.70 350.83 361.23 371.89
type2 = RESTRAINT
nreplica2 = 4
rest_function2 = 1
[SELECTION]
group1 = ai:25
group2 = ai:392
[RESTRAINTS]
nfunctions = 1
function1 = DIST
constant1 = 2.0 2.0 2.0 2.0
(continues on next page)
15.4. Examples 99
GENESIS User Guide, 1.7.1
Example of off-grid REUS (merge two restraints into single reaction coordinate), where distance and
dihedral restraints are merged into single reaction coordinate. First values of restraints ((2.0,10.0) for
distance, (10,-40) for dihedral) will be used for the first replica, the fourth parameters ((2.0,11.5) for
distance, (10,-10) for dihedral) will be used for the fourth replica:
[REMD]
dimension = 1
exchange_period = 1000
type1 = RESTRAINT # REUS
nreplica1 = 4
rest_function1 = 1 2 # off-grid REUS
[SELECTION]
group1 = ai:25
group2 = ai:392
group3 = ai:72
group4 = ai:73
group5 = ai:74
group6 = ai:75
[RESTRAINTS]
nfunctions = 2
function1 = DIST
constant1 = 2.0 2.0 2.0 2.0 # num of values must be nreplica1
reference1 = 10.0 10.5 11.0 11.5
select_index1 = 1 2
function2 = DIHED
constant2 = 10 10 10 10 # num of values must be nreplica1
reference2 = -40 -30 -20 -10
select_index2 = 3 4 5 6
15.4.4 gREST
In this example, the dihedral, CMAP, and LJ energy terms in the selected atom groups are treated as
“solute”.
[REMD]
dimension = 1
exchange_period = 1000
type1 = REST
nreplica1 = 4
parameters1 = 300.0 310.0 320.0 330.0 # solute temperatures
param_type1 = D CM L # dihedral, CMAP, and LJ
select_index1 = 1
[SELECTION]
group1 = ai:1-313
T-REMD in the two-dimensional REMD (T-REMD/REUS) may be replaced with gREST (gREST/REUS
[75]) to reduce the required number of replicas.
[REMD]
dimension = 2
exchange_period = 1000
type1 = REST
nreplica1 = 4
parameters1 = 300.0 310.0 320.0 330.0 # solute temperatures
param_type1 = D CM L # dihedral, CMAP, and LJ
select_index1 = 3
type2 = RESTRAINT
nreplica2 = 4
rest_function2 = 1
[SELECTION]
group1 = ai:25
group2 = ai:392
group3 = ai:1-313
[RESTRAINTS]
nfunctions = 1
function1 = DIST
constant1 = 2.0 2.0 2.0 2.0
reference1 = 10.0 10.5 11.0 11.5
select_index1 = 1 2
SIXTEEN
RPATH SECTION
In the [RPATH] section, users can specify keywords for finding the reaction path. The path search is
carried out in two modes: the minimum energy path (MEP) and the minimum free-energy path (MFEP).
The former searches for the energetically most favorable pathway on the potential energy surface (PES),
while the latter does the same on the free-energy surface (FES). The MEP search is used to find relatively
fast processes such as chemical reactions (very likely along with QM/MM), in which the environment
can be regarded more or less rigid. On the other hand, the MFEP reveals large-scale conformational
changes of biomolecules by searching the path on a FES, where fast molecular motions are averaged out.
In both cases, the path is represented by a chain-of-replica, which evolves on the energy surface so as to
minimize the forces in the transverse direction.
rpathmode MFEP/MEP
Default: MFEP
Specify MFEP or MEP to invoke the MFEP or MEP search.
The MFEP search is invoked by specifying rpathmode = MFEP. The path search is carried out by
the string method, which is a powerful sampling technique to find a path connecting two stable con-
formational states. This method is widely used for investigating large-scale conformational changes of
biomolecules where time-scale of the transitions are not reachable in brute-force simulations.
There are three major algorithms in the string method: the mean forces string method [76], the on-the-fly
string method [77], and the string method of swarms of trajectories [78]. Among these algorithgms, the
mean forces string method is available in ATDYN and SPDYN [79].
In the mean-forces string method, the pathway is represented by discretized points (called images) in the
collective variable (CV) space. The current GENESIS supports distances, angles, dihedrals, Cartesian
coordinates, and principal components for CVs (note that different types of CVs cannot be mixed in
GENESIS. For example, users cannot mix distance and angle).
In the calculation, each image is assigned to each replica, and a replica samples mean forces and an
average metric tensor around its own image by short MD simulation (ps to ns length) with restraints.
102
GENESIS User Guide, 1.7.1
The restraints are imposed using the image coordinates as their reference values. After the short sim-
ulation , each image is evolved according to the mean force and metric tensor. Then, smoothing and
re-parametrization of images are performed and go to the next cycle.
Image coordinates are written in rpath files (rpathfile keyword) which user can specify in [OUTPUT]
section. This file provides the trajectory of image coordinates. Columns correspond to CVs and rows
are time steps. These values are written at the same timing with dcdfile (specified by crdout_period in
[DYNAMICS] section).
For the string method calculation, an initial pathway in the CV space and atomistic coordinates around
the pathway are required. For preparing these, targeted or steered MD methods are recommended.
nreplica Integer
Default: 1
Number of replicas (images) for representing the pathway.
rpath_period Integer
Default: 0
Time-step period during which the mean-forces acting on the images are evaluated. After
evaluating the mean-forces, the images are updated according to the mean-forces, then go
to the next cycle. If rpath_period = 0, images are not updated. This option is used for
equilibration or umbrella sampling around the pathway.
delta Real
Default: 0.0
Step-size for steepest descent update of images.
smooth Real
Default: 0.0
Smoothing parameter which controls the aggressiveness of the smoothing. Values from 0.0
to 0.1 are recommended, where “smooth = 0.0” means no-smoothing
rest_function List of Integers
Default: N/A
List of restraint function indices defined in [RESTRAINTS] section (see Restraints section).
Specified restraints are defined as CVs, and nreplica images (replicas) are created, where a
set of corresponding restraint reference values is assigned to each image. Force constants in
[RESTRAINTS] are also used for evaluation of mean-forces.
fix_terminal YES / NO
Default: NO
If fix_terminal = YES is specified, the two terminal images are always fixed and not updated.
This is useful if the terminal images correspond to crystal structures and users do not want
to move them.
use_restart YES / NO
Default : YES
Restart file generated by the string method calculation includes the last snapshot of images. If
use_restart = YES is specified, the reference values in [RESTRAINTS] will be overwritten
by the values in the restart file. Note that force constants are not overwritten.
Note: The following options are needed in the [FITTING] section when Cartesian coordinates are used
for CVs.
fitting_method TR+ROT / XYTR+ZROT / NO
If this keyword is specified, rot-translational elements are removed from the mean-force
estimation by fitting instaneous structures to the reference coordinates given by fitfile.
fitting_atom List of Integers
The user can specify index of an atom group which are fitted to the reference structure.
Usually, the same atoms as CVs are selected.
mepatm_select_index Integer
Index of a group of atoms which is treated as MEP atoms. The index must be defined in
[SELECTION] (see Selection section).
ncycle Integer
Default: 1000
Maximum number of cycle.
nreplica Integer
Default: 1
Number of replicas (images) for representing the pathway.
Note: If MPI processes are larger than nreplica, the MPI processes must be a multiple of nreplica.
For example, if nreplica = 16, MPI processes must be 16, 32, 48, etc. If MPI processes are smaller than
nreplica, the MPI processes must be a divisor of nreplica. For example, the calculation with nreplica
= 16 can be performed using 1, 2, 4, and 8 MPI processes.
eneout_period Integer
Default : 10
Frequency of the output of the energy profile and path length to the standard output.
crdout_period Integer
Default : 0
Frequency of coordinates outputs. Note that coordinate outputs are turned off for the mini-
mization (crdout_period in the [MINIMIZE] section).
rstout_period Integer
Default : 0
Frequency of restart file updates. Note that the updates are turned off for the minimization
(rstout_period in the [MINIMIZE] section).
tol_energy Real
Default : 0.01 (unit : kcal/mol)
Tolerence of convergence for the energy.
tol_path Real
Default : 0.01 (unit : Å)
Tolerence of convergence for the path length.
massweightcoord YES / NO
Default : NO
Use mass weighted Cartesian.
method STRING/NEB
Default: STRING
Choose the algorithm of a MEP search.
2
Default: 10.0 kcal/mol/Å
Spring constant of the force that connects the images
ncorrection Integer
Default : 10
Number of corrections to build the inverse Hessian.
lbfgs_bnd YES / NO
Default : YES
Set a boundary to move atoms in each step of image update.
lbfgs_bnd_qmonly YES / NO
Default : NO
Set the boundary only to QM atoms.
lbfgs_bnd_maxmove Real
Default : 0.1 (unit : Å)
The maximum size of move in each step.
16.4 Examples
Example of alanine-tripeptide with 16 replicas (images). Two dihedral angles are specified as the collec-
tive variables.
[RPATH]
nreplica = 16
rpath_period = 1000
delta = 0.02
smooth = 0.0
rest_function = 1 2
[SELECTION]
group1 = atomindex:15
group2 = atomindex:17
group3 = atomindex:19
group4 = atomindex:25
group5 = atomindex:27
[RESTRAINTS]
nfunctions = 2
function1 = DIHED
constant1 = 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 \
100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
reference1 = -40.0 -40.0 -40.0 -40.0 -40.0 -40.0 -40.0 -40.0 \
(continues on next page)
function2 = DIHED
constant2 = 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 \
100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
reference2 = -45.0 -33.0 -21.0 -9.0 3.0 15.0 27.0 39.0 \
51.0 63.0 75.0 87.0 99.0 111.0 123.0 135.0
select_index2 = 2 3 4 5 # PSI
Here is another example of Cartesian coordiante CVs for the same alanine-tripeptide.
[INPUT]
... skip ...
rstfile = ../eq/{}.rst
reffile = {}.pdb
fitfile = fit.pdb
[RPATH]
nreplica = 16
rpath_period = 1000
delta = 0.001
smooth = 0.00
rest_function = 1
fix_terminal = NO
[FITTING]
fitting_method = TR+ROT
fitting_atom = 1
[SELECTION]
group1 = ai:15 or ai:17 or ai:19 or ai:25 or ai:27
[RESTRAINTS]
nfunctions = 1
function1 = POSI
constant1 = 10.0 10.0 10.0 10.0 \
10.0 10.0 10.0 10.0 \
10.0 10.0 10.0 10.0 \
10.0 10.0 10.0 10.0
select_index1 = 1
[INPUT]
topfile = toppar/top_all36_prot.rtf, ...
parfile = toppar/par_all36_prot.prm, ...
psffile = prot.psf # protein structure file
(continues on next page)
[OUTPUT]
dcdfile = mep_{}.dcd # coordinates
logfile = mep_{}.log # log files
rstfile = mep_{}.rst # restart file
rpathfile = mep_{}.rpath # rpath file
[ENERGY]
forcefield = CHARMM # CHARMM force field
... skip ...
[MINIMIZE]
method = LBFGS # MIN using L-BFGS
nsteps = 100 # max. number of steps
eneout_period = 5 # energy output period
fixatm_select_index = 2 # fix the outer layer
macro = yes # macro/micro iteration
[RPATH]
rpathmode = MEP # MEP search
method = STRING # String method
delta = 0.0005 # step size
ncycle = 200 # max. number of cycle
nreplica = 16 # number of replica
eneout_period = 1 # frequency of the energy output
crdout_period = 1 # frequency of the coordinate output
rstout_period = 1 # frequency of the restart update
fix_terminal = no # fix the terminal
massWeightCoord = no # mass-weighted Cartesian
mepatm_select_index = 1 # selection of the MEP atoms
[BOUNDARY]
type = NOBC
[QMMM]
qmtyp = gaussian
qmatm_select_index = 1
... skip ...
[SELECTION]
group1 = sid:DHA or (sid:TIMA and (rno:95 or rno:165) and \
not (an:CA | an:C | an:O | an:N | an:HN | an:HA))
group2 = not (sid:DHA or sid:DHA around_res:6.0)
SEVENTEEN
GAMD SECTION
In the [GAMD] section, the users can specify keywords for Gaussian accelerated Molecular Dynam-
ics (GaMD) simulation. The GaMD method [83, 84] accelerates the conformational sampling of
biomolecules by adding a harmonic boost potential to smooth their potential energy surface. GaMD
has the advantage that reaction coordinates do not need to be predefined, thus setting up the system for
the simulation is rather easy. The use of the harmonic boost potential allows to recover the unbiased free-
energy changes through cumulant expansion to the second order, which resolves the practical reweighting
problem in the original accelerated MD method.
GaMD was developed as a potential-biasing method for enhanced sampling. It accelerates the conforma-
tional sampling of a biomolecule by adding a non-negative boost potential to the system potential energy
𝑈 (⃗𝑥):
where ⃗𝑥 is the configuration of the system, 𝑈 ′ (⃗𝑥) is the modified potential energy, and ∆𝑈 GaMD is the
boost potential depending only on 𝑈 (⃗𝑥).
In conventional accelerated MD [85, 86, 87], the average of the Boltzmann factors of the boost potential
terms appears in the reweighting equation of the probability along the selected reaction coordinates,
causing a large statistical error. In order to reduce the noise, GaMD uses a harmonic boost potential,
which adopts a positive value only when the system potential is lower than an energy threshold 𝐸:
{︃
1
GaMD 𝑘{𝐸 − 𝑈 (⃗𝑥)}2 (𝑈 (⃗𝑥) < 𝐸)
∆𝑈 (𝑈 (⃗𝑥)) = 2 ,
0 (𝑈 (⃗𝑥) ≥ 𝐸)
where 𝑘 is a harmonic force constant. 𝑈 ′ (⃗𝑥) should satisfy the following relationships [83, 84]: 𝑈 ′ (⃗𝑥1 ) <
𝑈 ′ (⃗𝑥2 ) and 𝑈 ′ (⃗𝑥2 ) − 𝑈 ′ (⃗𝑥1 ) < 𝑈 (⃗𝑥2 ) − 𝑈 (⃗𝑥1 ) if 𝑈 (⃗𝑥1 ) < 𝑈 (⃗𝑥2 ). To keep the relationships, the
threshold energy needs to be set as:
1
𝑈max ≤ 𝐸 ≤ 𝑈min + ,
𝑘
where 𝑈max and 𝑈min are maximum and minimum energies of the system, respectively. To ensure accu-
rate reweighting, the deviation of the potential must also satisfy the relation:
109
GENESIS User Guide, 1.7.1
where 𝑈ave and 𝜎𝑈 are the average and standard deviation of 𝑈 (⃗𝑥), respectively. 𝜎0 is a user-specified
upper limit. 𝑘0 is defined as 𝑘0 ≡ 𝑘(𝑈max − 𝑈min ), then 0 < 𝑘0 ≤ 1.
When 𝐸 is set to the lower bound 𝑈max , 𝑘0 is determined by
(︂ )︂
𝜎0 𝑈max − 𝑈min
𝑘0 = min 1,
𝜎𝑈 𝑈max − 𝑈ave
where 𝑈𝑖′′ (⃗𝑥) is the modified potential energy of replica 𝑖, ∆𝑈𝑖REUS is the bias potential of REUS for
replica 𝑖, and 𝜉(⃗𝑥) is the collective variable of REUS. This method is referred to as Gaussian accelerated
replica exchange umbrella sampling (GaREUS) [89]. The parameters in the GaMD boost potential are
used in all replicas of GaREUS simulations. By using this combination, the simulated system in each
replica becomes more flexible, or the energy barrier irrelevant to the collective variable is lowered, en-
hancing the sampling efficiency. When performing GaREUS simulations, the user must specify [REMD]
section to use REUS and define a collective variable in the [SELECTION] and [RESTRAINTS] sections.
Please check the example below.
gamd YES / NO
Default : NO
Enable the GaMD method.
boost YES / NO
Default : YES
Flag to apply GaMD boost to the system (). If boost = NO, boost is not applied but GaMD
parameters are updated from the trajectory.
boost_type DUAL / DIHEDRAL / POTENTIAL
Default: DUAL
Type of boost.
• DUAL: Boost is applied on both the dihedral and total potential energies.
dih_min Real
dih_ave Real
Default: 0.0 (unit: kcal/mol)
Average of the dihedral energy of the system 𝑈ave
dih .
dih_dev Real
Default: 0.0 (unit: kcal/mol)
Standard deviation of the dihedral energy of the system 𝜎𝑈dih .
17.2 Examples
Example of a GaMD simulation to determine initial parameters. To obtain the initial guess of the boost
potential, (pot_max, pot_min, pot_ave, pot_dev, dih_max, dih_min, dih_ave, dih_dev) are calculated
from a short simulation without boosting.
[GAMD]
gamd = yes
boost = no
boost_type = DUAL
thresh_type = LOWER
sigma0_pot = 6.0
sigma0_dih = 6.0
update_period = 50000
Example of a GaMD simulation updating parameters. The boost potential is updated every update_period
during the simulation.
[GAMD]
gamd = yes
boost = yes
boost_type = DUAL
thresh_type = LOWER
sigma0_pot = 6.0
sigma0_dih = 6.0
update_period = 500
pot_max = -20935.8104
pot_min = -21452.3778
pot_ave = -21183.9911
pot_dev = 78.1207
dih_max = 16.4039
dih_min = 8.5882
dih_ave = 11.0343
dih_dev = 1.0699
Example of a GaMD simulation for production. In order to fix the parameters (pot_max, pot_min,
pot_ave, pot_dev, dih_max, dih_min, dih_ave, dih_dev), update_period is set to 0.
[GAMD]
gamd = yes
boost = yes
boost_type = DUAL
thresh_type = LOWER
sigma0_pot = 6.0
sigma0_dih = 6.0
update_period = 0
pot_max = -20669.2404
pot_min = -21452.3778
pot_ave = -20861.5224
pot_dev = 48.9241
dih_max = 23.2783
dih_min = 8.5882
dih_ave = 13.3806
dih_dev = 1.7287
Example of a GaREUS simulation. The same GaMD parameters are applied in each replica of REUS.
After the simulation, the two-step reweighting procedure using the multistate Bennett acceptance ratio
method and the cumulant expansion for the exponential average is required to obtain the unbiased free-
energy landscapes.
[REMD]
dimension = 1
exchange_period = 5000
type1 = RESTRAINT
nreplica1 = 4
rest_function1 = 1
[GAMD]
gamd = yes
boost = yes
boost_type = DUAL
thresh_type = LOWER
sigma0_pot = 6.0
sigma0_dih = 6.0
update_period = 0
pot_max = -26491.7344
pot_min = -27447.4316
pot_ave = -26744.5742
pot_dev = 52.5674
dih_max = 135.8921
dih_min = 91.2309
dih_ave = 116.8572
dih_dev = 3.6465
[SELECTION]
group1 = rno:1 and an:CA
group2 = rno:10 and an:CA
(continues on next page)
[RESTRAINTS]
nfunctions = 1
function1 = DISTMASS
constant1 = 1.0 1.0 1.0 1.0
reference1 = 5.0 6.0 7.0 8.0
select_index1 = 1 2
EIGHTEEN
QMMM SECTION
QM−MM
𝑉 (R𝑎 , R𝑚 ) = 𝑉 QM (R𝑎 , R𝑚 ) + 𝑉LJ (R𝑎 , R𝑚 ) + 𝑉 MM (R𝑚 ),
QM−MM
where R𝑎 and R𝑚 denote the position of atoms in QM and MM regions, respectively. 𝑉LJ and
𝑉 MM are the Lennard-Jones interaction between QM-MM atoms and the force field for MM atoms,
respectively. The QM energy, 𝑉 QM , is written in terms of the electronic energy and the Coulomb inter-
action between nucleus-nucleus and nucleus-MM atoms,
∑︁ 𝑍𝑎 𝑍𝑎′ ∑︁ 𝑍𝑎 𝑞𝑚
𝑉 QM (R𝑎 , R𝑚 ) = 𝐸𝑒 (R𝑎 , R𝑚 ) + + ,
′
𝑟𝑎𝑎′ 𝑎,𝑚
𝑟𝑎𝑚
𝑎>𝑎
where 𝑍𝑎 and 𝑞𝑚 are the charge of nucleus and MM atoms, respectively, and 𝑟𝑎𝑎′ and 𝑟𝑎𝑚 denote the
distantce between nucleus and nucleus-MM atoms, respectively. The electronic energy is given by solving
the Schrödinger equation for electrons,
⎡ ⎤
1 ∑︁ ∑︁ 1 ∑︁ 𝑍𝑎 ∑︁ 𝑞𝑚
⎣− ∇2𝑖 + − − ⎦ |Ψ𝑒 ⟩ = 𝐸𝑒 |Ψ𝑒 ⟩ ,
2 𝑟𝑖𝑗 𝑟𝑖𝑎 𝑟𝑖𝑚
𝑖 𝑖>𝑗 𝑖,𝑎 𝑖,𝑚
where i, a, and m are indices for electrons, nucleus, and MM atoms, respectively, and 𝑟𝑋𝑌 denotes the
distance between particle X and Y.
GENESIS does not have a function to solve the electronic Schrödinger equation, but rely on external
QM programs, which provide the QM energy, its derivatives, and other information. The interface is
currently avaliable for Gaussian, Q-Chem, TeraChem, DFTB+, and QSimulate.
115
GENESIS User Guide, 1.7.1
• Gaussian09/Gaussian16 (http://gaussian.com)
• Q-Chem (http://www.q-chem.com)
• TeraChem (http://www.petachem.com)
• DFTB+ (https://www.dftbplus.org)
• QSimulate (https://qsimulate.com/academic.html)
GENESIS/QSimulate is interfaced via shared libraries and seamlessly uses the MPI parallelization,
thereby facilitating high performace QM/MM-MD simulations [80]. A ready-to-use Singularity image
is provided by QSimualte Inc. See QSimulate (https://qsimulate.com/academic.html) for further infor-
mation.
Other QM programs are invoked via a system call function of Fortran. In this scheme, the input file
of a QM calculation is first generated, followed by executing a script to run a QM program and read-
ing the information from QM output files. For more information on the method and implementation,
see Ref.[92]. Samples of a QM input file (qmcnt) and a script (qmexe) are available in our github
(https://github.com/yagikiyoshi/QMMMscripts)
In order to run QM/MM calculations, users add the [QMMM] section in the control file. Avaliable
options are listed in the following.
Default : N/A
If present, QM files are copied from workdir to this directory. It is typically the case that
QM calculations are carried out within a node, and the whole simulation (such as REMD)
accross nodes. Then, it is useful for a better performance to set workdir to a local disk
of each node with fast access (e.g., /dev/shm), and copy the QM files to savedir with a
frequency specified by qmsave_period.
qmmaxtrial Integer
Default : 1
The maximum number of trial run for QM calculations. When a QM calculation fails, GEN-
ESIS repeats the calculation until the iteration reaches this number. The SCF threshold is
lowered, if the SCF threshold option is present in the QM control file.
exclude_charge ATOM / GROUP / AMBER
Default: GROUP
This option specifies how to exclude the MM charge in the vicinity of a QM-MM boundary
to avoid overpolarization of QM electron density. When the CHARMM force field is used,
ATOM excludes only the charge of MM link atom, while GROUP excludes the charges of
all MM atoms that belongs to the same group as a MM atom at the boundary. When the
AMBER force field is used, AMBER excludes the charge of MM link atom and distributes
it to rest of the system evenly.
Note: Since version 1.6.1, QM/MM supports both CHARMM and AMBER force field for the MM.
Note: The QM/MM calculation must be carried out in non-PBC. A non-PBC system can be created
from MD trajectory (pdb, dcd) using qmmm_generator in the analysis tool. See the tutorial of QM/MM
(https://www.r-ccs.riken.jp/labs/cbrt/tutorials2019/tutorial-16-1) for more details.
18.2 Examples
In the following example, the atoms from # 1 to 14 are selected as QM atoms by [SELECTION] section.
The QM program is Gaussian. A directory qmmm_min is created, where input and output files for Gaussian
(jobXXXX.inp and jobXXXX.log) are saved every 10 steps.
[SELECTION]
group1 = atomno:1-14
[QMMM]
qmatm_select_index = 1
qmtyp = gaussian
qmcnt = gaussian.com
qmexe = runGau.sh
workdir = qmmm_min
basename = job
qmsave_period = 10
The following example is for DFTB+. Because DFTB calculations typically take < 1 sec per one snapshot,
I/O to generate the input and read output could be non-negligible. It is recommended to set workdir to
a fast disk such as /dev/shm. The input and output files of DFTB+ will be copied to qmmm_min every
100 steps.
[QMMM]
qmatm_select_index = 1
qmtyp = dftb+
qmcnt = dftb.hsd
qmexe = runDFTB.sh
workdir = /dev/shm/qmmm_min
savedir = qmmm_min
basename = job
qmsave_period = 100
NINETEEN
VIBRATION SECTION
HC = 𝜔C
1 𝜕2𝑉
𝐻𝑖𝑗 = √ ,
𝑚𝑖 𝑚𝑗 𝜕𝑥𝑖 𝜕𝑥𝑗
with the mass of the i-th atom, 𝑚𝑖 , and the potential energy, V. The Hessian matrix is calculated by
numerical differentiations of the gradient,
𝜕2𝑉
(︂ )︂
1 𝜕𝑉 (+𝛿𝑖 ) 𝜕𝑉 (−𝛿𝑖 )
≃ −
𝜕𝑥𝑖 𝜕𝑥𝑗 2𝛿𝑖 𝜕𝑥𝑗 𝜕𝑥𝑗
This step requires 6 N number of gradient calculations, where N is the number of atoms in the subsystem.
The gradient calculations are parallelized by distributing over MPI processes.
The information is output to a minfo file, which can be visualized by a molecular vibrational program,
SINDO.
119
GENESIS User Guide, 1.7.1
Default : 1
The number of MPI processes.
vibatm_select_index Integer
Default: N/A
Indices of a group of atoms to specify a target subsystem for vibrational analyses. The indices
must be defined in [SELECTION] (see Selection section).
output_minfo_atm Integer
Default: N/A
Indices of a group of atoms, which are printed to a minfo file in addition to the target sub-
system. This option is useful when one would like to visualize the atoms surrounding the
target subsystem. The indices must be defined in [SELECTION] (see Selection section).
diff_stepsize Real
Default : 0.01 (unit: Å)
The size of numerical differentiations when generating the Hessian matrix.
minfo_folder Character
Default : minfo.files
The name of a directory where intermediate minfo files are stored. If the directory and
intermediate files are present, the program restarts from where it ended in the last run.
Note: The geometry of the subsystem needs to be optimized prior to the vibrational analysis. RMSG <
0.35 kcal/mol/Å is recommended.
Furthermore, anharmonic vibrational calculations can be carried out by combining SINDO and GENE-
SIS. The following two options are used for generating anharmonic potential energy surfaces. For more
details, visit the website of SINDO (https://tms.riken.jp/en/research/software/sindo).
gridfile Character
Default : makeQFF.xyz (for QFF) and makeGrid.xyz for GRID
The name of a file containing the XYZ coordinates of grid points for generating the anhar-
monic PES. The xyz file is generated by MakePES module of SINDO.
datafile Character
Default : makeGrid.dat
The name of a file containing the energy, dipole moment, etc. at grid points specified by
gridfile. The file is used by SINDO for generating GRID potentials.
19.2 Examples
In the following example, the vibrational analysis is performed for a subsystem composed of atom number
5-8 (group1) and residue number 12-14 of segment “WAT” (group2). 4 MPI processes are used to
calculate the gradients at grid points of numerical differentiations. The output is written to a minfo file,
where the coordinates are given not only for target atoms (group1 and group2) but also for the whole
protein (segid PROA).
[OUTPUT]
minfofile = qmmm_vib.minfo
[VIBRATION]
runmode = HARM
nreplica = 4
vibatm_select_index = 1 2
output_minfo_atm = 3
[SELECTION]
group1 = atomno:5-8
group2 = segid:WAT and (resno:12-14)
group3 = segid:PROA
TWENTY
EXPERIMENTS SECTION
20.1.1 Theory
Cryo-electron microscopy (Cryo-EM) is one of the powerful tools to determine three-dimensional struc-
tures of biomolecules at near atomic resolution. Flexible fitting has been widely utilized to model the
atomic structure from the experimental density map [93]. One of the commonly used methods is the
MD-based flexible fitting [94] [95]. In the method, the total potential energy is defined as the summation
of a force field 𝑉FF and biasing potential 𝑉EM that guides the protein structure towards the target density:
In the c.c.-based approach [93], one of the commonly used formulas for 𝑉EM is
where k is the force constant, and c.c. is the cross-correlation coefficient between the experimental and
simulated EM density maps, calculated as
(i, j, k) is a voxel index in the density map, and 𝜌exp and 𝜌sim are the experimental and simulated EM
densities, respectively.
The simulated densities are usually computed using a Gaussian mixture model, where a 3D Gaussian
function is put on the Cartesian coordinates of each target atom (i.e., protein atom), and all contributions
are integrated in each voxel of the map. Here, several schemes have been proposed, in which the Gaussian
function is weighted with an atomic number [96] or mass [97], or it is simply applied to non-hydrogen
atom [93]. In GENESIS, the last scheme is introduced. The simulated density of each voxel is defined
as:
𝑁 ∫︁ ∫︁ ∫︁
∑︁
𝜌sim (𝑖.𝑗, 𝑘) = 𝑔𝑛 (𝑥, 𝑦, 𝑧)d𝑥d𝑦d𝑧
𝑛=1 𝑉𝑖𝑗𝑘
122
GENESIS User Guide, 1.7.1
where 𝑉𝑖𝑗𝑘 is the volume of the voxel, N is the total number of non-hydrogen atoms in the system, and n
is the index of the atom. The Gaussian function 𝑔𝑛 (𝑥, 𝑦, 𝑧) is given by
[︂ }︁]︂
3 {︁ 2 2 2
𝑔𝑛 (𝑥, 𝑦, 𝑧) = exp − 2 (𝑥 − 𝑥𝑛 ) + (𝑦 − 𝑦𝑛 ) + (𝑧 − 𝑧𝑛 )
2𝜎
where (𝑥𝑛 , 𝑦𝑛 , 𝑧𝑛 ) are the coordinates of the n-th atom. 𝜎 determines the width of the Gaussian function,
and the generated EM density has the resolution of 2𝜎 in the map.
In GENESIS, EM biasing force is treated as a kind of “Restraints” (see Restraints section). The force
constant of the biasing potential is given in the [RESTRAINTS] section in a similar manner as the other
restraint potentials, where “functionN = EM” is specified for the restraint type (see below). The unit
of the force constant is kcal/mol. In the cryo-EM flexible fitting, the users add the [EXPERIMENTS]
section in the control file, and specify the following keywords. Note that the [FITTING] section (see
Fitting section) is not related to the cryo-EM flexible fitting.
The flexible fitting can be combined with various methods such as the replica-exchange umbrella-
sampling scheme (REUSfit) [98], all-atom Go-model (MDfit) [99], and GB/SA implicit solvent model.
The method is parallelized with the hybrid MPI+OpenMP scheme in both ATDYN and SPDYN, and
also accelerated with GPGPU calculation in SPDYN [100].
emfit YES / NO
Default : NO
Turn on or off the cryo-EM flexible fitting.
emfit_target Character
Default : N/A
The file name of the target EM density map. The available file format is MRC/CCP4 (ver.
2000 or later) or SITUS (https://situs.biomachina.org/), which is automatically selected ac-
cording to the file extension. The file extension should be “.map”, “.mrc”, or “.ccp4” for
MRC/CCP4, and “.sit” for SITUS.
emfit_sigma Real
Default : 2.5 (unit : Å)
Resolution parameter of the simulated map. This is usually set to the half of the resolution
of the target map. For example, if the target map resolution is 5 Å, “emfit_sigma=2.5” is a
reasonable choice.
emfit_tolerance Real
Default : 0.001
This variable determines the tail length of the Gaussian function. For example, if “em-
fit_tolerance=0.001” is specified, the Gaussian function is truncated to zero when it is less
than 0.1% of the maximum value. Smaller value requires large computational cost.
emfit_zero_threshold Real
Default : 0.0
This variable determines a threshold to set zero-densities in the target EM map. If the density
in a voxel of the target map is under a given “emfit_zero_threshold”, the density is set to zero.
emfit_period Integer
Default : 1
Update frequency of the EM biasing force. In the case of “emfit_period=1”, the force is
updated every step (slow but accurate).
There are some limitations in the cryo-EM flexible fitting with SPDYN. Here, we assume that the users
perform the flexible fitting with explicit solvent under the periodic boundary condition (PBC). In the
PBC, there is a unit cell at the center of the system (red box in Fig. 20.1 left panel), which is surrounded
by 26 image cells. In GENESIS, the center of the unit cell is always at the origin (𝑋, 𝑌, 𝑍) = (0, 0, 0).
Thus, the coordinates of the edge of the unit cell is (𝑋, 𝑌, 𝑍) = 0.5 × (±box_size_x, ± box_size_y, ±
box_size_z). Please keep in mind that the “water box position” of the initial structure does NOT always
correspond with the “unit cell position”. If the user constructed the initial structure without considering
the unit cell position, the center of mass of the system might be largely shifted from the origin like Fig.
20.1 right panel, which is basically no problem in typical MD simulations.
However, as shown in Fig. 20.1 left panel, in the flexibe fitting with SPDYN, all fitting atoms should
satisfy the following condition due to parallelization algorithms implemented in SPDYN:
where x, y, and z are the coordinates of each fitting atom. Here, the margin size should be larger than
0.5 × 𝑝𝑎𝑖𝑟𝑙𝑖𝑠𝑡𝑑𝑖𝑠𝑡. If the fitting atoms are located outside this region, as shown in Fig. 20.1 right panel,
correct flexible fitting calculations cannot be done. In such cases, the users must translate the center of
mass of the target protein and density map to the origin by using other external tools. For the translation
of the density map, map2map tool in SITUS is useful. This kind of limitations does not exist in ATDYN.
20.2 Examples
The following is an example of the cryo-EM flexible fitting using k = 10,000 kcal/mol for the 4.1 Å
resolution map. The other sections are common to the conventional MD simulations.
[SELECTION]
group1 = all and not hydrogen
[RESTRAINTS]
nfunctions = 1
function1 = EM # apply EM biasing potential
constant1 = 10000 # force constant in Eem = k*(1 - c.c.)
select_index1 = 1 # apply force on protein heavy atoms
[EXPERIMENTS]
emfit = YES # perform EM flexible fitting
emfit_target = emd_8623.sit # target EM density map
emfit_sigma = 2.05 # half of the map resolution (4.1 A)
emfit_tolerance = 0.001 # Tolerance for error (0.1%)
emfit_period = 1 # emfit force update period
The following is an example of REUSfit using 8 replicas, where the force constants 100–800 kcal/mol
are assigned to each replica, and exchanged during the simulation (see also REMD section).
[REMD]
dimension = 1
exchange_period = 1000
type1 = RESTRAINT
nreplica1 = 8
rest_function1 = 1
[SELECTION]
group1 = all and not hydrogen
[RESTRAINTS]
nfunctions = 1
function1 = EM
constant1 = 100 200 300 400 500 600 700 800
select_index1 = 1
[EXPERIMENTS]
emfit = YES
emfit_target = target.sit
emfit_sigma = 5
emfit_tolerance = 0.001
emfit_period = 1
Here, we show examples of the log message obtained from the flexible fitting in the NPT ensemble.
In the case of ATDYN, the cross-correlation-coefficient (c.c.) between the experimental and simulated
density maps is displayed in the column “RESTR_CVS001”, if the EM biasing potential is specified in
“function1” in the [RESTRAINTS] section:
TWENTYONE
ALCHEMY SECTION
In the [Alchemy] section, the users can specify keywords for the free-energy perturbation (FEP) method,
which is one of the alchemical free energy calculations. The FEP method calculates the free-energy
difference between two states by gradually changing a part of the system from one state to another state.
Since the free energy depends only on the initial and final states, any intermediate states can be chosen,
regardless of whether they are physically realizable or not. If the intermediate states are not physically
unrealizable but computationally realizable, the calculation and thermodynamic process are referred to as
the alchemical free-energy calculation and alchemical process, respectively. Using alchemical processes,
the FEP method can be applied to a variety of free-energy calculations, such as solvation free energies,
binding free energies, and free-energy changes upon protein mutations. In particular, absolute binding
free energies can be calculated by gradually vanishing a ligand of interest, while relative binding free
energies can be calculated by gradually changing one ligand to another.
GENESIS enables to perform various alchemical calculations by implementing dual-topology and
hybrid-topology approaches, soft-core potentials, and lambda-exchange calculations based on REMD.
GPGPU acceleration are also available in FEP simulations. The short-range non-bonded interactions (LJ
and PME real part) are calculated by GPU, while the long-range (PME reciprocal part) are calculated
by CPU. Currently, SPDYN supports only CHARMM and AMBER force fields for FEP simulations.
The FEP method is not available in ATDYN. In this section, the FEP functions in GENESIS are briefly
summarized and some examples are shown.
The FEP method calculates the free-energy difference between two states, A and B, using the following
equation.
∆𝐹 = 𝐹B − 𝐹A
∫︀
𝑑𝑥 exp[−𝛽𝑈B (𝑥)]
= −𝑘BT ln ∫︀
𝑑𝑥 exp[−𝛽𝑈A (𝑥)]
∫︀
𝑑𝑥 exp[−𝛽𝑈A (𝑥) − 𝛽∆𝑈 (𝑥)]
= −𝑘BT ln ∫︀
𝑑𝑥 exp[−𝛽𝑈A (𝑥)]
= −𝑘BT ln ⟨exp[−𝛽∆𝑈 (𝑥)]⟩A ,
where 𝐹 and 𝑈 are the free energy and potential energy of state A or B, ∆𝑈 is the difference between
𝑈A and 𝑈B , and 𝑥 is the configuration of the system. The bracket at the final line represents the en-
semble average at state A. This equation means that ∆𝐹 can be estimated by sampling only equilibrium
configurations of the state A. However, if the difference between states A and B is large, there is little
127
GENESIS User Guide, 1.7.1
overlap of energy distributions (Left of Fig. 21.1) and the configurations at state B are poorly sampled
by simulations at the state A, leading to large statistical errors. To reduce the errors, 𝑛 − 2 intermediate
states are inserted between states A and B to overlap energy distributions (Right of Left of Fig. 21.1).
The potential energy of the intermediate state 𝑖 is defined by 𝑈 (𝜆𝑖 ) = (1 − 𝜆𝑖 )𝑈A + 𝜆𝑖 𝑈B , where 𝜆𝑖 is
the scaling parameter for connecting the initial and final states. By changing 𝜆𝑖 from A to B, states A
and B can be connected smoothly. ∆𝐹 can be estimated by calculating the summation of free-energy
changes between adjacent states:
𝑛−1
∑︁
∆𝐹 = ∆𝐹𝑖
𝑖=0
𝑛−1
∑︁
= −𝑘𝐵 𝑇 ln ⟨exp[−𝛽(𝑈 (𝜆𝑖 + 1) − 𝑈 (𝜆𝑖 ))]⟩ ,
𝑖=0
where the subscript 𝜆𝑖 represents the ensemble average at state 𝑖. The states of 𝑖 = 0 and 𝑛 correspond
to state A and B, respectively.
Fig. 21.1: Insertion of intermediate states. Some intermediate states are inserted between the reference
state and the target state states to overlap energy distributions.
One of the most important applications of the FEP method is the calculation of protein-ligand binding
affinity, which represents how strong a ligand binds to a protein. In drug discovery, it is required to
find a ligand that best binds to the target protein from a large number of chemical compounds. The
difference between binding affinities of two ligands, called the relative binding affinity, can be calculated
by changing one ligand into another ligand during the simulation. For example, consider the mutation
from benzene to phenol (Fig. 21.2 (a)). Benzene and phenol correspond to states A and B, respectively.
The atoms of benzene except for a hydrogen atom are common to both ligands, which have no need to be
perturbed. On the other hand, the H atom of benzene and the OH atoms of phenol are different in not only
their force field parameters but also their topology. To minimize perturbation and treat the topological
difference, topologies of two ligands are unified such that the atoms with different topologies connect
with the common atoms (Fig. 21.2 (b)). This topology is called the dual topology, which consists of the
common atoms, the atoms included in only state A (dualA in Fig. 21.2 (b)), and the atoms included in
only state B (dualB in Fig. 21.2 (b)) [101, 102, 103]. The perturbation is applied to only the dualA and
dualB parts.
The free-energy change upon the mutation can be calculated by gradually switching the interactions of
the dual-topology part from benzene to phenol (Fig. 21.2 (c)). At state A, only the H atom exists in the
dual-topology part, while the OH atoms do not interact with the other atoms in the system. During the
alchemical transformation, the H atom gradually disappears, whereas the OH atoms gradually appears.
At state B, only the OH atoms exist in the dual-topology part and interact with the other atoms. The
Fig. 21.2: Dual topology approach. (a) Benzene and phenol. Common atoms, H atom of benzene, and
OH atoms of phenol are in black, cyan, and green, respectively. (b) Dual topology of benzene and phenol.
(c) The alchemical transformation from benzene to phenol.
non-bonded potential energy is modified to connect smoothly state A to state B by introducing 𝜆𝐿𝐽 and
𝜆𝑒𝑙𝑒𝑐 :
where “common”, “dualA”, “dualB”, and “other” in the superscripts respectively represent the common
atoms, the atoms existing only at state A, the atoms existing only at state B, and the other molecules
including solvent molecules, proteins, or other ligands. For example, 𝑈LJ common-dualA represents the LJ
interaction between a common atom and a dualA atom. The potential energy at 𝜆𝐴 𝐴
LJ = 1, 𝜆elec = 1,
𝜆𝐵LJ = 0, and 𝜆elec = 0 corresponds to that of state A, while the energy at 𝜆LJ = 0, 𝜆elec = 0, 𝜆LJ = 1,
𝐵 𝐴 𝐴 𝐵
and 𝜆elec = 1 corresponds to that of state B. By gradually changing 𝜆LJ , 𝜆elec , 𝜆LJ , and 𝜆elec , states A
𝐵 𝐴 𝐴 𝐵 𝐵
and 𝜆𝐵elec can be specified by lambljA, lambljB, lambelA, and lambelB, respectively. An example is
shown below.
[ALCHEMY]
fep_topology = Dual
dualA = 1 # group1 in [SELECTION]
dualB = 2 # group2 in [SELECTION]
lambljA = 1.00 0.75 0.50 0.25 0.00
(continues on next page)
[SELECTION]
group1 = ai:1 # atoms in dual A
group2 = ai:2-3 # atoms in dual B
LECTION] section, the H atom of benzene and the OH atoms of phenol are selected by group 1 and 2,
respectively. The group IDs are specified as dualA = 1 and dualB = 2.
In the dual-topology approach, the force field parameters of common atoms are assumed to be the same
in both states. However, in general, they are different each other. Fig. 21.3 (a) shows that the charge dis-
tribution of the benzene ring of benzene is different from that of phenol. The parameters of the common
atoms for bond, angle, and dihedral are also different between two molecules. To treat the difference of
the force field parameters, the parts of the molecules with the same topology are superimposed (Fig. 21.3
(b)) [104]. The superimposed part has a single topology, in which the parts corresponding to states A and
B are referred to as “singleA” and “singleB”, respectively. During FEP simulations, the single-topology
part does not change its topology, but its force field parameters (charge, LJ, and internal bond) are grad-
ually changed from state A to state B (Fig. 21.3 (c)). In contrast, the other part has a dual topology, in
which “dualA” and “dualB” correspond to states A and B, respectively. In the dual-topology part, their
topology is changed as well as their parameters (Fig. 21.3 (c)).
Fig. 21.3: Hybrid topology approach. (a) Benzene and phenol. H atom of benzene and OH atoms of
phenol are in cyan and green, respectively. The point charges on common atoms are shown in red and
magenda, which are determined using Amber Tools [1]. (b) Hybrid topology of benzene and phenol. (c)
The alchemical transformation from benzene to phenol.
In the hybrid topology approach, the potential energy is scaled by 𝜆LJ , 𝜆elec , and 𝜆bond :
other-other other-other
𝑈nonbond = 𝑈LJ + 𝑈elec
other-singleA other-dualA singleA-singleA singleA-dualA dualA-dualA
+ 𝜆𝐴
LJ (𝑈LJ + 𝑈LJ + 𝑈LJ + 𝑈LJ + 𝑈LJ )
other-singleB other-dualB singleB-singleB singleB-dualB dualB-dualB
+ 𝜆𝐵
LJ (𝑈LJ + 𝑈LJ + 𝑈LJ + 𝑈LJ + 𝑈LJ )
other-singleA other-dualA singleA-singleA singleA-dualA dualA-dualA
+ 𝜆𝐴
elec (𝑈elec + 𝑈elec + 𝑈elec + 𝑈elec + 𝑈elec )
other-singleB other-dualB singleB-singleB singleB-dualB dualB-dualB
+ 𝜆𝐵
elec (𝑈elec + 𝑈elec + 𝑈elec + 𝑈elec + 𝑈elec )
where “singleA”, “singleB”, “dualA”, and “dualB” in the superscripts respectively represent the atoms
corresponding to “singleA”, “singleB”, “dualA”, and “dualB” parts, respectively, and “other” represents
the other molecules including solvent molecules, proteins, or other ligands. The potential energy at
𝜆𝐴LJ = 1, 𝜆elec = 1, 𝜆bond = 1, 𝜆LJ = 0, 𝜆elec = 0, and 𝜆bond = 0 corresponds to that of state A, while
𝐴 𝐴 𝐵 𝐵 𝐵
[ALCHEMY]
fep_topology = Hybrid
singleA = 1 # group1 in [SELECTION]
singleB = 2 # group2 in [SELECTION]
dualA = 3 # group3 in [SELECTION]
dualB = 4 # group4 in [SELECTION]
lambljA = 1.00 0.75 0.50 0.25 0.00
lambljB = 0.00 0.25 0.50 0.75 1.00
lambelA = 1.00 0.75 0.50 0.25 0.00
lambelB = 0.00 0.25 0.50 0.75 1.00
lambbondA = 1.00 0.75 0.50 0.25 0.00
lambbondB = 0.00 0.25 0.50 0.75 1.00
[SELECTION]
group1 = ai:1-11 # atoms in single A
group2 = ai:13-23 # atoms in single B
group3 = ai:12 # atoms in dual A
group4 = ai:24-25 # atoms in dual B
In this example, five sets of the lambda values are used to connect state A to state B. In the [SELECTION]
section, the benzene ring of benzene, the benzene ring of phenol, the H atom of benzene, and the OH
atoms of phenol are selected by group 1, 2, 3, and 4, respectively. The group IDs are specified as singleA
= 1, singleB = 2, dualA = 3, and dualB = 4.
Close to the end point of alchemical calculations (𝜆LJ = 0 or 1), overlaps between perturbed atoms or
between perturbed and non-perturbed atoms occur, causing large energy change. The system becomes
unstable due to the overlaps and the simulations might be stopped, which is called the end point catas-
trophe. To avoid the catastrophe, the soft core is introduced to the LJ potential [105]:
⎡(︃ )︃6 (︃ )︃3 ⎤
𝜎 2 𝜎 2
𝑈LJ (𝑟𝑖𝑗 , 𝜆LJ ) = 4𝜆LJ 𝜖 ⎣ 2 − 2 + 𝛼 (1 − 𝜆 )
⎦,
𝑟𝑖𝑗 + 𝛼𝑠𝑐 (1 − 𝜆LJ ) 𝑟𝑖𝑗 𝑠𝑐 LJ
where 𝛼sc is the parameter for the soft-core potential. In the potential, 𝑟𝑖𝑗 2 is shifted to 𝛼 (1 − 𝜆 ),
sc LJ
which weakens the repulsive part in the LJ potential when 𝜆LJ approaches 0 (Fig. 21.4). Since the soft-
core potential corresponds to the original LJ potential at the end point of 𝜆LJ :
[︃(︂ )︂ )︂ ]︃
𝜎 12 𝜎 6
(︂
𝑈LJ (𝑟𝑖𝑗 , 𝜆LJ = 1) = 4𝜖 − ,
𝑟𝑖𝑗 𝑟𝑖𝑗 )
𝑈LJ (𝑟𝑖𝑗 , 𝜆LJ = 0) = 0,
the soft-core modification in the LJ potential does not affect the free-energy calculation. 𝛼sc can be
specified by a keyword sc_alpha in the GENESIS control file.
In GENESIS, the soft core is also applied to the electrostatic potential [106]:
√︁
𝑞𝑖 𝑞𝑗 erfc(𝛼 𝑟𝑖𝑗2 + 𝛽 (1 − 𝜆
𝑠𝑐 elec ))
𝑈elec (𝑟𝑖𝑗 , 𝜆elec ) = 𝜆elec √︁ + 𝜆elec (PME reciprocal and self terms),
𝜖 𝑟𝑖𝑗2 + 𝛽 (1 − 𝜆 )
𝑠𝑐 elec
where 𝛽sc is the parameter for the electrostatic soft-core potential. In the potential, 𝑟𝑖𝑗
2 is also shifted to
𝛽sc (1 − 𝜆elec ) like the LJ soft-core potential, which softens disruptions due to overlaps of point charges.
This soft-core potential is almost the same as used in Amber [106]. 𝛽sc can be specified by a keyword
sc_beta in the GENESIS control file.
21.1.5 FEP/𝜆-REMD
To enhance the sampling efficiency, the FEP simulations at different 𝜆 values are coupled using the
Hamiltonian replica exchange method [69, 70]. This method is called FEP/𝜆-REMD or 𝜆-exchange FEP
[74]. Replicas run in parallel and exchange their parameters at fixed intervals during the simulation.
The exchanges between adjacent replicas are accepted or rejected according to Metropolis’s criterion. In
FEP/𝜆-REMD simulations, [REMD] section is also required. type1 = alchemy is set to exchange the
lambda values. The following is an example of the control file for the FEP/𝜆-REMD simulation.
[REMD]
dimension = 1
exchange_period = 500
type1 = alchemy
nreplica1 = 5
[ALCHEMY]
fep_topology = Hybrid
singleA = 1 # group1 in [SELECTION]
singleB = 2 # group2 in [SELECTION]
dualA = 3 # group3 in [SELECTION]
dualB = 4 # group4 in [SELECTION]
lambljA = 1.00 0.75 0.50 0.25 0.00
lambljB = 0.00 0.25 0.50 0.75 1.00
lambelA = 1.00 0.75 0.50 0.25 0.00
lambelB = 0.00 0.25 0.50 0.75 1.00
lambbondA = 1.00 0.75 0.50 0.25 0.00
lambbondB = 0.00 0.25 0.50 0.75 1.00
[SELECTION]
group1 = ai:1-11 # atoms in single A
group2 = ai:13-23 # atoms in single B
group3 = ai:12 # atoms in dual A
group4 = ai:24-25 # atoms in dual B
lambljB Real
Default: 1.0
Scaling parameters for Lennard-Jones interactions in state B (𝜆𝐵
LJ ).
lambelA Real
Default: 1.0
Scaling parameters for electrostatic interactions in state A (𝜆𝐴
elec ).
lambelB Real
Default: 1.0
Scaling parameters for electrostatic interactions in state B (𝜆𝐵
elec ).
lambbondA Real
Default: 1.0
Scaling parameters for bonded interactions in state A (𝜆𝐴
bond ).
lambbondB Real
Default: 1.0
bond ).
Scaling parameters for bonded interactions in state B (𝜆𝐵
lambrest Real
Default: 1.0
Scaling parameters for restraint interactions (𝜆rest ).
fep_md_type Serial / Single / Parallel
Default: Serial
Type of FEP simulation. If fep_md_type = Serial, FEP simulations are performed with
changing lambda values specified in lambljA, lambljB, lambelA, etc. For example, if 0.0,
0.5, and 1.0 are specified in lambljA, GENESIS first performs the FEP simulation with
lambljA = 0.0, subsequently performs the FEP simulation with lambljA = 0.5, and finally
performs the FEP simulation with lambljA = 1.0. If fep_md_type = Single, a FEP sim-
ulation is performed with the lambda window specified in ref_lambid. If fep_md_type =
Parallel, each lambda window is simulated in parallel. In this case, [REMD] section must
be specified.
ref_lambid Integer
Default: 0
Reference window id for a single FEP MD simulation. If fep_md_type = Single,
ref_lambid must be specified.
21.3 Examples
Example of a calculation of the solvation free energy of a ligand. The solvation free energy corresponds to
the free-energy change upon the transfer of the ligand from vacuum to solvent. In state A (= in solvent)
the ligand fully interacts with solvent molecules, whereas in state B (= in vacuum) those interactions
vanishes. To perform such calculation, the dual topology is employed, and dualA is set to the group ID
of the selected ligand, while dualB is set to NONE. dualB = NONE means that there is no ligand in the
system at state B. lambljA, lambljB, lambelA, and lambelB should be zero at state B.
[ALCHEMY]
fep_direction = BothSides
fep_topology = Dual
singleA = NONE
singleB = NONE
dualA = 1
dualB = NONE
fepout_period = 500
equilsteps = 0
sc_alpha = 5.0
sc_beta = 0.5
lambljA = 1.000 1.000 1.000 1.000 1.000 0.750 0.500 0.250 0.000
lambljB = 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
lambelA = 1.000 0.750 0.500 0.250 0.000 0.000 0.000 0.000 0.000
lambelB = 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
lambbondA = 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
lambbondB = 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
lambrest = 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
[SELECTION]
group1 = segid:LIG
Example of an alchemical transformation between two ligands using serial FEP simulations. When the
[REMD] section is not specified and more than one lambda values are specified, GENESIS performs
serial calculations by changing lambda values. If fep_direction is Bothsides, lambljA, lambljB, lam-
belA, lambelB, lambbondA, and lambbondB are first set to the leftmost values, which are “1.00”,
“0.00”, “1.00”, “0.00”, “1.00”, and “0.00”, respectively, in the below example. After the equilsteps
+ nsteps-steps FEP simulation is performed with the set of the lambda values, the lambda values are
changed to the second values from the left. In this way, GENESIS performs FEP simulations, changing
lambda values. When the FEP simulation with the rightmost values of lambda finishes, GENESIS stops
the calculation.
[ALCHEMY]
fep_direction = BothSides
fep_topology = Hybrid
singleA = 1 # group1 in [SELECTION]
singleB = 2 # group2 in [SELECTION]
dualA = 3 # group3 in [SELECTION]
dualB = 4 # group4 in [SELECTION]
fepout_period = 500
equilsteps = 0
(continues on next page)
[SELECTION]
group1 = ai:1-11 # atoms in single A
group2 = ai:13-23 # atoms in single B
group3 = ai:12 # atoms in dual A
group4 = ai:24-25 # atoms in dual B
Example of an alchemical transformation between two ligands at a set of lambda values. If the user
wants to perform a FEP simulation with a specified set of lambda values, set fep_md_type to Single and
assign the ID of the set of lambda values to ref_lambid. In the following example, ref_lambid is set to 3,
which means that the third column of the lambda values: lambljA = 0.5, lambljB = 0.5, lambelA = 0.5,
lambelB = 0.5, lambbondA = 0.5, and lambbondB = 0.5. If fep_direction = Bothsides, the energy
differences between “ref_lambid”-th and “ref_lambid -1”-th columns and between “ref_lambid”-th
and “ref_lambid +1”-th columns are outputted into the fepout file. By using these function, the user can
independently perform FEP simulations with different lambda values in parallel.
[ALCHEMY]
fep_direction = BothSides
fep_topology = Hybrid
fep_md_type = Single
ref_lambid = 3
singleA = 1 # group1 in [SELECTION]
singleB = 2 # group2 in [SELECTION]
dualA = 3 # group3 in [SELECTION]
dualB = 4 # group4 in [SELECTION]
fepout_period = 500
equilsteps = 0
sc_alpha = 5.0
sc_beta = 5.0
lambljA = 1.00 0.75 0.50 0.25 0.00
lambljB = 0.00 0.25 0.50 0.75 1.00
lambelA = 1.00 0.75 0.50 0.25 0.00
lambelB = 0.00 0.25 0.50 0.75 1.00
lambbondA = 1.00 0.75 0.50 0.25 0.00
lambbondB = 0.00 0.25 0.50 0.75 1.00
[SELECTION]
group1 = ai:1-11 # atoms in single A
group2 = ai:13-23 # atoms in single B
group3 = ai:12 # atoms in dual A
group4 = ai:24-25 # atoms in dual B
Example of an alchemical transformation between two ligands using a parallel FEP simulation. When
the [REMD] section is specified, GENESIS performs the FEP/𝜆-REMD simulation. Each lambda value
in lambljA, lambljB, lambelA, lambelB, lambbondA, and lambbondB is assigned to each replica, and
the FEP simulation in each replica is performed in parallel. The lambda values are exchanged at fixed
intervals specified by exchange_period during the simulation.
[REMD]
dimension = 1
exchange_period = 1000
type1 = alchemy
nreplica1 = 5
[ALCHEMY]
fep_direction = BothSides
fep_topology = Hybrid
singleA = 1 # group1 in [SELECTION]
singleB = 2 # group2 in [SELECTION]
dualA = 3 # group3 in [SELECTION]
dualB = 4 # group4 in [SELECTION]
fepout_period = 500
equilsteps = 0
sc_alpha = 5.0
sc_beta = 5.0
lambljA = 1.00 0.75 0.50 0.25 0.00
lambljB = 0.00 0.25 0.50 0.75 1.00
lambelA = 1.00 0.75 0.50 0.25 0.00
lambelB = 0.00 0.25 0.50 0.75 1.00
lambbondA = 1.00 0.75 0.50 0.25 0.00
lambbondB = 0.00 0.25 0.50 0.75 1.00
[SELECTION]
group1 = ai:1-11 # atoms in single A
group2 = ai:13-23 # atoms in single B
group3 = ai:12 # atoms in dual A
group4 = ai:24-25 # atoms in dual B
TWENTYTWO
TROUBLE SHOOTING
The followings are representative error messages that the users can frequently encounter during the sim-
ulations. We describe possible reasons for each error message, and provide suggestions to solve the
problem.
Compute_Shake> SHAKE algorithm failed to converge
This message indicates that constraint for the rigid bond using the SHAKE algorithm (see
Constraints section) was failed due to some reasons. In most cases, SHAKE errors are
originated from insufficient equilibration, bad initial structure, or bad input parameters. We
recommend the users to check the following points:
• Reconsider the equilibration scheme. More moderate equilibration might be needed.
For example, heating the system from 0 K, using a shorter timestep (e.g., 1.0 fs), or
performing long energy minimization is a possible solution.
• Check the initial structure very carefully. One of the frequent mistakes in the initial
structure modeling is “ring penetration” of covalent bonds. One covalent bond might
be somehow inserted into an aromatic ring. Solve the ring penetration first, and then
try the simulation again.
• Some force field parameters are missing or wrong, which can easily cause unstable
simulations.
Check_Atom_Coord> Some atoms have large clashes
This message indicates that there is an atom pair whose distance is zero or close to zero.
Those atom indexes and distance are displayed in a warning message: “WARNING: too
short distance:”. This situation is not allowed, especially in SPDYN, since it can cause a
numerical error in the lookup table method. Check the initial structure first. Even if you
cannot see such atomic clashes, there may be a clash between the atoms in the unit cell and
image cells in the case of the periodic boundary condition. One of the automatic solutions
is to specify “contact_check = YES” in the control file (see Energy section). However, this
cannot work well, if the distance is exactly zero. In such cases, the problem should be solved
by the users themselves. For example, the users may have to slightly move the clashing atoms
manually, or specify larger or smaller box size, or rebuild the initial structure more carefully.
Setup_Processor_Number> Cannot define domains and cells. Smaller MPI processors, or shorter
pairlistdist, or larger boxsize should be used
This message indicates that the total number of MPI processors used in your calculation is not
appropriate for your system. The users had better understand relations between the system
size and number of MPI processors. In SPDYN, the system is divided into several domains
for parallel computation, where the number of domains must be equal to the number of MPI
processors (see Available Programs). In most cases, this message tells you that the system
139
GENESIS User Guide, 1.7.1
could not be divided into the specified number of domains. Although there are mainly three
solutions for this problem, first one is the most recommended way:
• Use smaller number of MPI processors. If it can work, the previous number was too
large to handle the system.
• Use shorter pairlistdist. This treatment can make a domain size smaller, allowing to
use a larger number of MPI processors. However, this is not recommended, if you are
already using a recommended parameter set for switchdist, cutoffdist, and pairlistdist
(e.g., 10, 12, and 13.5 Å in the CHARMM force field)
• Build a larger initial structure by adding solvent molecules in the system, which may
allow the users to divide the system into the desired number of domains.
Update_Boundary_Pbc> too small boxsize/pairdist. larger boxsize or shorter pairdist should be
used.
This message indicates that your system is too small to handle in the periodic boundary
condition. In ATDYN, cell-linked list method is used to make non-bonded pairlists, where
the cell size is determined to be close to and larger than the pairlist distance given in the
control file. In addition, the total number of cells in x, y, and z dimensions must be at least
three. SPDYN has a similar lower limitation in the available box size. Therefore, in order
to solve this problem, the users may have to set a shorter pairlistdist, or build a larger system
by adding much solvent molecules.
Compute_Energy_Experimental_Restraint_Emfit> Gaussian kernel is extending outside the map
box
This message indicates that the simulated densities were generated outside the target density
map. If atoms to be fitted are located near the edge of the target density map, this error can
frequently happen.
• Create a larger density map by adding an enough margin to the map, which can be
easily accomplished with the “voledit” tool in SITUS (https://situs.biomachina.org/).
• Examine a normal MD simulation by turning off the EM biasing potential (emfit = NO).
If the simulation is not stable, there is an issue in the molecular mechanics calculation
rather than the biasing potential calculation. In such cases, please check the initial
structure carefully. There might be large clashes between some atoms, which can cause
explosion of the target molecule, and push some atoms out of the density map. The
problems to be solved are almost same with those in the SHAKE errors (see above).
Compute_Energy_Restraints_Pos> Positional restraint energy is too big
This message indicates that some atoms to be restrained are significantly deviated from the
reference position, indicating that the restraint might not be properly applied to such atoms.
This situation is not allowed in SPDYN.
• Use a larger force constant to keep their position near the reference.
• Turn off the positional restraint for such atoms if it is not essential.
140
CHAPTER
TWENTYTHREE
APPENDIX
In the first sub-section, we explain how to install the requirements using the package manager “apt” in
Ubuntu/Debian. If you want to install them from the source codes, or if you want to use other Linux
systems like CentOS, please see the second sub-section.
OpenMPI
Then, we install OpenMPI. Note that the development version (XXX-dev) should be installed.
141
GENESIS User Guide, 1.7.1
LAPACK/BLAS libraries
$ ls /usr/lib/x86_64-linux-gnu/liblapack.*
$ ls /usr/lib/x86_64-linux-gnu/libblas.*
Here, we explain how to install OpenMPI and LAPACK/BLAS libraries from the source codes. We
assume that the users already installed GNU compilers. The following schemes are commonly applicable
to typical Linux systems including CentOS and Red Hat.
OpenMPI
The source code of OpenMPI is availabe in https://www.open-mpi.org/. The following commands install
OpenMPI 3.1.5 in the user’s local directory “$HOME/Software/mpi” as an example. Here, we use GNU
compilers (gcc, g++, and gfortran).
$ cd $HOME
$ mkdir Software
$ cd Software
$ mkdir build
$ cd build
$ wget https://download.open-mpi.org/release/open-mpi/v3.1/openmpi-3.1.5.tar.
˓→gz
$ make all
$ make install
MPIROOT=$HOME/Software/mpi
export PATH=$MPIROOT/bin:$PATH
export LD_LIBRARY_PATH=$MPIROOT/lib:$LD_LIBRARY_PATH
export MANPATH=$MPIROOT/share/man:$MANPATH
$ source ~/.bash_profile
If you want to uninstall OpenMPI, just remove the directory “mpi” in “Software”.
LAPACK/BLAS libraries
$ cd $HOME/Software
$ wget http://www.netlib.org/lapack/lapack-3.8.0.tar.gz
$ tar -xvf lapack-3.8.0.tar.gz
$ cd lapack-3.8.0
$ cp make.inc.example make.inc
$ make blaslib
$ make lapacklib
$ ls lib*
liblapack.a librefblas.a
$ ln -s librefblas.a ./libblas.a
export LAPACK_PATH=$HOME/Software/lapack-3.8.0
$ source ~/.bash_profile
If you want to uninstall LAPACK/BLAS, just remove the directory “lapack-3.8.0” in “Software”.
We recommend the Mac users to utilize “Xcode” for the installation of GENESIS, and also to install
“OpenMPI” from the source code to avoid a “clang” problem (see below).
“Xcode” is available in the Mac App Store (https://developer.apple.com/xcode/), and it is free of charge.
After the installation of Xcode, all tasks described below will be done on “Terminal”. The “Terminal
app” is in the “Utilities” folder in Applications. Please launch the Terminal. This terminal is almost same
with that in Linux.
We recommend you to further install “Homebrew”, which enables easy installation of various tools such
as compilers. If you have already installed “MacPorts”, you do not need to install “Homebrew” to avoid a
conflict between “Homebrew” and “MacPorts”. In the Homebrew website (https://brew.sh/), you can find
a long command like “/usr/bin/ruby -e "$(curl -fsSL https://...”. To install homebrew,
execute that command in the Terminal prompt.
First, we install “gcc”, “autoconf”, “automake”, and other tools via homebrew:
$ which gcc
/usr/bin/gcc
$ gcc --version
...
Apple LLVM version 10.0.1 (clang-1001.0.46.4)
These messages tell us that “gcc” is installed in the “/usr/bin” directory. However, this gcc is not a “real”
GNU compiler, and it is linked to another compiler “clang”. If you use this gcc for the installation of
OpenMPI, it can cause a trouble in compiling GENESIS with a certain option. Therefore, you have
to use a “real” GNU compiler, which is actually installed in “/usr/local/bin”. For example, if you have
installed gcc ver. 9, you can find it as “gcc-9” in “/usr/local/bin”.
$ ls /usr/local/bin/gcc*
/usr/local/bin/gcc-9 /usr/local/bin/gcc-ar-9 ...
$ gcc-9 --version
gcc-9 (Homebrew GCC 9.2.0) 9.2.0
OpenMPI
We then install “OpenMPI”. We specify “real” GNU compilers explicitly in the configure command.
The following commands install OpenMPI in the user’s local directory “$HOME/Software/mpi”.
$ cd $HOME
$ mkdir Software
$ cd Software
$ mkdir build
$ cd build
$ wget https://download.open-mpi.org/release/open-mpi/v3.1/openmpi-3.1.5.tar.
˓→gz
$ make all
$ make install
MPIROOT=$HOME/Software/mpi
export PATH=$MPIROOT/bin:$PATH
export LD_LIBRARY_PATH=$MPIROOT/lib:$LD_LIBRARY_PATH
export MANPATH=$MPIROOT/share/man:$MANPATH
$ source ~/.bash_profile
Make sure that “mpicc” and “mpif90” are linked to “gcc-9” and “gfortran-9”, respectively.
$ mpicc --version
gcc-9 (Homebrew GCC 9.2.0) 9.2.0
$ mpif90 --version
(continues on next page)
If you want to uninstall OpenMPI, just remove the directory “mpi” in “Software”.
LAPACK/BLAS libraries
Finally, we install LAPACK and BLAS libraries. Again, “real” GNU compilers are used for the install.
$ cd $HOME/Software
$ wget http://www.netlib.org/lapack/lapack-3.8.0.tar.gz
$ tar -xvf lapack-3.8.0.tar.gz
$ cd lapack-3.8.0
$ cp make.inc.example make.inc
$ make blaslib
$ make lapacklib
$ ls lib*
liblapack.a librefblas.a
$ ln -s librefblas.a ./libblas.a
export LAPACK_PATH=$HOME/Software/lapack-3.8.0
$ source ~/.bash_profile
If you want to uninstall LAPACK/BLAS, just remove the directory “lapack-3.8.0” in “Software”.
[1] D.A. Case, I.Y. Ben-Shalom, S.R. Brozell, D.S. Cerutti, T.E. Cheatham III, V.W.D. Cruzeiro, T.A.
Darden, R.E. Duke, D. Ghoreishi, M.K. Gilson, H. Gohlke, A.W. Goetz, D. Greene, R Harris,
N. Homeyer, S. Izadi, A. Kovalenko, T. Kurtzman, T.S. Lee, S. LeGrand, P. Li, C. Lin, J. Liu,
T. Luchko, R. Luo, D.J. Mermelstein, K.M. Merz, Y. Miao, G. Monard, C. Nguyen, H. Nguyen,
I. Omelyan, A. Onufriev, F. Pan, R. Qi, D.R. Roe, A. Roitberg, C. Sagui, S. Schott-Verdugo,
J. Shen, C.L. Simmerling, J. Smith, R. Salomon-Ferrer, J. Swails, R.C. Walker, J. Wang, H. Wei,
R.M. Wolf, X. Wu, L. Xiao, D.M. York, and P.A. Kollman. Amber18. University of California,
San Francisco, 2018.
[2] B. R. Brooks, C. L. Brooks, A. D. Mackerell, L. Nilsson, R. J. Petrella, B. Roux, Y. Won, G. Ar-
chontis, C. Bartels, S. Boresch, A. Caflisch, L. Caves, Q. Cui, A. R. Dinner, M. Feig, S. Fischer,
J. Gao, M. Hodoscek, W. Im, K. Kuczera, T. Lazaridis, J. Ma, V. Ovchinnikov, E. Paci, R. W. Pas-
tor, C. B. Post, J. Z. Pu, M. Schaefer, B. Tidor, R. M. Venable, H. L. Woodcock, X. Wu, W. Yang,
D. M. York, and M. Karplus. CHARMM: The biomolecular simulation program. J. Comput.
Chem., 30:1545–1614, 2009. URL: http://dx.doi.org/10.1002/jcc.21287, doi:10.1002/jcc.21287
(https://doi.org/10.1002/jcc.21287).
[3] S. Pronk, S. Páll, R. Schulz, P. Larsson, P. Bjelkmar, R. Apostolov, M. R. Shirts, J. C. Smith, P. M.
Kasson, D. van der Spoel, B. Hess, and E. Lindahl. GROMACS 4.5: a high-throughput and highly
parallel open source molecular simulation toolkit. Bioinformatics, 29:845–854, 2013.
[4] K. J. Bowers, E. Chow, H. Xu, R. O. Dror, M. P. Eastwood, B. A. Gregersen, J. L. Klepeis,
I. Kolossvary, M. A. Moraes, F. D. Sacerdoti, J. K. Salmon, Y. Shan, and D. E. Shaw. Scalable
algorithms for molecular dynamics simulations on commodity clusters. In SC 2006 Conference,
Proceedings of the ACM/IEEE, 11–17. IEEE, 2006.
[5] J. C. Phillips, R. Braun, W. Wang, J. Gumbart, E. Tajkhorshid, E. Villa, C. Chipot, R. D.
Skeel, L. Kalé, and K. Schulten. Scalable molecular dynamics with NAMD. J. Comput.
Chem., 26:1781–1802, 2005. URL: http://dx.doi.org/10.1002/jcc.20289, doi:10.1002/jcc.20289
(https://doi.org/10.1002/jcc.20289).
[6] D. A. Case, T. E. Cheatham, T. Darden, H. Gohlke, R. Luo, K. M. Merz, A. Onufriev, C. Sim-
merling, B. Wang, and R. J. Woods. The Amber biomolecular simulation programs. J. Comput.
Chem., 26:1668–1688, 2005.
[7] A. D. MacKerell, D. Bashford, M. Bellott, R. L. Dunbrack, J. D. Evanseck, M. J. Field, S. Fischer,
J. Gao, H. Guo, S. Ha, D. Joseph-McCarthy, L. Kuchnir, K. Kuczera, F. T. K. Lau, C. Mattos,
S. Michnick, T. Ngo, D. T. Nguyen, B. Prodhom, W. E. Reiher, B. Roux, M. Schlenkrich, J. C.
Smith, R. Stote, J. Straub, M. Watanabe, J. Wiorkiewicz-Kuczera, D. Yin, and M. Karplus. All-
atom empirical potential for molecular modeling and dynamics studies of proteins. J. Phys. Chem.
B, 102:3586–3616, 1998.
149
GENESIS User Guide, 1.7.1
[8] A. D. MacKerell, M. Feig, and C. L. Brooks. Improved treatment of the protein backbone in
empirical force fields. J. Am. Chem. Soc., 126:698–699, 2004.
[9] W. L. Jorgensen, D. S. Maxwell, and J. Tirado-Rives. Development and Testing of the OPLS All-
Atom Force Field on Conformational Energetics and Properties of Organic Liquids. J. Am. Chem.
Soc., 118:11225–11236, 1996.
[10] C. Oostenbrink, A. Villa, A. E. Mark, and W. F. Van Gunsteren. A biomolecular force field based
on the free enthalpy of hydration and solvation: The GROMOS force-field parameter sets 53A5
and 53A6. J. Comput. Chem., 25:1656–1676, 2004.
[11] J. Jung, T. Mori, and Y. Sugita. Efficient lookup table using a linear function of inverse distance
squared. J. Comput. Chem., 34:2412–2420, 2013.
[12] J. Jung, T. Mori, and Y. Sugita. Midpoint cell method for hybrid (MPI+OpenMP) parallelization
of molecular dynamics simulations. J. Comput. Chem., 35:1064–1072, 2014.
[13] J. Huang, S. Rauscher, G. Nawrocki, T. Ran, M. Feig, B. L. de Groot, H. Grubmüller, and
A. D. MacKerell Jr. CHARMM36m: an improved force field for folded and intrinsically disor-
dered proteins. Nat. Methods, 14:71–73, 2017.
[14] W. Humphrey, A. Dalke, and K. Schulten. VMD: Visual molecular dynamics. J. Mol. Graph.,
14:33–38, 1996.
[15] S. Jo, T. Kim, V. G. Iyer, and W. Im. CHARMM GUI: A web based graphical user interface for
CHARMM. J. Comput. Chem., 29:1859–1865, 2008.
[16] W. D. Cornell, P. Cieplak, C. I. Bayly, I. R. Gould, K. M. Merz, D. M. Ferguson, D. C. Spellmeyer,
T. Fox, J. W. Caldwell, and P. A. Kollman. A Second Generation Force Field for the Simulation
of Proteins, Nucleic Acids, and Organic Molecules. J. Am. Chem. Soc., 117:5179–5197, 1995.
[17] H. Taketomi, Y. Ueda, and N. Go. Studies on protein folding, unfolding and fluctuations by com-
puter simulation. nt. J. Peptide Proteins Res., 7:445–459, 1975.
[18] S.J. Marrink, H.J. Risselada, S. Yefimov, D.P. Tieleman, and A.H. de Vries. The MARTINI force-
field: coarse grained model for biomolecular simulations. J. Phys. Chem. B, 111:7812–7824, 2007.
[19] P. C. Whitford, J. K. Noel, S. Gosavi, A. Schug, K. Y. Sanbonmatsu, and J. N. Onuchic. An all-
atom structure-based potential for proteins: Bridging minimal models with all-atom empirical
forcefields. Proteins: Structure, Function, and Bioinformatics, 75:430–441, 2009.
[20] J. K. Noel, P. C. Whitford, K. Y. Sanbonmatsu, and J. N. Onuchic. SMOG@ctbp: simplified
deployment of structure based models in GROMACS. Nucleic Acids Res., 38:W657–W661, 2010.
[21] J. K. Noel, M. Levi, M. Raghunathan, H. Lammert, R. L. Hayes, J. N. Onuchic, and P. C. Whitford.
SMOG 2: A Versatile Software Package for Generating Structure Based Models. PLoS Comput.
Biol., 12:e1004794, 2016.
[22] J. Karanicolas and C. L. Brooks, III. The origins of asymmetry in the folding transition states of
protein L and protein G. Protein Sci., 11:2351–2361, 2002.
[23] J. Karanicolas and C. L. Brooks III. Improved Go-like models demonstrate the robustness of pro-
tein folding mechanisms towards non-native interactions. J. Mol. Biol., 334:309–325, 2003.
[24] M. Feig, J. Karanicolas, and C. L. III Brooks. MMTSB Tool Set: enhanced sampling and multi-
scale modeling methods for applications in structural biology. J. Mol. Graph, Model., 22:377–395,
2004.
[25] CHARMM. http://www.charmm.org/.
Bibliography 150
GENESIS User Guide, 1.7.1
Bibliography 151
GENESIS User Guide, 1.7.1
[45] T. Mori, N. Miyashita, W. Im, M. Feig, and Y. Sugita. Molecular dynamics simulations of bio-
logical membranes and membrane proteins using enhanced conformational sampling algorithms.
BBA-Biomembranes, 1858:1635–1651, 2016.
[46] W. C. Still, A. Tempczyk, R. C. Hawley, and T. Hendrickson. Semianalytical treatment of solvation
for molecular mechanics and dynamics. J. Am. Chem. Soc., 112:6127–6129, 1990.
[47] D. Eisenberg and A. D. McLachlan. Solvation energy in protein folding and binding. Nature,
319:199–203, 1986.
[48] M. Schaefer and C. Froemmel. A Precise Analytical Method for Calculating the Electrostatic En-
ergy of Macromolecules in Aqueous Solution. J. Mol. Biol., 216:1045–1066, 1990.
[49] T. Lazaridis. Structural Determinants of Transmembrane beta-Barrels. J. Chem. Theory Comput.,
1:716–722, 2005.
[50] M. Tuckerman, B. J. Berne, and Martyna G. J. Reversible multiple time scale molecular dynamics.
J. Chem. Phys., 97:1990–2001, 1992.
[51] J. Schlitter, M. Engels, and P. Kruger. Targeted molecular dynamics: a new approach for searching
pathways of conformational transitions. J. Mol. Graph., 12:84–89, 1994.
[52] T. Mori, G. Terashi, D. Matsuoka, D. Kihara, and Y. Sugita. Efficient Flexible Fitting Refinement
with Automatic Error Fixing for De Novo Structure Modeling from Cryo-EM Density Maps. J.
Chem. Inf. Model., 61:3516–3528, 2021.
[53] J. P. Ryckaert, G. Ciccotti, and H. J. C. Berendsen. Numerical-Integration of Cartesian Equations
of Motion of a System with Constraints - Molecular-Dynamics of N-Alkanes. J. Comput. Chem.,
23:327–341, 1977.
[54] H. C. Andersen. Rattle - a Velocity Version of the Shake Algorithm for Molecular-Dynamics
Calculations. J. Comput. Chem., 52:24–34, 1983.
[55] S. Miyamoto and P. A. Kollman. Settle - an Analytical Version of the Shake and Rattle Algorithm
for Rigid Water Models. J. Comput. Chem., 13:952–962, 1992.
[56] S. A. Adelman and J. D. Doll. Generalized Langevin Equation Approach for Atom-Solid-Surface
Scattering - General Formulation for Classical Scattering Off Harmonic Solids. J. Chem. Phys.,
64:2375–2388, 1976.
[57] D. Quigley and M. I. J. Probert. Langevin dynamics in constant pressure extended systems. J.
Chem. Phys., 120:11432–11441, 2004.
[58] Y. Zhang, S. E. Feller, B. R. Brooks, and Pastor R. W. Computer simulation of liquid/liquid inter-
faces. I. Theory and application to octane/water. J. Chem. Phys., 103:10252–10266, 1995.
[59] H. J. C. Berendsen, J. P. M. Postma, W. F. Vangunsteren, A. Dinola, and J. R. Haak. Molecular-
Dynamics with Coupling to an External Bath. J. Chem. Phys., 81:3684–3690, 1984.
[60] G. Bussi, D. Donadio, and M. Parrinello. Canonical sampling through velocity rescaling. J. Chem.
Phys., 126:014101, 2007.
[61] G. Bussi, T. Zykova-Timan, and M. Parrinello. Isothermal-isobaric molecular dynamics using
stochastic velocity rescaling. J. Chem. Phys., 130:074101, 2009.
[62] C. Kandt, W. L. Ash, and D. P. Tieleman. Setting up and running molecular dynamics simulations
of membrane proteins. Method, 41:475–488, 2007.
[63] Y. Sugita and Y. Okamoto. Replica-exchange molecular dynamics method for protein folding.
Chem. Phys. Lett., 314:141–151, 1999.
Bibliography 152
GENESIS User Guide, 1.7.1
[64] A. Mitsutake, Y. Sugita, and Y. Okamoto. Generalized-ensemble algorithms for molecular simu-
lations of biopolymers. Biopolymers, 60:96–123, 2001.
[65] Y. Mori and Y. Okamoto. Generalized-ensemble algorithms for the isobaric-isothermal ensemble.
J. Phys. Soc. Jpn., 79:074003, 2010.
[66] Y. Mori and Y. Okamoto. Replica-exchange molecular dynamics simulations for various constant
temperature algorithms. J. Phys. Soc. Jpn., 79:074001, 2010.
[67] T. Okabe, M. Kawata, Y. Okamoto, and M. Mikami. Replica-exchange Monte Carlo method for
the isobaric-isothermal ensemble. Chem. Phys. Lett., 335:435–439, 2001.
[68] T. Mori, J. Jung, and Y. Sugita. Surface-tension replica-exchange molecular dynamics method for
enhanced sampling of biological membrane systems. J. Chem. Theory. Comput., 9:5629–5640,
2013.
[69] Y. Sugita, A. Kitao, and Y. Okamoto. Multidimensional replica-exchange method for free-energy
calculations. J. Chem. Phys., 113:6042–6051, 2000.
[70] H. Fukunishi, O. Watanabe, and S. Takada. On the Hamiltonian replica exchange method for ef-
ficient sampling of biomolecular systems: Application to protein structure prediction. J. Chem.
Phys., 116:9058–9067, 2002.
[71] T. Terakawa, T. Kameda, and S. Takada. On Easy Implementation of a Variant of the Replica
Exchange with Solute Tempering in GROMACS. J. Comput. Chem., 32:1228–1234, 2011.
[72] M. Kamiya and Y. Sugita. Flexible selection of the solute region in replica exchange with solute
tempering: Application to protein-folding simulations. J. Chem. Phys., 149:072304, 2018.
[73] P. Liu, B. Kim, R. A. Friesner, and B. J. Berne. Replica exchange with solute tempering: A method
for sampling biological systems in explicit water. Proc. Natl. Acad. Sci. USA, 102:13749–13754,
2005.
[74] W. Jiang, M. Hodoscek, and B. Roux. Computation of Absolute Hydration and Binding Free En-
ergy with Free Energy Perturbation Distributed Replica-Exchange Molecular Dynamics. J. Chem.
Theory Comput., 5:2583–2588, 2009.
[75] S. Re, H. Oshima, K. Kasahara, M. Kamiya, and Y. Sugita. Encounter complexes and hid-
den poses of kinase-inhibitor binding on the free-energy landscape. Proc. Natl. Acad. Sci. USA,
116:18404–18409, 2019.
[76] L Maragliano, A. Fischer, E. Vanden-Eijnden, and G. Ciccotti. String method in collective vari-
ables: minimum free energy paths and isocommittor surfaces. J. Chem. Phys., 125:24106, 2006.
[77] L. Maragliano and E. Vanden-Eijnden. On-the-fly string method for minimum free energy paths
calculation. Chem. Phys. Lett., 446:182–190, 2007.
[78] A. C Pan, D. Sezer, and B. Roux. Finding transition pathways using the string method with swarms
of trajectories. J. Phys. Chem. B, 112:3432–3440, 2008.
[79] Y. Matsunaga, Y. Komuro, C. Kobayashi, J. Jung, T. Mori, and Y. Sugita. Dimensionality of Col-
lective Variables for Describing Conformational Changes of a Multi-Domain Protein. J. Phys.
Chem. Lett., 7:1446–1451, 2016.
[80] K. Yagi, S. Ito, and Y. Sugita. Exploring the Minimum-Energy Pathways and Free-Energy Profiles
of Enzymatic Reactions with QM/MM Calculations. J. Phys. Chem. B, 125:4701–4713, 2021.
[81] W. E, W. Ren, and E. Vanden-Eijnden. Simplified and improved string method for computing the
minimum energy paths in barrier-crossing events. J. Chem. Phys., 126:164103, 2007.
Bibliography 153
GENESIS User Guide, 1.7.1
[82] D. Sheppard, R. Terrell, and G. Henkelman. Optimization methods for finding minimum energy
paths. J. Chem. Phys., 128:134106, 2008.
[83] Y. Miao, V. A. Feher, and J. A. McCammon. Gaussian Accelerated Molecular Dynamics:
Unconstrained Enhanced Sampling and Free Energy Calculation. J. Chem. Theory Comput.,
11:3584–3595, 2015.
[84] Y. T. Pang, Y. Miao, Y. Wang, and J. A. McCammon. Gaussian Accelerated Molecular Dynamics
in NAMD. J. Chem. Theory Comput., 13:9–19, 2017.
[85] D. Hamelberg, J. Mongan, and J. A. McCammon. Accelerated Molecular Dynamics: A Promising
and Efficient Simulation Method for Biomolecules. J. Chem. Phys., 120:11919–11929, 2004.
[86] D. Hamelberg, C. A. F. de Oliveira, and J. A. McCammon. Sampling of Slow Diffusive Confor-
mational Transitions with Accelerated Molecular Dynamics. J. Chem. Phys., 127:155102, 2007.
[87] T. Shen and D. Hamelberg. A Statistical Analysis of the Precision of Reweighting-Based Simula-
tions. J. Chem. Phys., 129:034103, 2008.
[88] Y. Miao, W. Sinko, L. Pierce, D. Bucher, R. C. Walker, and J. A. McCammon. Improved Reweight-
ing of Accelerated Molecular Dynamics Simulations for Free Energy Calculation. J. Chem. Theory
Comput., 10:2677–2689, 2014.
[89] H. Oshima, S. Re, and Y. Sugita. Replica-Exchange Umbrella Sampling Combined with
Gaussian Accelerated Molecular Dynamics for Free-Energy Calculation of Biomolecules.
J. Chem. Theory Comput., 2019. URL: https://pubs.acs.org/doi/10.1021/acs.jctc.9b00761,
doi:10.1021/acs.jctc.9b00761 (https://doi.org/10.1021/acs.jctc.9b00761).
[90] A. Warshel and M. Karplus. Calculation of Ground and Excited State Potential Surfaces of Con-
jugated Molecules. I. Formulation and Parametrization. J. Am. Chem. Soc., 94:5612–5625, 1972.
[91] A. Warshel and M. Levitt. Theoretical studies of enzymic reactions: Dielectric, electrostatic and
steric stabilization of the carbonium ion in the reaction of lysozyme. J. Mol. Biol., 103:227–249,
1976.
[92] K. Yagi, K. Yamada, C. Kobayashi, and Y. Sugita. Anharmonic Vibrational Analysis of
Biomolecules and Solvated Molecules Using Hybrid QM/MM Computations. J. Chem. Theory
Comput., 15:1924–1938, 2019.
[93] F. Tama, O. Miyashita, and C. L. Brooks. Flexible multi scale fitting of atomic structures into
low resolution electron density maps with elastic network normal mode analysis. J. Mol. Biol.,
337:985–999, 2004.
[94] M. Orzechowski and F. Tama. Flexible fitting of high resolution X ray structures into cryoelectron
microscopy maps using biased molecular dynamics simulations. Biophys. J., 95:5692–5705, 2008.
[95] L. G. Trabuco, E. Villa, K. Mitra, J. Frank, and K. Schulten. Flexible fitting of atomic structures
into electron microscopy maps using molecular dynamics. Structure, 16:673–683, 2008.
[96] M. Topf, K. Lasker, B. Webb, H. Wolfson, W. Chiu, and A. Sali. Protein structure fitting and
refinement guided by cryo EM density. Structure, 16:295–307, 2008.
[97] H. Ishida and A. Matsumoto. Free energy landscape of reverse tRNA translocation through the
ribosome analyzed by electron microscopy density maps and molecular dynamics simulations.
PloS one, 9:e101951, 2014.
[98] O. Miyashita, C. Kobayashi, T. Mori, Y. Sugita, and F. Tama. Flexible fitting to cryo-EM density
map using ensemble molecular dynamics simulations. J. Comput. Chem., 38:1447–1461, 2017.
Bibliography 154
GENESIS User Guide, 1.7.1
[99] P. C. Whitford, A. Ahmed, Y. Yu, Hennelly, S. P., F. Tama, Spahn, C. M., J. N. Onuchic, and
K. Y. Sanbonmatsu. Excited states of ribosome translocation revealed through integrative molec-
ular modeling. Proc. Natl. Acad. Sci. U.S.A., 108:18943–18948, 2011.
[100] T. Mori, M. Kulik, O. Miyashita, J. Jung, F. Tama, and Y. Sugita. Acceleration of cryo-EM flexible
fitting for large biomolecular systems by efficient space partitioning. Structure, 27:161–174.e3,
2019.
[101] J. Gao, K. Kuczera, B. Tidor, and M. Karplus. Hidden thermodynamics of mutant proteins: A
molecular dynamics analysis. Science, 244:1069–1072, 1989.
[102] D. A. Pearlman. A comparison of alternative approaches to free energy calculations. J. Phys.
Chem., 98:1487–1493, 1994.
[103] P. H. Axelsen and D. Li. Improved convergence in dual-topology free energy calculations through
use of harmonic restraints. J. Comput. Chem., 19:1278–1283, 1998.
[104] W. Jiang, C. Chipot, and B. Roux. Computing relative binding 791 affinity of ligands to receptor:
An effective hybrid single-dual-topology free-energy perturbation approach in NAMD. J. Chem.
Inf. Model., 59:3794–3802, 2019.
[105] M. Zacharias, T. P. Straatsma, and J. A. McCammon. Separation-shifted scaling, a new scal-
ing method for Lennard-Jones interactions in thermodynamic integration. J. Chem. Phys.,
100:9025–9031, 1994.
[106] T. Steinbrecher, I. Joung, and D. A. Case. Soft-core potentials in thermodynamic integration:
Comparing one- and two-step transformations. J. Comput. Chem., 32:3253–3263, 2011.
Bibliography 155