GriDock

 

 

AutoDock 4 front-end for virtual screening

 

 

 

Printable manual

 


 

1. Introduction

GriDock was designed to perform the molecular dockings of a large number of ligands stored in a single database (in a format supported by VEGA) in the lowest possible time. It take the full advantage of all local and remote CPUs thanks the MPICH2 technology, balancing the computational load between processors/grid nodes. The docking procedure consists in some steps:

  1. Extraction of the molecule from the database (VEGA command line).
  2. Addition of the hydrogens if they are missing (VEGA command line).
  3. Atomic charges calculation, potential attribution (AMBER), search of the flexible torsion angles and conversion in PDBQT format (VEGA command line).
  4. Molecular docking (AutoDock 4).
  5. Selection of the best scored complex and ranking in the result list.
  6. Storing of the AutoDock output in a compressed Zip archive for a successive analysis (e.g. with VEGA ZZ 2.3.0).

Main features:


2. Installation

The GriDock package includes executables for Linux 32 bit, Linux 64 bit, Windows 32 bit and Windows 64 bit. Each executable exists in two versions: the standard parallel version for unified memory systems (e.g. single workstation/PC with one ore more CPUs/cores) and the MPI (MPICH2) version for grid arrays.

 

2.1 Linux installation

Before to proceed with the installation, you must have the following packages:

GriDock_X.X.X.tar.gz Generic multiplatform GriDock package.
Vega_X.X.X_Linux_86-32.tar.gz VEGA command-line for Linux 32 bit.
Vega_X.X.X_Linux_86-64.tar.gz VEGA command-line for Linux 64 bit. This package is the alternative of the previous one if your system has a 64 bit operating system.

where X.X.X is the package version. You can download all needed packages from www.vegazz.net or www.ddl.unimi.it. The pre-compiled GriDock executables were built by CentOS 4.3 and they require the libc version 6.

 

2.1.1 Standard Linux installation

You must choose this installation procedure if you want to run GriDock locally on a single node (e.g. HPC system with shared memory architecture). All installed CPUs can be used by GriDock.

  1. Install VEGA command-line as explained in its manual.
  2. Copy the gridock executable from GriDock/Linux_86-32 or GriDock/Linux_86-64 folder of GriDock_X.X.X.tar.gz archive to VEGA installation directory.
  3. Copy the autodock4 and autogrid4 files from the AutoDock/Linux_86-32 or AutoDock/Linux_86-64 of GriDock_X.X.X.tar.gz archive to VEGA installation directory.
  4. Copy AutoDock directory from Vega/Data/AutoDock of GriDock_X.X.X.tar.gz archive to VEGA installation directory/Data.
  5. Copy gridock.xml file from Vega/Data of GriDock_X.X.X.tar.gz archive to VEGA installation directory/Data.
  6. Check/change the file permissions, typing in the command shell:
chmod 755 $VEGADIR/autodock4
chmod 755 $VEGADIR/autogrid4
chmod 755 $VEGADIR/gridock

To test the installation, type in the command shell:

vega
gridock

If the installation is right, no error messages will be shown.

 

2.1.2 MPI Linux installation with shared directories

This is the installation for grid array systems x86-based. MPICH2 is required to run this MPI version. If you have already installed MPICH2, skip this section and go to the next one.

 

2.1.2.1 MPICH2 installation

This is the procedure to install the MPICH2 package on a single node. You must repeat the installation for each node that you intend to use in the grid array. For more information about the installation, you can read the Installer's Guide available at the MPICH2 main site in the documentation section.

  1. Download the latest MPICH2 package from its home site (see downloads section), choosing the current stable version and clicking on UNIX and Windows source.
  2. Unpack the archive, typing in the command shell:
tar -zxvf mpich2-X.X.X.tar.gz

where X.X.X is the MPICH2 version.

  1. Change the current directory to mpich2-X.X.X:
cd mpich2-X.X.X
  1. Build the package, typing:
./configure
make
  1. Install the package:
make install
  1. Create the file .mpd.conf in home directory or /etc/mpd.conf. if your user is root.
cd $HOME
touch .mpd.conf
chmod 600 .mpd.conf

or

touch /etc/mpd.conf
chmod 600 mpd.conf
  1. Add the following line to the .mpd.conf or /etc/mpd.conf file with your preferred text editor (e.g. nedit, vi, etc):
secretword=<MY_PASSWORD>

the new MPICH2 releases require a different line:

MPI_SECRETWORD=<MY_PASSWORD>

where <MY_PASSWORD> is the access password and it must the same for all nodes that you want use in the grid.

  1. Run the MPI daemon:
mpd&
  1. Check the daemon sanity, typing a MPI command:
mpdtrace

the output must be localhost or any other host name.

  1. Try to run locally a non-MPI application in the MPI environment:
mpiexec -n 1 /bin/hostname

the output must be localhost.localdomain or any other host and domain names.

To stop the mpd daemon, you must use the mpdallexit command. To configure the remote hosts, you must consult the Installer's Guide.

 

2.1.2.2 VEGA and GriDock installation

  1. On your master node, create two shared directories: VEGA and GRIDOCK_DATA (other directory names can be used). They must be accessible to all nodes of your grid. The GRIDOCK_DATA directory is needed to put the data (sdf database and receptor files) to do the calculation. In that directory, you can find the results also.
  2. Unpack the content of Vega directory of Vega_X.X.X_Linux_86-32.tar.gz or Vega_X.X.X_Linux_86-64.tar.gz archive to VEGA shared folder.
  3. Copy the gridockmpi executable from GriDock/Linux_86-32 or GriDock/Linux_86-64 folder of GriDock_X.X.X.tar.gz archive to VEGA shared directory.
  4. Copy autodock4 and autogrid4 files from AutoDock/Linux_86-32 or AutoDock/Linux_86-64 of GriDock_X.X.X.tar.gz archive to VEGA shared directory.
  5. Copy Autodock directory from Vega/Data/Autodock of GriDock_X.X.X.tar.gz archive to VEGA shared directory/Data.
  6. Copy gridock.xml file from Vega/Data of GriDock_X.X.X.tar.gz archive to VEGA shared directory/Data.
  7. Check/change the file permissions, typing in the command shell:
chmod 755 VEGA_SHARED_PATH/vega
chmod 755 VEGA_SHARED_PATH/autodock4
chmod 755 VEGA_SHARED_PATH/autogrid4
chmod 755 VEGA_SHARED_PATH/gridockmpi
  1. To test locally GriDockMPI, you must type in the shell:
mpiexec -n <NUMBER_OF_PROCESSES> gridockmpi

where <NUMBER_OF_PROCESSES> is the total number of processes that you want create that is usually the number of CPUs/cores installed in your system.

WARNING:
The mpirun command has a buggy implementation of the MPI arguments that can be interpreted by GriDock as optional parameters. For this reason, when you use mpirun, you must specify always the receptor file name, the database and the AutoDock template.

 

2.2 Windows installation

GriDock and a special build of AutoDock 4 are included in VEGA ZZ package that can be downloaded from www.vegazz.net or www.ddl.unimi.it.


3. Usage

3.1 Main parameters

Running the program without parameters, the list of the implemented options is shown:

GriDock 1.0.6.3
Copyright 2008-2023, Alessandro Pedretti & Giulio Vistoli

Usage: gridock -f[FIRSTMOL] -l[LASTMOL] -o[OUTDIR] -p[CPUs] -s[STEP]
       -a[MODE] -t[TEMPLATE] -z[BOOL] -qr RECEPTOR DATABASE

 a -> add hydrogens: NONE, GEN, GENBO (default)
 f -> first molecule to dock (1)
 l -> last molecule to dock (the last one)
 o -> output directory (current directory)
 p -> number of CPUs (all available CPUs, SMP only)
 q -> shutdown when the calculation finishes (Windows only)
 r -> restart the screening
 s -> molecule step (1)
 t -> AutoDock template (default.dpf)
 z -> enable/disable the Zip output

The database must be in one of formats supported by VEGA.

 

3.1.1 DATABASE

It's the molecule database file name that can be in a format supported by VEGA. GriDock can perform also the docking of a single molecule and requires the ligand in PDBQT format instead of the database.

 

3.1.2 RECEPTOR

It's the receptor file name in PDBQT format with the polar hydrogens only. In the same directory of the receptor files must be present the grid files generated by AutoGrid4. To generate the PDBQT file and the grid maps you can use MGLTools or VEGA ZZ.

 

3.1.3 -a[MODE]

Add the hydrogens to each molecule in the database:

Mode Description
NONE No hydrogens will be added.
GEN Use the generic algorithm based on the bond geometry and atom hybridization.
GENBO Use the algorithm based on the bond order (default mode).

 

3.1.4 -f[FIRSTMOL]

Start the screening from the specified molecule number in the selected database. The default starting molecule to dock is the first one.

 

3.1.5 -l[LASTMOL]

Stop the screening when the specified molecule number is reached. The default value is the last molecule in the database.

 

3.1.6 -o[OUTDIR]

Set the directory in which the output files are stored. The default is the current directory.

 

3.1.7 -p[CPUs]

This parameter set the number of CPUs/cores used to perform the screening. The default value is the maximum number of CPUs installed in your system. This options don't have any effect if you are using the MPI version of GriDock because the nodes/CPUs are controlled by the mpiexec command.

 

3.1.8 -q

Power off the system when the calculation is finish. This option is available for Windows OS only.

 

3.1.9 -s[STEP]

Step increment to extract the molecules from the database (default 1).

 

3.1.10 -r

Restart the screening from the last saved molecule. Remember that to restart the screening the correct .csv file must be present.

 

3.1.11 -t[TEMPLATE]

Optionally, you can specify the template file used to pass the docking parameters to AutoDock4. The default template file is default.dpf (for more details, see the Template files section). The the default search path is the current directory, but if no file is found, GriDock search the template in the ...\VEGA ZZ\Data\Autodock directory (or .../vega/Data/Autodock directory for the Linux version).

 

3.1.12 -z[BOOL]

Enable/disable the creation of the Zip archive in which the AutoDock output complexes are stored. The Boolean values can be: 1/0, on/off, true/false and yes/no.

 

3.2 Preparing the input files

This section shows how it's possible to prepare the input files required by GriDock by VEGA ZZ. To do it, you need the structure of the target protein (receptor) and a database of molecules.

 

3.2.1 The receptor

To prepare the receptor, you need a 3D model without connectivity errors and completed with all hydrogens. A crystal structure download from the Protein Data Bank (it's the most common scenario), can't be used "as is", but it must be prepared:

WARNING:
if one or more error messages are shown, the receptor structure has problems (e.g. wrong connectivity, misplaced hydrogens, etc).

 

3.2.1 The database

The database must be in one of the formats supported by VEGA and VEGA ZZ (usually Access, Merck MMD, Mol2, ODBC data source, SQLite, SDF and Zip) and must contain 3D structures with or without hydrogens. If you have a 2D database, you must convert it to 3D by VEGA ZZ (consult its manual).
If you need to dock a single molecule, you can convert it to the MDL Mol format and rename the file from .mol to .sdf.

 

3.3 Running the screening

In the most common cases, it's enough to specify two parameters only in the command shell to perform the screening:

gridock receptor.pdbqt database.sdf

In this way, all molecules (ligands) included in database.sdf file will be docked in receptor.pdbqt structure. Each ligand is pre-processed automatically, adding the hydrogens by bond order method.

If you want to run the MPI version on Windows operating system:

mpiexec.exe -wdir Y:\ -map Y:=<GRIDOCK_DATA> -env VEGADIR <VEGA_ZZ_SHARED_DIR>
            -hosts <LIST_OF_THE_HOSTS>
            -noprompt <VEGA_ZZ_SHARED_DIR>\GriDockMPI.exe -a none Y:\receptor.pdbqt Y:\database.sdf

where:

<GRIDOCK_DATA>    The full UNC path of the shared directory in which the data files are stored (for more details, see Windows MPI installation procedure).
<LIST_OF_THE_HOSTS>   The list of the host to use for the calculation. It must have the following format:
N NTHREAD_1 HOSTNAME_1 NTHREAD_2 HOSTNAME_2 ... NTHREAD_N HOSTNAME_N

where N is the number of the hosts, NTHREAD_N is the number of thread that will be started at the specifeid node (usually one for each CPU/core) and HOSTNAME_N is the host name.

Example:

10 ANTARES 2 GALAXY  2 NOVA 2 VILLA 2 ALDEBARAN 1 ANDROMEDA
1 MIRA 1 NADIR 1 PEGASUS 1 ALTAIR
<VEGA_ZZ_SHARED_DIR>   VEGA ZZ installation directory shared by the master node (for more details, see Windows MPI installation procedure).


 

3.4 Output files

Three output files are created In the current directory:

16:11:34 ************************
16:11:34 INIT: GriDock 1.0.0.10 started on Linux
16:11:34 INIT: Local time Thu, 11 Dec 2008 17:11:34
16:11:34 INIT: 16 Dual Core AMD Opteron(tm) Processor 875 detected
16:11:34 INIT: Used CPUs: 16
16:11:34 INIT: AutoDock4/VEGA directory: "/hd1/home/warp/Vega"
16:11:34 INIT: Receptor file: "rdrp_q2xp15.pdbqt"
16:11:34 INIT: Database file: "ChemBank.sdf"
16:11:34 INIT: AutoDock output archive: "rdrp_q2xp15-ChemBank.zip"
16:11:34 INIT: Energy output file: "rdrp_q2xp15-ChemBank.csv"
16:11:34 INIT: Temporary file directory: "/tmp"
16:11:34 INIT: First molecule to dock: 1
16:11:34 INIT: Last molecule to dock: last in database
16:11:34 INIT: Molecule step/s: 1
16:11:34 INIT: AutoDock template file: "/hd1/home/warp/Vega/Data/Autodock/default.dpf"
16:11:34 INIT: AMMP time-out: 120 sec.
16:11:34 INIT: AutoDock time-out: 1200 sec.
16:11:34 INIT: VEGA time-out: 120 sec.
16:11:34 INIT: Add hydrogen to the ligand: yes
16:11:34 INFO: Starting AutoDock - Molecule 1 (tyrphostin [bis-tyrphostin] 270-059)
16:11:34 INFO: Starting AutoDock - Molecule 4 (Anandamide (20:3, n-6))
16:11:35 INFO: Starting AutoDock - Molecule 3 (tyrphostin [bis-tyrphostin] B42 270-168)
...

The time of the events is referred to the Greenwich Mean Time (GMT) and the local time is shown in the third line. When a single docking is finish, two lines are stored in the log:

...
16:11:47 INFO: Molecule 9 - Docking finished (0m 12s)
16:11:47 DOCK: Molecule 9 - Best model 4, Best Binding energy = -7.63 kcal/mol, Ki = 2.54 uM
...

The former informs that the docking is finish in a defined time (12 seconds in the example) and the latter indicates what solution/pose is selected by GriDock (4) on the basis of the binding energy (-7.63 kcal/mol) and inhibition constants (Ki = 2.54 uM).
An error record could be present in the log, if a problem is found:

...
16:16:03 ERROR: Molecule 110 - Atom/s with unassigned potential
...

This message means the molecule to dock has one or more atoms with unassigned potential and that's possible when it contains elements or atom types not included in the AutoDock/Amber force field.

...
16:47:40 ERROR: Molecule 1084 - AutoDock time-out - process killed
...

To avoid the lost of one or more CPUs when AutoDock 4 is spending too computational time or is in an infinite loop (it could be possible in few cases), GriDock kills the AutoDock when its process runs over the time-out. When this situation is true, the previous line is printed to the log file. The same check is achieved for VEGA also and the time-out values can be set in the GriDock preference file.

receptor-database_NNNNNNNN.dlg

where NNNNNNNN is the serial number of the docked ligand in the database.

 


4. Default settings

To modify the GriDock default  settings, you can edit the gridock.xml file in the ...\VEGA ZZ\Config directory by your preferred text editor. The following scheme shows the meaning of each Xml tag:

 

Configuration file   Description

<gridock version="1.0">
     

Main tag:
- version -> Version number of the configuration file.


  <maxzipsize>4000000000</maxzipsize>
 

Maximum size in bytes of each zip file containing AutoDock output files. You must remember that it can't be greater than 4 Gb (4,294,967,296 bytes). If the file system is  FAT32, it's strongly recommended to set this parameter to 2000000000 in order to not exceed the 2,147,483,648 value that is the maximum file size supported by this file system.


  <writezip>true</writezip>
 

Enabling this flag, the output files generated by AutoDock 4 are stored in zip archives. The supported arguments are: on / off, yes / no, true / false.


  <ammp timeout="120">
 

AMMP settings:
- timeout is the maximum time in seconds that can be used by AMMP to finish the calculation.


    <to3d>
      use none bond angle torsion hybrid nonbon;
      setf mxdq 1.0;
      gsdg 15 0;
      steep 50 1;
      cngdel 3000 0 0.01;
  AMMP commands used to perform the 1D or 2D to 3D conversion. For more details, see the AMMP manual.

    </to3d>
  End of 3D conversion section.

  </ammp>
  End of AMMP settings.

  <autodock timeout="12000"></autodock>
 

AutoDock 4 settings:
- timeout is the maximum time in seconds that can be used by AutoDock to finish the docking.


  <vega timeout="120">
 

VEGA  settings:
- timeout is the maximum time in seconds that can be used by VEGA to finish the calculation/conversion.


    <charges>gasteiger</charges>
 

Atom charges attribution method. It can be one of the methods supported by VEGA or none if you don't need to assign the atom charges. See the -c option in the VEGA manual.


    <potential>autodock</potential>
 

Atom type template that must be one of the templates supported by VEGA. See the -p option in the VEGA manual.


    <torsions>flex</torsions>
 

Method to find torsions/dihedrals in the ligand structures. See the -j option in the VEGA manual.


  </vega>
 

End of VEGA settings.


  <csvout decsep="auto">
 

Comma Separated Values (CSV) settings:
- decsep can be auto to detect the locale settings for the decimal separator (this capability works with Windows only) or the decimal separator character (e.g. , or .).


    <delay>300</delay>
 

In order to avoid the file system overload, the write procedure of the output csv file is delayed of the specified number of second. Changing the value to zero (0), the write is done immediately and not delayed.


  </csvout>
  End of CSV settings.

</gridock>
 

End of the GriDock configuration file.


 


5. Template files

The template files are used by GriDock to prepare the AutoDock 4 input files. They are placed in the ...\VEGA ZZ\Data\Autodock directory (or .../vega/Data/Autodock directory for the Linux version) and have the .dpf file extension. The template file can be selected by -t option.
The template files are standard AutoDock 4 files in which special tags are present that are substituted by GriDock with parameters that are specific for each molecule to screen as shown in the following table:

Tag name Description
%ATOMTYPES% AMBER atom types of the ligand. Each type is separated by a space.
%CENTGEO% Ligand center (format: X Y Z).
%DESOLVMAPFILE% Desolvation map (e.g. RECEPTOR.d.map).
%ELECMAPFILE% Electrostatic map (e.g. RECEPTOR.e.map).
%FLDFILE% Grid data file (e.g. RECEPTOR.maps.fld).
%LIGAND% Ligand file name with full path.
%MAPS% List of the maps with the following format:

map RECEPTOR.ATMTYPE_1.map
map RECEPTOR.ATMTYPE_2.map
...
map RECEPTOR.ATMTYPE_N.map

%TORFLEXNUM% Number of the flexible torsions.

Each tag can be repeated more than one time in the template file.

 

5.1 Template file example

This is a GriDock template example for virtual screening:

#
# *****************************************
# **** GriDock template for AutoDock 4 ****
# *****************************************
#
# Default input file for virtual screening
#
outlev 1                             # diagnostic output level
intelec                              # calculate internal electrostatics
seed pid time                        # seeds for random generator
ligand_types %ATOMTYPES%             # atoms types in ligand
fld %FLDFILE%                        # grid_data_file
%MAPS%
elecmap %ELECMAPFILE%                # electrostatics map
desolvmap %DESOLVMAPFILE%            # desolvation map
move %LIGAND%                        # small molecule
about %CENTGEO%                      # small molecule center
tran0 random                         # initial coordinates/A or random
quat0 random                         # initial quaternion
ndihe %TORFLEXNUM%                   # number of active torsions
dihe0 random                         # initial dihedrals (relative) or random
tstep 2.0                            # translation step/A
qstep 50.0                           # quaternion step/deg
dstep 50.0                           # torsion step/deg
torsdof %TORFLEXNUM% 0.274000        # torsional degrees of freedom and coefficient
rmstol 2.0                           # cluster_tolerance/A
extnrg 1000.0                        # external grid energy
e0max 0.0 10000                      # max initial energy; max number of retries
ga_pop_size 150                      # number of individuals in population
ga_num_evals 100000                  # maximum number of energy evaluations (2500000, 50000)
ga_num_generations 27000             # maximum number of generations
ga_elitism 1                         # number of top individuals to survive to next generation
ga_mutation_rate 0.02                # rate of gene mutation
ga_crossover_rate 0.8                # rate of crossover
ga_window_size 10                    # 
ga_cauchy_alpha 0.0                  # Alpha parameter of Cauchy distribution
ga_cauchy_beta 1.0                   # Beta parameter Cauchy distribution
set_ga                               # set the above parameters for GA or LGA
sw_max_its 300                       # iterations of Solis & Wets local search
sw_max_succ 4                        # consecutive successes before changing rho
sw_max_fail 4                        # consecutive failures before changing rho
sw_rho 1.0                           # size of local search space to sample
sw_lb_rho 0.01                       # lower bound on rho
ls_search_freq 0.06                  # probability of performing local search on individual
set_sw1                              # set the above Solis & Wets parameters
compute_unbound_extended             # compute extended ligand energy
ga_run 10                            # do this many hybrid GA-LS runs (10)
analysis                             # perform a ranked cluster analysis

 

6. History


7. Copyright and disclaimers

All trademarks and software directly or indirectly referred in this document, are copyrighted from legal owners. GriDock is a freeware program and can be spread through Internet, BBS, CD-ROM and other electronic formats. The Authors of these programs accept no responsibility for hardware/software damages resulting from the use of this package.
No warranty is made about the software or its performance.

Use and copying of this software and the preparation of derivative works based on this software are permitted, so long as the following conditions are met:

  

  

GriDock
is a software developed in 2008-2021
by Alessandro Pedretti & Giulio Vistoli
All rights reserved.

Alessandro Pedretti
Dipartimento di Scienze Farmaceutiche
Facoltà di Scienze del Farmaco
Università degli Studi di Milano
Via Mangiagalli, 25
I-20133 Milano - Italy
Tel. +39 02 503 19332
Fax. +39 02 503 19359
E-Mail: info@vegazz.net
WWW: http://www.vegazz.net

 

AutoDock
is a software developed in 1994-2021
by Garrett M. Morris, David S. Goodsell, Ruth Huey and Arthur J. Olson

Molecular Graphics Laboratory
Department of Molecular Biology
The Scripps Research Institute, MB-5
10550 N. Torrey Pines Rd.
La Jolla, CA 92037-1000
U.S.A.
WWW: http://autodock.scripps.edu