GriDock
AutoDock 4 front-end for virtual screening
Printable manual
1. Introduction
GriDock was designed to perform the molecular dockings of a large number of ligands stored in a single database (in a format supported by VEGA) in the lowest possible time. It take the full advantage of all local and remote CPUs thanks the MPICH2 technology, balancing the computational load between processors/grid nodes. The docking procedure consists in some steps:
Main features:
2. Installation
The GriDock package includes executables for Linux 32 bit, Linux 64 bit, Windows 32 bit and Windows 64 bit. Each executable exists in two versions: the standard parallel version for unified memory systems (e.g. single workstation/PC with one ore more CPUs/cores) and the MPI (MPICH2) version for grid arrays.
2.1 Linux installation
Before to proceed with the installation, you must have the following packages:
GriDock_X.X.X.tar.gz | Generic multiplatform GriDock package. |
Vega_X.X.X_Linux_86-32.tar.gz | VEGA command-line for Linux 32 bit. |
Vega_X.X.X_Linux_86-64.tar.gz | VEGA command-line for Linux 64 bit. This package is the alternative of the previous one if your system has a 64 bit operating system. |
where X.X.X is the package version. You can download all needed packages from www.vegazz.net or www.ddl.unimi.it. The pre-compiled GriDock executables were built by CentOS 4.3 and they require the libc version 6.
2.1.1 Standard Linux installation
You must choose this installation procedure if you want to run GriDock locally on a single node (e.g. HPC system with shared memory architecture). All installed CPUs can be used by GriDock.
chmod 755 $VEGADIR/autodock4 chmod 755 $VEGADIR/autogrid4 chmod 755 $VEGADIR/gridockTo test the installation, type in the command shell:
vega gridockIf the installation is right, no error messages will be shown.
2.1.2 MPI Linux installation with shared directories
This is the installation for grid array systems x86-based. MPICH2 is required to run this MPI version. If you have already installed MPICH2, skip this section and go to the next one.
2.1.2.1 MPICH2 installation
This is the procedure to install the MPICH2 package on a single node. You must repeat the installation for each node that you intend to use in the grid array. For more information about the installation, you can read the Installer's Guide available at the MPICH2 main site in the documentation section.
tar -zxvf mpich2-X.X.X.tar.gzwhere X.X.X is the MPICH2 version.
cd mpich2-X.X.X
./configure make
make install
cd $HOME touch .mpd.conf chmod 600 .mpd.confor
touch /etc/mpd.conf chmod 600 mpd.conf
secretword=<MY_PASSWORD>the new MPICH2 releases require a different line:
MPI_SECRETWORD=<MY_PASSWORD>where <MY_PASSWORD> is the access password and it must the same for all nodes that you want use in the grid.
mpd&
mpdtracethe output must be localhost or any other host name.
mpiexec -n 1 /bin/hostnamethe output must be localhost.localdomain or any other host and domain names.
To stop the mpd daemon, you must use the mpdallexit command. To configure the remote hosts, you must consult the Installer's Guide.
2.1.2.2 VEGA and GriDock installation
chmod 755 VEGA_SHARED_PATH/vega chmod 755 VEGA_SHARED_PATH/autodock4 chmod 755 VEGA_SHARED_PATH/autogrid4 chmod 755 VEGA_SHARED_PATH/gridockmpi
mpiexec -n <NUMBER_OF_PROCESSES> gridockmpiwhere <NUMBER_OF_PROCESSES> is the total number of processes that you want create that is usually the number of CPUs/cores installed in your system.
WARNING:
The mpirun command has a buggy implementation of the MPI arguments that can be interpreted by GriDock as optional parameters. For this reason, when you use mpirun, you must specify always the receptor file name, the database and the AutoDock template.
2.2 Windows installation
GriDock and a special build of AutoDock 4 are included in VEGA ZZ package that can be downloaded from www.vegazz.net or www.ddl.unimi.it.
3. Usage
3.1 Main parameters
Running the program without parameters, the list of the implemented options is shown:
GriDock 1.0.6.3 Copyright 2008-2023, Alessandro Pedretti & Giulio Vistoli Usage: gridock -f[FIRSTMOL] -l[LASTMOL] -o[OUTDIR] -p[CPUs] -s[STEP] -a[MODE] -t[TEMPLATE] -z[BOOL] -qr RECEPTOR DATABASE a -> add hydrogens: NONE, GEN, GENBO (default) f -> first molecule to dock (1) l -> last molecule to dock (the last one) o -> output directory (current directory) p -> number of CPUs (all available CPUs, SMP only) q -> shutdown when the calculation finishes (Windows only) r -> restart the screening s -> molecule step (1) t -> AutoDock template (default.dpf) z -> enable/disable the Zip output The database must be in one of formats supported by VEGA.
3.1.1 DATABASE
It's the molecule database file name that can be in a format supported by VEGA. GriDock can perform also the docking of a single molecule and requires the ligand in PDBQT format instead of the database.
3.1.2 RECEPTOR
It's the receptor file name in PDBQT format with the polar hydrogens only. In the same directory of the receptor files must be present the grid files generated by AutoGrid4. To generate the PDBQT file and the grid maps you can use MGLTools or VEGA ZZ.
3.1.3 -a[MODE]
Add the hydrogens to each molecule in the database:
Mode | Description |
NONE | No hydrogens will be added. |
GEN | Use the generic algorithm based on the bond geometry and atom hybridization. |
GENBO | Use the algorithm based on the bond order (default mode). |
3.1.4 -f[FIRSTMOL]
Start the screening from the specified molecule number in the selected database. The default starting molecule to dock is the first one.
3.1.5 -l[LASTMOL]
Stop the screening when the specified molecule number is reached. The default value is the last molecule in the database.
3.1.6 -o[OUTDIR]
Set the directory in which the output files are stored. The default is the current directory.
3.1.7 -p[CPUs]
This parameter set the number of CPUs/cores used to perform the screening. The default value is the maximum number of CPUs installed in your system. This options don't have any effect if you are using the MPI version of GriDock because the nodes/CPUs are controlled by the mpiexec command.
3.1.8 -q
Power off the system when the calculation is finish. This option is available for Windows OS only.
3.1.9 -s[STEP]
Step increment to extract the molecules from the database (default 1).
3.1.10 -r
Restart the screening from the last saved molecule. Remember that to restart the screening the correct .csv file must be present.
3.1.11 -t[TEMPLATE]
Optionally, you can specify the template file used to pass the docking parameters to AutoDock4. The default template file is default.dpf (for more details, see the Template files section). The the default search path is the current directory, but if no file is found, GriDock search the template in the ...\VEGA ZZ\Data\Autodock directory (or .../vega/Data/Autodock directory for the Linux version).
3.1.12 -z[BOOL]
Enable/disable the creation of the Zip archive in which the AutoDock output complexes are stored. The Boolean values can be: 1/0, on/off, true/false and yes/no.
3.2 Preparing the input files
This section shows how it's possible to prepare the input files required by GriDock by VEGA ZZ. To do it, you need the structure of the target protein (receptor) and a database of molecules.
3.2.1 The receptor
To prepare the receptor, you need a 3D model without connectivity errors and completed with all hydrogens. A crystal structure download from the Protein Data Bank (it's the most common scenario), can't be used "as is", but it must be prepared:
WARNING:
if one or more error messages are shown, the receptor structure has problems (e.g. wrong connectivity, misplaced hydrogens, etc).
3.2.1 The database
The database must be in one of the formats supported by VEGA and VEGA ZZ
(usually Access, Merck MMD, Mol2, ODBC data source, SQLite, SDF and Zip) and must contain 3D structures with or without hydrogens.
If you have a 2D database, you must convert it to 3D by VEGA ZZ (consult its
manual).
If you need to dock a single molecule, you can convert it to the MDL Mol
format and rename the file from .mol to .sdf.
3.3 Running the screening
In the most common cases, it's enough to specify two parameters only in the command shell to perform the screening:
gridock receptor.pdbqt database.sdf
In this way, all molecules (ligands) included in database.sdf file will be docked in receptor.pdbqt structure. Each ligand is pre-processed automatically, adding the hydrogens by bond order method.
If you want to run the MPI version on Windows operating system:
mpiexec.exe -wdir Y:\ -map Y:=<GRIDOCK_DATA> -env VEGADIR <VEGA_ZZ_SHARED_DIR> -hosts <LIST_OF_THE_HOSTS> -noprompt <VEGA_ZZ_SHARED_DIR>\GriDockMPI.exe -a none Y:\receptor.pdbqt Y:\database.sdf
where:
<GRIDOCK_DATA> | The full UNC path of the shared directory in which the data files are stored (for more details, see Windows MPI installation procedure). | |
<LIST_OF_THE_HOSTS> | The list of the host to use for the calculation. It
must have the following format:N NTHREAD_1 HOSTNAME_1 NTHREAD_2 HOSTNAME_2 ... NTHREAD_N HOSTNAME_N where N is the number of the hosts, NTHREAD_N is the number of thread that will be started at the specifeid node (usually one for each CPU/core) and HOSTNAME_N is the host name. Example: 10 ANTARES 2 GALAXY 2 NOVA 2 VILLA 2 ALDEBARAN 1 ANDROMEDA 1 MIRA 1 NADIR 1 PEGASUS 1 ALTAIR |
|
<VEGA_ZZ_SHARED_DIR> | VEGA ZZ installation directory shared by the master node (for more details, see Windows MPI installation procedure). |
3.4 Output files
Three output files are created In the current directory:
Column | Description |
MolID | Molecule serial number in the database. |
Pose | Best pose number. AutoDock can generate more than one pose and GriDock select the best one to rank it in the .csv file. |
Ki | Estimated inhibition constants (mM, millimolar at 298.15 K). |
Binding | Estimated free energy of binding (kcal/mol). |
Intermolecular | Final intermolecular energy (kcal/mol). |
VdW + Hbond + Desolv | Van der Waals + hydrogen bonds + desolvation energies (Kcal/mol) |
Electrostatic | Electrostatic energy (kcal/mol). |
Internal | Final total internal energy (Kcal/mol). |
Torsional | Torsional free energy (kcal/mol). |
Unbound | Unbound system's energy (kcal/mol). |
Molecule name | Name of the molecule in the database. |
This is an example of .csv output file:
MolID; Ki; Binding; Intermolecular; VdW + Hbond + Desolv; Electrostatic; Internal; Torsional; Unbound; Molecule name 240; 70,79; -9,75; -9,39; -8,99; -0,40; -1,56; 1,10; -0,09; okadaic acid 274; 74,98; -9,72; -9,37; -9,19; -0,18; -1,46; 1,10; -0,01; rapamycin 1761; 127,95; -9,40; -8,99; -9,01; 0,02; -1,56; 1,10; -0,05; Rapamune 235; 156,44; -9,28; -10,45; -8,97; -1,47; -1,47; 1,92; -0,71; Leptomycin 1354; 203,17; -9,13; -9,26; -8,61; -0,65; -1,00; 0,55; -0,58; salinomycin 114; 223,06; -9,07; -10,17; -10,15; -0,02; -1,02; 1,10; -1,02; AM-251 ...
16:11:34 ************************ 16:11:34 INIT: GriDock 1.0.0.10 started on Linux 16:11:34 INIT: Local time Thu, 11 Dec 2008 17:11:34 16:11:34 INIT: 16 Dual Core AMD Opteron(tm) Processor 875 detected 16:11:34 INIT: Used CPUs: 16 16:11:34 INIT: AutoDock4/VEGA directory: "/hd1/home/warp/Vega" 16:11:34 INIT: Receptor file: "rdrp_q2xp15.pdbqt" 16:11:34 INIT: Database file: "ChemBank.sdf" 16:11:34 INIT: AutoDock output archive: "rdrp_q2xp15-ChemBank.zip" 16:11:34 INIT: Energy output file: "rdrp_q2xp15-ChemBank.csv" 16:11:34 INIT: Temporary file directory: "/tmp" 16:11:34 INIT: First molecule to dock: 1 16:11:34 INIT: Last molecule to dock: last in database 16:11:34 INIT: Molecule step/s: 1 16:11:34 INIT: AutoDock template file: "/hd1/home/warp/Vega/Data/Autodock/default.dpf" 16:11:34 INIT: AMMP time-out: 120 sec. 16:11:34 INIT: AutoDock time-out: 1200 sec. 16:11:34 INIT: VEGA time-out: 120 sec. 16:11:34 INIT: Add hydrogen to the ligand: yes 16:11:34 INFO: Starting AutoDock - Molecule 1 (tyrphostin [bis-tyrphostin] 270-059) 16:11:34 INFO: Starting AutoDock - Molecule 4 (Anandamide (20:3, n-6)) 16:11:35 INFO: Starting AutoDock - Molecule 3 (tyrphostin [bis-tyrphostin] B42 270-168) ...The time of the events is referred to the Greenwich Mean Time (GMT) and the local time is shown in the third line. When a single docking is finish, two lines are stored in the log:
... 16:11:47 INFO: Molecule 9 - Docking finished (0m 12s) 16:11:47 DOCK: Molecule 9 - Best model 4, Best Binding energy = -7.63 kcal/mol, Ki = 2.54 uM ...The former informs that the docking is finish in a defined time (12 seconds in the example) and the latter indicates what solution/pose is selected by GriDock (4) on the basis of the binding energy (-7.63 kcal/mol) and inhibition constants (Ki = 2.54 uM).
An error record could be present in the log, if a problem is found:... 16:16:03 ERROR: Molecule 110 - Atom/s with unassigned potential ...This message means the molecule to dock has one or more atoms with unassigned potential and that's possible when it contains elements or atom types not included in the AutoDock/Amber force field.
... 16:47:40 ERROR: Molecule 1084 - AutoDock time-out - process killed ...To avoid the lost of one or more CPUs when AutoDock 4 is spending too computational time or is in an infinite loop (it could be possible in few cases), GriDock kills the AutoDock when its process runs over the time-out. When this situation is true, the previous line is printed to the log file. The same check is achieved for VEGA also and the time-out values can be set in the GriDock preference file.
receptor-database_NNNNNNNN.dlgwhere NNNNNNNN is the serial number of the docked ligand in the database.
4. Default settings
To modify the GriDock default settings, you can edit the gridock.xml file in the ...\VEGA ZZ\Config directory by your preferred text editor. The following scheme shows the meaning of each Xml tag:
Configuration file | Description | |
<gridock version="1.0"> |
Main tag: |
|
<maxzipsize>4000000000</maxzipsize> |
Maximum size in bytes of each zip file containing AutoDock output files. You must remember that it can't be greater than 4 Gb (4,294,967,296 bytes). If the file system is FAT32, it's strongly recommended to set this parameter to 2000000000 in order to not exceed the 2,147,483,648 value that is the maximum file size supported by this file system. |
|
|
||
<writezip>true</writezip> |
Enabling this flag, the output files generated by AutoDock 4 are stored in zip archives. The supported arguments are: on / off, yes / no, true / false. |
|
<ammp timeout="120"> |
AMMP settings: |
|
|
||
<to3d> use none bond angle torsion hybrid nonbon; setf mxdq 1.0; gsdg 15 0; steep 50 1; cngdel 3000 0 0.01; |
AMMP commands used to perform the 1D or 2D to 3D conversion. For more details, see the AMMP manual. | |
|
||
</to3d> |
End of 3D conversion section. | |
|
||
</ammp> |
End of AMMP settings. | |
<autodock timeout="12000"></autodock> |
AutoDock 4 settings: |
|
<vega timeout="120"> |
VEGA settings: |
|
<charges>gasteiger</charges> |
Atom charges attribution method. It can be one of the methods supported by VEGA or none if you don't need to assign the atom charges. See the -c option in the VEGA manual. |
|
|
||
<potential>autodock</potential> |
Atom type template that must be one of the templates supported by VEGA. See the -p option in the VEGA manual. |
|
<torsions>flex</torsions> |
Method to find torsions/dihedrals in the ligand structures. See the -j option in the VEGA manual. |
|
</vega> |
End of VEGA settings. |
|
<csvout decsep="auto"> |
Comma Separated Values (CSV) settings: |
|
|
||
<delay>300</delay> |
In order to avoid the file system overload, the write procedure of the output csv file is delayed of the specified number of second. Changing the value to zero (0), the write is done immediately and not delayed. |
|
|
||
</csvout> |
End of CSV settings. | |
</gridock> |
End of the GriDock configuration file. |
|
5. Template files
The template files are used by GriDock to prepare the AutoDock 4 input files.
They are placed in the ...\VEGA ZZ\Data\Autodock directory (or .../vega/Data/Autodock
directory for the Linux version) and have the .dpf file extension.
The template file can be selected by -t option.
The template files are standard AutoDock 4 files in which special tags are
present that are substituted by GriDock with parameters that are specific for
each molecule to screen as shown in the following table:
Tag name | Description |
%ATOMTYPES% | AMBER atom types of the ligand. Each type is separated by a space. |
%CENTGEO% | Ligand center (format: X Y Z). |
%DESOLVMAPFILE% | Desolvation map (e.g. RECEPTOR.d.map). |
%ELECMAPFILE% | Electrostatic map (e.g. RECEPTOR.e.map). |
%FLDFILE% | Grid data file (e.g. RECEPTOR.maps.fld). |
%LIGAND% | Ligand file name with full path. |
%MAPS% | List of the maps with the following format: map
RECEPTOR.ATMTYPE_1.map |
%TORFLEXNUM% | Number of the flexible torsions. |
Each tag can be repeated more than one time in the template file.
5.1 Template file example
This is a GriDock template example for virtual screening:
# # ***************************************** # **** GriDock template for AutoDock 4 **** # ***************************************** # # Default input file for virtual screening # outlev 1 # diagnostic output level intelec # calculate internal electrostatics seed pid time # seeds for random generator ligand_types %ATOMTYPES% # atoms types in ligand fld %FLDFILE% # grid_data_file %MAPS% elecmap %ELECMAPFILE% # electrostatics map desolvmap %DESOLVMAPFILE% # desolvation map move %LIGAND% # small molecule about %CENTGEO% # small molecule center tran0 random # initial coordinates/A or random quat0 random # initial quaternion ndihe %TORFLEXNUM% # number of active torsions dihe0 random # initial dihedrals (relative) or random tstep 2.0 # translation step/A qstep 50.0 # quaternion step/deg dstep 50.0 # torsion step/deg torsdof %TORFLEXNUM% 0.274000 # torsional degrees of freedom and coefficient rmstol 2.0 # cluster_tolerance/A extnrg 1000.0 # external grid energy e0max 0.0 10000 # max initial energy; max number of retries ga_pop_size 150 # number of individuals in population ga_num_evals 100000 # maximum number of energy evaluations (2500000, 50000) ga_num_generations 27000 # maximum number of generations ga_elitism 1 # number of top individuals to survive to next generation ga_mutation_rate 0.02 # rate of gene mutation ga_crossover_rate 0.8 # rate of crossover ga_window_size 10 # ga_cauchy_alpha 0.0 # Alpha parameter of Cauchy distribution ga_cauchy_beta 1.0 # Beta parameter Cauchy distribution set_ga # set the above parameters for GA or LGA sw_max_its 300 # iterations of Solis & Wets local search sw_max_succ 4 # consecutive successes before changing rho sw_max_fail 4 # consecutive failures before changing rho sw_rho 1.0 # size of local search space to sample sw_lb_rho 0.01 # lower bound on rho ls_search_freq 0.06 # probability of performing local search on individual set_sw1 # set the above Solis & Wets parameters compute_unbound_extended # compute extended ligand energy ga_run 10 # do this many hybrid GA-LS runs (10) analysis # perform a ranked cluster analysis
6. History
7. Copyright and disclaimers
All trademarks and software directly or indirectly referred
in this document, are copyrighted from legal owners. GriDock is a freeware program and can be spread through Internet, BBS, CD-ROM and other electronic formats. The Authors of these programs accept no responsibility for hardware/software damages resulting from the use of this package.
No warranty is made about the software or its performance.
Use and copying of this software and the preparation of derivative works based on this software are permitted, so long as the following conditions are met:
The copyright notice and this entire notice are included intact and prominently carried
on all copies and supporting documentation.
No fees or compensation are charged for use, copies, or access to this software. You may
charge a nominal distribution fee for the physical act of transferring a copy, but you may
not charge for the program itself.
If you want include the GriDock package into a commercial file collection, you must send a
written request. The Authors can accept or deny the request on their own decision.
If you change the source code to improve the GriDock performances,
please contact the Authors to add your modifications in the official package.
Any work distributed or published that in whole or in part contains or is a derivative of this software or any part thereof is subject to the terms of this agreement. The aggregation of another unrelated program with this software or its derivative on a volume of storage or distribution medium does not bring the other program under the scope of these terms.
GriDock
is a software developed in 2008-2021
by Alessandro Pedretti & Giulio Vistoli
All rights reserved.
Alessandro Pedretti
Dipartimento di Scienze Farmaceutiche
Facoltà di Scienze del Farmaco
Università degli Studi di Milano
Via Mangiagalli, 25
I-20133 Milano - Italy
Tel. +39 02 503 19332
Fax. +39 02 503 19359
E-Mail: info@vegazz.net
WWW: http://www.vegazz.net
AutoDock
is a software developed in 1994-2021
by Garrett M. Morris, David S. Goodsell, Ruth Huey and Arthur J. Olson
Molecular Graphics Laboratory
Department of Molecular Biology
The Scripps Research Institute, MB-5
10550 N. Torrey Pines Rd.
La Jolla, CA 92037-1000
U.S.A.
WWW:
http://autodock.scripps.edu