17. Creating a new template

VEGA and VEGA ZZ uses two types of template files: the former is for atom types, and the latter is for atomic charges.

 

17.1 Force field template

By ATDL (Atom Type Description Language), you can expand VEGA adding new atom types and/or new force field templates. Actually, VEGA supports the following pre-defined templates:

Force Field Package
AM1BCC AM1BCC.
AMBER Amber.
AUTODOCK AutoDock 4 force field (based on AMBER).
BOND Used by VEGA to calculate the bond types (single, double, partial double and triple).
BROTO Broto and Moreau atom types for logP calculation.
CFF91 Accelrys CFF91.
CHARMM Accelrys Quanta/CHARMm.
CHARMM22_LIG CHARMM 22 for ligands, including CHARMM22_PRO.
CHARMM22_LIPID CHARMM 22 for lipids.
CHARMM22_NA CHARMM 22 for nucleic acids.
CHARMM22_PRO CHARMM 22 for proteins.
CHARMM27 CHARMM 27 for proteins.
CHARMM36_GEN CHARMM 36 for generic use. The use of this template is not recommended for proteins and nucleic acids.
CRIPPEN Ghose and Crippen atom types for logP calculation.
CRIPPEN_MR Ghose and Crippen atom types for molar refractivity calculation
CVFF Accelrys CVFF.
GRID Grid.
GROUPS Used by VEGA to detect the functional groups.
HBOND H-bond atom types (for internal use).
MENG By Elanie C. Meng and Richard A. Lewis.
MM+ MM+.
MM2 MM2 by N .L. Allinger.
MM3 MM3 by N .L. Allinger.
MMFF MMFF94.
OPLS OPLS.
SP4 Used by VEGA to generate the AMMP input files.
TRIPOS Sybyl by Tripos.
UNIV Used by VEGA to assign the Gasteiger-Marsili atom charges.
VINA AutoDock Vina force field (based on AMBER).

A force field template is a file storing the atom type descriptions with uppercase name (corresponding to the force field name) and .tem lowercase extension (e.g. AMBER.tem, CVFF.tem, etc). All template files are placed in Data directory. Please remember that the .tem extension is for all VEGA templates and not for force field only.

In all template files the first column can contain special control characters:

Character Description
; Comment marker
# Keyword or command marker

The first line must contain a keyword needed for file type recognition. For force field it must be:

#TemplateFF   [TEMPLATE_NAME]   [VERSION]

where TEMPLATE_NAME is the name of the force field template and VERSION is the revision number.

#TemplateFF CVFF 3.0

After this keyword, you can place the atom type description. The first column is the atom type name (max 8 characters), the second is the atom description in ATDL and the third contains the description of bonded atoms (also in ATDL).
In this last column, each group of atoms limited by parenthesis contains all atoms bonded to precedent atom:

C-300 (O-100 O-100)

This line describes a carboxylic carbon: a sp2 carbon bonded to two oxygens making one bond only. More than one levels of parenthesis can be used for complex description of atom types:

C-300 (O-100 O-200 (C-900) C-900)

This line describes a carbonylic carbon of an ester group, bonded to a generic carbon. The O-200 is also bonded to a generic carbon.
Please remember that VEGA reads the line from left to right and thus the more restrictive atom description must placed in more left side of line:

C-400 (C-300 X-900 X-900 X-900)

and not:

C-400 (X-900 X-900 C-300 X-900)

If VEGA finds a C-300 as first or second atom bonded to a sp3 carbon, this is recognized as a more generic X-900 atom and can't be reassigned to the next more specific description.
The description sequence of each atom type goes from more to less specific, from upper to lower line:

cn	C-400 (N-300 X-900 X-900 X-900) ; more specific
c 	C-400 (X-900 X-900 X-900 X-900) ; less specific

If the order of this two lines is swapped, when VEGA finds a carbon bonded to a sp3 nitrogen, the atom type recognized is a generic c an not a cn.

 

17.1.1 ATDL atom description

 

ATDL

 

Each atom can be defined by a five character string. The first two characters are the element symbol of atom. If the element symbol is one character only, the second character must be a dash (-). For a better description, special elements can be used:

 

Special element Description
X Any atom.
# Heavy atom (all atoms excluding hydrogens).
$ Any atom excluding carbons and hydrogens.
@ Halogen (F, Cl, Br and I).

 

The third character is the bond order: use values from 1 to 6 for real bond order, 0 for non-bonded atom and 9 for a bonded atom with a non-specified bond order.
The fourth character is the ring indicator: use values from 3 to 7 if the atom is a 3 to 7-ring member, 0 for a non-ring member atom and 9 for a non-specified ring atom.
The fifth character is the aromatic indicator: 0 for non-aromatic atom and 1 for aromatic atom.
The ATDL language allows to use AND, OR and NOT operators (&, | and !) inside a logical expression included between square.

 

Examples:

 

17.2 Atomic charge template

This template file is much different from the first one, because the atom recognition is based on the residue names and the atom names. The control characters are the same of the force field template.

The first line must contain a keyword needed for file type recognition. For force field it must be:

#TemplateCharge [TYPE] [TEMPLATE_NAME]

where TYPE is the template charge type (Gasteiger or Fragments) and TEMPLATE_NAME is the name of the template. Please remember that the template name must be the same one of the file without the extension.
Example:

#TemplateCharge   Fragments   CHARMM22_CHAR

After this keyword it could be present the optional template title/description:

#Title   [TEMPLATE_TITLE]

Spaces and special characters are allowed.
Example:

#Title   Gasteiger-Marsili charges

After these two keywords, the file can be different if the template type is Gasteiger or Fragments

 

17.2.1 Gasteiger template

The Gasteiger template is very easy: after the header it's a list of records one for each line. Each record has six fields as reported in the following table:

Field Description
Type Atom type. See the UNIV.tem file in the Data directory.
a Gasteiger a parameter. See Tetrahedron, 36, 3219, 1980 and Croat.Chem.Acta, 53, 601, 1980.
b Gasteiger b parameter.
c Gasteiger c parameter.
d Gasteiger d parameter (a + b + c).
Charge Formal charge.

 

17.2.2 Fragment template

The fragment template is a little bit complex because it uses some keywords. To define a new residue, you must use the #ResName tag:

 

#ResName   [NAME1]   [NAME2] ... [NAME16]

e.g.   #ResName   ALA   ALAN

 

In this way, you define a new residue that it could have one of the specified names. The maximum number of names is 16 and the maximum length of each name is 4 characters.

This tag could be followed by other optional keywords:

 

#Id   [ID]

e.g.   #Id   AA_ALA

 

This command defines an unique residue identificator. It can be used by the #Call command (see below) and its maximum length is 31 characters.

 

#Description   [SHORT_DESCRIPTION]

e.g.   #Description   Alanine (protonated N-terminus)

 

It allows to specify a short description for the residue or for the macro (see below). Its maximum size is 127 characters.

 

#Charge   [CHARGE]

e.g.   #Charge   1.0

 

This optional keyword specifies the residue total charge. The number should be a positive or a negative floating point number.

After these optional keywords, that must be after the #ResName tag, the atom section begins. Each atom is defined in a line with the following fields:

 

[CHARGE]   [GROUPID]   [BONDS]   [NAME1]   [NAME2] ... [NAME8]

 

Where:

Field Description
CHARGE Atom partial charge.
GROUPID Group/fragment identification number. It's a positive integer starting from 1 to 255.
BONDS Number of atom bonds. It can be from 0 (non bonded) to 6. If it's greater than 6, the number of bonds isn't checked.
NAME1
...
NAME8
The atom names. The maximum number of atom names (aliases) is 8 and their maximum length is 4 characters.

e.g.   0.3100   1   1   HN   H   H1

 

This is a complete residue template example:

#ResName ALA
#Id ALA
#Description Alanine
#Charge  0.0000
 -0.4700   1 3 N
  0.3100   1 1 HN   H    H1
  0.0700   1 4 CA
  0.0900   1 1 HA
 -0.2700   2 4 CB
  0.0900   2 1 HB1
  0.0900   2 1 HB2
  0.0900   2 1 HB3
  0.5100   3 3 C
 -0.5100   3 1 O

In red are reported the optional keywords.

In order to simplify the template writing and to make more compact the file size, it's possible to create macros inside the file that must be defined before the use. To begin a new macros, you must use the following command:

 

#Define   [MACROID]

e.g.   #Define  AMINO_CT

 

Where the MACROID is the unique identification name of the macro. It have the same function of the #Id command inside the #ResName section. Inside the macro, you can use the #Description, #Call, #Delete commands and the atom records.

 

#Call   [RESIDUEID_OR_MACROID]

e.g.   #Call   AA_ALA

 

This command call a residue or a macro executing its commands. It can be placed inside a macro or a residue section.

 

#Delete   [ATOM_NAME]

e.g.   #Delete   O

 

This keyword deletes an atom previously defined in a residue section.

 

Please remember that an atom record inside a macro could replace a previous one if they have the same first atom name. This is a macro example:

#Define AMINO_CT
#Description C-terminus
  0.3400   9 3 C
 -0.6700   9 1 OT1  OCT1 O1   O
 -0.6700   9 1 OT2  OCT2 O2   OXT
#Delete O

Using this macro and an aminoacid residue definition, it's possible to obtain a new one specific for the C-terminal aminoacid:

#ResName ALA ALAC
#Id ALAC
#Description Alanine (negative C-terminal)
#Charge -1.0000
#Call ALA
#Call AMINO_CT

The first call copies the atom definitions from the ALA residue and the second call applies the AMINO_CT macro that change the C atom, add the two carboxyl oxygen, and delete the carbonyl oxygen (O).