Open source software tools for computer aided drug design

Received on: 21.12.2017 Revised on: 11.02.2018 Accepted on: 12.02.2018 Computer-aided drug design (CADD) has revolutionized the drug discovery arena and it has reduced the costs associated with finding novel compounds which are having pharmaceutical importance. In CADD, the scientists use the computer software to discover biological active compounds. Molecular docking and energy minimization tools are essential components of structure based drug design. It is a significant tool in structural molecular biology and computer-assisted drug design. It reduces the laboratory workload of the end user and allows researchers to restrict their docking studies to the smallest and the most representative set of macromolecules and small molecules possible. This greatly enhances the productivity of researchers. Energy minimization is an important criterion for selecting a potential 3D molecule. In modeled structures, the 3D structure is affected is due to steric clashes. These clashes happen in a protein structure due to the overlap of non bonding atoms and with the assistance of energy minimization, steric clashes can be eradicated. The open software’s and databases provides a platform for scientists and scholars to carry out their research work in a better way. The docking tools are discussed in this review cover protein-ligand, protein-peptide as well as protein-nucleic acid docking. The tools described include AutoDock 4 and Vina, UCSF DOCK, FLIPDock, EADock, HADDOCK 2.2, SwissDock, PatchDock and ClusPro. In addition to the docking tools, energy minimization tools such as YASARA minimization server, KoBaMIN server and 3D refine server have also been discussed. This mini-review concentrates on open software tools which are free of cost and can be easily downloaded in the computers that are useful for CADD.


INTRODUCTION
Computer aided drug design has played a key role in drug discovery from the past thirty years.Molecular docking software's are an integral part of any structure based drug design process.They predict the formation of non-covalent bond between a ligand (usually a small molecule) and a macromolecule (usually a receptor protein) (Trott et al., 2010).In structure based drug design, the ligands are usually the small molecule drug candidates whose non-covalent interactions with a target receptor protein are to be simulated computationally.Several docking tools are available at no cost to the end user, allowing unrestricted access to carry out virtual high throughput screening (VHTS) of several ligands at once in even resource constrained laboratories such as those in several developing countries.Virtual screening has expedited and reduced the cost of drug discovery process (Kumar et al., 2016).Various kinds of kinase have been discovered with the help of virtual screening.The most commonly cited are human immunodeficiency virus (HIV) drugs, such as amprenavir (Agenerase) and nelfinavir (Viracept), which were developed using the crystal structure of HIV protease.This methodology gets to be main stream in the pharmaceutical examination for lead molecule classification.It is envisaged as substitute path for trial screening of drug molecules.It demonstrates an expanded achievement rate in the process of drug findings.It allows rapid and inexpensive filtering of active compounds from inactive ones.As a result, virtual screening has become an essential part of modern computer aided drug discovery (CADD) process.Databases such as Af-roDb, iSMART, Traditional Chinese Medicine (TCM), Super Natural II, PubChem, ZINC database etc are a useful source of ligands for carrying out virtual screening (McInnes, 2007).Although docking servers for protein-protein and protein-small molecule docking are widely available, servers for protein-nucleic acid docking have so far been relatively few.Most of the servers which do allow protein-nucleic acid docking were initially developed for protein-protein docking and later modified so as to accommodate protein-nucleic acid docking (Tuszynska et al., 2015).This, however, is likely to change soon as a result of the growth of interest in ncRNA (non-coding RNA) due to their role in disease, development and the potential of exogenous ncRNA (such as siRNA) in therapeutics.In addition to protein-nucleic acid docking tools, siRNA design tools are also likely to benefit due to the increased interest in the role of RNAi (RNA interference) in the regulation of gene expression in higher organisms (Laganà et al., 2015).
Energy minimization tools (also called structure refinement tools) are another important component of a structure based drug design workflow.Software programs are employed to create the 3D structures.After the 3D structure is built, energy minimization is carried out since it results in unfavorable bonded and non-bonded interactions.These clashes happen in a protein structure due to the overlap of non bonding atoms and with the assistance of energy minimization, steric clashes can be eradicated (Ramachandran et al., 2011).Energy minimization is done to bring the potential energy of the system to the lowest point and to eliminate close contacts.This enables the user to find out the structure with the minimum potential energy (and hence maximum stability) from a given set of atomic coordinates and interatomic bonds.In this review article, we have summarized some of the important protein structure refinement tools are that inexpensive and most of the popular structure refinement tools are web based.

Molecular Docking Tools
Molecular docking software predicts the non-covalent interactions occurring between ligands and their corresponding binding sites in the receptors (Kumar et al., 2015;Kumar et al., 2017).Depending on whether the binding site must be defined before the docking run, the docking can be classified as either blind docking (binding site not defined) or local docking (binding site is defined).Most modern molecular docking software carry out local docking as blind docking is much more expensive computationally.The only case in which blind docking might be considered plausible is when the target protein is small in size.Another way of classifying docking is on the basis of whether the ligand or the target of interest is flexible.On the basis of flexibility, docking can be classified as being either rigid or flexible (Teague, 2003).Flexible docking can be further classified as flexible ligand-rigid protein, flexible ligand-flexible protein and rigid ligand ligand-flexible protein.Flexible proteins, although more computationally expensive to consider than rigid proteins, greatly improve the accuracy of the results (Carlson and McCammon, 2000;Kumar and Ramanathan, 2014).As a result of the computationally intensive nature of docking, docking servers have gained widespread popularity in the scientific community as it provides the end user access to powerful hardware via an easy to use web interface.ClusPro server was the first molecular docking server (Comeau et al., 2005).
The structure of the protein on which the docking run is being performed is in most cases obtained by X-ray crystallography.NMR (Nuclear Magnetic Resonance) may also be used in som cases, especially when the protein cannot be crystallized.In most structure based drug design scenarios, the set of ligands under consideration are the possible drug candidates and the target a receptor or an enzyme.However, this is only one of the several possible intermolecular interactions amenable to study using molecular docking tools (Meng et al., 2011).Due to growing interest in RNA based therapeutics in the pharmaceutical industry, there is an increasing requirement of molecular docking tools which can predict protein-nucleic acid and nucleic acid-nucleic acid interactions.The main aim of molecular docking tools is to determine the docking pose (a docking pose is the conformation of ligand and target molecules at the time of binding) which has the minimum free energy of binding.Any good docking tool must have a good accuracy in predicting ligand-target interactions and must maximize its computational speed for a given set of parameters (Kumar et al., 2016).Docking tools are highly useful for rapid and efficient virtual screening of several candidate drug molecules.In virtual screening, several ligands are tested in various conformations in the binding site of their targets and the corresponding free energies of binding determined.The conformations with the lowest free energy of binding are chosen.This enables the researcher to filter out and identify suitable lead compounds to work on from a large set of candidate drug molecules.The two most important features of a docking software is the scoring function it uses for the ligand-receptor complex and the algorithm used for finding conformations of ligand in binding site of the receptor.Scoring functions allow the docking program to rank the affinity between the ligand and the binding site in the receptor.A good scoring function should strike an adequate balance between efficiency in usage of computational resources and accuracy in ranking affinities (Forli et al., 2016).AutoDock 4 and AutoDock Vina are popular molecular docking programs developed at The Scripps Research Institute.These two software are not only free but also open source, allowing the end user to make improvements and modifications in the underlying source code.The key difference between AutoDock 4 and Vina is in their scoring functions.Although the base installation of AutoDock 4 and Vina provides access to only their command line interface, a graphical user interface, AutoDock-Tools (ADT), can be downloaded separately as part of the MGLTools software.Alternatively, PyMol plugin autodock.pycan also be used to view docking poses generated by AutoDock 4 and Vina (Chaitanya et al., 2010).This gives the user access to the viewing capabilities of PyMol and docking capabilities of AutoDock 4 and Vina.After a docking run has been completed, the docking scores of various poses can be exported in diverse formats.AutoDock utilizes the Lamarckian Genetic algorithm (LGA) during the docking run (Seeliger et al., 2010;Kumar et al., 2014).

UCSF DOCK
UCSF DOCK was one of the first molecular docking software.It assumed both the receptor and the ligand to be rigid initially (Clark and Ajay, 1995).This is the least computationally intensive way of carrying out a docking run.However, the results in this case are far from accurate.As a result, as computational power increased, flexible ligand docking was incorporated in later versions of DOCK.DOCK works mainly by superimposing the ligand onto a negative image of the binding site of the receptor.It screens large libraries of small molecules which could serve as potential ligands to determine those that fit the binding site the 'best' (Kolb et al., 2009).The latest version of DOCK is DOCK 6.A highly useful feature of DOCK 6, especially due to increasing interest in RNA therapeutics, is the ability of DOCK 6 to be used for nucleic acid targets in addition to the protein targets (Lang et al., 2009).The versatility of DOCK is demonstrated by the fact that Hermann et al utilized DOCK for structure based prediction of function of enzymes.This involved the docking of high energy intermediates to the active site of enzymes.However, there are limitations to such applications of DOCK.Enzymes can undergo significant changes during the course of a reaction.Additionally, when utilizing DOCK for such applications, only a limited set of substrates can be considered (Hermann et al., 2007).HADDOCK 2.2 HADDOCK 2.2 (High Ambiguity Driven Protein-Protein Docking) web server primarily is meant for protein-protein and protein-peptide docking simulations.It was originally developed for NMR data.It has a large user base in India.As of 2016, HAD-DOCK server has had over 6000 users and has completed more than 108,000 runs.More than 120 protein structures, whose structure have been calculated using HADDOCK, have been submitted in PDB.HADDOCK server gives access to seven interfaces to the user.The interfaces differ in the number of parameters that can be changed.The most basic interface is the 'Easy' interface.The 'Guru' interface is the most advanced interface, allowing access to all the molecular docking parameters available on HADDOCK web server.'Guru' and 'Expert' allow access to advanced level features such as the ability to choose which region of the molecule are to be considered flexible or semi-flexible ( Van et al., 2016).Since docking calculations requiring access to advanced level features are computationally much more expensive than those that can simply be carried out by 'Easy' interface options, access to advanced level interfaces is only granted upon request.Users can send the request to haddocking@gmail.com.'Easy' and 'Prediction' interfaces can be used without requesting access.Upon completion of docking run, the user would receive an email containing a link to the results page.The HADDOCK score displayed on the results page takes into account the Vander Waals forces, electrostatic forces, desolvation energy, restraint violation energy and buried surface area at the region of interaction between the interacting molecules.Unlike most other docking tools, HADDOCK also has the capability to deal with more than 2 molecules simultaneously per docking run (Karaca et al., 2010).The interface for access to multiple molecule docking features is the 'Multi-body interface' of the HADDOCK web server.Although HADDOCK is primarily used for protein-protein and proteinpeptide docking studies, HADDOCK versions 2.0 and onward also support nucleic acid and small molecule docking (Vries et al., 2010).

PatchDock
PatchDock web server, which runs on PatchDock algorithm, is useful for protein-small molecule and protein-protein docking (Kumar et al., 2016).It was developed keeping antibody-antigen and enzyme-inhibitor interactions in mind.PatchDock algorithm carries out geometry based docking on the basis of shape complementarity.PatchDock algorithm has a relatively short run time.It can complete docking runs between 2 input proteins (of about 300 amino acids each) in less than 10 minutes on just a 1GHz processor.The web server serves as an interface for the PatchDock algorithm.During submission, the user may either upload the files on which the docking run has to be performed in the PDB format or enter their PDB codes (Kumar et al., 2015).The results as are sent to the user's email account upon completion of the docking run.The results with the top score can also be downloaded in a compressed file via a link on the results page.FiberDock web server is a useful tool for refining and ranking docking results from Patch-Dock.

SymmDock
SymmDock web server uses the SymmDock algorithm for predicting the structure of homomulti-mers which are cylindrically symmetrical.In addition to the PDB file of the molecule of interest, the user also has to enter the symmetry order of the molecule of interest.The user must keep in mind that SymmDock can only predict the quaternary structure of molecules with cyclic symmetry.The appearance of the SymmDock server is similar to PatchDock.The results are sent to the user via email (Schneidman et al., 2005).

FLIPDock
FLIPDock (Flexible LIgand Protein Docking) is molecular docking software developed by Yong Zhao FLIPDock utilizes a Flexibility Tree (FT) data structure in order to reduce the computational cost of using a flexible receptor (Zhao et al., 2005).

SwissDock
SwissDock server is a web server dedicated to protein-ligand docking simulations.It was developed by the Molecular Modeling group of The Swiss In-stitute of Bioinformatics and is based on the docking program EADock DSS (Evolutionary algorithms Dock Dihedral Space Sampling).EADock DSS takes the best features from the highly accurate and flexible EADock 2 while being significantly faster than EADock.The protein and the ligand structure between which the docking needs to be carried out is submitted online.The SwissDock online interface is very user friendly and easy to use, allowing use by even beginners in protein-ligand docking studies.Moreover, since SwissDock is web server based, users do not have to worry about lack of computational resources for molecular docking as the SwissDock servers are utilized for docking.The results of the docking can be viewed from a URL provided upon submission.Alternatively, the user can state their email in order to receive the link to the results via email upon completion of the docking run.The predicted binding modes for the protei-ligand complex can be viewed online or downloaded as a zip file containing the PDB, DOCK and CHARMM format files.The files uploaded by the user and the results of the docking run are deleted within a period of 4 days (Grosdidier et al., 2011).

ClusPro
ClusPro web server, as previously mentioned, was the first molecular docking server made available to the scientific community.The user has to either upload the PDB file of the proteins of interest or enter their PDB code at the time of submission.The results are sent to the user's email upon completion of docking run.ClusPro carries out rigid body docking using the Fourier correlation method.ClusPro does not allow the receptor molecule to have more than 11999 atoms and does not permit the ligand to have more than 4700 atoms after energy minimization.In order to reduce the docking run time, users can use a perl script 'block.pl'(on the ClusPro web server) to restrict binding predictions to residues of interest on the receptor (Katchalski et al., 1992).ClusPro also has symmetry functions which enable prediction of homomultimeric complex structures.

Energy minimization tools
Energy minimization is the optimization of position of atoms in a molecule in order to attain a molecular structure with the lowest free energy.There is several of carrying out the structural refinement.Although comparative modeling does give correct backbone but it is inaccurate for side chains and H bonds (Bhattacharya et al., 2013).Direct protein refinement can be carried out either by structural changes at global level or structural changes at the local level.Carrying out refinement at the global level is more desirable but is significantly more computationally demanding.The latter does not give satisfactory results at the global level.Hence, a good refinement tool must achieve a balance between the two (Bhattacharya et al., 2013).

YASARA
YASARA (Yet Another Scientific Artificial Reality Application) is a molecular graphics, modelling and simulation software available on Linux, Windows, OS X and even Android.There are four stages in YASARA: view stage, model stage, dynamic stage and structure stage.Of these YASARA view is available for free as is.The remaining 3 require a license fee to be paid.Access to the other three can also be gained for free by contributing user side developments to the YASARA community.The YASARA minimization server is a web server for carrying out energy minimization of proteins structures.It is a part of YASARA structure and performs energy minimization with the help of YASARA force field (Krieger et al., 2009).
Unlike the three stages of YASARA requiring a license fee, YASARA minimization server does not require any fee.It takes input in PDB format and emails the results to the user.

KoBaMIN
KoBaMIN (Knowledge Based MINimization) web server is a freely available protein structure refinement and energy minimization web server with a simple and easy to use web interface.It does not require any registration and is totally free.KoBa-MIN can also compare refined structure with a reference structure to determine the accuracy of the web server.The accuracy of any energy minimization depends on the accuracy of the force field on which the energy minimization tool is based on  (Rodrigues et al., 2013).Just like YASARA minimization server, KoBaMIN takes input in the PDB format.The results of the energy minimization are emailed to the user if the user submits their email during data submission.KoBaMIN also has the ability to take multiple structures if the structures to be refined are contained in a single zip, tar.gz or tar.bz2 archive.The KoBaMIN workflow involves 3 main steps: Validation of submitted structure, refinement or energy minimization of the submitted structure and obtaining the server output (Rodrigues et al., 2013).

3D refine server
3D refine server is another openly accessible energy minimization server.The web interface of 3D refine is similar to KoBaMIN.3D refine server workflow involves 3 main types: Validation of file type of submitted structure, H-bond optimization and energy minimization of optimized protein structure (Bhattacharya and Cheng, 2013).This workflow allows 3D refine to give improved results within a short period of time.3D refine uses direct refinement of predicted model for structural prediction.

CONCLUSION
Molecular docking and energy minimization software are an important part of the tool set of a researcher involved in computer aided drug design experiments.They are highly useful in predicting the interactions involving macromolecules as well as their structures.Over the years, the accuracy, computational efficiency and accessibility of these programs have increased considerably.There are several free and open source tools for docking and structure refinement available online.Although their accuracy and computational efficiency has increased considerably, they are not without limitations.The main problem with docking tools is that they perform docking runs in vacuo (in vacuum) in most cases, leading to results which do not accurately depict the in vivo and in vitro conditions.As computational power increases, several docking tools have begun to appear which tackle this problem by, for instance, solvated docking.Another issue with docking tools, carrying out flexible docking, is being tackled by tools such as FLIPDock which simulate ligand and receptor flexibility.Energy minimization servers have allowed even beginners to carry out structural refinement by providing a user friendly interface and automatically setting appropriate values for advanced parameters.Such advances in docking and refinement have led and will continue to lead to greater productivity by researchers involved in structure based drug design and would also enable non experts to contribute to advancement of our knowledge in interactomics and drug design.We sincerely hope that this paper will be extremely useful for the researchers particularly from the developing countries where there is lack of funding for the research work and deadly diseases taking the lives of so many people.In particular, drug development with the help of computationally techniques will be a bold step towards the drug designing process.

Figure 3 :Figure 4 :
Figure 3: Snapshot showing (A) PatchDock web server interface and (B) SymmDock web server interface.Notice the similarity between the interfaces

Figure 7 :
Figure 7: Snapshot showing KoBa MIN web server interface

Figure 8 :
Figure 8: Snapshot showing 3D refine web server interface