Prediction of Protein-Ligand Binding Site



Figure 1 Flowchart of the proposed system.
MATERIALS AND METHODS
     The proposed algorithm is composed of five major steps (see Figure 1), a PDB identifier or a PDB file can be imported into the system. Multiple chains or specific range of a protein structure can be assigned . Each step is briefly introduced as follows.

Grid-based Protein Structure Construction
     The retrieved file should be stored in the PDB format . In this proposed system, the coordinates of atoms and corresponding van der Waals radii are transformed into corresponding volume pixels (voxels) within a grid structure. After discretization processes, the query protein can be represented as a set of discrete voxels which are categorized to inside, outside and surface portions of the query protein respectively.

Solid Angle Computation
     For each surface voxel within a protein, the proposed system computes corresponding solid angle by using the following formula:
(1)
      In this step, the recommended radius of the sphere by Connolly is defined as 6 A for all surface voxels. The proposed system employed CUDA codes to enhance the computational performance.


Surface Anchor Residues and Clustering

     Since we are looking for binding cavities from the query protein, only surface voxels with values of solid angles ranked in the top 20% were clustered into representative groups. Two surface voxels were clustered into the same group when they were neighboring voxels and located within a threshold distance (8 A) and both voxels possess solid angles at similar level. A voxel with the largest solid angle within a clustered group was considered as a feature point and claimed as an anchor of the group.


Figure 3 The clustered surface points of the query protein structure (PDB ID: 1TPA) after clustering based on solid angles.

Geometric Feature Calculation

     After the assignment of clustered groups and representative anchors, the system calculates extra geometric characteristics for each group. The following sections describe the geometric features in details:

Average Depth of Cavity

     To avoid such high variations of neighboring surface residues within a group, an average depth of a potential cavity was calculated and verified. The average depth was heuristically defined and evaluated according to the following formula:


Figure 4 A simplified example of average depth indicator for an anchor cluster with 6 neighboring surface residues.

Volume of a potential cavity

     The volume of selected cavities provides powerful discrimination between binding and non-binding regions. In this study, the volume indicator of a cluster was obtained by taking the anchor surface residue as a center and formulating a virtual sphere with a radius of 10 A. Those voxels located within the virtual sphere but not inside the query protein were then evaluated by taking 7 directional vectors including the edge and diagonal vectors of a cube. If extending both directions of one of the directional vectors could intersect with the query protein simultaneously, then this directional vector was assigned as the interior directional vector. For each voxel under investigation, if it possesses more than or equal to 4 verified interior vectors, this voxel is defined as part of the volume within the cavity. After examining all voxels in the virtual sphere, total interior voxel counts could provide as the volume value for the cluster..


Figure 5 An example of calculating volume indicator:

     When both geometric features of average depth and volume were obtained, a measuring score combining with linear weighting coefficients was performed for ranking all identified potential binding regions. The formula is written as the following equation:
     Where the RV(p) is the ranking value for anchor residue p; CD(p)avg is the average depth value for anchor residue p; CDmax is the maximum depth for the query protein; CV(p) is the volume for anchor residue p; CVmax is the maximum volumn for the query protein; the sum of both weighting coefficients of w1 and w2 is equal to 1.