The binding of small molecules to proteins is in many cases an essential feature of the mechanisms of their function. Drugs thatbind proteins allow us to modify or control these functions. Both for identification of the protein function, and for design of drugsthat target proteins, it is useful to have computational methods that identify potential ligands to a protein of known structure.This is true despite the fact that ligand-induced conformational changes may degrade the quality of a ligand prediction, perhapseven fatally.
In the work reported here, we bootstrap our way from pairs of ligands to a combined ligand in which the two are linked. Inprinciple, the Gibbs free energy changes of association of the dimeric ligand could approximate the sum of the Gibbs free energy changes of association of the individual ligands, implying that the affinity constant for the dimeric ligand would approximate the product of the affinity constants of the individual ligands, and thereby achieve the goal of producing tighter binding.
Keywords: Drug Design; Protein Structure
The goal is to design, computationally, molecules that bind tospecific active sites of proteins. We present a novel method and apply it to the extracellular ligand-binding domain of receptortyrosine kinase Eph receptor EphA4 (from wwPDB entry2WO2). The Ephrin-Eph receptor system has many important roles in development, both in the embryo and adult. Theseinclude but are not limited to angiogenesis, the development of new blood vessels. As tumours require formation of newblood vessels, the Ephrin-Eph receptor system is a target of interest for cancer therapy .
Our approach has been to enhance ligand binding by linkingseparate ligands into a single combined ligand that wouldcombine the affinities of the separate ligands. For the individual ligands, we used oligopeptides designed to bind EphA4 byObarska-Kosinska and Tramontano . We describe the method we developed for selecting pairs of peptides from theset they provided, and for designing linkers to create a dimeric ligand while maintaining the structures and interactions of the original individual oligopeptides. We have coordinated the use of several publicly-available web servers that do the heavylifting.
Selection of pairs of peptides suitable for linking
The starting material was a group of 50 oligopeptides, eachdesigned to bind, individually, to the binding site of EphA4. Not every pair chosen from the 50 is suitable for linking: some clash; for others the distance between chain termini istoo large.
We observed two classes of possible linkers. Assuming that each peptide has one end inside the protein and the other endaccessible to the outside, a linker between two “inner” ends faces the problem that the linker resides inside the active siteand therefore has danger of prohibitive steric contacts with the protein. Linkers between the two “outer” ends are generally free of this difficulty; however they tend to be much longer.One of us (C.S.) wrote a program to determine the distances between the terminal Cαs to identify pairs that might befeasible to link. Pairs of oligopeptides thus selected wereindividually tested for overlapping contacts with the protein.
Finding a linker using SUPER 
Once a pair of oligopeptides was considered possible forlinking, we utilized the web server SUPER . SUPER searchesthe Protein Data Bank for proteins that contain continuous regions that match the structures of the two submittedoligopeptides but “fill in” the gap between them. The user may specify the number of residues in the gap. In some cases we
tried to link the same oligopeptides with different gap lengths. SUPER reports the r.m.s.d. of the Cαs of the two oligopeptides with the corresponding regions of the continuous peptide.
SUPER identifies a region in a protein structure without reference to the geometry of the complex. Upon finding ausable match, the identified structure was superimposed onto the pair of peptides, within the active site, and inspected forsteric clashes with the enzyme; those that showed clashes were discarded. For the remaining, the side chains from the originaloligopeptides were then transferred to the corresponding Cαs on the new structure, replacing the side chains of the moleculefrom the PDB identified by SUPER.
The result was then run through PyRobetta  whichdocked the final structure into an active site, and energyminimizedthe entire system. PyRobetta returned multiple results, in some of which the Eph receptor was substantiallydistorted. The result with minimal distortion (overall r.m.s.d. less than a few tenths of an Å) was run throughFireDock [5,6] for further checking and adjustment.
Table 1 reports two results, all containing the EphA4 receptordomain from PDB entry 2WO2, together with peptides(see Figure 1). The r.m.s.d. measures the average distance between the Cα atoms of the peptides submitted to SUPER, andthe corresponding Cα atoms of the region returned by SUPER, not including the linker.
Figure 1. (a) Result 1, based on peptides from PDB entries 2B6Nand 1OGO. Ephrin receptor EphA4 from PDB entry 2WO2, black; designed ligand, red.
(b) Result 2, based on peptides from PDB entries 1X1I and 1I82, usinga different gap length. Colours as in (a).
We selected and wrote a set of tools that performed reliablytowards our objective of predicting a combined ligand to bindto a target site in a protein. Although tested on a particular system, it is entirely general and could be applied, without change, to other proteins.
We thank A. Obarska-Kosinska and A. Tramontano for providing the individual peptides, and G. Vriend for help withcalculations.