Protein structure space is enormous 1,2. Most of the linear polymer of amino acids has intrinsic nature of forming a dynamic and flexible 3D structure; which poses one of the conundrum of present days 3,5. Mapping the variation in the structure space will provide an insight into the organisation and evolvability in protein fold space 1,4,5.
The local arrangements (or secondary structure elements (SSEs)) provide a regular pattern in the protein chain, which is well defined 9,3. The arrangements of these patterns have been addressed in architecture level (in -CATH) 7 and fold level (in -SCOP)8 classifications. However, these definitions are influenced by evolutionary and sequence information, which is bias by current limitation of knowledge space10.
Here, with ProLego, we provide a simple and intuitive way to study protein structure space, using core definition of "topology", as used in case of protein structures 1,3. With focus on secondary structures (alpha helix and strands), we have cataloged the protein structure topology variation in current structure space.
Definition of Topology 1,3
Here topology is defined as the arrangement of secondary structure, their spatial contacts and relative orientation. This definition help to address the crucial aspects of local and non-local contacts and their relative position in the context of 3D structure.