The language of protein polymers
Proteins are heteropolymers of one or more amino acid residues arranged in a molecularly defined fashion. The precise control of amino acid sequence in protein biosynthesis programs the folding of these heteropolymers into diverse three-dimensional structures. The language of proteins, however, as seen in nature, encompasses limitless amino acid "phrases" (heteropolymers) written in peptide "words" (amino acid motifs) that span the entire structural spectrum from tightly folded to unstructured. Because protein sequences do not always have an obvious syntactic unit (word), herein we focus on protein polymers that repeat one or more syntactic motifs-units with a characteristic fold, biological activity or physical property (e.g., elasticity, phase behavior). We review the biosynthesis and sequence-controlled behavior of protein polymers that altogether span the gap between folded proteins and unstructured polymers. Learning to speak the language of protein polymers promises to merge the science of protein design and the materials science of synthetic polymers. Paradoxically, while protein structure is largely foreign to polymer chemists, the study and synthesis of unstructured, polymer-like proteins has been-till recently-similarly foreign to structural biologists. Interesting possibilities in materials science emerge from acquiring the capacity to read, write and speak the language of protein polymers.