*The original version of this article was published by The Conversation-E on December 29th, 2020 and its translation has the permission of the author. https://theconversation.com/sars-cov-2-su-infectividad-y-ductilidad-son-una-puerta-abierta-a-nuevos-tratamientos-149887
In the SARS-CoV-2 coronavirus, as in all viruses, structural flexibility and plasticity are of special relevance. The way in which SARS-CoV-2 infects our cells is explained by analyzing its molecular structure and, above all, by looking at the ductile regions of the proteins that form it.
The genetic material of viruses, in this case RNA, is packaged with thousands of small subunits of nucleoproteins inside a lipid envelope called a capsid which, in the case of coronaviruses, is shaped like a crown. The size of the coronavirus genome is larger than that of other RNA-type viruses. The SARS-CoV-2 genome encodes fourteen functional proteins.
The S (Spike) protein, the E (Envelope) protein and the M (Membrane) protein which are located in the capsid have the structural function of assembling the virus and recognize the host cell. The N nucleoprotein located inside the capsid interacts with the viral RNA providing stability. In addition to these proteins, there are ten other smaller structural proteins and sixteen more non-structural ones that participate in the replication and transcription of the virus in the host cell.
The set of accessory and non-structural proteins has been called, somewhat literally, ‘dark proteome of SARS-CoV-2’. Although less studied, it contains important elements for the biology of the virus and its infectivity, as we will see soon. Another characteristic to highlight of SARS-type viruses is that all their proteins have a tendency to bind to genetic material.
The structure of SARS-CoV-2
Since the beginning of the pandemic, the explosion of scientific work around SARS-CoV-2 has rapidly provided numerous results on the structure of its proteins, many of them inspired by the knowledge of other related viruses: SARS-CoV (known since 2002) and MERS-CoV (known in 2012).
Since the middle of the 20th century, the study of the structure of proteins has been approached through biophysical techniques based on the interaction of electromagnetic radiation with matter, for example, spectroscopic techniques, which cover a range of possibilities as wide as the electromagnetic spectrum. But the success of studies with crystallography and X-ray diffraction has led to put this technique at the top of the ranking to solve the three-dimensional structure of these biomolecules at the atomic level and then to infer their function. Proof of this is that by 2020, nearly four hundred SARS-CoV-2 structures have been resolved with this technique, according to data from the Protein Data Bank - the repository that records the known three-dimensional structures of proteins and nucleic acids.
Structures resolved by other classical techniques, such as nuclear magnetic resonance (NMR) or neutron diffraction, are in the minority and do not reach ten. The extraordinary effort of the scientific community in this field during the last year is shown when we count that, in the period 2002-2019, the resolved structures of the SARS type viruses were nearly half.
The SARS-CoV-2 S protein is by far the most studied by X-ray crystallography. Up to one hundred and thirty-seven three-dimensional resolved structures are counted. The extraordinary attention it has received is because it is key to the infection by functioning as the key to the virus' entry into the host cell by attaching to the ACE2 receptor protein that acts as a lock on the oral mucosa, the main route of entry into the human body. It is, therefore, decisive in the fusion of the viral membrane with the host cell by allowing the release of the virus genome and causing the infection.
The N nucleoprotein follows with thirteen structures resolved so far and participates in the intracellular process of replication and transcription of the virus genome.
The resolution of the structures of the 'dark proteome of SARS-CoV-2' is scarce, partly due to the difficulty of finding homologous proteins available in other viruses or to resolve their ductile regions. As an alternative, microscopy techniques are contributing to the knowledge of its role, surely relevant, in the biology of the virus
Protein ductile regions influence infectivity
Some structural elements of proteins escape experimental procedures of crystallography and X-ray diffraction. But they are still important, they are often crucial for their functioning. This lack of structure is a useful feature.
Ductile regions (IDRs) are biologically active and highly dynamic in molecular recognition, in binding to other biomolecules or atoms (DNA, RNA, proteins, sugars, metals) and in the assembly of molecular complexes. They can rapidly adopt interconvertible conformations under different physiological conditions. Thus, the structured elements and the flexible ones complement each other.
Viral proteins contain a large number of ductile regions and several studies correlate this characteristic with virulence. In SARS-CoV-2, as in all viruses, ductile regions interact with other proteins and with genetic material. For example, in the N nucleoprotein, its high proportion of flexible regions allows a close interaction with the viral RNA and other proteins of the membrane, such as glycoprotein M, which is the most abundant in the virus, or with the proteins of the host cell, being thus multifunctional.
The remaining SARS-CoV-2 proteins, including the S protein, have a moderate or low content in ductile regions, but some of them can be crucial to modulate the infection. In a recent work published in Nature, up to 332 interactions between SARS-CoV-2 proteins and human proteins have been found. Most of them have as protagonist the S protein and the non-structural accessory proteins Nsp7 and Nsp8 of the virus. The flexibility and movement of the S proteins in the capsid envelope (they are counted up to about forty units) is determinant for the recognition of the cellular membranes and their union to them. In several works, with the help of high resolution electronic cryo-microscopy, it has been proved that there is a continuous and characteristic flexibility in the S protein, which is what makes this virus different from other coronaviruses.
As suggested, the ductile regions of the 'dark proteome of SARS-CoV-2' are also relevant. Computational tools provide valuable information on whether or not a protein adopts a well-defined three-dimensional structure, and whether or not a flexible region is involved in molecular recognition. The recent study published in Cellular and Molecular Life Sciences concludes that almost all SARS-CoV-2 proteins have one or more molecular recognition segments. Increased flexibility in specific regions of the proteins correlates with infectivity. In some cases, the predictions associated with these correlations become proven.
A recent work published in Nature shows clinical evidence that the D614G mutation in the SARS-CoV-2 S protein, detected in a variant that emerged in Europe during the month of January, increases replication in lung epithelial cells and in primary airway tissues, thereby boosting infectivity. This mutation -a substitution of aspartate by glycine- means a loss of complexity in the sequence and a gain of local flexibility. A strong phenotypic change related to virulence and associated to a glycine mutation has also been described in Mycobacterium tuberculosis.
The relationship between ductility and infectivity is proving useful in the design of new therapies aimed at blocking the entry of the virus into the cell or its replication. Some of the vaccine and antiviral drug designs attempt to block specific protein sites inspired by this knowledge to prevent infection. It is a door open for hope.
Senior Scientist, CSIC Group of Computational and Structural Biology
Estación Experimental de Aula Dei (EEAD-CSIC)
Avda. Montañana 1005, 50059 Zaragoza, Spain