Since the beginning of 19th century, it took scientists more than 100 years to identify the 20 amino acids that make up proteins. However, merely knowing the primary structure of proteins or amino acid sequence is insufficient, as protein biological functions are determined by its three-dimensional structure in atomic-level. Very few tools can directly or indirectly detect the position of each atom, except for high-energy rays or electrons, as their wavelengths are comparable to the distance between atoms. Therefore, the mainstream technologies currently used for analysis of protein structure are X-ray diffraction and cryo-electron microscope.
Discovery of X-ray and Crystallography
People were still unable to determine the nature of X-rays after Röntgen discovered it in 1895. Scientists held two viewpoints on this: X-rays are a highly penetrating neutral particle or a very short-wavelength electromagnetic wave. Robert Pohl observed in his experiments that X-ray beam indeed widened when passing through a wedge-shaped slit, but it was suspected as an experimental error due to the too weak phenomenon. Then, wavelength of X-rays was estimated by Sommerfeld to be 0.5Å based on the patterns in Robert Pohl's photographs. To obtain a clear diffraction image, the distance between each slit in grating must reach atomic levels. Obviously, such accuracy was far beyond the technology at the time.
If atoms in crystal are indeed arranged in a regular spatial lattice as Auguste Bravais described in his theory; if X-rays are indeed an electromagnetic wave with a very short-wavelength that is comparable to atom distance, then the crystal would be equivalent to a natural grating to diffract X-rays. This is what the German physicist Max von Laue thought of in 1912. Soon after, he observed the diffraction when X-ray shone the copper sulfate crystal in his self-designed experiment. In 1913, Bragg and his son proposed Bragg's law to explain the relationship between lattice plane distance and wavelength.
X-ray Diffraction for Protein Structure Analysis
The first men use X-ray to explore protein structure were John Bernal and Dorothy Hodgkin. In 1934, they took the photography of pepsin crystal. Because the diffraction phase could not be resolved, the three-dimensional structure can’t be computed. However, they were the first men to prove that protein crystals could produce clear X-ray diffraction patterns. This discovery greatly inspired structural biologists and promoted the development of protein crystallography. The X-ray diffraction of fibrous proteins hinted the repetitive modules were in protein. In 1951, Linus Pauling and Robert Corey proposed the secondary structure of proteins, α-Helix and β-Sheet, based on hydrogen bonds in amino acid.
Although the Direct Method deduces the losing phase through diffraction intensity and probability theory, this method is only suitable for small molecules instead of biomacromolecules. Max Perutz invented heavy atom replacement. The protein is soaked in a heavy metal solution to introduce heavy atoms into groups containing lone pairs of electrons, such as -SH, -COOH, -NH₂, and carbonyl groups. Then, comparing diffraction pattern of original protein reveals the missing phase, and electron cloud density is calculated from phase and diffraction intensity. Although Max Perutz invented the method for phase computation, his colleague John Kendrew became the first man in history to determine the protein structure (myoglobin) in 1957. The reason is that myoglobin has only single polypeptide chain containing 153 amino acids, while hemoglobin is composed of four subunits, and each subunit is a polypeptide with about 150 amino acids. It was not until 1960 that hemoglobin structure was determined by Perutz.
Since then, more and more phase analysis methods have been created and X-ray crystallography was widely used in structural analysis of biomacromolecules: Lysozyme (1965, David Phillips), bacterial photosynthetic reaction center (1985, Johann Deisenhofer, Robert Huber, Hartmut Michel), ATP Synthase (1994, John E. Walker).
Gene Recombination and High Throughput Screening for Protein Crystall
The growth of protein crystals often requires high-purity samples in milligram quantities. Before the DNA recombination technology, scientists were very fond of studying proteins abundant in natural organisms, such as hemoglobin, myoglobin, and collagen. However, enzymes and hormones with important physiological functions and very low quantities had to be extracted from massive amounts of slaughterhouse waste (such as hundreds of liters of blood or tons of organs). This process was not only tedious, but also resulted in low purity. DNA recombination greatly simplified the extraction process. DNA sequence of the target protein is inserted into E-coli genetic material. These prokaryotes rapidly proliferate under suitable conditions to replace arduous organ grinding. Scientists can obtain a large amount of pure protein to create perfect protein crystals.
Before the 1990s, there was no systematic approach in screening and cultivation of protein crystals. Researchers tried to obtain crystals by trial and error with various high-salt solutions. Since then, scientists have designed high-throughput screening: microplates with hundreds to thousands of wells randomly combined pH values, precipitants, salt concentrations, and additives. This greatly shortens the time to obtain the optimal crystallization conditions.
Computers and Software Aid Protein Structure Analysis
The transition from 2D X-ray diffraction patterns to 3D protein conformation requires massive mathematical calculations, such as iteration, matrix operations, integration, and Fourier transform, especially phase, electron density map and model refinement which consume a tremendous computation resource. Any manual calculation is impossible. Fortunately, Fast Fourier Transform and high-speed computers (multi-core processors and GPUs) have greatly reduced computation time. Modern software automatically processes diffraction data and amino acid sequence information to complete phase determination, model building, and refinement. Moreover, the visual interface directly presents the three-dimensional structure of proteins to researchers, helping them to better understand their functions.
From birth to 2010, X-ray crystallography is the mainstream method for analyzing protein structures. Today, cryo-electron microscope has achieved atomic-level resolution. It has significant advantages in studying dynamic proteins or flexible regions. Thus, cryo-electron microscope is gradually replacing X-ray crystallography.