header image
Home
Articles
Science Sunday
DOTW
Dear Reyne
News
----------
Forums
Links
Advanced Search
Contact Us
Login Form
Username

Password

Remember me
Password Reminder
No account yet? Create one
Syndicate
Home arrow Articles arrow Science Sunday arrow Proteomics and mass spectroscopy
Proteomics and mass spectroscopy PDF Print E-mail
Written by Eric Watt   
May 08, 2006 at 01:23 AM
New developments in mass spectroscopy have allowed researchers to characterize proteins in a high-throughput fashion with accuracy many orders of magnitude greater than previously possible. Over the past decade, genomics has been one of the most popular scientific disciplines in the eyes of the public. Basically, researchers seek to characterize the DNA of a living creature, fully sequencing the genome in order to identify genes. One thing that seems to be lacking from public awareness is that genomics is not an endpoint of scientific inquiry. Knowing the sequence of a genome does not give us much in and of itself. It is a tool, which will generate useful results that can then themselves be used for further understanding. For example, one can use genome sequencing to locate a particular gene that is responsible for breast cancer, allowing doctors to screen for that gene in order to determine risk factors, or possibly prevent cancer before it starts.

To get a better grasp of this concept, it is necessary to understand the so called 'central dogma' of molecular biology (Figure 1). Basically, our DNA is transcribed into mRNA, which is then translated into proteins. Proteins are the biomolecules that directly carry out the processes in a living organism. For many applications, understanding of the proteins themselves is much more desirable than the DNA that encodes for them.

2006.05.07.edw.ss.001
Figure I: The 'central dogma' of molecular biology. Genomics research has characterized the DNA step in this sequence, while proteomics seeks to characterize the end products themselves.

Just as genomics refers to the characterization of an organism's genome, proteomics refers to the study of an organism's proteome in a high-throughput way. Proteomics is much more ambitious than genomics. Studying the expression, localization, structure, dynamics, and function of every protein in an organism is a huge endeavor. While the genome is relatively stable, as it does not vary from cell to cell, the proteome can vary from cell to cell, tissue to tissue, and can also change over time as protein expression levels change. In addition, characterizing proteins is much more complicated than sequencing DNA. However, due to the wealth of information that would be available from such a study, there has been a large focus in the scientific community in developing methods useful in proteomics.

One of the tools used in this characterization is mass spectroscopy (MS). Conceptually, the instrumentation used is relatively straightforward. As shown in Figure 2, the design can be broken into three parts. The ionization charges a molecule, places it into the gas phase, and causes it to accelerate. The mass analyzer stage typically separates molecules by the mass (or, more accurately, their mass to charge ratio m/q). The detector then registers an impact by the molecule.

2006.05.07.edw.ss.002
Figure II: Schematic of a mass spectrometer. The three main components can usually be interchanged, allowing them to be combined in different ways to perform multiple experiments. This also lets us study them separately in this article.

While there are many different methods for achieving each stage, they can usually be mixed and matched, so we will look at each separately. Historically, ionizing biological macromolecules has been rather difficult. Small molecules can readily be placed into the gas phase and can be ionized by bombardment with electrons. These hard ionization techniques tend to destroy biomolecules before they can be analyzed. In 2002, the Nobel Prize in Chemistry was awarded to three individuals, two of whom developed methods of ionizing biomolecules and placing them into the gas phase. The first method is called MALDI (Matrix-assisted laser desorption/ionization). In this case, you mix the biomolecules with a bunch of smaller molecules. These smaller molecules are able to absorb a high-powered laser pulse. The energy absorbed by the small molecules blasts the biomolecules into the gas phase, and ionizes them at the same time. The second method is called ESI (electrospray ionization). While the actual physics behind it are quite complicated and still heavily debated, the basic steps involve dissolving the biomolecules in solution, and then forcing the solution through an electrically charged capillary tube. As the solution is forced through the tube, the solvent evaporates, and the charged droplet is expelled into a vacuum, creating a mist of highly charged macromolecules.

The mass analyzer methods are quite varied. They range from very complicated to very simple, as well as from very expensive to very cheap. While most are too complicated to be explained here, one of the most common ones is simple enough to describe. TOF (time of flight) mass spectroscopy is simply a long tube under vacuum through which the ions can pass. During the ionization (usually TOF is combined with MALDI) the biomolecules are accelerated through a high voltage. Lighter, highly charged molecules are accelerated more quickly than heavy, slightly charged molecules. The final speed of the molecules is inversely proportional to the mass per charge ratio, m/q. The mass analyzer section simply gives the molecules enough time and distance to separate as much as possible. In this way, the amount of time it takes the ions to reach the detector can be used to determine their velocity, which can then be used to determine their m/q ratio. The longer the distance allowed for TOF, the better the separation.

The detector is the portion of MS undergoing the least amount of advancement, as it is typically very good already. For most types of MS, the detector is simply an electron multiplier tube (EMT). An EMT is not able to distinguish between different sizes of molecules, which is why the mass analyzer region is necessary. However, the high sensitivity is great for measuring small amounts of ions.

Now that we have covered the basics of how MS works, we can talk about how it is used specifically in proteomics. One of the main goals of proteomics is to determine which proteins are present in a particular cell or tissue. Through a variety of ways, MS can be used to accomplish this goal. From genomics, we know the sequence of every gene. Using MS, we can determine the mass of all the proteins in a given solution. MS typically has enough resolving power to take all of the proteins in a particular tissue and separate them by mass. Alternatively, the proteins can be separated through other biochemical techniques (gel electrophoresis, HPLC) and then analyzed via MS. In either case, after detection of a protein at a given molecular weight, it is desirable to match the protein to the gene from which it was translated. Using the information obtained from genomics, we know the DNA sequence of every gene. This can be used to determine the amino acid sequence of the resulting protein, from which the molecular weight can be calculated. Once we calculate the masses of all proteins in the proteome, we compare a measured mass to this list in order to determine which protein we are analyzing (Figure 3).

As you can imagine, this is not an exact process. There are approximately 25,000 genes in the human genome, so there are bound to be different proteins of similar size. In addition, different proteins could have the same exact amino acid composition, only in different orders. Another concern is that the final protein has been modified from the original coded by the genome. Genes can be spliced in multiple ways, mRNA can be edited, the protein product can be cleaved, glycosylated, phosphorylated, acetylated, or modified in many other ways.

2006.05.07.edw.ss.003
Figure III: In the above figure, the numbers correspond to the relative weights of different amino acids. The amino acid sequence then determines the protein in question. If the genomics info listed the above four genes, an experimental measurement of '21' would indicate that the protein being studied was A. A result of '23' would indicate that protein B was being studied. Complications arise when multiple proteins have the same weight. For example, a measurement of '26' could indicate that either protein C or D was present.

To solve this problem, researchers have devised a rather unique technique. Collision-induced dissociation (CID) involves passing the ionized proteins through a neutral gas such as neon. Upon collision of the protein with the neutral gas, the protein will shatter and fragment in predictable ways. The breaks occur along the backbone of the protein, and will occur at different points. This is illustrated in Figure 4.

2006.05.07.edw.ss.004

Figure IV: Possible fragmentation results from a CID experiment. Jagged lines indicate fragmentation points along the backbone of the amino acid sequence. Products on the right indicate all the results from breaks at these fragmentation sites.

The fragmentation products can then be used to determine the exact amino acid sequence of the protein. So while proteins C and D in Figure 3 may have the same mass, their fragments will be different. C could have a possible fragment of 15, which is not possible for D. While there is likely to be a lot more overlap among actual proteins, there will also be many more fragments. Using these fragments, the protein can be easily identified. In addition, the locations of post-translational modifications can also be determined. This information is usually essential for determining protein function, and can not be determined from genomics info alone.

Recent work has been done to further expand these MS methods. Multiple mass analyzers can be chained together, so that you can select for a certain type of protein, apply CID, and then measure the fragments. Other possibilities include measuring the full protein mass and then the fragment masses in a single experiment by detecting in multiple steps. However, the basic concepts remain the same.

Using these techniques, proteins can be identified, characterized, and localized extremely quickly in a high-throughput fashion. Without these methods, a similar proteome characterization would take months or years while being orders of magnitude less accurate. Next week we will look at another technique being developed to aid in proteomics.

Last Updated ( Sep 23, 2007 at 04:33 PM )
Recent News
Dear Reyne
Paid Advertisements
Polls
What browser do you use most often?
  
Who's Online
We have 18 guests online