Unifying data on protein abundance
October 1st, 2020, by Heidi Tran
Scientific research is about investigating a new question or idea, either by primary research or collating and analyzing secondary data. A big obstacle for the latter is that studies from different groups often use different methods or units to study and report similar outcomes, which can make it difficult to collate findings, perform analyses, and draw valid conclusions.
A 2017 study by Ho et al. attempted to rectify this issue by unifying findings from studies that had used different methods and units to study and report protein abundance. The authors found 19 studies that had quantified protein abundance in yeast, a type of microorganism that is widely used in biomedical research, when the yeast were in normal conditions and when they were under stress. They used statistical analyses and data processing methods to convert these disparate results into a single unit of measure – molecules per cell. Their goal was to improve comparability for future analyses, both their own and others’.
Examining 5702 proteins – 97% of the yeast proteome – the researchers found that the number of yeast proteins varied according to their function, with the most abundant proteins playing roles in cell growth and development. While the number of molecules of each protein type ranged from as few as 5 to as many as 1.3 million molecules per cell, two-thirds of the proteins were expressed at 1000–5000 molecules, which, according to the researchers, suggests “that it is rare for proteins to be present at very high or very low copy numbers”.
The researchers also compared results obtained using mass spectrometry methods and fluorescence-tagging, which differ in sensitivity. They found that most proteins (95%) were not affected by the presence of fluorescent tags, a finding that is sure to please the many researchers who use this common method to examine protein distribution and levels.
The researchers confirmed that the number of proteins changed (increased or decreased) under normal versus stress conditions, and concluded that the unified data will be a “useful resource for further analysis of the dynamic regulation of the proteome”.
This study is an example of how “novel” research can arise from collating the findings of previously published studies. While stand-alone novel research is always needed, studies that combine data on what is already known about a particular subject also play an important role in progressing scientific research.