In 2000 Tjalsma et al. coined the term ‘secretome’ in their study of the eubacterium B. subtilis. They defined the secretome as all of the secreted proteins and secretory machinery of the bacteria. Using a database of protein sequences in B. subtilis and an algorithm that looked at cleavage sites and amino-terminal signal peptides characteristic of secreted proteins they were able to predict what fraction of the proteome is secreted by the cell. In 2001 the same lab set a standard of secretomics – predictions based on amino acid sequence alone are not enough to define the secretome. They used two-dimensional gel electrophoresis and mass spectrometry to identify 82 proteins secreted by B. subtilis, only 48 of which had been predicted using the genome-based method of their previous paper. This demonstrates the need for protein verification of predicted findings.
As the complicated nature of secretory pathways was revealed – namely that there are many non-classical pathways of secretion and there are many non-secreted proteins that are a part of the classical secretory pathway – a more in-depth definition of the secretome became necessary. In 2010, Agrawal et al. suggested defining the secretome as “the global group of secreted proteins into the extracellular space by a cell, tissue, organ, or organism at any given time and conditions through known and unknown secretory mechanisms involving constitutive and regulated secretory organelles.”
In culture, cells are surrounded by contaminants. Bovine serum from cell culture media and cellular debris can contaminate the collection of secreted proteins used for analysis. Bovine contaminants present a particular challenge because the protein sequences of many bovine extracellular proteins, like fibronectin and fibulin-1, are similar to the human protein sequences. To remove these contaminants, cells can be washed with PBS or serum-free medium (SFM) before incubating in SFM and collecting secreted proteins. Care must be taken not to burst cells, releasing intracellular proteins. In addition, incubation time and conditions must be optimized so that the metabolic stress that can be induced by the lack of nutrients in SFM does not affect secretomic analysis.
Some proteins are secreted in low abundance and then diluted further in the cell culture medium or body fluid, making these proteins difficult to detect and analyze. Concentration methods like TCA precipitation can be used as well as highly sensitive methods like antibody microarrays that can detect even single molecules of a protein.
Many secretomic studies are conducted in vitro with cell culture methods, but it is unclear whether the same proteins are secreted in vivo. More and more studies, especially those looking at the cancer secretome, are using in vivo methods to confirm the relevance of the results obtained in vitro. For example, proximal biological fluids can be collected adjacent to a tumor in order to conduct a secretomic analysis.
Many secreted proteins have an N-terminal peptide sequence that signals for the translated protein to move into the endoplasmic reticulum where the processing occurs that will ultimately lead to secretion. The presence of these signal peptides can be used to predict the secretome of a cell. Software such as SignalP can identify signal sequences (and their cleavage sites) to predict proteins that are secreted. Since transmembrane proteins are also processed in the ER, but not secreted, software like the TMHMM server is used to predict transmembrane domains and therefore eliminate false positives. Some secretory proteins do not have classical signal peptide sequences. These ‘leaderless secretory proteins’ (LSPs) will be missed by SignalP. SecretomeP is a software that has been developed to try to predict these non-classical secretory proteins from their sequences. Genome-wide secretomes have been predicted for a wide range of organisms, including human, mouse, zebrafish, and hundreds of bacteria.
Genome-wide prediction methods have a variety of problems. There is a high possibility of false positives and false negatives. In addition, gene expression is heavily influenced by environmental conditions, meaning a secretome predicted from the genome or a cDNA library is not likely to match completely with the true secretome. Proteomic approaches are necessary to validate any predicted secreted proteins.
Several genome-wide secretome databases or knowledgebases are available based on both curation and computational prediction. These databases include: the fungal secretome database (FSD), the fungal secretome knowledgebase (FunSecKB), (FunSecKB2), (PlantSecKB), and the lactic acid bacterial secretome database. The human and animal protein subcellular location database (MetazSecKB) and the protist subcellular proteome database (ProtSecKB) are also recently released. Though there are some inaccuracies in the computational prediction, these databases provide useful resources for further characterizing the protein subcellular locations.
Mass spectrometry analysis is integral to secretomics. Serum or supernatant containing secreted proteins is digested with a protease and the proteins are separated by 2D gel electrophoresis or chromatographic methods. Each individual protein is then analyzed by mass spectrometry and the peptide-mass fingerprint generated can be run through a database to identify the protein.
Stable isotope labeling by amino acids in cell culture (SILAC) has emerged as an important method in secretomics – it helps to distinguish between secreted proteins and bovine serum contaminants in cell culture. Supernatant from cells grown in normal medium and cells grown in medium with stable-isotope labeled amino acids is mixed in a 1:1 ratio and analyzed by mass spectrometry. Protein contaminants in the serum will only show one peak because they do not have a labeled equivalent. As an example, the SILAC method has been used successfully to distinguish between proteins secreted by human chondrocytes in culture and serum contaminants.
An antibody microarray is a highly sensitive and high-throughput method for protein detection that has recently become part of secretomic analysis. Antibodies, or another type of binder molecule, are fixed onto a solid support and a fluorescently labeled protein mixture is added. Signal intensities are used to identify proteins. Antibody microarrays are extremely versatile – they can be used to analyze the amount of protein in a mixture, different protein isoforms, posttranslational modifications, and the biochemical activity of proteins. In addition, these microarrays are highly sensitive – they can detect single molecules of protein. Antibody microarrays are currently being used mostly to analyze human plasma samples but can also be used for cultured cells and body fluid secretomics, presenting a simple way to look for the presence of many proteins at one time.
Besides being important in normal physiological processes, secreted proteins also have an integral role in tumorigenesis through cell growth, migration, invasion, and angiogenesis, making secretomics an excellent method for the discovery of cancer biomarkers. Using a body fluid or full serum proteomic method to identify biomarkers can be extremely difficult – body fluids are complex and highly variable. Secretomic analysis of cancer cell lines or diseased tissue presents a simpler and more specific alternative for biomarker discovery.
The two main biological sources for cancer secretomics are cancer cell line supernatants and proximal biological fluids, the fluids in contact with a tumor. Cancer cell line supernatant is an attractive source of secreted proteins. There are many standardized cell lines available and supernatant is much simpler to analyze than proximal body fluid. But it is unclear whether a cell line secretome is a good representation of an actual tumor in its specific microenvironment and a standardized cell line is not illustrative of the heterogeneity of a real tumor. Analysis of proximal fluids can give a better idea of a human tumor secretome, but this method also has its drawbacks. Procedures for collecting proximal fluids still need to be standardized and non-malignant controls are needed. In addition, environmental and genetic differences between patients can complicate analysis.
Secretomic analysis has discovered potential new biomarkers in many cancer types, including lung cancer, liver cancer, pancreatic cancer, colorectal cancer, prostate cancer, and breast cancer. Prostate-specific antigen (PSA), the current standard biomarker for prostate cancer, has a low diagnostic specificity – PSA levels can not always discriminate between aggressive and non-aggressive cancer – and so a better biomarker is greatly needed. Using secretomic analysis of prostate cell lines, one study was able to discover multiple proteins found in higher levels in the serum of cancer patients than in healthy controls.
There is also a great need for biomarkers for the detection of breast cancer – currently biomarkers only exist for monitoring later stages of cancer. Secretomic analysis of breast cancer cell lines led to the discovery of the protein ALCAM as a new biomarker with promising diagnostic potential.
Analyzing the human embryonic secretome could be helpful in finding a non-invasive method for determining viability of embryos. In IVF, embryos are assessed on morphological criteria in an attempt to find those with high implantation potential. Finding a more quantitative method of assessment could help reduce the number of embryos used in IVF, thereby reducing higher order pregnancies. For example, one study was able to develop secretome fingerprints for many blastocysts and found 9 proteins that could distinguish between blastocysts with normal and abnormal numbers of chromosomes. This type of analysis could help replace preimplantation genetic screening (PGS), which involves biopsy of embryonic cells and can be harmful to development.