Professor Jyoti Choudhary
Group Leader: Functional Proteomics
OrcID: 0000-0003-0881-5477
Phone: +44(0)20 7153 5253
Email: [email protected]
Also on: @proteomics_ICR
Location: Chelsea
OrcID: 0000-0003-0881-5477
Phone: +44(0)20 7153 5253
Email: [email protected]
Also on: @proteomics_ICR
Location: ChelseaBiography
Professor Jyoti Choudhary's research focuses on understanding how the organisation and dynamics of protein networks underpin cancer progression and resistance. She also investigates the impact of mutations and genetic variation on the proteome and protein attributes using quantitative mass spectrometry. Toward this goal, her research group develops novel experimental and computational proteomics and proteogenomic techniques.
She received her PhD in biological mass spectrometry under the guidance of Professor Howard Morris FRS at Imperial College London. She joined the Bioanalytical Sciences division in GlaxoWellcome as a senior scientist and was subsequently selected as a group leader in the CellMap project - a technology incubator unit founded to pursue the development of cutting edge proteomics technologies for drug discovery applications.
This highly successful and reputed unit was spun out of GlaxoSmithKline (GSK), and she became a founding member of Cellzome AG. This leading biotech company developed and applied interaction and chemical proteomics technology for target identification and drug development.
She joined the Wellcome Trust Sanger Institute in 2004, to pursue her research in interaction proteomics and proteogenomics. Her group implemented endogenous gene tagging, eTAP, in mouse stem cells and applied it to systematically study native protein complexes in a range of cell types and tissues. They further developed open access computational tools and data analysis pipelines to integrate proteomics data with genomics. These were then used to refine annotation of the mammalian and malaria genomes, as well as to study the impact of genetic variation on protein networks.
Moving to the ICR in 2017, Professor Choudhary is currently Head of the Proteomics Core Facility and Career Faculty Leader, where she leads a group applying leading edge proteomics, and proteogenomics technologies to further cancer research. Additionally she is a visiting scientist at Imperial College London and at the Wellcome Trust Sanger Institute.
Professor Choudhary is a member of the Cancer Research UK Convergence Science Centre, which brings together leading researchers in engineering, physical sciences, life sciences and medicine to develop innovative ways to address challenges in cancer.
BSc(Hons) Biochemistry, Imperial College.
PhD Biochemistry, Imperial College.
Editorial BoardsMolecular and Cellular Proteomics, 2006.
Multiuser Grants, Member, Wellcome Trust, 2019-2022.
Related pages
Types of Publications
Journal articles
Faithful chromosome segregation during mitosis depends on the spindle assembly checkpoint (SAC), which monitors kinetochore attachment to the mitotic spindle. Unattached kinetochores generate mitotic checkpoint proteins complexes (MCCs) that bind and inhibit the anaphase-promoting complex, or cyclosome (APC/C). How the SAC proficiently inhibits the APC/C but still allows its rapid activation when the last kinetochore attaches to the spindle is important for the understanding of how cells maintain genomic stability. We show that the APC/C subunit APC15 is required for the turnover of the APC/C co-activator CDC20 and release of MCCs during SAC signalling but not for APC/C activity per se. In the absence of APC15, MCCs and ubiquitylated CDC20 remain 'locked' onto the APC/C, which prevents the ubiquitylation and degradation of cyclin B1 when the SAC is satisfied. We conclude that APC15 mediates the constant turnover of CDC20 and MCCs on the APC/C to allow the SAC to respond to the attachment state of kinetochores.
Cyclin-dependent kinases comprise the conserved machinery that drives progress through the cell cycle, but how they do this in mammalian cells is still unclear. To identify the mechanisms by which cyclin-cdks control the cell cycle, we performed a time-resolved analysis of the in vivo interactors of cyclins E1, A2, and B1 by quantitative mass spectrometry. This global analysis of context-dependent protein interactions reveals the temporal dynamics of cyclin function in which networks of cyclin-cdk interactions vary according to the type of cyclin and cell-cycle stage. Our results explain the temporal specificity of the cell-cycle machinery, thereby providing a biochemical mechanism for the genetic requirement for multiple cyclins in vivo and reveal how the actions of specific cyclins are coordinated to control the cell cycle. Furthermore, we identify key substrates (Wee1 and c15orf42/Sld3) that reveal how cyclin A is able to promote both DNA replication and mitosis.
Genetic screens in simple model organisms have identified many of the key components of the conserved signal transduction pathways that are oncogenic when misregulated. Here, we identify H37N21.1 as a gene that regulates vulval induction in let-60(n1046gf), a strain with a gain-of-function mutation in the Caenorhabditis elegans Ras orthologue, and show that somatic deletion of Nrbp1, the mouse orthologue of this gene, results in an intestinal progenitor cell phenotype that leads to profound changes in the proliferation and differentiation of all intestinal cell lineages. We show that Nrbp1 interacts with key components of the ubiquitination machinery and that loss of Nrbp1 in the intestine results in the accumulation of Sall4, a key mediator of stem cell fate, and of Tsc22d2. We also reveal that somatic loss of Nrbp1 results in tumourigenesis, with haematological and intestinal tumours predominating, and that nuclear receptor binding protein 1 (NRBP1) is downregulated in a range of human tumours, where low expression correlates with a poor prognosis. Thus NRBP1 is a conserved regulator of cell fate, that plays an important role in tumour suppression.
Malaria represents a major global health issue, and the identification of new intervention targets remains an urgent priority. This search is hampered by more than one-third of the genes of malaria-causing Plasmodium parasites being uncharacterized. We report a large-scale protein interaction network in Plasmodium schizonts, generated by combining blue native-polyacrylamide electrophoresis with quantitative mass spectrometry and machine learning. This integrative approach, spanning 3 species, identifies >20,000 putative protein interactions, organized into 600 protein clusters. We validate selected interactions, assigning functions in chromatin regulation to previously unannotated proteins and suggesting a role for an EELM2 domain-containing protein and a putative microrchidia protein as mechanistic links between AP2-domain transcription factors and epigenetic regulation. Our interactome represents a high-confidence map of the native organization of core cellular processes in Plasmodium parasites. The network reveals putative functions for uncharacterized proteins, provides mechanistic and structural insight, and uncovers potential alternative therapeutic targets.
The midbody is an organelle assembled at the intercellular bridge between the two daughter cells at the end of mitosis. It controls the final separation of the daughter cells and has been involved in cell fate, polarity, tissue organization, and cilium and lumen formation. Here, we report the characterization of the intricate midbody protein-protein interaction network (interactome), which identifies many previously unknown interactions and provides an extremely valuable resource for dissecting the multiple roles of the midbody. Initial analysis of this interactome revealed that PP1β-MYPT1 phosphatase regulates microtubule dynamics in late cytokinesis and de-phosphorylates the kinesin component MKLP1/KIF23 of the centralspindlin complex. This de-phosphorylation antagonizes Aurora B kinase to modify the functions and interactions of centralspindlin in late cytokinesis. Our findings expand the repertoire of PP1 functions during mitosis and indicate that spatiotemporal changes in the distribution of kinases and counteracting phosphatases finely tune the activity of cytokinesis proteins.
In recent years, the remarkable molecular complexity of synapses has been revealed, with over 1,000 proteins identified in the synapse proteome. Although it is known that different receptors and other synaptic proteins are present in different types of neurons, the extent of synapse diversity across the brain is largely unknown. This is mainly due to the limitations of current techniques. Here, we report an efficient method for the purification of synaptic protein complexes, fusing a high-affinity tag to endogenous PSD95 in specific cell types. We also developed a strategy, which enables the visualisation of endogenous PSD95 with fluorescent-protein tag in Cre-recombinase-expressing cells. We demonstrate the feasibility of proteomic analysis of synaptic protein complexes and visualisation of these in specific cell types. We find that the composition of PSD95 complexes purified from specific cell types differs from those extracted from tissues with diverse cellular composition. The results suggest that there might be differential interactions in the PSD95 complexes in different brain regions. We have detected differentially interacting proteins by comparing data sets from the whole hippocampus and the CA3 subfield of the hippocampus. Therefore, these novel conditional PSD95 tagging lines will not only serve as powerful tools for precisely dissecting synapse diversity in specific brain regions and subsets of neuronal cells, but also provide an opportunity to better understand brain region- and cell-type-specific alterations associated with various psychiatric/neurological diseases. These newly developed conditional gene tagging methods can be applied to many different synaptic proteins and will facilitate research on the molecular complexity of synapses.
The identification of bona fide protein-protein interactions and the mapping of proteomes was greatly enhanced by protein tagging for generic affinity purification methods and analysis by mass spectrometry (AP-MS). The high quality of AP-MS data permitted the development of proteomic navigation by sequential tagging of identified interactions. However AP-MS is laborious and limited to relatively high affinity protein-protein interactions. Proximity labeling, first with the biotin ligase BirA, termed BioID, and then with ascorbate peroxidase, termed APEX, permits a greater reach into the proteome than AP-MS enabling both the identification of a wider field and weaker protein-protein interactions. This additional reach comes with the need for stringent controls. Proximity labeling also permits experiments in living cells allowing spatiotemporal investigations of the proteome. Here we discuss proximity labeling with accompanying methodological descriptions for E. coli and mammalian cells.
The discovery of a Salmonella-targeting phage from the waterways of the United Kingdom provided an opportunity to address the mechanism by which Chi-like bacteriophage (phage) engages with bacterial flagellae. The long tail fibre seen on Chi-like phages has been proposed to assist the phage particle in docking to a host cell flagellum, but the identity of the protein that generates this fibre was unknown. We present the results from genome sequencing of this phage, YSD1, confirming its close relationship to the original Chi phage and suggesting candidate proteins to form the tail structure. Immunogold labelling in electron micrographs revealed that YSD1_22 forms the main shaft of the tail tube, while YSD1_25 forms the distal part contributing to the tail spike complex. The long curling tail fibre is formed by the protein YSD1_29, and treatment of phage with the antibodies that bind YSD1_29 inhibits phage infection of Salmonella. The host range for YSD1 across Salmonella serovars is broad, but not comprehensive, being limited by antigenic features of the flagellin subunits that make up the Salmonella flagellum, with which YSD1_29 engages to initiate infection.
Covalent modifications of proteins with ubiquitin and ubiquitin-like molecules are instrumental to many biological processes. However, identifying the E3 ligase responsible for these modifications remains a major bottleneck in ubiquitin research. Here, we present an E2-thioester-driven identification (E2~dID) method for the targeted identification of substrates of specific E2 and E3 enzyme pairs. E2~dID exploits the central position of E2-conjugating enzymes in the ubiquitination cascade and provides in vitro generated biotinylated E2~ubiquitin thioester conjugates as the sole source for ubiquitination in extracts. This enables purification and mass spectrometry-based identification of modified proteins under stringent conditions independently of the biological source of the extract. We demonstrate the sensitivity and specificity of E2-dID by identifying and validating substrates of APC/C in human cells. Finally, we perform E2~dID with SUMO in S. cerevisiae, showing that this approach can be easily adapted to other ubiquitin-like modifiers and experimental models.
BAF complexes are composed of different subunits with varying functional and developmental roles, although many subunits have not been examined in depth. Here we show that the Baf45 subunit Dpf2 maintains pluripotency and ESC differentiation potential. Dpf2 co-occupies enhancers with Oct4, Sox2, p300, and the BAF subunit Brg1, and deleting Dpf2 perturbs ESC self-renewal, induces repression of Tbx3, and impairs mesendodermal differentiation without dramatically altering Brg1 localization. Mesendodermal differentiation can be rescued by restoring Tbx3 expression, whose distal enhancer is positively regulated by Dpf2-dependent H3K27ac maintenance and recruitment of pluripotency TFs and Brg1. In contrast, the PRC2 subunit Eed binds an intragenic Tbx3 enhancer to oppose Dpf2-dependent Tbx3 expression and mesendodermal differentiation. The PRC2 subunit Ezh2 likewise opposes Dpf2-dependent differentiation through a distinct mechanism involving Nanog repression. Together, these findings delineate distinct mechanistic roles for specific BAF and PRC2 subunits during ESC differentiation.
Ubiquitin-conjugating enzymes (E2s) govern key aspects of ubiquitin signaling. Emerging evidence suggests that the activities of E2s are modulated by posttranslational modifications; the structural underpinnings, however, are largely unclear. Here, we unravel the structural basis and mechanistic consequences of a conserved autoubiquitination event near the catalytic center of E2s, using the human anaphase-promoting complex/cyclosome-associated UBE2S as a model system. Crystal structures we determined of the catalytic ubiquitin carrier protein domain combined with MD simulations reveal that the active-site region is malleable, which permits an adjacent ubiquitin acceptor site, Lys<sup>+5</sup>, to be ubiquitinated intramolecularly. We demonstrate by NMR that the Lys<sup>+5</sup>-linked ubiquitin inhibits UBE2S by obstructing its reloading with ubiquitin. By immunoprecipitation, quantitative mass spectrometry, and siRNA-and-rescue experiments we show that Lys<sup>+5</sup> ubiquitination of UBE2S decreases during mitotic exit but does not influence proteasomal turnover of this E2. These findings suggest that UBE2S activity underlies inherent regulation during the cell cycle.
Oncogene-induced replication stress (RS) promotes cancer development but also impedes tumor growth by activating anti-cancer barriers. To determine how cancer cells adapt to RS, we have monitored the expression of different components of the ATR-CHK1 pathway in primary tumor samples. We show that unlike upstream components of the pathway, the checkpoint mediators Claspin and Timeless are overexpressed in a coordinated manner. Remarkably, reducing the levels of Claspin and Timeless in HCT116 cells to pretumoral levels impeded fork progression without affecting checkpoint signaling. These data indicate that high level of Claspin and Timeless increase RS tolerance by protecting replication forks in cancer cells. Moreover, we report that primary fibroblasts adapt to oncogene-induced RS by spontaneously overexpressing Claspin and Timeless, independently of ATR signaling. Altogether, these data indicate that enhanced levels of Claspin and Timeless represent a gain of function that protects cancer cells from of oncogene-induced RS in a checkpoint-independent manner.
The most widely appreciated role of DNA is to encode protein, yet the exact portion of the human genome that is translated remains to be ascertained. We previously developed PhyloCSF, a widely used tool to identify evolutionary signatures of protein-coding regions using multispecies genome alignments. Here, we present the first whole-genome PhyloCSF prediction tracks for human, mouse, chicken, fly, worm, and mosquito. We develop a workflow that uses machine learning to predict novel conserved protein-coding regions and efficiently guide their manual curation. We analyze more than 1000 high-scoring human PhyloCSF regions and confidently add 144 conserved protein-coding genes to the GENCODE gene set, as well as additional coding regions within 236 previously annotated protein-coding genes, and 169 pseudogenes, most of them disabled after primates diverged. The majority of these represent new discoveries, including 70 previously undetected protein-coding genes. The novel coding genes are additionally supported by single-nucleotide variant evidence indicative of continued purifying selection in the human lineage, coding-exon splicing evidence from new GENCODE transcripts using next-generation transcriptomic data sets, and mass spectrometry evidence of translation for several new genes. Our discoveries required simultaneous comparative annotation of other vertebrate genomes, which we show is essential to remove spurious ORFs and to distinguish coding from pseudogene regions. Our new coding regions help elucidate disease-associated regions by revealing that 118 GWAS variants previously thought to be noncoding are in fact protein altering. Altogether, our PhyloCSF data sets and algorithms will help researchers seeking to interpret these genomes, while our new annotations present exciting loci for further experimental characterization.
The mouse pathogen Citrobacter rodentium is used to model infections with enterohaemorrhagic and enteropathogenic Escherichia coli (EHEC and EPEC). Pathogenesis is commonly modelled in mice developing mild disease (e.g., C57BL/6). However, little is known about host responses in mice exhibiting severe colitis (e.g., C3H/HeN), which arguably provide a more clinically relevant model for human paediatric enteric infection. Infection of C3H/HeN mice with C. rodentium results in rapid colonic colonisation, coinciding with induction of key inflammatory signatures and colonic crypt hyperplasia. Infection also induces dramatic changes to bioenergetics in intestinal epithelial cells, with transition from oxidative phosphorylation (OXPHOS) to aerobic glycolysis and higher abundance of SGLT4, LDHA, and MCT4. Concomitantly, mitochondrial proteins involved in the TCA cycle and OXPHOS were in lower abundance. Similar to observations in C57BL/6 mice, we detected simultaneous activation of cholesterol biogenesis, import, and efflux. Distinctly, however, the pattern recognition receptors NLRP3 and ALPK1 were specifically induced in C3H/HeN. Using cell-based assays revealed that C. rodentium activates the ALPK1/TIFA axis, which is dependent on the ADP-heptose biosynthesis pathway but independent of the Type III secretion system. This study reveals for the first time the unfolding intestinal epithelial cells' responses during severe infectious colitis, which resemble EPEC human infections.
Palmitoylation is the post-translational reversible addition of the acyl moiety, palmitate, to cysteine residues of proteins and is involved in regulating protein trafficking, localization, stability and function. The Aspartate-Histidine-Histidine-Cysteine (DHHC) protein family, named for their highly conserved DHHC signature motif, is thought to be responsible for catalysing protein palmitoylation. Palmitoylation is widespread in all eukaryotes, including the malaria parasite, Plasmodium falciparum, where over 400 palmitoylated proteins are present in the asexual intraerythrocytic schizont stage parasites, including proteins involved in key aspects of parasite maturation and development. The P. falciparum genome includes 12 proteins containing the conserved DHHC motif. In this study, we adapted a palmitoyl-transferase activity assay for use with P. falciparum proteins and demonstrated for the first time that P. falciparum DHHC proteins are responsible for the palmitoylation of P. falciparum substrates. This assay also reveals that multiple DHHCs are capable of palmitoylating the same substrate, indicating functional redundancy at least in vitro. To test whether functional redundancy also exists in vivo, we investigated the endogenous localization and essentiality of a subset of schizont-expressed PfDHHC proteins. Individual PfDHHC proteins localized to distinct organelles, including parasite-specific organelles such as the rhoptries and inner membrane complex. Knock-out studies identified individual DHHCs that may be essential for blood-stage growth and others that were functionally redundant in the blood stages but may have functions in other stages of parasite development. Supporting this hypothesis, disruption of PfDHHC9 had no effect on blood-stage growth but reduced the formation of gametocytes, suggesting that this protein could be exploited as a transmission-blocking target. The localization and stage-specific expression of the DHHC proteins may be important for regulating their substrate specificity and thus may provide a path for inhibitor development.
Infection with Citrobacter rodentium triggers robust tissue damage repair responses, manifested by secretion of IL-22, in the absence of which mice succumbed to the infection. Of the main hallmarks of C. rodentium infection are colonic crypt hyperplasia (CCH) and dysbiosis. In order to colonize the host and compete with the gut microbiota, C. rodentium employs a type III secretion system (T3SS) that injects effectors into colonic intestinal epithelial cells (IECs). Once injected, the effectors subvert processes involved in innate immune responses, cellular metabolism and oxygenation of the mucosa. Importantly, the identity of the effector/s triggering the tissue repair response is/are unknown. Here we report that the effector EspO ,an orthologue of OspE found in Shigella spp, affects proliferation of IECs 8 and 14 days post C. rodentium infection as well as secretion of IL-22 from colonic explants. While we observed no differences in the recruitment of group 3 innate lymphoid cells (ILC3s) and T cells, which are the main sources of IL-22 at the early and late stages of C. rodentium infection respectively, infection with ΔespO was characterized by diminished recruitment of sub-mucosal neutrophils, which coincided with lower abundance of Mmp9 and chemokines (e.g. S100a8/9) in IECs. Moreover, mice infected with ΔespO triggered significantly lesser nutritional immunity (e.g. calprotectin, Lcn2) and expression of antimicrobial peptides (Reg3β, Reg3γ) compared to mice infected with WT C. rodentium. This overlapped with a decrease in STAT3 phosphorylation in IECs. Importantly, while the reduced CCH and abundance of antimicrobial proteins during ΔespO infection did not affect C. rodentium colonization or the composition of commensal Proteobacteria, they had a subtle consequence on Firmicutes subpopulations. EspO is the first bacterial virulence factor that affects neutrophil recruitment and secretion of IL-22, as well as expression of antimicrobial and nutritional immunity proteins in IECs.
A GGGGCC hexanucleotide repeat expansion within the C9orf72 gene is the most common genetic cause of both amyotrophic lateral sclerosis and frontotemporal dementia. Sense and antisense repeat-containing transcripts undergo repeat-associated non-AUG-initiated translation to produce five dipeptide proteins (DPRs). The polyGR and polyPR DPRs are extremely toxic when expressed in Drosophila neurons. To determine the mechanism that mediates this toxicity, we purified DPRs from the Drosophila brain and used mass spectrometry to identify the in vivo neuronal DPR interactome. PolyGR and polyPR interact with ribosomal proteins, and inhibit translation in both human iPSC-derived motor neurons, and adult Drosophila neurons. We next performed a screen of 81 translation-associated proteins in GGGGCC repeat-expressing Drosophila to determine whether this translational repression can be overcome and if this impacts neurodegeneration. Expression of the translation initiation factor eIF1A uniquely rescued DPR-induced toxicity in vivo, indicating that restoring translation is a potential therapeutic strategy. These data directly implicate translational repression in C9orf72 repeat-induced neurodegeneration and identify eIF1A as a novel modifier of C9orf72 repeat toxicity.
High Mobility Group B (HMGB) proteins are involved in cancer progression and in cellular responses to platinum compounds used in the chemotherapy of prostate and ovary cancer. Here we use affinity purification coupled to mass spectrometry (MS) and yeast two-hybrid (Y2H) screening to carry out an exhaustive study of HMGB1 and HMGB2 protein interactions in the context of prostate and ovary epithelia. We present a proteomic study of HMGB1 partners based on immunoprecipitation of HMGB1 from a non-cancerous prostate epithelial cell line. In addition, HMGB1 and HMGB2 were used as baits in yeast two-hybrid screening of libraries from prostate and ovary epithelial cell lines as well as from healthy ovary tissue. HMGB1 interacts with many nuclear proteins that control gene expression, but also with proteins that form part of the cytoskeleton, cell-adhesion structures and others involved in intracellular protein translocation, cellular migration, secretion, apoptosis and cell survival. HMGB2 interacts with proteins involved in apoptosis, cell motility and cellular proliferation. High confidence interactors, based on repeated identification in different cell types or in both MS and Y2H approaches, are discussed in relation to cancer. This study represents a useful resource for detailed investigation of the role of HMGB1 in cancer of epithelial origins, as well as potential alternative avenues of therapeutic intervention.
Mutations in <i>NBEAL2</i>, the gene encoding the scaffolding protein Nbeal2, are causal of gray platelet syndrome (GPS), a rare recessive bleeding disorder characterized by platelets lacking α-granules and progressive marrow fibrosis. We present here the interactome of Nbeal2 with additional validation by reverse immunoprecipitation of Dock7, Sec16a, and Vac14 as interactors of Nbeal2. We show that GPS-causing mutations in its BEACH domain have profound and possible effects on the interaction with Dock7 and Vac14, respectively. Proximity ligation assays show that these 2 proteins are physically proximal to Nbeal2 in human megakaryocytes. In addition, we demonstrate that Nbeal2 is primarily localized in the cytoplasm and Dock7 on the membrane of or in α-granules. Interestingly, platelets from GPS cases and <i>Nbeal2</i><sup><i>-/-</i></sup> mice are almost devoid of Dock7, resulting in a profound dysregulation of its signaling pathway, leading to defective actin polymerization, platelet activation, and shape change. This study shows for the first time proteins interacting with Nbeal2 and points to the dysregulation of the canonical signaling pathway of Dock7 as a possible cause of the aberrant formation of platelets in GPS cases and <i>Nbeal2-</i>deficient mice.
The transcriptional program of early embryonic development is tightly regulated by a set of well-defined transcription factors that suppress premature expression of differentiation genes and sustain the pluripotent identity. It is generally accepted that this program can be perturbed by environmental factors such as chemical pollutants; however, the precise molecular mechanisms remain unknown. The aryl hydrocarbon receptor (AHR) is a widely expressed nuclear receptor that senses environmental stimuli and modulates target gene expression. Here, we have investigated the AHR interactome in embryonic stem cells by mass spectrometry and show that ectopic activation of AHR during early differentiation disrupts the differentiation program via the chromatin remodeling complex NuRD (nucleosome remodeling and deacetylation). The activated AHR/NuRD complex altered the expression of differentiation-specific genes that control the first two developmental decisions without affecting the pluripotency program. These findings identify a mechanism that allows environmental stimuli to disrupt embryonic development through AHR signaling.
<i>Toxoplasma gondii</i> encodes three protein kinase A catalytic (PKAc1-3) and one regulatory (PKAr) subunits to integrate cAMP-dependent signals. Here, we show that inactive PKAc1 is maintained at the parasite pellicle by interacting with acylated PKAr. Either a conditional knockdown of PKAr or the overexpression of PKAc1 blocks parasite division. Conversely, down-regulation of PKAc1 or stabilisation of a dominant-negative PKAr isoform that does not bind cAMP triggers premature parasite egress from infected cells followed by serial invasion attempts leading to host cell lysis. This untimely egress depends on host cell acidification. A phosphoproteome analysis suggested the interplay between cAMP and cGMP signalling as PKAc1 inactivation changes the phosphorylation profile of a putative cGMP-phosphodiesterase. Concordantly, inhibition of the cGMP-dependent protein kinase G (PKG) blocks egress induced by PKAc1 inactivation or environmental acidification, while a cGMP-phosphodiesterase inhibitor circumvents egress repression by PKAc1 or pH neutralisation. This indicates that pH and PKAc1 act as balancing regulators of cGMP metabolism to control egress. These results reveal a crosstalk between PKA and PKG pathways to govern egress in <i>T. gondii</i>.
The intestinal epithelial cells (IECs) that line the gut form a robust line of defense against ingested pathogens. We investigated the impact of infection with the enteric pathogen Citrobacter rodentium on mouse IEC metabolism using global proteomic and targeted metabolomics and lipidomics. The major signatures of the infection were upregulation of the sugar transporter Sglt4, aerobic glycolysis, and production of phosphocreatine, which mobilizes cytosolic energy. In contrast, biogenesis of mitochondrial cardiolipins, essential for ATP production, was inhibited, which coincided with increased levels of mucosal O<sub>2</sub> and a reduction in colon-associated anaerobic commensals. In addition, IECs responded to infection by activating Srebp2 and the cholesterol biosynthetic pathway. Unexpectedly, infected IECs also upregulated the cholesterol efflux proteins AbcA1, AbcG8, and ApoA1, resulting in higher levels of fecal cholesterol and a bloom of Proteobacteria. These results suggest that C. rodentium manipulates host metabolism to evade innate immune responses and establish a favorable gut ecosystem.
The transmission of malaria parasites to mosquitoes relies on the rapid induction of sexual reproduction upon their ingestion into a blood meal. Haploid female and male gametocytes become activated and emerge from their host cells, and the males enter the cell cycle to produce eight microgametes. The synchronized nature of gametogenesis allowed us to investigate phosphorylation signaling during its first minute in Plasmodium berghei via a high-resolution time course of the phosphoproteome. This revealed an unexpectedly broad response, with proteins related to distinct cell cycle events undergoing simultaneous phosphoregulation. We implicate several protein kinases in the process, and we validate our analyses on the plant-like calcium-dependent protein kinase 4 (CDPK4) and a homolog of serine/arginine-rich protein kinases (SRPK1). Mutants in these kinases displayed distinct phosphoproteomic disruptions, consistent with differences in their phenotypes. The results reveal the central role of protein phosphorylation in the atypical cell cycle regulation of a divergent eukaryote.
Current tools for visualization and integration of proteomics with other omics datasets are inadequate for large-scale studies and capture only basic sequence identity information. Furthermore, the frequent reformatting of annotations for reference genomes required by these tools is known to be highly error prone. We developed PoGo for mapping peptides identified through mass spectrometry to overcome these limitations. PoGo reduced runtime and memory usage by 85% and 20%, respectively, and exhibited overall superior performance over other tools on benchmarking with large-scale human tissue and cancer phosphoproteome datasets comprising ∼3 million peptides. In addition, extended functionality enables representation of single-nucleotide variants, post-translational modifications, and quantitative features. PoGo has been integrated in established frameworks such as the PRIDE tool suite and OpenMS, as well as a standalone tool with user-friendly graphical interface. With the rapid increase of quantitative high-resolution datasets capturing proteomes and global modifications to complement orthogonal genomics platforms, PoGo provides a central utility enabling large-scale visualization and interpretation of transomics datasets.
Protein S-acylation (palmitoylation) is a reversible lipid modification that is an important regulator of dynamic membrane-protein interactions. Proteomic approaches have uncovered many putative palmitoylated proteins however, methods for comprehensive palmitoylation site characterization are lacking. We demonstrate a quantitative site-specific-Acyl-Biotin-Exchange (ssABE) method that allowed the identification of 906 putative palmitoylation sites on 641 proteins from mouse forebrain. 62% of sites map to known palmitoylated proteins and 102 individual palmitoylation sites are known from the literature. 54% of palmitoylation sites map to synaptic proteins including many GPCRs, receptors/ion channels and peripheral membrane proteins. Phosphorylation sites were also identified on a subset of peptides that were palmitoylated, demonstrating for the first time co-identification of these modifications by mass spectrometry. Palmitoylation sites were identified on over half of the family of palmitoyl-acyltransferases (PATs) that mediate protein palmitoylation, including active site thioester-linked palmitoyl intermediates. Distinct palmitoylation motifs and site topology were identified for integral membrane and soluble proteins, indicating potential differences in associated PAT specificity and palmitoylation function. ssABE allows the global identification of palmitoylation sites as well as measurement of the active site modification state of PATs, enabling palmitoylation to be studied at a systems level.
MYST histone acetyltransferases have crucial functions in transcription, replication and DNA repair and are hence implicated in development and cancer. Here we characterise Myst2/Kat7/Hbo1 protein interactions in mouse embryonic stem cells by affinity purification coupled to mass spectrometry. This study confirms that in embryonic stem cells Myst2 is part of H3 and H4 histone acetylation complexes similar to those described in somatic cells. We identify a novel Myst2-associated protein, the tumour suppressor protein Niam (Nuclear Interactor of ARF and Mdm2). Human NIAM is involved in chromosome segregation, p53 regulation and cell proliferation in somatic cells, but its role in embryonic stem cells is unknown. We describe the first Niam embryonic stem cell interactome, which includes proteins with roles in DNA replication and repair, transcription, splicing and ribosome biogenesis. Many of Myst2 and Niam binding partners are required for correct embryonic development, implicating Myst2 and Niam in the cooperative regulation of this process and suggesting a novel role for Niam in embryonic biology. The data provides a useful resource for exploring Myst2 and Niam essential cellular functions and should contribute to deeper understanding of organism early development and survival as well as cancer. Data are available via ProteomeXchange with identifier PXD005987.
Chlamydia trachomatis remains a leading cause of bacterial sexually transmitted infections and preventable blindness worldwide. There are, however, limited in vitro models to study the role of host genetics in the response of macrophages to this obligate human pathogen. Here, we describe an approach using macrophages derived from human induced pluripotent stem cells (iPSdMs) to study macrophage-Chlamydia interactions in vitro. We show that iPSdMs support the full infectious life cycle of C. trachomatis in a manner that mimics the infection of human blood-derived macrophages. Transcriptomic and proteomic profiling of the macrophage response to chlamydial infection highlighted the role of the type I interferon and interleukin 10-mediated responses. Using CRISPR/Cas9 technology, we generated biallelic knockout mutations in host genes encoding IRF5 and IL-10RA in iPSCs, and confirmed their roles in limiting chlamydial infection in macrophages. This model can potentially be extended to other pathogens and tissue systems to advance our understanding of host-pathogen interactions and the role of human genetics in influencing the outcome of infections.
Malaria transmission relies on the production of gametes following ingestion by a mosquito. Here, we show that Ca<sup>2+</sup>-dependent protein kinase 4 controls three processes essential to progress from a single haploid microgametocyte to the release of eight flagellated microgametes in <i>Plasmodium berghei</i>. A myristoylated isoform is activated by Ca<sup>2+</sup> to initiate a first genome replication within twenty seconds of activation. This role is mediated by a protein of the SAPS-domain family involved in S-phase entry. At the same time, CDPK4 is required for the assembly of the subsequent mitotic spindle and to phosphorylate a microtubule-associated protein important for mitotic spindle formation. Finally, a non-myristoylated isoform is essential to complete cytokinesis by activating motility of the male flagellum. This role has been linked to phosphorylation of an uncharacterised flagellar protein. Altogether, this study reveals how a kinase integrates and transduces multiple signals to control key cell-cycle transitions during <i>Plasmodium</i> gametogenesis.
Accurate statistical evaluation of sequence database peptide identifications from tandem mass spectra is essential in mass spectrometry based proteomics experiments. These statistics are dependent on accurately modelling random identifications. The target-decoy approach has risen to become the de facto approach to calculating FDR in proteomic datasets. The main principle of this approach is to search a set of decoy protein sequences that emulate the size and composition of the target protein sequences searched whilst not matching real proteins in the sample. To do this, it is commonplace to reverse or shuffle the proteins and peptides in the target database. However, these approaches have their drawbacks and limitations. A key confounding issue is the peptide redundancy between target and decoy databases leading to inaccurate FDR estimation. This inaccuracy is further amplified at the protein level and when searching large sequence databases such as those used for proteogenomics. Here, we present a unifying hybrid method to quickly and efficiently generate decoy sequences with minimal overlap between target and decoy peptides. We show that applying a reversed decoy approach can produce up to 5% peptide redundancy and many more additional peptides will have the exact same precursor mass as a target peptide. Our hybrid method addresses both these issues by first switching proteolytic cleavage sites with preceding amino acid, reversing the database and then shuffling any redundant sequences. This flexible hybrid method reduces the peptide overlap between target and decoy peptides to about 1% of peptides, making a more robust decoy model suitable for large search spaces. We also demonstrate the anti-conservative effect of redundant peptides on the calculation of q-values in mouse brain tissue data.
Campylobacter jejuni is the leading cause of bacterial gastroenteritis in the world. A number of factors are believed to contribute to the ability of C. jejuni to cause disease within the human host including the secretion of non-flagellar proteins via the flagellar type III secretion system (FT3SS). Here for the first time we have utilised quantitative proteomics using stable isotope labelling by amino acids in cell culture (SILAC), and label-free liquid chromatography-mass spectrometry (LC/MS), to compare supernatant samples from C. jejuni M1 wild type and flagella-deficient (flgG mutant) strains to identify putative novel proteins secreted via the FT3SS. Genes encoding proteins that were candidates for flagellar secretion, derived from the LC/MS and SILAC datasets, were deleted. Infection of human CACO-2 tissue culture cells using these mutants resulted in the identification of novel genes required for interactions with these cells. This work has shown for the first time that both CJM1_0791 and CJM1_0395 are dependent on the flagellum for their presence in supernatants from C. jejuni stains M1 and 81-176.<h4>Biological significance</h4>This study provides the most complete description of the Campylobac er jejuni secretome to date. SILAC and label-free proteomics comparing mutants with or without flagella have resulted in the identification of two C. jejuni proteins that are dependent on flagella for their export from the bacterial cell.
Outer membrane blebs are naturally shed by Gram-negative bacteria and are candidates of interest for vaccines development. Genetic modification of bacteria to induce hyperblebbing greatly increases the yield of blebs, called Generalized Modules for Membrane Antigens (GMMA). The composition of the GMMA from hyperblebbing mutants of Shigella flexneri 2a and Shigella sonnei were quantitatively analyzed using high-sensitivity mass spectrometry with the label-free iBAQ procedure and compared to the composition of the solubilized cells of the GMMA-producing strains. There were 2306 proteins identified, 659 in GMMA and 2239 in bacteria, of which 290 (GMMA) and 1696 (bacteria) were common to both S. flexneri 2a and S. sonnei. Predicted outer membrane and periplasmic proteins constituted 95.7% and 98.7% of the protein mass of S. flexneri 2a and S. sonnei GMMA, respectively. Among the remaining proteins, small quantities of ribosomal proteins collectively accounted for more than half of the predicted cytoplasmic protein impurities in the GMMA. In GMMA, the outer membrane and periplasmic proteins were enriched 13.3-fold (S. flexneri 2a) and 8.3-fold (S. sonnei) compared to their abundance in the parent bacteria. Both periplasmic and outer membrane proteins were enriched similarly, suggesting that GMMA have a similar surface to volume ratio as the surface to periplasmic volume ratio in these mutant bacteria. Results in S. flexneri 2a and S. sonnei showed high reproducibility indicating a robust GMMA-producing process and the low contamination by cytoplasmic proteins support the use of GMMA for vaccines. Data are available via ProteomeXchange with identifier PXD002517.
The large number of chemical modifications that are found on the histone proteins of eukaryotic cells form multiple complex combinations, which can act as recognition signals for reader proteins. We have used peptide capture in conjunction with super-SILAC quantification to carry out an unbiased high-throughput analysis of the composition of protein complexes that bind to histone H3K9/S10 and H3K27/S28 methyl-phospho modifications. The accurate quantification allowed us to perform Weighted correlation network analysis (WGCNA) to obtain a systems-level view of the histone H3 histone tail interactome. The analysis reveals the underlying modularity of the histone reader network with members of nuclear complexes exhibiting very similar binding signatures, which suggests that many proteins bind to histones as part of pre-organized complexes. Our results identify a novel complex that binds to the double H3K9me3/S10ph modification, which includes Atrx, Daxx and members of the FACT complex. The super-SILAC approach allows comparison of binding to multiple peptides with different combinations of modifications and the resolution of the WGCNA analysis is enhanced by maximizing the number of combinations that are compared. This makes it a useful approach for assessing the effects of changes in histone modification combinations on the composition and function of bound complexes.
Arc is an activity-regulated neuronal protein, but little is known about its interactions, assembly into multiprotein complexes, and role in human disease and cognition. We applied an integrated proteomic and genetic strategy by targeting a tandem affinity purification (TAP) tag and Venus fluorescent protein into the endogenous Arc gene in mice. This allowed biochemical and proteomic characterization of native complexes in wild-type and knockout mice. We identified many Arc-interacting proteins, of which PSD95 was the most abundant. PSD95 was essential for Arc assembly into 1.5-MDa complexes and activity-dependent recruitment to excitatory synapses. Integrating human genetic data with proteomic data showed that Arc-PSD95 complexes are enriched in schizophrenia, intellectual disability, autism, and epilepsy mutations and normal variants in intelligence. We propose that Arc-PSD95 postsynaptic complexes potentially affect human cognitive function.
The histone H3 Lys27-specific demethylase UTX (or KDM6A) is targeted by loss-of-function mutations in multiple cancers. Here, we demonstrate that UTX suppresses myeloid leukemogenesis through noncatalytic functions, a property shared with its catalytically inactive Y-chromosome paralog, UTY (or KDM6C). In keeping with this, we demonstrate concomitant loss/mutation of KDM6A (UTX) and UTY in multiple human cancers. Mechanistically, global genomic profiling showed only minor changes in H3K27me3 but significant and bidirectional alterations in H3K27ac and chromatin accessibility; a predominant loss of H3K4me1 modifications; alterations in ETS and GATA-factor binding; and altered gene expression after Utx loss. By integrating proteomic and genomic analyses, we link these changes to UTX regulation of ATP-dependent chromatin remodeling, coordination of the COMPASS complex and enhanced pioneering activity of ETS factors during evolution to AML. Collectively, our findings identify a dual role for UTX in suppressing acute myeloid leukemia via repression of oncogenic ETS and upregulation of tumor-suppressive GATA programs.
Osteoarthritis (OA) is a common disease characterized by cartilage degeneration and joint remodeling. The underlying molecular changes underpinning disease progression are incompletely understood. We investigated genes and pathways that mark OA progression in isolated primary chondrocytes taken from paired intact versus degraded articular cartilage samples across 38 patients undergoing joint replacement surgery (discovery cohort: 12 knee OA, replication cohorts: 17 knee OA, 9 hip OA patients). We combined genome-wide DNA methylation, RNA sequencing, and quantitative proteomics data. We identified 49 genes differentially regulated between intact and degraded cartilage in at least two -omics levels, 16 of which have not previously been implicated in OA progression. Integrated pathway analysis implicated the involvement of extracellular matrix degradation, collagen catabolism and angiogenesis in disease progression. Using independent replication datasets, we showed that the direction of change is consistent for over 90% of differentially expressed genes and differentially methylated CpG probes. AQP1, COL1A1 and CLEC3B were significantly differentially regulated across all three -omics levels, confirming their differential expression in human disease. Through integration of genome-wide methylation, gene and protein expression data in human primary chondrocytes, we identified consistent molecular players in OA progression that replicated across independent datasets and that have translational potential.
Salmonella enterica are a threat to public health. Current vaccines are not fully effective. The ability to grow in infected tissues within phagocytes is required for S. enterica virulence in systemic disease. As the infection progresses the bacteria are exposed to a complex host immune response. Consequently, in order to continue growing in the tissues, S. enterica requires the coordinated regulation of fitness genes. Bacterial gene regulation has so far been investigated largely using exposure to artificial environmental conditions or to in vitro cultured cells, and little information is available on how S. enterica adapts in vivo to sustain cell division and survival. We have studied the transcriptome, proteome and metabolic flux of Salmonella, and the transcriptome of the host during infection of wild type C57BL/6 and immune-deficient gp91-/-phox mice. Our analyses advance the understanding of how S. enterica and the host behaves during infection to a more sophisticated level than has previously been reported.
The proteome of human brain synapses is highly complex and is mutated in over 130 diseases. This complexity arose from two whole-genome duplications early in the vertebrate lineage. Zebrafish are used in modelling human diseases; however, its synapse proteome is uncharacterized, and whether the teleost-specific genome duplication (TSGD) influenced complexity is unknown. We report the characterization of the proteomes and ultrastructure of central synapses in zebrafish and analyse the importance of the TSGD. While the TSGD increases overall synapse proteome complexity, the postsynaptic density (PSD) proteome of zebrafish has lower complexity than mammals. A highly conserved set of ∼1,000 proteins is shared across vertebrates. PSD ultrastructural features are also conserved. Lineage-specific proteome differences indicate that vertebrate species evolved distinct synapse types and functions. The data sets are a resource for a wide range of studies and have important implications for the use of zebrafish in modelling human synaptic diseases.
A family of apicomplexa-specific proteins containing AP2 DNA-binding domains (ApiAP2s) was identified in malaria parasites. This family includes sequence-specific transcription factors that are key regulators of development. However, functions for the majority of ApiAP2 genes remain unknown. Here, a systematic knockout screen in Plasmodium berghei identified ten ApiAP2 genes that were essential for mosquito transmission: four were critical for the formation of infectious ookinetes, and three were required for sporogony. We describe non-essential functions for AP2-O and AP2-SP proteins in blood stages, and identify AP2-G2 as a repressor active in both asexual and sexual stages. Comparative transcriptomics across mutants and developmental stages revealed clusters of co-regulated genes with shared cis promoter elements, whose expression can be controlled positively or negatively by different ApiAP2 factors. We propose that stage-specific interactions between ApiAP2 proteins on partly overlapping sets of target genes generate the complex transcriptional network that controls the Plasmodium life cycle.
Pluripotency and self-renewal, the defining properties of embryonic stem cells, are brought about by transcriptional programs involving an intricate network of transcription factors and chromatin remodeling complexes. The Nucleosome Remodeling and Deacetylase (NuRD) complex plays a crucial and dynamic role in the regulation of stemness and differentiation. Several NuRD-associated factors have been reported but how they are organized has not been investigated in detail. Here, we have combined affinity purification and blue native polyacrylamide gel electrophoresis followed by protein identification by mass spectrometry and protein correlation profiling to characterize the topology of the NuRD complex. Our data show that in mouse embryonic stem cells the NuRD complex is present as two distinct assemblies of differing topology with different binding partners. Cell cycle regulator Cdk2ap1 and transcription factor Sall4 associate only with the higher mass NuRD assembly. We further establish that only isoform Sall4a, and not Sall4b, associates with NuRD. By contrast, Suz12, a component of the PRC2 Polycomb repressor complex, associates with the lower mass entity. In addition, we identify and validate a novel NuRD-associated protein, Wdr5, a regulatory subunit of the MLL histone methyltransferase complex, which associates with both NuRD entities. Bioinformatic analyses of published target gene sets of these chromatin binding proteins are in agreement with these structural observations. In summary, this study provides an interesting insight into mechanistic aspects of NuRD function in stem cell biology. The relevance of our work has broader implications because of the ubiquitous nature of the NuRD complex. The strategy described here can be more broadly applicable to investigate the topology of the multiple complexes an individual protein can participate in.
Complete annotation of the human genome is indispensable for medical research. The GENCODE consortium strives to provide this, augmenting computational and experimental evidence with manual annotation. The rapidly developing field of proteogenomics provides evidence for the translation of genes into proteins and can be used to discover and refine gene models. However, for both the proteomics and annotation groups, there is a lack of guidelines for integrating this data. Here we report a stringent workflow for the interpretation of proteogenomic data that could be used by the annotation community to interpret novel proteogenomic evidence. Based on reprocessing of three large-scale publicly available human data sets, we show that a conservative approach, using stringent filtering is required to generate valid identifications. Evidence has been found supporting 16 novel protein-coding genes being added to GENCODE. Despite this many peptide identifications in pseudogenes cannot be annotated due to the absence of orthogonal supporting evidence.
In Gram-positive pathogens, surface proteins may be covalently anchored to the bacterial peptidoglycan by sortase, a cysteine transpeptidase enzyme. In contrast to other Gram-positive bacteria, only one single sortase enzyme, SrtB, is conserved between strains of Clostridium difficile. Sortase-mediated peptidase activity has been reported in vitro, and seven potential substrates have been identified. Here, we demonstrate the functionality of sortase in C. difficile. We identify two sortase-anchored proteins, the putative adhesins CD2831 and CD3246, and determine the cell wall anchor structure of CD2831. The C-terminal PPKTG sorting motif of CD2831 is cleaved between the threonine and glycine residues, and the carboxyl group of threonine is amide-linked to the side chain amino group of diaminopimelic acid within the peptidoglycan peptide stem. We show that CD2831 protein levels are elevated in the presence of high intracellular cyclic diGMP (c-diGMP) concentrations, in agreement with the control of CD2831 expression by a c-diGMP-dependent type II riboswitch. Low c-diGMP levels induce the release of CD2831 and presumably CD3246 from the surface of cells. This regulation is mediated by proteolytic cleavage of CD2831 and CD3246 by the zinc metalloprotease ZmpI, whose expression is controlled by a type I c-diGMP riboswitch. These data reveal a novel regulatory mechanism for expression of two sortase substrates by the secondary messenger c-diGMP, on which surface anchoring is dependent.
Protein post-translational modifications (PTM) are commonly used to regulate biological processes. Protein S-acylation is an enzymatically regulated reversible modification that has been shown to modulate protein localization, activity and membrane binding. Proteome-scale discovery on Plasmodium falciparum schizonts has revealed a complement of more than 400 palmitoylated proteins, including those essential for host invasion and drug resistance. The wide regulatory affect on this species is endorsed by the presence of 12 proteins containing the conserved DHHC-CRD (DHHC motif within a cysteine-rich domain) that is associated with palmitoyl-transferase activity. Genetic interrogation of these enzymes in Apicomplexa has revealed essentiality and distinct localization at cellular compartments; these features are species specific and are not observed in yeast. It is clear that palmitoylation has an elaborate role in Plasmodium biology and opens intriguing questions on the functional consequence of this group of acylation modifications and how the protein S-acyl transferases (PATs) orchestrate molecular events.
<h4>Background</h4>Synapses are fundamental components of brain circuits and are disrupted in over 100 neurological and psychiatric diseases. The synapse proteome is physically organized into multiprotein complexes and polygenic mutations converge on postsynaptic complexes in schizophrenia, autism and intellectual disability. Directly characterising human synapses and their multiprotein complexes from post-mortem tissue is essential to understanding disease mechanisms. However, multiprotein complexes have not been directly isolated from human synapses and the feasibility of their isolation from post-mortem tissue is unknown.<h4>Results</h4>Here we establish a screening assay and criteria to identify post-mortem brain samples containing well-preserved synapse proteomes, revealing that neocortex samples are best preserved. We also develop a rapid method for the isolation of synapse proteomes from human brain, allowing large numbers of post-mortem samples to be processed in a short time frame. We perform the first purification and proteomic mass spectrometry analysis of MAGUK Associated Signalling Complexes (MASC) from neurosurgical and post-mortem tissue and find genetic evidence for their involvement in over seventy human brain diseases.<h4>Conclusions</h4>We have demonstrated that synaptic proteome integrity can be rapidly assessed from human post-mortem brain samples prior to its analysis with sophisticated proteomic methods. We have also shown that proteomics of synapse multiprotein complexes from well preserved post-mortem tissue is possible, obtaining structures highly similar to those isolated from biopsy tissue. Finally we have shown that MASC from human synapses are involved with over seventy brain disorders. These findings should have wide application in understanding the synaptic basis of psychiatric and other mental disorders.
We present a workflow using an ETD-optimised version of Mascot Percolator and a modified version of SLoMo (turbo-SLoMo) for analysis of phosphoproteomic data. We have benchmarked this against several database searching algorithms and phosphorylation site localisation tools and show that it offers highly sensitive and confident phosphopeptide identification and site assignment with PSM-level statistics, enabling rigorous comparison of data acquisition methods. We analysed the Plasmodium falciparum schizont phosphoproteome using for the first time, a data-dependent neutral loss-triggered-ETD (DDNL) strategy and a conventional decision-tree method. At a posterior error probability threshold of 0.01, similar numbers of PSMs were identified using both methods with a 73% overlap in phosphopeptide identifications. The false discovery rate associated with spectral pairs where DDNL CID/ETD identified the same phosphopeptide was <1%. 72% of phosphorylation site assignments using turbo-SLoMo without any score filtering, were identical and 99.8% of these cases are associated with a false localisation rate of <5%. We show that DDNL acquisition is a useful approach for phosphoproteomics and results in an increased confidence in phosphopeptide identification without compromising sensitivity or duty cycle. Furthermore, the combination of Mascot Percolator and turbo-SLoMo represents a robust workflow for phosphoproteomic data analysis using CID and ETD fragmentation.<h4>Biological significance</h4>Protein phosphorylation is a ubiquitous post-translational modification that regulates protein function. Mass spectrometry-based approaches have revolutionised its analysis on a large-scale but phosphorylation sites are often identified by single phosphopeptides and therefore require more rigorous data analysis to unsure that sites are identified with high confidence for follow-up experiments to investigate their biological significance. The coverage and confidence of phosphoproteomic experiments can be enhanced by the use of multiple complementary fragmentation methods. Here we have benchmarked a data analysis pipeline for analysis of phosphoproteomic data generated using CID and ETD fragmentation and used it to demonstrate the utility of a data-dependent neutral loss triggered ETD fragmentation strategy for high confidence phosphopeptide identification and phosphorylation site localisation.
Protein identification by MS/MS is an important technique in proteome studies. The Open Mass Spectrometry Search Algorithm (OMSSA) is an open-source search engine that can be used to identify MS/MS spectra acquired in these experiments. Here, we present a software tool, termed OMSSAPercolator, which interfaces OMSSA with Percolator, a post-search machine learning method for rescoring database search results. We demonstrate that it outperforms the standard OMSSA scoring scheme, and provides reliable significant measurements. OMSSAPercolator is programmed using JAVA and can be readily used as a standalone tool or integrated into existing data analysis pipelines. OMSSAPercolator is freely available and can be downloaded at http://sourceforge.net/projects/omssapercolator/.
<h4>Background</h4>Clostridium difficile is an anaerobic, Gram-positive bacterium that can reside as a commensal within the intestinal microbiota of healthy individuals or cause life-threatening antibiotic-associated diarrhea in immunocompromised hosts. C. difficile can also form highly resistant spores that are excreted facilitating host-to-host transmission. The C. difficile spo0A gene encodes a highly conserved transcriptional regulator of sporulation that is required for relapsing disease and transmission in mice.<h4>Results</h4>Here we describe a genome-wide approach using a combined transcriptomic and proteomic analysis to identify Spo0A regulated genes. Our results validate Spo0A as a positive regulator of putative and novel sporulation genes as well as components of the mature spore proteome. We also show that Spo0A regulates a number of virulence-associated factors such as flagella and metabolic pathways including glucose fermentation leading to butyrate production.<h4>Conclusions</h4>The C. difficile spo0A gene is a global transcriptional regulator that controls diverse sporulation, virulence and metabolic phenotypes coordinating pathogen adaptation to a wide range of host interactions. Additionally, the rich breadth of functional data allowed us to significantly update the annotation of the C. difficile 630 reference genome which will facilitate basic and applied research on this emerging pathogen.
Direct comparison of protein components from human and mouse excitatory synapses is important for determining the suitability of mice as models of human brain disease and to understand the evolution of the mammalian brain. The postsynaptic density is a highly complex set of proteins organized into molecular networks that play a central role in behavior and disease. We report the first direct comparison of the proteome of triplicate isolates of mouse and human cortical postsynaptic densities. The mouse postsynaptic density comprised 1556 proteins and the human one 1461. A large compositional overlap was observed; more than 70% of human postsynaptic density proteins were also observed in the mouse postsynaptic density. Quantitative analysis of postsynaptic density components in both species indicates a broadly similar profile of abundance but also shows that there is higher abundance variation between species than within species. Well known components of this synaptic structure are generally more abundant in the mouse postsynaptic density. Significant inter-species abundance differences exist in some families of key postsynaptic density proteins including glutamatergic neurotransmitter receptors and adaptor proteins. Furthermore, we have identified a closely interacting set of molecules enriched in the human postsynaptic density that could be involved in dendrite and spine structural plasticity. Understanding synapse proteome diversity within and between species will be important to further our understanding of brain complexity and disease.
Asexual stage Plasmodium falciparum replicates and undergoes a tightly regulated developmental process in human erythrocytes. One mechanism involved in the regulation of this process is posttranslational modification (PTM) of parasite proteins. Palmitoylation is a PTM in which cysteine residues undergo a reversible lipid modification, which can regulate target proteins in diverse ways. Using complementary palmitoyl protein purification approaches and quantitative mass spectrometry, we examined protein palmitoylation in asexual-stage P. falciparum parasites and identified over 400 palmitoylated proteins, including those involved in cytoadherence, drug resistance, signaling, development, and invasion. Consistent with the prevalence of palmitoylated proteins, palmitoylation is essential for P. falciparum asexual development and influences erythrocyte invasion by directly regulating the stability of components of the actin-myosin invasion motor. Furthermore, P. falciparum uses palmitoylation in diverse ways, stably modifying some proteins while dynamically palmitoylating others. Palmitoylation therefore plays a central role in regulating P. falciparum blood stage development.
The combination of affinity purification with mass spectrometry analysis has become the method of choice for protein complex characterization. With the improved performance of mass spectrometry technology, the sensitivity of the analyses is increasing, probing deeper into molecular interactions and yielding longer lists of proteins. These identify not only core complex subunits but also the more inaccessible proteins that interact weakly or transiently. Alongside them, contaminant proteins, which are often abundant proteins in the cell, tend to be recovered in affinity experiments because they bind nonspecifically and with low affinity to matrix, tag, and/or antibody. The challenge now lies in discriminating nonspecific binders from true interactors, particularly at the low level and in a larger scale. This review aims to summarize the variety of methods that have been used to distinguish contaminants from specific interactions in the past few years, ranging from manual elimination using heuristic rules to more sophisticated probabilistic scoring approaches. We aim to give awareness on the processing that takes place before an interaction list is reported and on the different types of list curation approaches suited to the different experiments.
Peptide identification using tandem mass spectrometry is a core technology in proteomics. Latest generations of mass spectrometry instruments enable the use of electron transfer dissociation (ETD) to complement collision induced dissociation (CID) for peptide fragmentation. However, a critical limitation to the use of ETD has been optimal database search software. Percolator is a post-search algorithm, which uses semi-supervised machine learning to improve the rate of peptide spectrum identifications (PSMs) together with providing reliable significance measures. We have previously interfaced the Mascot search engine with Percolator and demonstrated sensitivity and specificity benefits with CID data. Here, we report recent developments in the Mascot Percolator V2.0 software including an improved feature calculator and support for a wider range of ion series. The updated software is applied to the analysis of several CID and ETD fragmented peptide data sets. This version of Mascot Percolator increases the number of CID PSMs by up to 80% and ETD PSMs by up to 60% at a 0.01 q-value (1% false discovery rate) threshold over a standard Mascot search, notably recovering PSMs from high charge state precursor ions. The greatly increased number of PSMs and peptide coverage afforded by Mascot Percolator has enabled a fuller assessment of CID/ETD complementarity to be performed. Using a data set of CID and ETcaD spectral pairs, we find that at a 1% false discovery rate, the overlap in peptide identifications by CID and ETD is 83%, which is significantly higher than that obtained using either stand-alone Mascot (69%) or OMSSA (39%). We conclude that Mascot Percolator is a highly sensitive and accurate post-search algorithm for peptide identification and allows direct comparison of peptide identifications using multiple alternative fragmentation techniques.
A variety of methods are described in the literature to assign peptide sequences to observed tandem MS data. Typically, the identified peptides are associated only with an arbitrary score that reflects the quality of the peptide-spectrum match but not with a statistically meaningful significance measure. In this chapter, we discuss why statistical significance measures can simplify and unify the interpretation of MS-based proteomic experiments. In addition, we also present available software solutions that convert scores into sound statistical measures.
Assessing the impact of genomic alterations on protein networks is fundamental in identifying the mechanisms that shape cancer heterogeneity. We have used isobaric labeling to characterize the proteomic landscapes of 50 colorectal cancer cell lines and to decipher the functional consequences of somatic genomic variants. The robust quantification of over 9,000 proteins and 11,000 phosphopeptides on average enabled the de novo construction of a functional protein correlation network, which ultimately exposed the collateral effects of mutations on protein complexes. CRISPR-cas9 deletion of key chromatin modifiers confirmed that the consequences of genomic alterations can propagate through protein interactions in a transcript-independent manner. Lastly, we leveraged the quantified proteome to perform unsupervised classification of the cell lines and to build predictive models of drug response in colorectal cancer. Overall, we provide a deep integrative view of the functional network and the molecular structure underlying the heterogeneity of colorectal cancer cells.
High-resolution mass spectrometry (MS) has become an important tool in the life sciences, contributing to the diagnosis and understanding of human diseases, elucidating biomolecular structural information and characterizing cellular signaling networks. However, the rapid growth in the volume and complexity of MS data makes transparent, accurate and reproducible analysis difficult. We present OpenMS 2.0 (http://www.openms.de), a robust, open-source, cross-platform software specifically designed for the flexible and reproducible analysis of high-throughput MS data. The extensible OpenMS software implements common mass spectrometric data processing tasks through a well-defined application programming interface in C++ and Python and through standardized open data formats. OpenMS additionally provides a set of 185 tools and ready-made workflows for common mass spectrometric data processing tasks, which enable users to perform complex quantitative mass spectrometric analyses with ease.
<h4>Unlabelled</h4>Legionella pneumophila, the causative agent of Legionnaires' disease, uses the Dot/Icm type IV secretion system (T4SS) to translocate more than 300 effectors into host cells, where they subvert host cell signaling. The function and host cell targets of most effectors remain unknown. PieE is a 69-kDa Dot/Icm effector containing three coiled-coil (CC) regions and 2 transmembrane (TM) helices followed by a fourth CC region. Here, we report that PieE dimerized by an interaction between CC3 and CC4. We found that ectopically expressed PieE localized to the endoplasmic reticulum (ER) and induced the formation of organized smooth ER, while following infection PieE localized to the Legionella-containing vacuole (LCV). To identify the physiological targets of PieE during infection, we established a new purification method for which we created an A549 cell line stably expressing the Escherichia coli biotin ligase BirA and infected the cells with L. pneumophila expressing PieE fused to a BirA-specific biotinylation site and a hexahistidine tag. Following tandem Ni(2+) nitrilotriacetic acid (NTA) and streptavidin affinity chromatography, the effector-target complexes were analyzed by mass spectrometry. This revealed interactions of PieE with multiple host cell proteins, including the Rab GTPases 1a, 1b, 2a, 5c, 6a, 7, and 10. Binding of the Rab GTPases, which was validated by yeast two-hybrid binding assays, was mediated by the PieE CC1 and CC2. In summary, using a novel, highly specific strategy to purify effector complexes from infected cells, which is widely applicable to other pathogens, we identified PieE as a multidomain LCV protein with promiscuous Rab GTPase-binding capacity.<h4>Importance</h4>The respiratory pathogen Legionella pneumophila uses the Dot/Icm type IV secretion system to translocate more than 300 effector proteins into host cells. The function of most effectors in infection remains unknown. One of the bottlenecks for their characterization is the identification of target proteins. Frequently used in vitro approaches are not applicable to all effectors and suffer from high rates of false positives or missed interactions, as they are not performed in the context of an infection. Here, we determine key functional domains of the effector PieE and describe a new method to identify host cell targets under physiological infection conditions. Our approach, which is applicable to other pathogens, uncovered the interaction of PieE with several proteins involved in membrane trafficking, in particular Rab GTPases, revealing new details of the Legionella infection strategy and demonstrating the potential of this method to greatly advance our understanding of the molecular basis of infection.
Schizophrenia is a common disease with a complex aetiology, probably involving multiple and heterogeneous genetic factors. Here, by analysing the exome sequences of 2,536 schizophrenia cases and 2,543 controls, we demonstrate a polygenic burden primarily arising from rare (less than 1 in 10,000), disruptive mutations distributed across many genes. Particularly enriched gene sets include the voltage-gated calcium ion channel and the signalling complex formed by the activity-regulated cytoskeleton-associated scaffold protein (ARC) of the postsynaptic density, sets previously implicated by genome-wide association and copy-number variation studies. Similar to reports in autism, targets of the fragile X mental retardation protein (FMRP, product of FMR1) are enriched for case mutations. No individual gene-based test achieves significance after correction for multiple testing and we do not detect any alleles of moderately low frequency (approximately 0.5 to 1 per cent) and moderately large effect. Taken together, these data suggest that population-based exome sequencing can discover risk alleles and complements established gene-mapping paradigms in neuropsychiatric disease.
Calcium-dependent protein kinases (CDPKs) play key regulatory roles in the life cycle of the malaria parasite, but in many cases their precise molecular functions are unknown. Using the rodent malaria parasite Plasmodium berghei, we show that CDPK1, which is known to be essential in the asexual blood stage of the parasite, is expressed in all life stages and is indispensable during the sexual mosquito life-cycle stages. Knockdown of CDPK1 in sexual stages resulted in developmentally arrested parasites and prevented mosquito transmission, and these effects were independent of the previously proposed function for CDPK1 in regulating parasite motility. In-depth translational and transcriptional profiling of arrested parasites revealed that CDPK1 translationally activates mRNA species in the developing zygote that in macrogametes remain repressed via their 3' and 5'UTRs. These findings indicate that CDPK1 is a multifunctional protein that translationally regulates mRNAs to ensure timely and stage-specific protein expression.
Campylobacter jejuni is the most common bacterial cause of foodborne disease in the developed world. Its general physiology and biochemistry, as well as the mechanisms enabling it to colonize and cause disease in various hosts, are not well understood, and new approaches are required to understand its basic biology. High-throughput sequencing technologies provide unprecedented opportunities for functional genomic research. Recent studies have shown that direct Illumina sequencing of cDNA (RNA-seq) is a useful technique for the quantitative and qualitative examination of transcriptomes. In this study we report RNA-seq analyses of the transcriptomes of C. jejuni (NCTC11168) and its rpoN mutant. This has allowed the identification of hitherto unknown transcriptional units, and further defines the regulon that is dependent on rpoN for expression. The analysis of the NCTC11168 transcriptome was supplemented by additional proteomic analysis using liquid chromatography-MS. The transcriptomic and proteomic datasets represent an important resource for the Campylobacter research community.
Prmt5, an arginine methyltransferase, has multiple roles in germ cells, and possibly in pluripotency. Here we show that loss of Prmt5 function is early embryonic-lethal due to the abrogation of pluripotent cells in blastocysts. Prmt5 is also up-regulated in the cytoplasm during the derivation of embryonic stem (ES) cells together with Stat3, where they persist to maintain pluripotency. Prmt5 in association with Mep50 methylates cytosolic histone H2A (H2AR3me2s) to repress differentiation genes in ES cells. Loss of Prmt5 or Mep50 results in derepression of differentiation genes, indicating the significance of the Prmt5/Mep50 complex for pluripotency, which may occur in conjunction with the leukemia inhibitory factor (LIF)/Stat3 pathway.
Recent advances in proteomic mass spectrometry (MS) offer the chance to marry high-throughput peptide sequencing to transcript models, allowing the validation, refinement, and identification of new protein-coding loci. We present a novel pipeline that integrates highly sensitive and statistically robust peptide spectrum matching with genome-wide protein-coding predictions to perform large-scale gene validation and discovery in the mouse genome for the first time. In searching an excess of 10 million spectra, we have been able to validate 32%, 17%, and 7% of all protein-coding genes, exons, and splice boundaries, respectively. Moreover, we present strong evidence for the identification of multiple alternatively spliced translations from 53 genes and have uncovered 10 entirely novel protein-coding genes, which are not covered in any mouse annotation data sources. One such novel protein-coding gene is a fusion protein that spans the Ins2 and Igf2 loci to produce a transcript encoding the insulin II and the insulin-like growth factor 2-derived peptides. We also report nine processed pseudogenes that have unique peptide hits, demonstrating, for the first time, that they are not just transcribed but are translated and are therefore resurrected into new coding loci. This work not only highlights an important utility for MS data in genome annotation but also provides unique insights into the gene structure and propagation in the mouse genome. All these data have been subsequently used to improve the publicly available mouse annotation available in both the Vega and Ensembl genome browsers (http://vega.sanger.ac.uk).
We isolated the postsynaptic density from human neocortex (hPSD) and identified 1,461 proteins. hPSD mutations cause 133 neurological and psychiatric diseases and were enriched in cognitive, affective and motor phenotypes underpinned by sets of genes. Strong protein sequence conservation in mammalian lineages, particularly in hub proteins, indicates conserved function and organization in primate and rodent models. The hPSD is an important structure for nervous system disease and behavior.
The transcription factor Oct4 is key in embryonic stem cell identity and reprogramming. Insight into its partners should illuminate how the pluripotent state is established and regulated. Here, we identify a considerably expanded set of Oct4-binding proteins in mouse embryonic stem cells. We find that Oct4 associates with a varied set of proteins including regulators of gene expression and modulators of Oct4 function. Half of its partners are transcriptionally regulated by Oct4 itself or other stem cell transcription factors, whereas one-third display a significant change in expression upon cell differentiation. The majority of Oct4-associated proteins studied to date show an early lethal phenotype when mutated. A fraction of the human orthologs is associated with inherited developmental disorders or causative of cancer. The Oct4 interactome provides a resource for dissecting mechanisms of Oct4 function, enlightening the basis of pluripotency and development, and identifying potential additional reprogramming factors.
High-density, strand-specific cDNA sequencing (ssRNA-seq) was used to analyze the transcriptome of Salmonella enterica serovar Typhi (S. Typhi). By mapping sequence data to the entire S. Typhi genome, we analyzed the transcriptome in a strand-specific manner and further defined transcribed regions encoded within prophages, pseudogenes, previously un-annotated, and 3'- or 5'-untranslated regions (UTR). An additional 40 novel candidate non-coding RNAs were identified beyond those previously annotated. Proteomic analysis was combined with transcriptome data to confirm and refine the annotation of a number of hpothetical genes. ssRNA-seq was also combined with microarray and proteome analysis to further define the S. Typhi OmpR regulon and identify novel OmpR regulated transcripts. Thus, ssRNA-seq provides a novel and powerful approach to the characterization of the bacterial transcriptome.
Clostridium difficile, a major cause of antibiotic-associated diarrhea, produces highly resistant spores that contaminate hospital environments and facilitate efficient disease transmission. We purified C. difficile spores using a novel method and show that they exhibit significant resistance to harsh physical or chemical treatments and are also highly infectious, with <7 environmental spores per cm(2) reproducibly establishing a persistent infection in exposed mice. Mass spectrometric analysis identified approximately 336 spore-associated polypeptides, with a significant proportion linked to translation, sporulation/germination, and protein stabilization/degradation. In addition, proteins from several distinct metabolic pathways associated with energy production were identified. Comparison of the C. difficile spore proteome to those of other clostridial species defined 88 proteins as the clostridial spore "core" and 29 proteins as C. difficile spore specific, including proteins that could contribute to spore-host interactions. Thus, our results provide the first molecular definition of C. difficile spores, opening up new opportunities for the development of diagnostic and therapeutic approaches.
Sound scoring methods for sequence database search algorithms such as Mascot and Sequest are essential for sensitive and accurate peptide and protein identifications from proteomic tandem mass spectrometry data. In this paper, we present a software package that interfaces Mascot with Percolator, a well performing machine learning method for rescoring database search results, and demonstrate it to be amenable for both low and high accuracy mass spectrometry data, outperforming all available Mascot scoring schemes as well as providing reliable significance measures. Mascot Percolator can be readily used as a stand alone tool or integrated into existing data analysis pipelines.
The combination of affinity purification and tandem mass spectrometry (MS) has emerged as a powerful approach to delineate biological processes. In particular, the use of epitope tags has allowed this approach to become scaleable and has bypassed difficulties associated with generation of antibodies. Single epitope tags and tandem affinity purification (TAP) tags have been used to systematically map protein complexes generating protein interaction data at a near proteome-wide scale. Recent developments in the design of tags, optimisation of purification conditions, experimental design and data analysis have greatly improved the sensitivity and specificity of this approach. Concomitant developments in MS, including high accuracy and high-throughput instrumentation together with quantitative MS methods, have facilitated large-scale and comprehensive analysis of multiprotein complexes.
It is a major challenge to develop effective sequence database search algorithms to translate molecular weight and fragment mass information obtained from tandem mass spectrometry into high quality peptide and protein assignments. We investigated the peptide identification performance of Mascot and X!Tandem for mass tolerance settings common for low and high accuracy mass spectrometry. We demonstrated that sensitivity and specificity of peptide identification can vary substantially for different mass tolerance settings, but this effect was more significant for Mascot. We present an adjusted Mascot threshold, which allows the user to freely select the best trade-off between sensitivity and specificity. The adjusted Mascot threshold was compared with the default Mascot and X!Tandem scoring thresholds and shown to be more sensitive at the same false discovery rates for both low and high accuracy mass spectrometry data.
We analyzed the mouse forebrain cytosolic phosphoproteome using sequential (protein and peptide) IMAC purifications, enzymatic dephosphorylation, and targeted tandem mass spectrometry analysis strategies. In total, using complementary phosphoenrichment and LC-MS/MS strategies, 512 phosphorylation sites on 540 non-redundant phosphopeptides from 162 cytosolic phosphoproteins were characterized. Analysis of protein domains and amino acid sequence composition of this data set of cytosolic phosphoproteins revealed that it is significantly enriched in intrinsic sequence disorder, and this enrichment is associated with both cellular location and phosphorylation status. The majority of phosphorylation sites found by MS were located outside of structural protein domains (97%) but were mostly located in regions of intrinsic sequence disorder (86%). 368 phosphorylation sites were located in long regions of disorder (over 40 amino acids long), and 94% of proteins contained at least one such long region of disorder. In addition, we found that 58 phosphorylation sites in this data set occur in 14-3-3 binding consensus motifs, linear motifs that are associated with unstructured regions in proteins. These results demonstrate that in this data set protein phosphorylation is significantly depleted in protein domains and significantly enriched in disordered protein sequences and that enrichment of intrinsic sequence disorder may be a common feature of phosphoproteomes. This supports the hypothesis that disordered regions in proteins allow kinases, phosphatases, and phosphorylation-dependent binding proteins to gain access to target sequences to regulate local protein conformation and activity.
Phosphorylation, the most intensively studied and common PTM on proteins, is a complex biological phenomenon. Its complexity manifests itself in the large numbers of proteins that attach it, remove it and recognise it as a protein code. Since the first report of protein phosphorylation on vitellin 100 years ago, a wide variety of biochemical and analytical chemical approaches have been developed to enrich and detect protein phosphorylation. The last 5 years have witnessed a renaissance in methodologies capable of characterising protein phosphorylation on a proteome-scale. These technological advances have allowed identification of hundreds to thousands of phosphorylation sites in a proteome and have resulted in a profound paradigm shift. For the first time, using quantitative MS, the topology and significance of global phosphorylation networks may be investigated, marking a new era of cell signalling research. This review addresses recent technological advances in the purification of phosphorylated proteins and peptides and current MS-based strategies used to qualitatively and quantitatively probe these enriched phosphoproteomes. In addition, we review the application of complementary array-based technologies to derive signalling networks from kinase-substrate interactions and discuss future challenges in the field.
Trichomaglin is a protein isolated from root tuber of the plant Maganlin (Trichosanthes Lepiniate, Cucurbitaceae). The crystal structure of trichomaglin has been determined by multiple-isomorphous replacement and refined at 2.2 A resolution. The X-ray sequence was established, based on electron density combined with the experimentally determined N-terminal sequence, and the sequence information derived from mass spectroscopic analysis. X-ray sequence-based homolog search and the three-dimensional structure reveal that trichomaglin is a novel S-like RNase, which was confirmed by biological assay. Trichomaglin molecule contains an additional beta sheet in the HV(b) region, compared with the known plant RNase structures. Fourteen cystein residues form seven disulfide bridges, more than those in the other known structures of S- and S-like RNases. His43 and His105 are expected to be the catalytic acid and base, respectively. Four hydrosulfate ions are bound in the active site pocket, three of them mimicking the substrate binding sites.
Proteomics is complementary to genomic approaches anchored in DNA and RNA. Global characterization of proteins is providing new insights into general biological structures as well as synapses, receptor complexes and other neuronal and glial features. Current challenges for proteomics of the nervous system include problems relating to sample preparation, brain complexity, limited databases and informatics tools. The combination of proteomics with other global functional genomic approaches at the levels of genome and transcriptome, together with network biology, will provide important bridges between genes, physiology and pathology.
The use of mass spectrometry data to search molecular sequence databases is a well-established method for protein identification. The technique can be extended to searching raw genomic sequences, providing experimental confirmation or correction of predicted coding sequences, and has the potential to identify novel genes and elucidate splicing patterns.
The public availability of a draft assembly of the human genome has enabled us to demonstrate, for the first time, the feasibility of searching a complete, unmasked eukaryotic genome using uninterpreted mass spectrometry data. A complex LC-MS/MS data set, containing peptides from at least 22 human proteins, was searched against a comprehensive, nonidentical protein database, an expressed sequence tag (EST) database, and the International Human Genome Project draft assembly of the human genome. The results from the three searches are compared in detail, and the merits of the different databases for this application are discussed. In the case of the EST database, the UniGene index provided a method of simplifying and summarising the search results. In the case of the genomic DNA, the presence of introns prevented matching of roughly one quarter of the spectra, but the technique can provide primary experimental verification of predicted coding sequences, and has the potential to identify novel coding sequences.
N-methyl-d-aspartate receptors (NMDAR) mediate long-lasting changes in synapse strength via downstream signaling pathways. We report proteomic characterization with mass spectrometry and immunoblotting of NMDAR multiprotein complexes (NRC) isolated from mouse brain. The NRC comprised 77 proteins organized into receptor, adaptor, signaling, cytoskeletal and novel proteins, of which 30 are implicated from binding studies and another 19 participate in NMDAR signaling. NMDAR and metabotropic glutamate receptor subtypes were linked to cadherins and L1 cell-adhesion molecules in complexes lacking AMPA receptors. These neurotransmitter-adhesion receptor complexes were bound to kinases, phosphatases, GTPase-activating proteins and Ras with effectors including MAPK pathway components. Several proteins were encoded by activity-dependent genes. Genetic or pharmacological interference with 15 NRC proteins impairs learning and with 22 proteins alters synaptic plasticity in rodents. Mutations in three human genes (NF1, Rsk-2, L1) are associated with learning impairments, indicating the NRC also participates in human cognition.
Advances in mass spectrometry combined with accelerated progress in genome sequencing projects have facilitated the rapid identification of proteins by enzymatic digestion, mass analysis, and sequence database searching. Applications for this technology range from the surveillance of protein expression in cells, tissues, and whole organisms, to the identification of proteins and posttranslational modifications. Here we consider practical aspects of the application of mass spectrometry in cell biology and illustrate these with examples from our own laboratories.
The phagocyte respiratory burst is crucial for innate immunity. The transfer of electrons to oxygen is mediated by a membrane-bound heterodimer, comprising gp91<i>phox</i> and p22<i>phox</i> subunits. Deficiency of either subunit leads to severe immunodeficiency. We describe Eros (essential for reactive oxygen species), a protein encoded by the previously undefined mouse gene <i>bc017643</i>, and show that it is essential for host defense via the phagocyte NAPDH oxidase. Eros is required for expression of the NADPH oxidase components, gp91<i>phox</i> and p22<i>phox</i> Consequently, <i>Eros</i>-deficient mice quickly succumb to infection. <i>Eros</i> also contributes to the formation of neutrophil extracellular traps (NETS) and impacts on the immune response to melanoma metastases. <i>Eros</i> is an ortholog of the plant protein Ycf4, which is necessary for expression of proteins of the photosynthetic photosystem 1 complex, itself also an NADPH oxio-reductase. We thus describe the key role of the previously uncharacterized protein Eros in host defense.
Enteric fever, caused by Salmonella enterica serovar Typhi, is an important public health problem in resource-limited settings and, despite decades of research, human responses to the infection are poorly understood. In 41 healthy adults experimentally infected with wild-type S. Typhi, we detected significant cytokine responses within 12 h of bacterial ingestion. These early responses did not correlate with subsequent clinical disease outcomes and likely indicate initial host-pathogen interactions in the gut mucosa. In participants developing enteric fever after oral infection, marked transcriptional and cytokine responses during acute disease reflected dominant type I/II interferon signatures, which were significantly associated with bacteremia. Using a murine and macrophage infection model, we validated the pivotal role of this response in the expression of proteins of the host tryptophan metabolism during Salmonella infection. Corresponding alterations in tryptophan catabolites with immunomodulatory properties in serum of participants with typhoid fever confirmed the activity of this pathway, and implicate a central role of host tryptophan metabolism in the pathogenesis of typhoid fever.
Expression Atlas (http://www.ebi.ac.uk/gxa) provides information about gene and protein expression in animal and plant samples of different cell types, organism parts, developmental stages, diseases and other conditions. It consists of selected microarray and RNA-sequencing studies from ArrayExpress, which have been manually curated, annotated with ontology terms, checked for high quality and processed using standardised analysis methods. Since the last update, Atlas has grown seven-fold (1572 studies as of August 2015), and incorporates baseline expression profiles of tissues from Human Protein Atlas, GTEx and FANTOM5, and of cancer cell lines from ENCODE, CCLE and Genentech projects. Plant studies constitute a quarter of Atlas data. For genes of interest, the user can view baseline expression in tissues, and differential expression for biologically meaningful pairwise comparisons-estimated using consistent methodology across all of Atlas. Our first proteomics study in human tissues is now displayed alongside transcriptomics data in the same tissues. Novel analyses and visualisations include: 'enrichment' in each differential comparison of GO terms, Reactome, Plant Reactome pathways and InterPro domains; hierarchical clustering (by baseline expression) of most variable genes and experimental conditions; and, for a given gene-condition, distribution of baseline expression across biological replicates.
Many critical events in the Plasmodium life cycle rely on the controlled release of Ca²⁺ from intracellular stores to activate stage-specific Ca²⁺-dependent protein kinases. Using the motility of Plasmodium berghei ookinetes as a signalling paradigm, we show that the cyclic guanosine monophosphate (cGMP)-dependent protein kinase, PKG, maintains the elevated level of cytosolic Ca²⁺ required for gliding motility. We find that the same PKG-dependent pathway operates upstream of the Ca²⁺ signals that mediate activation of P. berghei gametocytes in the mosquito and egress of Plasmodium falciparum merozoites from infected human erythrocytes. Perturbations of PKG signalling in gliding ookinetes have a marked impact on the phosphoproteome, with a significant enrichment of in vivo regulated sites in multiple pathways including vesicular trafficking and phosphoinositide metabolism. A global analysis of cellular phospholipids demonstrates that in gliding ookinetes PKG controls phosphoinositide biosynthesis, possibly through the subcellular localisation or activity of lipid kinases. Similarly, phosphoinositide metabolism links PKG to egress of P. falciparum merozoites, where inhibition of PKG blocks hydrolysis of phosphatidylinostitol (4,5)-bisphosphate. In the face of an increasing complexity of signalling through multiple Ca²⁺ effectors, PKG emerges as a unifying factor to control multiple cellular Ca²⁺ signals essential for malaria parasite development and transmission.
The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.
Alternative promoter usage and alternative splicing enable diversification of the transcriptome. Here we demonstrate that the function of Synaptic GTPase-Activating Protein (SynGAP), a key synaptic protein, is determined by the combination of its amino-terminal sequence with its carboxy-terminal sequence. 5' rapid amplification of cDNA ends and primer extension show that different N-terminal protein sequences arise through alternative promoter usage that are regulated by synaptic activity and postnatal age. Heterogeneity in C-terminal protein sequence arises through alternative splicing. Overexpression of SynGAP α1 versus α2 C-termini-containing proteins in hippocampal neurons has opposing effects on synaptic strength, decreasing and increasing miniature excitatory synaptic currents amplitude/frequency, respectively. The magnitude of this C-terminal-dependent effect is modulated by the N-terminal peptide sequence. This is the first demonstration that activity-dependent alternative promoter usage can change the function of a synaptic protein at excitatory synapses. Furthermore, the direction and degree of synaptic modulation exerted by different protein isoforms from a single gene locus is dependent on the combination of differential promoter usage and alternative splicing.
A small number of rare, recurrent genomic copy number variants (CNVs) are known to substantially increase susceptibility to schizophrenia. As a consequence of the low fecundity in people with schizophrenia and other neurodevelopmental phenotypes to which these CNVs contribute, CNVs with large effects on risk are likely to be rapidly removed from the population by natural selection. Accordingly, such CNVs must frequently occur as recurrent de novo mutations. In a sample of 662 schizophrenia proband-parent trios, we found that rare de novo CNV mutations were significantly more frequent in cases (5.1% all cases, 5.5% family history negative) compared with 2.2% among 2623 controls, confirming the involvement of de novo CNVs in the pathogenesis of schizophrenia. Eight de novo CNVs occurred at four known schizophrenia loci (3q29, 15q11.2, 15q13.3 and 16p11.2). De novo CNVs of known pathogenic significance in other genomic disorders were also observed, including deletion at the TAR (thrombocytopenia absent radius) region on 1q21.1 and duplication at the WBS (Williams-Beuren syndrome) region at 7q11.23. Multiple de novos spanned genes encoding members of the DLG (discs large) family of membrane-associated guanylate kinases (MAGUKs) that are components of the postsynaptic density (PSD). Two de novos also affected EHMT1, a histone methyl transferase known to directly regulate DLG family members. Using a systems biology approach and merging novel CNV and proteomics data sets, systematic analysis of synaptic protein complexes showed that, compared with control CNVs, case de novos were significantly enriched for the PSD proteome (P=1.72 × 10⁻⁶. This was largely explained by enrichment for members of the N-methyl-D-aspartate receptor (NMDAR) (P=4.24 × 10⁻⁶) and neuronal activity-regulated cytoskeleton-associated protein (ARC) (P=3.78 × 10⁻⁸) postsynaptic signalling complexes. In an analysis of 18 492 subjects (7907 cases and 10 585 controls), case CNVs were enriched for members of the NMDAR complex (P=0.0015) but not ARC (P=0.14). Our data indicate that defects in NMDAR postsynaptic signalling and, possibly, ARC complexes, which are known to be important in synaptic plasticity and cognition, play a significant role in the pathogenesis of schizophrenia.
Citrobacter rodentium is a natural mouse pathogen that causes attaching and effacing (A/E) lesions. It shares a common virulence strategy with the clinically significant human A/E pathogens enteropathogenic E. coli (EPEC) and enterohaemorrhagic E. coli (EHEC) and is widely used to model this route of pathogenesis. We previously reported the complete genome sequence of C. rodentium ICC168, where we found that the genome displayed many characteristics of a newly evolved pathogen. In this study, through PFGE, sequencing of isolates showing variation, whole genome transcriptome analysis and examination of the mobile genetic elements, we found that, consistent with our previous hypothesis, the genome of C. rodentium is unstable as a result of repeat-mediated, large-scale genome recombination and because of active transposition of mobile genetic elements such as the prophages. We sequenced an additional C. rodentium strain, EX-33, to reveal that the reference strain ICC168 is representative of the species and that most of the inactivating mutations were common to both isolates and likely to have occurred early on in the evolution of this pathogen. We draw parallels with the evolution of other bacterial pathogens and conclude that C. rodentium is a recently evolved pathogen that may have emerged alongside the development of inbred mice as a model for human disease.
A number of bacteriophages have been identified that target the Vi capsular antigen of Salmonella enterica serovar Typhi. Here we show that these Vi phages represent a remarkably diverse set of phages belonging to three phage families, including Podoviridae and Myoviridae. Genome analysis facilitated the further classification of these phages and highlighted aspects of their independent evolution. Significantly, a conserved protein domain carrying an acetyl esterase was found to be associated with at least one tail fiber gene for all Vi phages, and the presence of this domain was confirmed in representative phage particles by mass spectrometric analysis. Thus, we provide a simple explanation and paradigm of how a diverse group of phages target a single key virulence antigen associated with this important human-restricted pathogen.
The molecular complexity of mammalian proteomes demands new methods for mapping the organization of multiprotein complexes. Here, we combine mouse genetics and proteomics to characterize synapse protein complexes and interaction networks. New tandem affinity purification (TAP) tags were fused to the carboxyl terminus of PSD-95 using gene targeting in mice. Homozygous mice showed no detectable abnormalities in PSD-95 expression, subcellular localization or synaptic electrophysiological function. Analysis of multiprotein complexes purified under native conditions by mass spectrometry defined known and new interactors: 118 proteins comprising crucial functional components of synapses, including glutamate receptors, K+ channels, scaffolding and signaling proteins, were recovered. Network clustering of protein interactions generated five connected clusters, with two clusters containing all the major ionotropic glutamate receptors and one cluster with voltage-dependent K+ channels. Annotation of clusters with human disease associations revealed that multiple disorders map to the network, with a significant correlation of schizophrenia within the glutamate receptor clusters. This targeted TAP tagging strategy is generally applicable to mammalian proteomics and systems biology approaches to disease.
The mammalian postsynaptic density (PSD) comprises a complex collection of approximately 1100 proteins. Despite extensive knowledge of individual proteins, the overall organization of the PSD is poorly understood. Here, we define maps of molecular circuitry within the PSD based on phosphorylation of postsynaptic proteins. Activation of a single neurotransmitter receptor, the N-methyl-D-aspartate receptor (NMDAR), changed the phosphorylation status of 127 proteins. Stimulation of ionotropic and metabotropic glutamate receptors and dopamine receptors activated overlapping networks with distinct combinatorial phosphorylation signatures. Using peptide array technology, we identified specific phosphorylation motifs and switching mechanisms responsible for the integration of neurotransmitter receptor pathways and their coordination of multiple substrates in these networks. These combinatorial networks confer high information-processing capacity and functional diversity on synapses, and their elucidation may provide new insights into disease mechanisms and new opportunities for drug discovery.
Understanding the origins and evolution of synapses may provide insight into species diversity and the organization of the brain. Using comparative proteomics and genomics, we examined the evolution of the postsynaptic density (PSD) and membrane-associated guanylate kinase (MAGUK)-associated signaling complexes (MASCs) that underlie learning and memory. PSD and MASC orthologs found in yeast carry out basic cellular functions to regulate protein synthesis and structural plasticity. We observed marked changes in signaling complexity at the yeast-metazoan and invertebrate-vertebrate boundaries, with an expansion of key synaptic components, notably receptors, adhesion/cytoskeletal proteins and scaffold proteins. A proteomic comparison of Drosophila and mouse MASCs revealed species-specific adaptation with greater signaling complexity in mouse. Although synaptic components were conserved amongst diverse vertebrate species, mapping mRNA and protein expression in the mouse brain showed that vertebrate-specific components preferentially contributed to differences between brain regions. We propose that the evolution of synapse complexity around a core proto-synapse has contributed to invertebrate-vertebrate differences and to brain specialization.
Some bacteriophages target potentially pathogenic bacteria by exploiting surface-associated virulence factors as receptors. For example, phage have been identified that exhibit specificity for Vi capsule producing Salmonella enterica serovar Typhi. Here we have characterized the Vi-associated E1-typing bacteriophage using a number of molecular approaches. The absolute requirement for Vi capsule expression for infectivity was demonstrated using different Vi-negative S. enterica derivatives. The phage particles were shown to have an icosahedral head and a long noncontractile tail structure. The genome is 45,362 bp in length with defined capsid and tail regions that exhibit significant homology to the S. enterica transducing phage ES18. Mass spectrometry was used to confirm the presence of a number of hypothetical proteins in the Vi phage E1 particle and demonstrate that a number of phage proteins are modified posttranslationally. The genome of the Vi phage E1 is significantly related to other bacteriophages belonging to the same serovar Typhi phage-typing set, and we demonstrate a role for phage DNA modification in determining host specificity.
Characterization of the composition of the postsynaptic proteome (PSP) provides a framework for understanding the overall organization and function of the synapse in normal and pathological conditions. We have identified 698 proteins from the postsynaptic terminal of mouse CNS synapses using a series of purification strategies and analysis by liquid chromatography tandem mass spectrometry and large-scale immunoblotting. Some 620 proteins were found in purified postsynaptic densities (PSDs), nine in AMPA-receptor immuno-purifications, 100 in isolates using an antibody against the NMDA receptor subunit NR1, and 170 by peptide-affinity purification of complexes with the C-terminus of NR2B. Together, the NR1 and NR2B complexes contain 186 proteins, collectively referred to as membrane-associated guanylate kinase-associated signalling complexes. We extracted data from six other synapse proteome experiments and combined these with our data to provide a consensus on the composition of the PSP. In total, 1124 proteins are present in the PSP, of which 466 were validated by their detection in two or more studies, forming what we have designated the Consensus PSD. These synapse proteome data sets offer a basis for future research in synaptic biology and will provide useful information in brain disease and mental disorder studies.
Reversible protein phosphorylation mediated by kinases, phosphatases, and regulatory molecules is an essential mechanism of signal transduction in living cells. Although phosphorylation is the most intensively studied of the several hundred known posttranslational modifications on proteins, until recently the rate of identification of phosphorylation sites has remained low. The use of tandem mass spectrometry has greatly accelerated the identification of phosphorylation sites, although progress was limited by difficulties in phosphoresidue enrichment techniques. We have improved upon existing immobilized metal-affinity chromatography (IMAC) techniques for capturing phosphopeptides, to selectively purify phosphoproteins from complex mixtures. Combinations of phosphoprotein and phosphopeptide enrichment were more effective than current single phosphopeptide purification approaches. We have also implemented iterative mass spectrometry-based scanning techniques to improve detection of phosphorylated peptides in these enriched samples. Here, we provide detailed instructions for implementing and validating these methods together with analysis by tandem mass spectrometry for the study of phosphorylation at the mammalian synapse. This strategy should be widely applicable to the characterization of protein phosphorylation in diverse tissues, organelles, and in cell culture.
In the nervous system, protein phosphorylation is an essential feature of synaptic function. Although protein phosphorylation is known to be important for many synaptic processes and in disease, little is known about global phosphorylation of synaptic proteins. Heterogeneity and low abundance make protein phosphorylation analysis difficult, particularly for mammalian tissue samples. Using a new approach, combining both protein and peptide immobilized metal affinity chromatography and mass spectrometry data acquisition strategies, we have produced the first large scale map of the mouse synapse phosphoproteome. We report over 650 phosphorylation events corresponding to 331 sites (289 have been unambiguously assigned), 92% of which are novel. These represent 79 proteins, half of which are novel phosphoproteins, and include several highly phosphorylated proteins such as MAP1B (33 sites) and Bassoon (30 sites). An additional 149 candidate phosphoproteins were identified by profiling the composition of the protein immobilized metal affinity chromatography enrichment. All major synaptic protein classes were observed, including components of important pre- and postsynaptic complexes as well as low abundance signaling proteins. Bioinformatic and in vitro phosphorylation assays of peptide arrays suggest that a small number of kinases phosphorylate many proteins and that each substrate is phosphorylated by many kinases. These data substantially increase existing knowledge of synapse protein phosphorylation and support a model where the synapse phosphoproteome is functionally organized into a highly interconnected signaling network.
Using mass spectrometry we have identified proteins which co-immunoprecipitate with paxillin, an adaptor protein implicated in the integrin-mediated signaling pathways of cell motility. A major component of paxillin immunoprecipitates was poly(A)-binding protein 1, a 70-kDa mRNA-binding protein. Poly(A)-binding protein 1 associated with both the alpha and beta isoforms of paxillin, and this was unaffected by RNase treatment consistent with a protein-protein interaction. The NH(2)-terminal region of paxillin (residues 54-313) associated directly with poly(A)-binding protein 1 in cell lysates, and with His-poly(A)-binding protein 1 immobilized in microtiter wells. Binding was specific, saturable and of high affinity (K(d) of approximately 10 nm). Cell fractionation studies showed that at steady state, the bulk of paxillin and poly(A)-binding protein 1 was present in the "dense" polyribosome-associated endoplasmic reticulum. However, inhibition of nuclear export with leptomycin B caused paxillin and poly(A)-binding protein 1 to accumulate in the nucleus, indicating that they shuttle between the nuclear and cytoplasmic compartments. When cells migrate, poly(A)-binding protein 1 colocalized with paxillin-beta at the tips of lamellipodia. Our results suggest a new mechanism whereby a paxillin x poly(A)-binding protein 1 complex facilitates transport of mRNA from the nucleus to sites of protein synthesis at the endoplasmic reticulum and the leading lamella during cell migration.
A mass spectrometric analysis of proteins partitioning into Triton X-114 from purified hepatic Golgi apparatus (84% purity by morphometry, 122-fold enrichment over the homogenate for the Golgi marker galactosyl transferase) led to the unambiguous identification of 81 proteins including a novel Golgi-associated protein of 34 kDa (GPP34). The membrane protein complement was resolved by SDS-polyacrylamide gel electrophoresis and subjected to a hierarchical approach using delayed extraction matrix-assisted laser desorption ionization mass spectrometry characterization by peptide mass fingerprinting, tandem mass spectrometry to generate sequence tags, and Edman sequencing of proteins. Major membrane proteins corresponded to known Golgi residents, a Golgi lectin, anterograde cargo, and an abundance of trafficking proteins including KDEL receptors, p24 family members, SNAREs, Rabs, a single ARF-guanine nucleotide exchange factor, and two SCAMPs. Analytical fractionation and gold immunolabeling of proteins in the purified Golgi fraction were used to assess the intra-Golgi and total cellular distribution of GPP34, two SNAREs, SCAMPs, and the trafficking proteins GBF1, BAP31, and alpha(2)P24 identified by the proteomics approach as well as the endoplasmic reticulum contaminant calnexin. Although GPP34 has never previously been identified as a protein, the localization of GPP34 to the Golgi complex, the conservation of GPP34 from yeast to humans, and the cytosolically exposed location of GPP34 predict a role for a novel coat protein in Golgi trafficking.
Given the global burden of diarrheal diseases on healthcare it is surprising how little is known about the drivers of disease severity. Colitis caused by infection and inflammatory bowel disease (IBD) is characterised by neutrophil infiltration into the intestinal mucosa and yet our understanding of neutrophil responses during colitis is incomplete. Using infectious (Citrobacter rodentium) and chemical (dextran sulphate sodium; DSS) murine colitis models, as well as human IBD samples, we find that faecal neutrophil elastase (NE) activity reflects disease severity. During C. rodentium infection intestinal epithelial cells secrete the serine protease inhibitor SerpinA3N to inhibit and mitigate tissue damage caused by extracellular NE. Mice suffering from severe infection produce insufficient SerpinA3N to control excessive NE activity. This activity contributes to colitis severity as infection of these mice with a recombinant C. rodentium strain producing and secreting SerpinA3N reduces tissue damage. Thus, uncontrolled luminal NE activity is involved in severe colitis. Taken together, our findings suggest that NE activity could be a useful faecal biomarker for assessing disease severity as well as therapeutic target for both infectious and chronic inflammatory colitis.
Naïve CD4<sup>+</sup> T cells coordinate the immune response by acquiring an effector phenotype in response to cytokines. However, the cytokine responses in memory T cells remain largely understudied. Here we use quantitative proteomics, bulk RNA-seq, and single-cell RNA-seq of over 40,000 human naïve and memory CD4<sup>+</sup> T cells to show that responses to cytokines differ substantially between these cell types. Memory T cells are unable to differentiate into the Th2 phenotype, and acquire a Th17-like phenotype in response to iTreg polarization. Single-cell analyses show that T cells constitute a transcriptional continuum that progresses from naïve to central and effector memory T cells, forming an effectorness gradient accompanied by an increase in the expression of chemokines and cytokines. Finally, we show that T cell activation and cytokine responses are influenced by the effectorness gradient. Our results illustrate the heterogeneity of T cell responses, furthering our understanding of inflammation.
<h4>ABSTRACT</h4> <h4>Background</h4> Osteoarthritis (OA) is a common disease characterized by cartilage degeneration and joint remodeling. The underlying molecular changes underpinning disease progression are incompletely understood, but can be characterized using recent advances in genomics technologies, as the relevant tissue is readily accessible at joint replacement surgery. Here we investigate genes and pathways that mark OA progression, combining genome-wide DNA methylation, RNA sequencing and quantitative proteomics in isolated primary chondrocytes from matched intact and degraded articular cartilage samples across twelve patients with OA undergoing knee replacement surgery. <h4>Results</h4> We identify 49 genes differentially regulated between intact and degraded cartilage at multiple omics levels, 16 of which have not previously been implicated in OA progression. Using independent replication datasets, we replicate statistically significant signals and show that the direction of change is consistent for over 90% of differentially expressed genes and differentially methylated CpG probes. Three genes are differentially regulated across all 3 omics levels: AQP1 , COL1A1 and CLEC3B , and all three have evidence implicating them in OA through animal or cellular model studies. Integrated pathway analysis implicates the involvement of extracellular matrix degradation, collagen catabolism and angiogenesis in disease progression. All data from these experiments are freely available as a resource for the scientific community. <h4>Conclusions</h4> This work provides a first integrated view of the molecular landscape of human primary chondrocytes and identifies key molecular players in OA progression that replicate across independent datasets, with evidence for translational potential.
<h4>Background</h4>POLG, located on nuclear chromosome 15, encodes the DNA polymerase γ(Pol γ). Pol γ is responsible for the replication and repair of mitochondrial DNA (mtDNA). Pol γ is the only DNA polymerase found in mitochondria for most animal cells. Mutations in POLG are the most common single-gene cause of diseases of mitochondria and have been mapped over the coding region of the POLG ORF.<h4>Results</h4>Using PhyloCSF to survey alternative reading frames, we found a conserved coding signature in an alternative frame in exons 2 and 3 of POLG, herein referred to as ORF-Y that arose de novo in placental mammals. Using the synplot2 program, synonymous site conservation was found among mammals in the region of the POLG ORF that is overlapped by ORF-Y. Ribosome profiling data revealed that ORF-Y is translated and that initiation likely occurs at a CUG codon. Inspection of an alignment of mammalian sequences containing ORF-Y revealed that the CUG codon has a strong initiation context and that a well-conserved predicted RNA stem-loop begins 14 nucleotides downstream. Such features are associated with enhanced initiation at near-cognate non-AUG codons. Reanalysis of the Kim et al. (2014) draft human proteome dataset yielded two unique peptides that map unambiguously to ORF-Y. An additional conserved uORF, herein referred to as ORF-Z, was also found in exon 2 of POLG. Lastly, we surveyed Clinvar variants that are synonymous with respect to the POLG ORF and found that most of these variants cause amino acid changes in ORF-Y or ORF-Z.<h4>Conclusions</h4>We provide evidence for a novel coding sequence, ORF-Y, that overlaps the POLG ORF. Ribosome profiling and mass spectrometry data show that ORF-Y is expressed. PhyloCSF and synplot2 analysis show that ORF-Y is subject to strong purifying selection. An abundance of disease-correlated mutations that map to exons 2 and 3 of POLG but also affect ORF-Y provides potential clinical significance to this finding.
How the cell rapidly and completely reorganizes its architecture when it divides is a problem that has fascinated researchers for almost 150 yr. We now know that the core regulatory machinery is highly conserved in eukaryotes, but how these multiple protein kinases, protein phosphatases, and ubiquitin ligases are coordinated in space and time to remodel the cell in a matter of minutes remains a major question. Cyclin B1-Cdk is the primary kinase that drives mitotic remodeling; here we show that it is targeted to the nuclear pore complex (NPC) by binding an acidic face of the kinetochore checkpoint protein, MAD1, where it coordinates NPC disassembly with kinetochore assembly. Localized cyclin B1-Cdk1 is needed for the proper release of MAD1 from the embrace of TPR at the nuclear pore so that it can be recruited to kinetochores before nuclear envelope breakdown to maintain genomic stability.
Environmental-induced hyperthermia compromises animal production with drastic economic consequences to global animal agriculture and jeopardizes animal welfare. Heat stress is a major stressor that occurs as a result of an imbalance between heat production within the body and its dissipation and it affects animals at cellular, molecular and ecological levels. The molecular mechanism underlying the physiology of heat stress in the cattle remains undefined. The present study sought to evaluate mRNA expression profiles in the cattle blood in response to heat stress. In this study we report the genes that were differentially expressed in response to heat stress using global scale genome expression technology (Microarray). Four Sahiwal heifers were exposed to 42°C with 90% humidity for 4h followed by normothermia. Gene expression changes include activation of heat shock transcription factor 1 (HSF1), increased expression of heat shock proteins (HSP) and decreased expression and synthesis of other proteins, immune system activation via extracellular secretion of HSP. A cDNA microarray analysis found 140 transcripts to be up-regulated and 77 down-regulated in the cattle blood after heat treatment (P<0.05). But still a comprehensive explanation for the direction of fold change and the specific genes involved in response to acute heat stress still remains to be explored. These findings may provide insights into the underlying mechanism of physiology of heat stress in cattle. Understanding the biology and mechanisms of heat stress is critical to developing approaches to ameliorate current production issues for improving animal performance and agriculture economics.
<h4>Introduction</h4>Streptococcus pneumoniae is a major cause of mortality and morbidity in young children and the elderly. In the present study we evaluated antimicrobial susceptibilities, serotypes, and sequence types of pneumococcal isolates recovered in New Delhi, India.<h4>Methodology</h4>A total of 126 clinical isolates of Streptococcus pneumoniae were investigated. They were subjected to disk diffusion susceptibility testing, broth microdilution testing, serotyping and multilocus sequence typing.<h4>Results</h4>Broth microdilution assay showed that 5%, 20% and 23% of the isolates exhibited resistance to penicillin, erythromycin and ciprofloxacin, respectively. Serotypes19, 1 and 6 were more frequently isolated. Thirty per cent of the strains were comprised of serotypes 1, 3, 5, 19A and 7F, which are not included in the seven-valent vaccine. Fifty-nine isolates were typed using multilocus sequence typing. Thirty new sequence types were encountered in this study. Only one clonal complex with 4 isolates was seen; 11 clonal complexes and 96 sequence types (STs) were observed among 115 Indian isolates. Only 18 of the 96 STs were found globally, of which only 4 STs were found in many countries with larger numbers.<h4>Conclusions</h4>This study identifies the non-vaccine serotypes of Streptococcus pneumoniae circulating in India. It is important that an appropriate vaccine which covers all serotypes is used in the region.
Using the detergents n-dodecyl beta-D-maltoside and heptyl thioglycopyranoside, a subcore complex of photosystem II (PSII) has been isolated that contains the chlorophyll-binding protein, CP47, and the reaction center components, D1, D2, and cytochrome b559. We have found, by using sucrose density centrifugation, that the resulting preparation consisted of a mixture of dimeric and monomeric forms of the CP47 reaction center (RC) complex, having molecular masses of 410 +/- 30 and 200 +/- 28 kDa, respectively, as estimated by size exclusion chromatography. The level of the dimer in the preparation is significantly higher than the monomeric form. Both the monomer and dimer contain the proteins CP47, D1, and D2 and the alpha- and beta-subunits of cytochrome b559. Analyses by mass spectrometry and N-terminal sequencing showed that both forms of the CP47-RC complex contain the products of the psbI, psbTc (chloroplast gene), and psbW with molecular masses of 4195.5, 3849.6, and 5927.4 Da, respectively. In contrast to the monomeric form, the CP47-RC dimer contained two extra proteins with low molecular weights, identified as the products of the psbL and psbK genes having molecular masses of 4365.5 and 4292.1, respectively. It was also found that the dimer contained slightly more molecules of chlorophyll a (21 +/- 2.5) than the monomer (18 +/- 1.5), a characteristic also observed in the room temperature absorption spectrum by comparing the ratio of absorption at 416 and 435 nm. Of particular note was the finding that the dimer, but not the monomer, contained plastoquinone-9 (estimated to be 1.5 +/- 0.3 molecules per RC). The results indicate that the CP47-RC monomer is derived from the dimeric form of the complex, and therefore the latter is likely to represent an in vivo conformation. The PsbTc as well as the PsbI and PsbW proteins are identified as being intimately associated with the D1 and D2 proteins, and in the case of the dimer, importance is placed on the PsbL and PsbK proteins in sustaining plastoquinone binding and maintenance of the dimeric organization. Assuming only one copy of the alpha- and beta-subunits of cytochrome b559, the monomeric and dimeric forms of the complex would be expected to contain 21 and 23 x 2 transmembrane helices, respectively.
Mass spectrometry techniques have been applied in a protein mapping strategy to elucidate the majority of the primary structures of the D1 and D2 proteins present in the photosystem II reaction center. Evidence verifying the post-translational processing of the initiating methionine residue and acetylation of the free amino group, similar to those reported for other higher plant species, are presented for the two subunits from pea plants (Pisum sativum L.). Further covalent modifications observed on the D1 protein include the COOH-terminal processing with a loss of nine amino acids and phosphorylation of Thr2. In addition, the studies reported in this paper provide the first definitive characterization of oxidations on specific amino acids of the D1 and D2 proteins. We believe that these oxidations, and to a much lesser extent the phosphorylations, are major contributors to the heterogeneity observed during the electrospray analysis of the intact subunits reported in the accompanying paper (Sharma, J., Panico, M., Barber, J., and Morris, H. R. (1997) J. Biol. Chem. 272, 33153-33157). Significantly, all of the regions that have been identified as those particularly susceptible to oxidation are anticipated (from current models) to be in close proximity to the redox active components of the photosystem II complex.
A sensitive and simple reverse phase HPLC purification scheme was developed for the rapid separation of the small protein subunits from photosystem II reaction center preparations. The precise molecular masses of the alpha- and beta-subunits of cytochrome b559 and the psbI gene product from pea plants, found to be 4394.6 +/- 0. 6, 9283.6 +/- 0.7, and 4209.5 +/- 0.5 Da, respectively, were then successfully determined for the first time by electrospray- and fast atom bombardment-mass spectrometry. Discrepancies between the molecular weights assigned and those calculated from the respective DNA sequences were observed for alpha- and beta-subunits of cytochrome b559. Currently, the nucleotide sequence of the psbI gene product from pea plants is not available. Application of novel mapping and sequencing strategies has assured the elucidation of full primary structures of all of the purified subunits. The modifications identified here include the post-translational processing of the initiating methionine on both subunits of cytochrome b559, NH2-terminal acetylation and an mRNA editing site at residue 26 (Ser --> Phe) on the beta-subunit, and retention of the NH2-terminal formyl-Met on the psbI gene product. In addition, specific oxidation of a single amino acid residue was identified on the psbI gene product and the beta-subunit purified from light-treated reaction center preparations. Overall, these studies provide the first detailed primary structural characterization of the small subunits of the reaction center complex and their associated light-induced modifications.
A reverse phase high pressure liquid chromatography purification system for the rapid separation of photosystem II reaction center proteins free of salts and detergents is described. This procedure results in the isolation of the three small subunits: alpha- and beta-subunits of cytochrome b559 and PsbI protein, with near base-line resolution between each peak, although the D1 and D2 proteins were partially deconvoluted. The molecular masses obtained by electrospray ionization mass spectrometry for the purified beta-subunit of cytochrome b559, alpha-subunit of cytochrome b559, and the PsbI protein, 4,394.8 +/- 0.4, 9,283.7 +/- 0.8, and 4,209.5 +/- 0.4 Da, respectively, are in excellent agreement with values obtained from previous characterization studies (Sharma, J., Panico, M., Barber, J., and Morris, H. R. (1997) J. Biol. Chem. 272, 3935-3943). Direct electrospray analysis of the D1 and D2 proteins suggests that these components exist in heterogeneous forms. The molecular mass ascribed to a predominant form of the D1 protein, 38, 040.9 +/- 6.5 Da, and the D2 protein, 39,456.1 +/- 7.7, are also in agreement with those expected for the mature nonphosphorylated states of these subunits.
Intellectual disability (ID) is a heterogeneous clinical entity and includes an excess of males who harbor variants on the X-chromosome (XLID). We report rare FAM50A missense variants in the original Armfield XLID syndrome family localized in Xq28 and four additional unrelated males with overlapping features. Our fam50a knockout (KO) zebrafish model exhibits abnormal neurogenesis and craniofacial patterning, and in vivo complementation assays indicate that the patient-derived variants are hypomorphic. RNA sequencing analysis from fam50a KO zebrafish show dysregulation of the transcriptome, with augmented spliceosome mRNAs and depletion of transcripts involved in neurodevelopment. Zebrafish RNA-seq datasets show a preponderance of 3' alternative splicing events in fam50a KO, suggesting a role in the spliceosome C complex. These data are supported with transcriptomic signatures from cell lines derived from affected individuals and FAM50A protein-protein interaction data. In sum, Armfield XLID syndrome is a spliceosomopathy associated with aberrant mRNA processing during development.
Lung cancer in East Asia is characterized by a high percentage of never-smokers, early onset and predominant EGFR mutations. To illuminate the molecular phenotype of this demographically distinct disease, we performed a deep comprehensive proteogenomic study on a prospectively collected cohort in Taiwan, representing early stage, predominantly female, non-smoking lung adenocarcinoma. Integrated genomic, proteomic, and phosphoproteomic analysis delineated the demographically distinct molecular attributes and hallmarks of tumor progression. Mutational signature analysis revealed age- and gender-related mutagenesis mechanisms, characterized by high prevalence of APOBEC mutational signature in younger females and over-representation of environmental carcinogen-like mutational signatures in older females. A proteomics-informed classification distinguished the clinical characteristics of early stage patients with EGFR mutations. Furthermore, integrated protein network analysis revealed the cellular remodeling underpinning clinical trajectories and nominated candidate biomarkers for patient stratification and therapeutic intervention. This multi-omic molecular architecture may help develop strategies for management of early stage never-smoker lung adenocarcinoma.
The Encylopedia of DNA Elements (ENCODE) Project launched in 2003 with the long-term goal of developing a comprehensive map of functional elements in the human genome. These included genes, biochemical regions associated with gene regulation (for example, transcription factor binding sites, open chromatin, and histone marks) and transcript isoforms. The marks serve as sites for candidate cis-regulatory elements (cCREs) that may serve functional roles in regulating gene expression<sup>1</sup>. The project has been extended to model organisms, particularly the mouse. In the third phase of ENCODE, nearly a million and more than 300,000 cCRE annotations have been generated for human and mouse, respectively, and these have provided a valuable resource for the scientific community.
MOTIVATION: Mass spectrometry (MS) based quantitative proteomics experiments typically assay a subset of up to 60% of the ∼20,000 human protein coding genes. Computational methods for imputing the missing values using RNA expression data usually allow only for imputations of proteins measured in at least some of the samples. In silico methods for comprehensively estimating abundances across all proteins are still missing. RESULTS: We propose a novel method using deep learning to extrapolate the observed protein expression values in label-free MS experiments to all proteins, leveraging gene functional annotations and RNA measurements as key predictive attributes. We tested our method on four datasets, including human cell lines and human and mouse tissues. Our method predicts the protein expression values with average R2 scores between 0.46 and 0.54, which is significantly better than predictions based on correlations using the RNA expression data alone. Moreover, we demonstrate that the derived models can be "transferred" across experiments and species. For instance, the model derived from human tissues gave a R2 = 0.51 when applied to mouse tissue data. We conclude that protein abundances generated in label free MS experiments can be computationally predicted using functional annotated attributes and can be used to highlight aberrant protein abundance values. This article is protected by copyright. All rights reserved.
We used the mouse attaching and effacing (A/E) pathogen Citrobacter rodentium, which models the human A/E pathogens enteropathogenic Escherichia coli and enterohemorrhagic E. coli (EPEC and EHEC), to temporally resolve intestinal epithelial cell (IEC) responses and changes to the microbiome during in vivo infection. We found the host to be unresponsive during the first 3 days postinfection (DPI), when C. rodentium resides in the caecum. In contrast, at 4 DPI, the day of colonic colonization, despite only sporadic adhesion to the apex of the crypt, we observed robust upregulation of cell cycle and DNA repair processes, which were associated with expansion of the crypt Ki67-positive replicative zone, and downregulation of multiple metabolic processes (including the tricarboxylic acid [TCA] cycle and oxidative phosphorylation). Moreover, we observed dramatic depletion of goblet and deep crypt secretory cells and an atypical regulation of cholesterol homeostasis in IECs during early infection, with simultaneous upregulation of cholesterol biogenesis (e.g., 3-hydroxy-3-methylglutaryl-coenzyme A reductase [Hmgcr]), import (e.g., low-density lipoprotein receptor [Ldlr]), and efflux (e.g., AbcA1). We also detected interleukin 22 (IL-22) responses in IECs (e.g., Reg3γ) on the day of colonic colonization, which occurred concomitantly with a bloom of commensal Enterobacteriaceae on the mucosal surface. These results unravel a new paradigm in host-pathogen-microbiome interactions, showing for the first time that sensing a small number of pathogenic bacteria triggers swift intrinsic changes to the IEC composition and function, in tandem with significant changes to the mucosa-associated microbiome, which parallel innate immune responses.IMPORTANCE The mouse pathogen C. rodentium is a widely used model for colonic infection and has been a major tool in fundamental discoveries in the fields of bacterial pathogenesis and mucosal immunology. Despite extensive studies probing acute C. rodentium infection, our understanding of the early stages preceding the infection climax remains relatively undetailed. To this end, we apply a multiomics approach to resolve temporal changes to the host and microbiome during early infection. Unexpectedly, we found immediate and dramatic responses occurring on the day of colonic infection, both in the host intestinal epithelial cells and in the microbiome. Our study suggests changes in cholesterol and carbon metabolism in epithelial cells are instantly induced upon pathogen detection in the colon, corresponding with a shift to primarily facultative anaerobes constituting the microbiome. This study contributes to our knowledge of disease pathogenesis and mechanisms of barrier regulation, which is required for development of novel therapeutics targeting the intestinal epithelium.
Isobaric labeling is a highly precise approach for protein quantification. However, due to the isolation interference problem, isobaric tagging suffers from ratio underestimation at the MS2 level. The use of narrow isolation widths is a rational approach to alleviate the interference problem; however, this approach compromises proteome coverage. We reasoned that although a very narrow isolation window will result in loss of peptide fragment ions, the reporter ion signals will be retained for a significant portion of the spectra. On the basis of this assumption, we have designed a dual isolation width acquisition (DIWA) method, in which each precursor is first fragmented with HCD using a standard isolation width for peptide identification and preliminary quantification, followed by a second MS2 HCD scan using a much narrower isolation width for the acquisition of quantitative spectra with reduced interference. We leverage the quantification obtained by the "narrow" scans to build linear regression models and apply these to decompress the fold-changes measured at the "standard" scans. We evaluate the DIWA approach using a nested two species/gene knockout TMT-6plex experimental design and discuss the perspectives of this approach.
Objectives: To identify molecular differences between chondrocytes from osteophytic and articular cartilage tissue from OA patients. Methods: We investigated genes and pathways by combining genome-wide DNA methylation, RNA sequencing and quantitative proteomics in isolated primary chondrocytes from the cartilaginous layer of osteophytes and matched areas of low- and high-grade articular cartilage across nine patients with OA undergoing hip replacement surgery. Results: Chondrocytes from osteophytic cartilage showed widespread differences to low-grade articular cartilage chondrocytes. These differences were similar to, but more pronounced than, differences between chondrocytes from osteophytic and high-grade articular cartilage, and more pronounced than differences between high- and low-grade articular cartilage. We identified 56 genes with significant differences between osteophytic chondrocytes and low-grade articular cartilage chondrocytes on all three omics levels. Several of these genes have known roles in OA, including ALDH1A2 and cartilage oligomeric matrix protein, which have functional genetic variants associated with OA from genome-wide association studies. An integrative gene ontology enrichment analysis showed that differences between osteophytic and low-grade articular cartilage chondrocytes are associated with extracellular matrix organization, skeletal system development, platelet aggregation and regulation of ERK1 and ERK2 cascade. Conclusion: We present a first comprehensive view of the molecular landscape of chondrocytes from osteophytic cartilage as compared with articular cartilage chondrocytes from the same joints in OA. We found robust changes at genes relevant to chondrocyte function, providing insight into biological processes involved in osteophyte development and thus OA progression.
Mechanically activated, slowly adapting currents in sensory neurons have been linked to noxious mechanosensation. The conotoxin NMB-1 (noxious mechanosensation blocker-1) blocks such currents and inhibits mechanical pain. Using a biotinylated form of NMB-1 in mass spectrometry analysis, we identified 67 binding proteins in sensory neurons and a sensory neuron-derived cell line, of which the top candidate was annexin A6, a membrane-associated calcium-binding protein. Annexin A6-deficient mice showed increased sensitivity to mechanical stimuli. Sensory neurons from these mice showed increased activity of the cation channel Piezo2, which mediates a rapidly adapting mechano-gated current linked to proprioception and touch, and a decrease in mechanically activated, slowly adapting currents. Conversely, overexpression of annexin A6 in sensory neurons inhibited rapidly adapting currents that were partially mediated by Piezo2. Furthermore, overexpression of annexin A6 in sensory neurons attenuated mechanical pain in a mouse model of osteoarthritis, a disease in which mechanically evoked pain is particularly problematic. These data suggest that annexin A6 can be exploited to inhibit chronic mechanical pain.
Tyrosine phosphorylation is key for signal transduction from exogenous stimuli, including the defense against pathogens. Conversely, pathogens can subvert protein phosphorylation to control host immune responses and facilitate invasion and dissemination. The bacterial effectors EspJ and SeoC are injected into host cells through a type III secretion system by enteropathogenic and enterohemorrhagic Escherichia coli (EPEC and EHEC, respectively), Citrobacter rodentium, and Salmonella enterica, where they inhibit Src kinase by coupled amidation and ADP-ribosylation. C. rodentium, which is used to model EPEC and EHEC infections in humans, is a mouse pathogen triggering colonic crypt hyperplasia (CCH) and colitis. Enumeration of bacterial shedding and CCH confirmed that EspJ affects neither tolerance nor resistance to infection. However, comparison of the proteomes of intestinal epithelial cells isolated from mice infected with wild-type C. rodentium or C. rodentium encoding catalytically inactive EspJ revealed that EspJ-induced ADP-ribosylation regulates multiple nonreceptor tyrosine kinases in vivo Investigation of the substrate repertoire of EspJ revealed that in HeLa and A549 cells, Src and Csk were significantly targeted; in polarized Caco2 cells, EspJ targeted Src and Csk and the Src family kinase (SFK) Yes1, while in differentiated Thp1 cells, EspJ modified Csk, the SFKs Hck and Lyn, the Tec family kinases Tec and Btk, and the adapter tyrosine kinase Syk. Furthermore, Abl (HeLa and Caco2) and Lyn (Caco2) were enriched specifically in the EspJ-containing samples. Biochemical assays revealed that EspJ, the only bacterial ADP-ribosyltransferase that targets mammalian kinases, controls immune responses and the Src/Csk signaling axis.IMPORTANCE Enteropathogenic and enterohemorrhagic Escherichia coli (EPEC and EHEC, respectively) strains cause significant mortality and morbidity worldwide. Citrobacter rodentium is a mouse pathogen used to model EPEC and EHEC pathogenesis in vivo Diarrheal disease is triggered following injection of bacterial effectors, via a type III secretion system (T3SS), into intestinal epithelial cells (IECs). While insights into the role of the effectors were historically obtained from pathological, immunologic, or cell culture phenotypes, subtle roles of individual effectors in vivo are often masked. The aim of this study was to elucidate the role and specificity of the ADP-ribosyltransferase effector EspJ. For the first time, we show that the in vivo processes affected by a T3SS effector can be studied by comparing the proteomes of IECs extracted from mice infected with wild-type C. rodentium or an espJ catalytic mutant. We show that EspJ, the only bacterial ADP-ribosyltransferase that targets mammalian kinases, regulates the host immune response in vivo.
Label-free quantification of shotgun LC-MS/MS data is the prevailing approach in quantitative proteomics but remains computationally nontrivial. The central data analysis step is the detection of peptide-specific signal patterns, called features. Peptide quantification is facilitated by associating signal intensities in features with peptide sequences derived from MS2 spectra; however, missing values due to imperfect feature detection are a common problem. A feature detection approach that directly targets identified peptides (minimizing missing values) but also offers robustness against false-positive features (by assigning meaningful confidence scores) would thus be highly desirable. We developed a new feature detection algorithm within the OpenMS software framework, leveraging ideas and algorithms from the OpenSWATH toolset for DIA/SRM data analysis. Our software, FeatureFinderIdentification ("FFId"), implements a targeted approach to feature detection based on information from identified peptides. This information is encoded in an MS1 assay library, based on which ion chromatogram extraction and detection of feature candidates are carried out. Significantly, when analyzing data from experiments comprising multiple samples, our approach distinguishes between "internal" and "external" (inferred) peptide identifications (IDs) for each sample. On the basis of internal IDs, two sets of positive (true) and negative (decoy) feature candidates are defined. A support vector machine (SVM) classifier is then trained to discriminate between the sets and is subsequently applied to the "uncertain" feature candidates from external IDs, facilitating selection and confidence scoring of the best feature candidate for each peptide. This approach also enables our algorithm to estimate the false discovery rate (FDR) of the feature selection step. We validated FFId based on a public benchmark data set, comprising a yeast cell lysate spiked with protein standards that provide a known ground-truth. The algorithm reached almost complete (>99%) quantification coverage for the full set of peptides identified at 1% FDR (PSM level). Compared with other software solutions for label-free quantification, this is an outstanding result, which was achieved at competitive quantification accuracy and reproducibility across replicates. The FDR for the feature selection was estimated at a low 1.5% on average per sample (3% for features inferred from external peptide IDs). The FFId software is open-source and freely available as part of OpenMS ( www.openms.org ).
Proteogenomics leverages information derived from proteomic data to improve genome annotations. Of particular interest are "novel" peptides that provide direct evidence of protein expression for genomic regions not previously annotated as protein-coding. We present a modular, automated data analysis pipeline aimed at detecting such "novel" peptides in proteomic data sets. This pipeline implements criteria developed by proteomics and genome annotation experts for high-stringency peptide identification and filtering. Our pipeline is based on the OpenMS computational framework; it incorporates multiple database search engines for peptide identification and applies a machine-learning approach (Percolator) to post-process search results. We describe several new and improved software tools that we developed to facilitate proteogenomic analyses that enhance the wealth of tools provided by OpenMS. We demonstrate the application of our pipeline to a human testis tissue data set previously acquired for the Chromosome-Centric Human Proteome Project, which led to the addition of five new gene annotations on the human reference genome.
Red blood cell (RBC) invasion by Plasmodium merozoites requires multiple steps that are regulated by signaling pathways. Exposure of P. falciparum merozoites to the physiological signal of low K+, as found in blood plasma, leads to a rise in cytosolic Ca2+, which mediates microneme secretion, motility, and invasion. We have used global phosphoproteomic analysis of merozoites to identify signaling pathways that are activated during invasion. Using quantitative phosphoproteomics, we found 394 protein phosphorylation site changes in merozoites subjected to different ionic environments (high K+/low K+), 143 of which were Ca2+ dependent. These included a number of signaling proteins such as catalytic and regulatory subunits of protein kinase A (PfPKAc and PfPKAr) and calcium-dependent protein kinase 1 (PfCDPK1). Proteins of the 14-3-3 family interact with phosphorylated target proteins to assemble signaling complexes. Here, using coimmunoprecipitation and gel filtration chromatography, we demonstrate that Pf14-3-3I binds phosphorylated PfPKAr and PfCDPK1 to mediate the assembly of a multiprotein complex in P. falciparum merozoites. A phospho-peptide, P1, based on the Ca2+-dependent phosphosites of PKAr, binds Pf14-3-3I and disrupts assembly of the Pf14-3-3I-mediated multiprotein complex. Disruption of the multiprotein complex with P1 inhibits microneme secretion and RBC invasion. This study thus identifies a novel signaling complex that plays a key role in merozoite invasion of RBCs. Disruption of this signaling complex could serve as a novel approach to inhibit blood-stage growth of malaria parasites.IMPORTANCE Invasion of red blood cells (RBCs) by Plasmodium falciparum merozoites is a complex process that is regulated by intricate signaling pathways. Here, we used phosphoproteomic profiling to identify the key proteins involved in signaling events during invasion. We found changes in the phosphorylation of various merozoite proteins, including multiple kinases previously implicated in the process of invasion. We also found that a phosphorylation-dependent multiprotein complex including signaling kinases assembles during the process of invasion. Disruption of this multiprotein complex impairs merozoite invasion of RBCs, providing a novel approach for the development of inhibitors to block the growth of blood-stage malaria parasites.
Clustering of the enteropathogenic Escherichia coli (EPEC) type III secretion system (T3SS) effector translocated intimin receptor (Tir) by intimin leads to actin polymerisation and pyroptotic cell death in macrophages. The effect of Tir clustering on the viability of EPEC-infected intestinal epithelial cells (IECs) is unknown. We show that EPEC induces pyroptosis in IECs in a Tir-dependent but actin polymerisation-independent manner, which was enhanced by priming with interferon gamma (IFNγ). Mechanistically, Tir clustering triggers rapid Ca2+ influx, which induces lipopolysaccharide (LPS) internalisation, followed by activation of caspase-4 and pyroptosis. Knockdown of caspase-4 or gasdermin D (GSDMD), translocation of NleF, which blocks caspase-4 or chelation of extracellular Ca2+, inhibited EPEC-induced cell death. IEC lines with low endogenous abundance of GSDMD were resistant to Tir-induced cell death. Conversely, ATP-induced extracellular Ca2+ influx enhanced cell death, which confirmed the key regulatory role of Ca2+ in EPEC-induced pyroptosis. We reveal a novel mechanism through which infection with an extracellular pathogen leads to pyroptosis in IECs.
Osteoarthritis causes pain and functional disability for over 500 million people worldwide. To develop disease-stratifying tools and modifying therapies, we need a better understanding of the molecular basis of the disease in relevant tissue and cell types. Here, we study primary cartilage and synovium from 115 patients with osteoarthritis to construct a deep molecular signature map of the disease. By integrating genetics with transcriptomics and proteomics, we discover molecular trait loci in each tissue type and omics level, identify likely effector genes for osteoarthritis-associated genetic signals and highlight high-value targets for drug development and repurposing. These findings provide insights into disease aetiopathology, and offer translational opportunities in response to the global clinical challenge of osteoarthritis.
Melanoma represents ~5% of all cutaneous malignancies, yet accounts for the majority of skin cancer deaths due to its propensity to metastasise. To develop new therapies, novel target molecules must to be identified and the accessibility of cell surface proteins makes them attractive targets. Using CRISPR activation technology, we screened a library of guide RNAs targeting membrane protein-encoding genes to identify cell surface molecules whose upregulation enhances the metastatic pulmonary colonisation capabilities of tumour cells in vivo. We show that upregulated expression of the cell surface protein LRRN4CL led to increased pulmonary metastases in mice. Critically, LRRN4CL expression was elevated in melanoma patient samples, with high expression levels correlating with decreased survival. Collectively, our findings uncover an unappreciated role for LRRN4CL in the outcome of melanoma patients and identifies a potential therapeutic target and biomarker.
Mutational signatures are imprints of pathophysiological processes arising through tumorigenesis. We generated isogenic CRISPR-Cas9 knockouts (Δ) of 43 genes in human induced pluripotent stem cells, cultured them in the absence of added DNA damage, and performed whole-genome sequencing of 173 subclones. Δ<i>OGG1,</i> Δ<i>UNG,</i> Δ<i>EXO1,</i> Δ<i>RNF168,</i> Δ<i>MLH1,</i> Δ<i>MSH2,</i> Δ<i>MSH6,</i> Δ<i>PMS1,</i> and Δ<i>PMS2</i> produced marked mutational signatures indicative of being critical mitigators of endogenous DNA modifications. Detailed analyses revealed mutational mechanistic insights, including how 8-oxo-dG elimination is sequence-context-specific while uracil clearance is sequence-context-independent. Mismatch repair (MMR) deficiency signatures are engendered by oxidative damage (C>A transversions), differential misincorporation by replicative polymerases (T>C and C>T transitions), and we propose a 'reverse template slippage' model for T>A transversions. Δ<i>MLH1,</i> Δ<i>MSH6,</i> and Δ<i>MSH2</i> signatures were similar to each other but distinct from Δ<i>PMS2</i>. Finally, we developed a classifier, MMRDetect, where application to 7,695 WGS cancers showed enhanced detection of MMR-deficient tumors, with implications for responsiveness to immunotherapies.
Necroptosis is a lytic, inflammatory form of cell death that not only contributes to pathogen clearance but can also lead to disease pathogenesis. Necroptosis is triggered by RIPK3-mediated phosphorylation of MLKL, which is thought to initiate MLKL oligomerisation, membrane translocation and membrane rupture, although the precise mechanism is incompletely understood. Here, we show that K63-linked ubiquitin chains are attached to MLKL during necroptosis and that ubiquitylation of MLKL at K219 significantly contributes to the cytotoxic potential of phosphorylated MLKL. The K219R MLKL mutation protects animals from necroptosis-induced skin damage and renders cells resistant to pathogen-induced necroptosis. Mechanistically, we show that ubiquitylation of MLKL at K219 is required for higher-order assembly of MLKL at membranes, facilitating its rupture and necroptosis. We demonstrate that K219 ubiquitylation licenses MLKL activity to induce lytic cell death, suggesting that necroptotic clearance of pathogens as well as MLKL-dependent pathologies are influenced by the ubiquitin-signalling system.
The enteropathogenic Escherichia coli (EPEC) type III secretion system effector Tir, which mediates intimate bacterial attachment to epithelial cells, also triggers Ca<sup>2+</sup> influx followed by LPS entry and caspase-4-dependent pyroptosis, which could be antagonized by the effector NleF. Here we reveal the mechanism by which EPEC induces Ca<sup>2+</sup> influx. We show that in the intestinal epithelial cell line SNU-C5, Tir activates the mechano/osmosensitive cation channel TRPV2 which triggers extracellular Ca<sup>2+</sup> influx. Tir-induced Ca<sup>2+</sup> influx could be blocked by siRNA silencing of TRPV2, pre-treatment with the TRPV2 inhibitor SET2 or by growing cells in low osmolality medium. Pharmacological activation of TRPV2 in the absence of Tir failed to initiate caspase-4-dependent cell death, confirming the necessity of Tir. Consistent with the model implicating activation on translocation of TRPV2 from the ER to plasma membrane, inhibition of protein trafficking by either brefeldin A or the effector NleA prevented TRPV2 activation and cell death. While infection with EPECΔnleA triggered pyroptotic cell death, this could be prevented by NleF. Taken together this study shows that while integration of Tir into the plasma membrane activates TRPV2, EPEC uses NleA to inhibit TRPV2 trafficking and NleF to inhibit caspase-4 and pyroptosis.
Poly (ADP-ribose) polymerase (PARP) inhibitors elicit antitumour activity in homologous recombination-defective cancers by trapping PARP1 in a chromatin-bound state. How cells process trapped PARP1 remains unclear. Using wild-type and a trapping-deficient PARP1 mutant combined with rapid immunoprecipitation mass spectrometry of endogenous proteins and Apex2 proximity labelling, we delineated mass spectrometry-based interactomes of trapped and non-trapped PARP1. These analyses identified an interaction between trapped PARP1 and the ubiquitin-regulated p97 ATPase/segregase. We found that following trapping, PARP1 is SUMOylated by PIAS4 and subsequently ubiquitylated by the SUMO-targeted E3 ubiquitin ligase RNF4, events that promote recruitment of p97 and removal of trapped PARP1 from chromatin. Small-molecule p97-complex inhibitors, including a metabolite of the clinically used drug disulfiram (CuET), prolonged PARP1 trapping and enhanced PARP inhibitor-induced cytotoxicity in homologous recombination-defective tumour cells and patient-derived tumour organoids. Together, these results suggest that p97 ATPase plays a key role in the processing of trapped PARP1 and the response of tumour cells to PARP inhibitors.
BACKGROUND:Many pathogens secrete effector molecules to subvert host immune responses, to acquire nutrients, and/or to prepare host cells for invasion. One of the ways that effector molecules are secreted is through extracellular vesicles (EVs) such as exosomes. Recently, the malaria parasite P. falciparum has been shown to produce EVs that can mediate transfer of genetic material between parasites and induce sexual commitment. Characterizing the content of these vesicles may improve our understanding of P. falciparum pathogenesis and virulence. METHODS:Previous studies of P. falciparum EVs have been limited to long-term adapted laboratory isolates. In this study, we isolated EVs from a Kenyan P. falciparum clinical isolate adapted to in vitro culture for a short period and characterized their protein content by mass spectrometry (data are available via ProteomeXchange, with identifier PXD006925). RESULTS:We show that P. falciparum extracellular vesicles ( PfEVs) are enriched in proteins found within the exomembrane compartments of infected erythrocytes such as Maurer's clefts (MCs), as well as the secretory endomembrane compartments in the apical end of the merozoites, suggesting that these proteins play a role in parasite-host interactions. Comparison of this novel clinically relevant dataset with previously published datasets helps to define a core secretome present in Plasmodium EVs. CONCLUSIONS:P. falciparum extracellular vesicles contain virulence-associated parasite proteins. Therefore, analysis of PfEVs contents from a range of clinical isolates, and their functional validation may improve our understanding of the virulence mechanisms of the parasite, and potentially identify targets for interventions or diagnostics.
Aneuploidy results in decreased cellular fitness in many species and model systems. However, aneuploidy is commonly found in cancer cells and often correlates with aggressive growth, suggesting that the impact of aneuploidy on cellular fitness is context dependent. The BRG1 (SMARCA4) subunit of the SWI/SNF chromatin remodelling complex is frequently lost in cancer. Here, we use a chromosomally stable cell line to test the effect of BRG1 loss on the evolution of aneuploidy. BRG1 deletion leads to an initial loss of fitness in this cell line that improves over time. Notably, we find increased tolerance to aneuploidy immediately upon loss of BRG1, and the fitness recovery over time correlates with chromosome gain. These data show that BRG1 loss creates an environment where karyotype changes can be explored without a fitness penalty. At least in some genetic backgrounds, therefore, BRG1 loss can affect the progression of tumourigenesis through tolerance of aneuploidy.
When used in combination with hormone treatment, Palbociclib prolongs progression-free survival of patients with hormone receptor positive breast cancer. Mechanistically, Palbociclib inhibits CDK4/6 activity but the basis for differing sensitivity of cancer to Palbociclib is poorly understood. A common observation in a subset of Triple Negative Breast Cancers (TNBCs) is that prolonged CDK4/6 inhibition can engage a senescence-like state where cells exit the cell cycle, whilst, remaining metabolically active. To better understand the senescence-like cell state which arises after Palbociclib treatment we used mass spectrometry to quantify the proteome, phosphoproteome, and secretome of Palbociclib-treated MDA-MB-231 TNBC cells. We observed altered levels of cell cycle regulators, immune response, and key senescence markers upon Palbociclib treatment. These datasets provide a starting point for the derivation of biomarkers which could inform the future use CDK4/6 inhibitors in TNBC subtypes and guide the development of potential combination therapies.
The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.
This study reports the HMGB1 interactomes in prostate and ovary cancer cells lines. Affinity purification coupled to mass spectrometry confirmed that the HMGB1 nuclear interactome is involved in HMGB1 known functions such as maintenance of chromatin stability and regulation of transcription, and also in not as yet reported processes such as mRNA and rRNA processing. We have identified an interaction between HMGB1 and the NuRD complex and validated this by yeast-two-hybrid, confirming that the RBBP7 subunit directly interacts with HMGB1. In addition, we describe for the first time an interaction between two HMGB1 interacting complexes, the septin and THOC complexes, as well as an interaction of these two complexes with Rab11. Analysis of Pan-Cancer Atlas public data indicated that several genes encoding HMGB1-interacting proteins identified in this study are dysregulated in tumours from patients diagnosed with ovary and prostate carcinomas. In PC-3 cells, silencing of <i>HMGB1</i> leads to downregulation of the expression of key regulators of ribosome biogenesis and RNA processing, namely <i>BOP1</i>, <i>RSS1</i>, <i>UBF1</i>, <i>KRR1</i> and <i>LYAR</i>. Upregulation of these genes in prostate adenocarcinomas is correlated with worse prognosis, reinforcing their functional significance in cancer progression.
Large scale proteomic profiling of cell lines can reveal molecular signatures attributed to variable genotypes or induced perturbations, enabling proteogenomic associations and elucidation of pharmacological mechanisms of action. Although isobaric labeling has increased the throughput of proteomic analysis, the commonly used sample preparation workflows often require time-consuming steps and costly consumables, limiting their suitability for large scale studies. Here, we present a simplified and cost-effective one-pot reaction workflow in a 96-well plate format (SimPLIT) that minimizes processing steps and demonstrates improved reproducibility compared to alternative approaches. The workflow is based on a sodium deoxycholate lysis buffer and a single detergent cleanup step after peptide labeling, followed by quick off-line fractionation and MS2 analysis. We showcase the applicability of the workflow in a panel of colorectal cancer cell lines and by performing target discovery for a set of molecular glue degraders in different cell lines, in a 96-sample assay. Using this workflow, we report frequently dysregulated proteins in colorectal cancer cells and uncover cell-dependent protein degradation profiles of seven cereblon E3 ligase modulators (CRL4<sup>CRBN</sup>). Overall, SimPLIT is a robust method that can be easily implemented in any proteomics laboratory for medium-to-large scale TMT-based studies for deep profiling of cell lines.
Genes encoding the core cell cycle machinery are transcriptionally regulated by the MuvB family of protein complexes in a cell cycle-specific manner. Complexes of MuvB with the transcription factors B-MYB and FOXM1 activate mitotic genes during cell proliferation. The mechanisms of transcriptional regulation by these complexes are still poorly characterised. Here, we combine biochemical analysis and in vitro reconstitution, with structural analysis by cryo-electron microscopy and cross-linking mass spectrometry, to functionally examine these complexes. We find that the MuvB:B-MYB complex binds and remodels nucleosomes, thereby exposing nucleosomal DNA. This remodelling activity is supported by B-MYB which directly binds the remodelled DNA. Given the remodelling activity on the nucleosome, we propose that the MuvB:B-MYB complex functions as a pioneer transcription factor complex. In this work, we rationalise prior biochemical and cellular studies and provide a molecular framework of interactions on a protein complex that is key for cell cycle regulation.
Gastric cancer represents the third leading cause of global cancer mortality and an area of unmet clinical need. Drugs that target the DNA damage response, including ATR inhibitors (ATRi), have been proposed as novel targeted agents in gastric cancer. Here, we sought to evaluate the efficacy of ATRi in preclinical models of gastric cancer and to understand how ATRi resistance might emerge as a means to identify predictors of ATRi response. A positive selection genome-wide CRISPR-Cas9 screen identified candidate regulators of ATRi resistance in gastric cancer. Loss-of-function mutations in either SMG8 or SMG9 caused ATRi resistance by an SMG1-mediated mechanism. Although ATRi still impaired ATR/CHK1 signaling in SMG8/9-defective cells, other characteristic responses to ATRi exposure were not seen, such as changes in ATM/CHK2, γH2AX, phospho-RPA, or 53BP1 status or changes in the proportions of cells in S- or G2-M-phases of the cell cycle. Transcription/replication conflicts (TRC) elicited by ATRi exposure are a likely cause of ATRi sensitivity, and SMG8/9-defective cells exhibited a reduced level of ATRi-induced TRCs, which could contribute to ATRi resistance. These observations suggest ATRi elicits antitumor efficacy in gastric cancer but that drug resistance could emerge via alterations in the SMG8/9/1 pathway.<h4>Significance</h4>These findings reveal how cancer cells acquire resistance to ATRi and identify pathways that could be targeted to enhance the overall effectiveness of these inhibitors.
Activation of client protein kinases by the HSP90 molecular chaperone system is affected by phosphorylation at multiple sites on HSP90, the kinase-specific co-chaperone CDC37, and the kinase client itself. Removal of regulatory phosphorylation from client kinases and their release from the HSP90-CDC37 system depends on the Ser/Thr phosphatase PP5, which associates with HSP90 via its N-terminal TPR domain. Here, we present the cryoEM structure of the oncogenic protein kinase client BRAF<sup>V600E</sup> bound to HSP90-CDC37, showing how the V600E mutation favours BRAF association with HSP90-CDC37. Structures of HSP90-CDC37-BRAF<sup>V600E</sup> complexes with PP5 in autoinhibited and activated conformations, together with proteomic analysis of its phosphatase activity on BRAF<sup>V600E</sup> and CRAF, reveal how PP5 is activated by recruitment to HSP90 complexes. PP5 comprehensively dephosphorylates client proteins, removing interaction sites for regulatory partners such as 14-3-3 proteins and thus performing a 'factory reset' of the kinase prior to release.
Almost all living cells maintain size uniformity through successive divisions. Proteins that over and underscale with size can act as rheostats, which regulate cell cycle progression. Using a multiomic strategy, we leveraged the heterogeneity of melanoma cell lines to identify peptides, transcripts, and phosphorylation events that differentially scale with cell size. Subscaling proteins are enriched in regulators of the DNA damage response and cell cycle progression, whereas super-scaling proteins included regulators of the cytoskeleton, extracellular matrix, and inflammatory response. Mathematical modeling suggested that decoupling growth and proliferative signaling may facilitate cell cycle entry over senescence in large cells when mitogenic signaling is decreased. Regression analysis reveals that up-regulation of TP53 or CDKN1A/p21CIP1 is characteristic of proliferative cancer cells with senescent-like sizes/proteomes. This study provides one of the first demonstrations of size-scaling phenomena in cancer and how morphology influences the chemistry of the cell.
Lysine acetylation in histone tails is a key post-translational modification that controls transcription activation. Histone deacetylase complexes remove histone acetylation, thereby repressing transcription and regulating the transcriptional output of each gene. Although these complexes are drug targets and crucial regulators of organismal physiology, their structure and mechanisms of action are largely unclear. Here, we present the structure of a complete human SIN3B histone deacetylase holo-complex with and without a substrate mimic. Remarkably, SIN3B encircles the deacetylase and contacts its allosteric basic patch thereby stimulating catalysis. A SIN3B loop inserts into the catalytic tunnel, rearranges to accommodate the acetyl-lysine moiety, and stabilises the substrate for specific deacetylation, which is guided by a substrate receptor subunit. Our findings provide a model of specificity for a main transcriptional regulator conserved from yeast to human and a resource of protein-protein interactions for future drug designs.
Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × ∼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.
Malaria transmission to mosquitoes requires a developmental switch in asexually dividing blood-stage parasites to sexual reproduction. In Plasmodium berghei, the transcription factor AP2-G is required and sufficient for this switch, but how a particular sex is determined in a haploid parasite remains unknown. Using a global screen of barcoded mutants, we here identify genes essential for the formation of either male or female sexual forms and validate their importance for transmission. High-resolution single-cell transcriptomics of ten mutant parasites portrays the developmental bifurcation and reveals a regulatory cascade of putative gene functions in the determination and subsequent differentiation of each sex. A male-determining gene with a LOTUS/OST-HTH domain as well as the protein interactors of a female-determining zinc-finger protein indicate that germ-granule-like ribonucleoprotein complexes complement transcriptional processes in the regulation of both male and female development of a malaria parasite.
EROS (essential for reactive oxygen species) protein is indispensable for expression of gp91<i>phox</i>, the catalytic core of the phagocyte NADPH oxidase. EROS deficiency in humans is a novel cause of the severe immunodeficiency, chronic granulomatous disease, but its mechanism of action was unknown until now. We elucidate the role of EROS, showing it acts at the earliest stages of gp91<i>phox</i> maturation. It binds the immature 58 kDa gp91<i>phox</i> directly, preventing gp91<i>phox</i> degradation and allowing glycosylation via the oligosaccharyltransferase machinery and the incorporation of the heme prosthetic groups essential for catalysis. EROS also regulates the purine receptors P2X7 and P2X1 through direct interactions, and P2X7 is almost absent in EROS-deficient mouse and human primary cells. Accordingly, lack of murine EROS results in markedly abnormal P2X7 signalling, inflammasome activation, and T cell responses. The loss of both ROS and P2X7 signalling leads to resistance to influenza infection in mice. Our work identifies EROS as a highly selective chaperone for key proteins in innate and adaptive immunity and a rheostat for immunity to infection. It has profound implications for our understanding of immune physiology, ROS dysregulation, and possibly gene therapy.
Proteomic profiling of RNA-binding proteins in <i>Leishmania</i> is currently limited to polyadenylated mRNA-binding proteins, leaving proteins that interact with nonadenylated RNAs, including noncoding RNAs and pre-mRNAs, unidentified. Using a combination of unbiased orthogonal organic phase separation methodology and tandem mass tag-labeling-based high resolution quantitative proteomic mass spectrometry, we robustly identified 2,417 RNA-binding proteins, including 1289 putative novel non-poly(A)-RNA-binding proteins across the two main <i>Leishmania</i> life cycle stages. Eight out of 20 <i>Leishmania</i> deubiquitinases, including the recently characterized L. mexicana DUB2 with an elaborate RNA-binding protein interactome were exclusively identified in the non-poly(A)-RNA-interactome. Additionally, an increased representation of WD40 repeat domains were observed in the <i>Leishmania</i> non-poly(A)-RNA-interactome, thus uncovering potential involvement of this protein domain in RNA-protein interactions in <i>Leishmania</i>. We also characterize the protein-bound RNAs using RNA-sequencing and show that in addition to protein coding transcripts ncRNAs are also enriched in the protein-RNA interactome. Differential gene expression analysis revealed enrichment of 142 out of 195 total L. mexicana protein kinase genes in the protein-RNA-interactome, suggesting important role of protein-RNA interactions in the regulation of the <i>Leishmania</i> protein kinome. Additionally, we characterize the quantitative changes in RNA-protein interactions in hundreds of <i>Leishmania</i> proteins following inhibition of heat shock protein 90 (Hsp90). Our results show that the Hsp90 inhibition in <i>Leishmania</i> causes widespread disruption of RNA-protein interactions in ribosomal proteins, proteasomal proteins and translation factors in both life cycle stages, suggesting downstream effect of the inhibition on protein synthesis and degradation pathways in <i>Leishmania</i>. This study defines the comprehensive RNA interactome of <i>Leishmania</i> and provides in-depth insight into the widespread involvement of RNA-protein interactions in <i>Leishmania</i> biology. <b>IMPORTANCE</b> Advances in proteomics and mass spectrometry have revealed the mRNA-binding proteins in many eukaryotic organisms, including the protozoan parasites <i>Leishmania</i> spp., the causative agents of leishmaniasis, a major infectious disease in over 90 tropical and subtropical countries. However, in addition to mRNAs, which constitute only 2 to 5% of the total transcripts, many types of non-coding RNAs participate in crucial biological processes. In <i>Leishmania</i>, RNA-binding proteins serve as primary gene regulators. Therefore, transcriptome-wide identification of RNA-binding proteins is necessary for deciphering the distinctive posttranscriptional mechanisms of gene regulation in <i>Leishmania</i>. Using a combination of highly efficient orthogonal organic phase separation method and tandem mass tag-labeling-based quantitative proteomic mass spectrometry, we provide unprecedented comprehensive molecular definition of the total RNA interactome across the two main <i>Leishmania</i> life cycle stages. In addition, we characterize for the first time the quantitative changes in RNA-protein interactions in <i>Leishmania</i> following inhibition of heat shock protein 90, shedding light into hitherto unknown large-scale downstream molecular effect of the protein inhibition in the parasite. This work provides insight into the importance of total RNA-protein interactions in <i>Leishmania</i>, thus significantly expanding our knowledge of the emergence of RNA-protein interactions in <i>Leishmania</i> biology.
Type III secretion system (T3SS) effectors are key virulence factors that underpin the infection strategy of many clinically important Gram-negative pathogens, including Salmonella enterica, Shigella spp., enteropathogenic and enterohemorrhagic Escherichia coli and their murine equivalent, Citrobacter rodentium. The cellular processes or proteins targeted by the effectors can be common to multiple pathogens or pathogen-specific. The main approach to understanding T3SS-mediated pathogenesis has been to determine the contribution of one effector at a time, with the aim of piecing together individual functions and unveiling infection mechanisms. However, in contrast to this prevailing approach, simultaneous deletion of multiple effectors revealed that they function as an interconnected network in vivo, uncovering effector codependency and context-dependent effector essentiality. This paradigm shift in T3SS biology is at the heart of this opinion article.
Many enteric pathogens employ a type III secretion system (T3SS) to translocate effector proteins directly into the host cell cytoplasm, where they subvert signalling pathways of the intestinal epithelium. Here, we report that the anti-apoptotic regulator HS1-associated protein X1 (HAX-1) is an interaction partner of the T3SS effectors EspO of enterohaemorrhagic Escherichia coli (EHEC) and Citrobacter rodentium, OspE of Shigella flexneri and Osp1<sub>STYM</sub> of Salmonella enterica serovar Typhimurium. EspO, OspE and Osp1<sub>STYM</sub> have previously been reported to interact with the focal adhesions protein integrin linked kinase (ILK). We found that EspO localizes both to the focal adhesions (ILK localisation) and mitochondria (HAX-1 localisation), and that increased expression of HAX-1 leads to enhanced mitochondrial localisation of EspO. Ectopic expression of EspO, OspE and Osp1<sub>STYM</sub> protects cells from apoptosis induced by staurosporine and tunicamycin. Depleting cells of HAX-1 indicates that the anti-apoptotic activity of EspO is HAX-1 dependent. Both HAX-1 and ILK were further confirmed as EspO1-interacting proteins during infection using T3SS-delivered EspO1. Using cell detachment as a proxy for cell death we confirmed that T3SS-delivered EspO1 could inhibit cell death induced during EPEC infection, to a similar extent as the anti-apoptotic effector NleH, or treatment with the pan caspase inhibitor z-VAD. In contrast, in cells lacking HAX-1, EspO1 was no longer able to protect against cell detachment, while NleH1 and z-VAD maintained their protective activity. Therefore, during both infection and ectopic expression EspO protects cells from cell death by interacting with HAX-1. These results suggest that despite the differences between EHEC, C. rodentium, Shigella and S. typhimurium infections, hijacking HAX-1 anti-apoptotic signalling is a common strategy to maintain the viability of infected cells. TAKE AWAY: EspO homologues are found in EHEC, Shigella, S. typhimurium and some EPEC. EspO homologues interact with HAX-1. EspO protects infected cells from apoptosis. EspO joins a growing list of T3SS effectors that manipulate cell death pathways.
Complexome profiling is an emerging 'omics' approach that systematically interrogates the composition of protein complexes (the complexome) of a sample, by combining biochemical separation of native protein complexes with mass-spectrometry based quantitation proteomics. The resulting fractionation profiles hold comprehensive information on the abundance and composition of the complexome, and have a high potential for reuse by experimental and computational researchers. However, the lack of a central resource that provides access to these data, reported with adequate descriptions and an analysis tool, has limited their reuse. Therefore, we established the ComplexomE profiling DAta Resource (CEDAR, www3.cmbi.umcn.nl/cedar/), an openly accessible database for depositing and exploring mass spectrometry data from complexome profiling studies. Compatibility and reusability of the data is ensured by a standardized data and reporting format containing the "minimum information required for a complexome profiling experiment" (MIACE). The data can be accessed through a user-friendly web interface, as well as programmatically using the REST API portal. Additionally, all complexome profiles available on CEDAR can be inspected directly on the website with the profile viewer tool that allows the detection of correlated profiles and inference of potential complexes. In conclusion, CEDAR is a unique, growing and invaluable resource for the study of protein complex composition and dynamics across biological systems.
Infections with many Gram-negative pathogens, including <i>Escherichia coli</i>, <i>Salmonella</i>, <i>Shigella</i>, and <i>Yersinia</i>, rely on type III secretion system (T3SS) effectors. We hypothesized that while hijacking processes within mammalian cells, the effectors operate as a robust network that can tolerate substantial contractions. This was tested in vivo using the mouse pathogen <i>Citrobacter rodentium</i> (encoding 31 effectors). Sequential gene deletions showed that effector essentiality for infection was context dependent and that the network could tolerate 60% contraction while maintaining pathogenicity. Despite inducing very different colonic cytokine profiles (e.g., interleukin-22, interleukin-17, interferon-γ, or granulocyte-macrophage colony-stimulating factor), different networks induced protective immunity. Using data from >100 distinct mutant combinations, we built and trained a machine learning model able to predict colonization outcomes, which were confirmed experimentally. Furthermore, reproducing the human-restricted enteropathogenic <i>E. coli</i> effector repertoire in <i>C. rodentium</i> was not sufficient for efficient colonization, which implicates effector networks in host adaptation. These results unveil the extreme robustness of both T3SS effector networks and host responses.
Most studies of infections at mucosal surfaces have focused on the acute phase of the disease. Consequently, little is known about the molecular processes that underpin tissue recovery and the long-term consequences postinfection. Here, we conducted temporal deep quantitative proteomic analysis of colonic intestinal epithelial cells (cIECs) from mice infected with the natural mouse pathogen Citrobacter rodentium over time points corresponding to the late steady-state phase (10 days postinfection [DPI]), the clearance phase (13 to 20 DPI), and 4 weeks after the pathogen has been cleared (48 DPI). <i>C. rodentium</i>, which relies on a type III secretion system to infect, is used to model infections with enteropathogenic and enterohemorrhagic <i>Escherichia coli</i>. We observe a strong upregulation of inflammatory signaling and nutritional immunity responses during the clearance phase of the infection. Despite morphological tissue recovery, chromogranin B (ChgB)-positive endocrine cells remained significantly below baseline levels at 48 DPI. In contrast, we observed an increased abundance of proteins involved in antigen processing and presentation 4 weeks after pathogen clearance. In particular, long-term changes were characterized by a persistent interferon gamma (IFN-γ) response and the expression of major histocompatibility complex class II (MHCII) molecules in 60% of the EpCAM<sup>+</sup> cIECs, which were not seen in <i>Ifn</i>γ<i><sup>-/-</sup></i> mice. Nonetheless, both wild-type and <i>Ifn</i>γ<i><sup>-/-</sup></i> mice mounted similar systemic and colonic IgG responses to C. rodentium and were equally protected from rechallenge, suggesting that cIEC MHCII is not necessary for protective immunity against C. rodentium. <b>IMPORTANCE</b> Mucosal surfaces respond to infection by mounting an array of metabolic, inflammatory, and tissue repair responses. While these have been well studied during acute infection, less is known about tissue recovery after pathogen clearance. We employ the mouse pathogen Citrobacter rodentium, which binds colonic intestinal epithelial cells (cIECs), to investigate the long-term effects of bacterial infection on gut physiology. Using global proteomic analysis, we study cIEC temporal responses during and after the clearance phase of infection. While the overall tissue morphology recovered, cIECs showed persistent signs of infection 4 weeks after pathogen clearance. These were characterized by a strong IFN-γ signature, including the upregulation of major histocompatibility complex class II (MHCII) antigen presentation proteins, suggesting that the tissue remains on "high alert" for weeks after the acute insult is resolved. However, we demonstrate that cIEC MHCII expression, which is induced by IFN-γ, is not required for protective IgG-mediated immunity against C. rodentium; instead, it may play a role in mucosal recovery.
Inactivation of <i>Polybromo 1</i> (<i>PBRM1</i>), a specific subunit of the PBAF chromatin remodeling complex, occurs frequently in cancer, including 40% of clear cell renal cell carcinomas (ccRCC). To identify novel therapeutic approaches to targeting PBRM1-defective cancers, we used a series of orthogonal functional genomic screens that identified PARP and ATR inhibitors as being synthetic lethal with <i>PBRM1</i> deficiency. The PBRM1/PARP inhibitor synthetic lethality was recapitulated using several clinical PARP inhibitors in a series of <i>in vitro</i> model systems and <i>in vivo</i> in a xenograft model of ccRCC. In the absence of exogenous DNA damage, PBRM1-defective cells exhibited elevated levels of replication stress, micronuclei, and R-loops. PARP inhibitor exposure exacerbated these phenotypes. Quantitative mass spectrometry revealed that multiple R-loop processing factors were downregulated in PBRM1-defective tumor cells. Exogenous expression of the R-loop resolution enzyme RNase H1 reversed the sensitivity of PBRM1-deficient cells to PARP inhibitors, suggesting that excessive levels of R-loops could be a cause of this synthetic lethality. PARP and ATR inhibitors also induced cyclic GMP-AMP synthase/stimulator of interferon genes (cGAS/STING) innate immune signaling in PBRM1-defective tumor cells. Overall, these findings provide the preclinical basis for using PARP inhibitors in PBRM1-defective cancers. SIGNIFICANCE: This study demonstrates that PARP and ATR inhibitors are synthetic lethal with the loss of PBRM1, a PBAF-specific subunit, thus providing the rationale for assessing these inhibitors in patients with PBRM1-defective cancer. GRAPHICAL ABSTRACT: http://cancerres.aacrjournals.org/content/canres/81/11/2888/F1.large.jpg.
Multi-omics approaches including proteomics analyses are becoming an integral component of precision medicine. As clinical proteomics studies gain momentum and their sensitivity increases, research on identifying individuals based on their proteomics data is here examined for risks and ethics-related issues. A great deal of work has already been done on this topic for DNA/RNA sequencing data, but it has yet to be widely studied in other omics fields. The current state-of-the-art for the identification of individuals based solely on proteomics data is explained. Protein sequence variation analysis approaches are covered in more detail, including the available analysis workflows and their limitations. We also outline some previous forensic and omics proteomics studies that are relevant for the identification of individuals. Following that, we discuss the risks of patient reidentification using other proteomics data types such as protein expression abundance and post-translational modification (PTM) profiles. In light of the potential identification of individuals through proteomics data, possible legal and ethical implications are becoming increasingly important in the field.
GENCODE produces high quality gene and transcript annotation for the human and mouse genomes. All GENCODE annotation is supported by experimental data and serves as a reference for genome biology and clinical genomics. The GENCODE consortium generates targeted experimental data, develops bioinformatic tools and carries out analyses that, along with externally produced data and methods, support the identification and annotation of transcript structures and the determination of their function. Here, we present an update on the annotation of human and mouse genes, including developments in the tools, data, analyses and major collaborations which underpin this progress. For example, we report the creation of a set of non-canonical ORFs identified in GENCODE transcripts, the LRGASP collaboration to assess the use of long transcriptomic data to build transcript models, the progress in collaborations with RefSeq and UniProt to increase convergence in the annotation of human and mouse protein-coding genes, the propagation of GENCODE across the human pan-genome and the development of new tools to support annotation of regulatory features by GENCODE. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.
The PBRM1 subunit of the PBAF (SWI/SNF) chromatin remodeling complex is mutated in ∼40% of clear cell renal cancers. PBRM1 loss has been implicated in responses to immunotherapy in renal cancer, but the mechanism is unclear. DNA damage-induced inflammatory signaling is an important factor determining immunotherapy response. This response is kept in check by the G2/M checkpoint, which prevents progression through mitosis with unrepaired damage. We found that in the absence of PBRM1, p53-dependent p21 up-regulation is delayed after DNA damage, leading to defective transcriptional repression by the DREAM complex and premature entry into mitosis. Consequently, DNA damage-induced inflammatory signaling pathways are activated by cytosolic DNA. Notably, p53 is infrequently mutated in renal cancer, so PBRM1 mutational status is critical to G2/M checkpoint maintenance. Moreover, we found that the ability of PBRM1 deficiency to predict response to immunotherapy correlates with expression of the cytosolic DNA-sensing pathway in clinical samples. These findings have implications for therapeutic responses in renal cancer.
Triple-negative breast cancers (TNBC) are resistant to standard-of-care chemotherapy and lack known targetable driver gene alterations. Identification of novel drivers could aid the discovery of new treatment strategies for this hard-to-treat patient population, yet studies using high-throughput and accurate models to define the functions of driver genes in TNBC to date have been limited. Here, we employed unbiased functional genomics screening of the 200 most frequently mutated genes in breast cancer, using spheroid cultures to model <i>in vivo</i>-like conditions, and identified the histone acetyltransferase CREBBP as a novel tumor suppressor in TNBC. CREBBP protein expression in patient tumor samples was absent in 8% of TNBCs and at a high frequency in other tumors, including squamous lung cancer, where CREBBP-inactivating mutations are common. In TNBC, CREBBP alterations were associated with higher genomic heterogeneity and poorer patient survival and resulted in upregulation and dependency on a FOXM1 proliferative program. Targeting FOXM1-driven proliferation indirectly with clinical CDK4/6 inhibitors (CDK4/6i) selectively impaired growth in spheroids, cell line xenografts, and patient-derived models from multiple tumor types with CREBBP mutations or loss of protein expression. In conclusion, we have identified CREBBP as a novel driver in aggressive TNBC and identified an associated genetic vulnerability in tumor cells with alterations in CREBBP and provide a preclinical rationale for assessing CREBBP alterations as a biomarker of CDK4/6i response in a new patient population. SIGNIFICANCE: This study demonstrates that CREBBP genomic alterations drive aggressive TNBC, lung cancer, and lymphomas and may be selectively treated with clinical CDK4/6 inhibitors.
Condensin complexes compact and disentangle chromosomes in preparation for cell division. Commercially available antibodies raised against condensin subunits have been widely used to characterise their cellular interactome. Here we have assessed the specificity of a polyclonal antibody (Bethyl A302-276A) that is commonly used as a probe for NCAPH2, the kleisin subunit of condensin II, in mammalian cells. We find that, in addition to its intended target, this antibody cross-reacts with one or more components of the SWI/SNF family of chromatin remodelling complexes in an NCAPH2-independent manner. This cross-reactivity, with an abundant chromatin-associated factor, is likely to affect the interpretation of protein and chromatin immunoprecipitation experiments that make use of this antibody probe.
The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE<sup>1</sup> and Roadmap Epigenomics<sup>2</sup> data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.
<b>Background:</b> Cross-linking mass spectrometry (XL-MS) is a powerful technology capable of yielding structural insights across the complex cellular protein interaction network. However, up to date most of the studies utilising XL-MS to characterise individual protein complexes' topology have been carried out on over-expressed or recombinant proteins, which might not accurately represent native cellular conditions. <b>Methods:</b> We performed XL-MS using MS-cleavable crosslinker disuccinimidyl sulfoxide (DSSO) after immunoprecipitation of endogenous BRG/Brahma-associated factors (BAF) complex and co-purifying proteins. Data are available via ProteomeXchange with identifier PXD027611. <b>Results:</b> Although we did not detect the expected enrichment of crosslinks within the BAF complex, we identified numerous crosslinks between three co-purifying proteins, namely Thrap3, Bclaf1 and Erh. Thrap3 and Bclaf1 are mostly disordered proteins for which no 3D structure is available. The XL data allowed us to map interaction surfaces on these proteins, which overlap with the non-disordered portions of both proteins. The identified XLs are in agreement with homology-modelled structures suggesting that the interaction surfaces are globular. <b>Conclusions:</b> Our data shows that MS-cleavable crosslinker DSSO can be used to characterise in detail the topology and interaction surfaces of endogenous protein complexes without the need for overexpression. We demonstrate that Bclaf1, Erh and Thrap3 interact closely with each other, suggesting they might form a novel complex, hereby referred to as BET complex. This data can be exploited for modelling protein-protein docking to characterise the three-dimensional structure of the complex. Endogenous XL-MS might be challenging due to crosslinker accessibility, protein complex abundance or isolation efficiency, and require further optimisation for some complexes like the BAF complex to detect a substantial number of crosslinks.
Soft tissue sarcomas (STS) are rare and diverse mesenchymal cancers with limited treatment options. Here we undertake comprehensive proteomic profiling of tumour specimens from 321 STS patients representing 11 histological subtypes. Within leiomyosarcomas, we identify three proteomic subtypes with distinct myogenesis and immune features, anatomical site distribution and survival outcomes. Characterisation of undifferentiated pleomorphic sarcomas and dedifferentiated liposarcomas with low infiltrating CD3 + T-lymphocyte levels nominates the complement cascade as a candidate immunotherapeutic target. Comparative analysis of proteomic and transcriptomic profiles highlights the proteomic-specific features for optimal risk stratification in angiosarcomas. Finally, we define functional signatures termed Sarcoma Proteomic Modules which transcend histological subtype classification and show that a vesicle transport protein signature is an independent prognostic factor for distant metastasis. Our study highlights the utility of proteomics for identifying molecular subgroups with implications for risk stratification and therapy selection and provides a rich resource for future sarcoma research.
Although PARP inhibitors (PARPi) now form part of the standard-of-care for the treatment of homologous recombination defective cancers, de novo and acquired resistance limits their overall effectiveness. Previously, overexpression of the BRCA1-∆11q splice variant has been shown to cause PARPi resistance. How cancer cells achieve increased BRCA1-∆11q expression has remained unclear. Using isogenic cells with different BRCA1 mutations, we show that reduction in HUWE1 leads to increased levels of BRCA1-∆11q and PARPi resistance. This effect is specific to cells able to express BRCA1-∆11q (e.g. BRCA1 exon 11 mutant cells) and is not seen in BRCA1 mutants that cannot express BRCA1-∆11q, nor in BRCA2 mutant cells. As well as increasing levels of BRCA1-∆11q protein in exon 11 mutant cells, HUWE1 silencing also restores RAD51 nuclear foci and platinum salt resistance. HUWE1 catalytic domain mutations were also seen in a case of PARPi resistant, BRCA1 exon 11 mutant, high grade serous ovarian cancer. These results suggest how elevated levels of BRCA1-∆11q and PARPi resistance can be achieved, identify HUWE1 as a candidate biomarker of PARPi resistance for assessment in future clinical trials and illustrate how some PARPi resistance mechanisms may only operate in patients with particular BRCA1 mutations.
SF3B1 hotspot mutations are associated with a poor prognosis in several tumor types and lead to global disruption of canonical splicing. Through synthetic lethal drug screens, we identify that SF3B1 mutant (SF3B1<sup>MUT</sup>) cells are selectively sensitive to poly (ADP-ribose) polymerase inhibitors (PARPi), independent of hotspot mutation and tumor site. SF3B1<sup>MUT</sup> cells display a defective response to PARPi-induced replication stress that occurs via downregulation of the cyclin-dependent kinase 2 interacting protein (CINP), leading to increased replication fork origin firing and loss of phosphorylated CHK1 (pCHK1; S317) induction. This results in subsequent failure to resolve DNA replication intermediates and G<sub>2</sub>/M cell cycle arrest. These defects are rescued through CINP overexpression, or further targeted by a combination of ataxia-telangiectasia mutated and PARP inhibition. In vivo, PARPi produce profound antitumor effects in multiple SF3B1<sup>MUT</sup> cancer models and eliminate distant metastases. These data provide the rationale for testing the clinical efficacy of PARPi in a biomarker-driven, homologous recombination proficient, patient population.
Cancer-associated fibroblasts (CAFs) are a key component of tumors. We aimed to profile the proteome of cancer cell lines representing three common cancer types (lung, colorectal and pancreatic) and a representative CAF cell line from each tumor type to gain insight into CAF function and novel CAF biomarkers. We used isobaric labeling, liquid chromatography and mass spectrometry to evaluate the proteome of 9 cancer and 3 CAF cell lines. Of the 9460 proteins evaluated, functional enrichment analysis revealed an upregulation of N-glycan biosynthesis and extracellular matrix proteins in CAFs. 85 proteins had 16-fold higher expression in CAFs compared to cancer cells, including previously known CAF markers like fibroblast activation protein (FAP). Novel overexpressed CAF biomarkers included heat shock protein β-6 (HSPB6/HSP20) and cyclooxygenase 1 (PTGS1/COX1). SiRNA knockdown of the genes encoding these proteins did not reduce contractility in lung CAFs, suggesting they were not crucial to this function. Immunohistochemical analysis of 30 tumor samples (10 lung, 10 colorectal and 10 pancreatic) showed restricted HSPB6 and PTGS1 expression in the stroma. Therefore, we describe an unbiased differential proteome analysis of CAFs compared to cancer cells, which revealed higher expression of HSPB6 and PTGS1 in CAFs. Data are available via ProteomeXchange (PXD040360). SIGNIFICANCE: Cancer-associated fibroblasts (CAFs) are highly abundant stromal cells present in tumors. CAFs are known to influence tumor progression and drug resistance. Characterizing the proteome of CAFs could give potential insights into new stromal drug targets and biomarkers. Mass spectrometry-based analysis comparing proteomic profiles of CAFs and cancers characterized 9460 proteins of which 85 proteins had 16-fold higher expression in CAFs compared to cancer cells. Further interrogation of this rich resource could provide insight into the function of CAFs and could reveal putative stromal targets. We describe for the first time that heat shock protein β-6 (HSPB6/HSP20) and cyclooxygenase 1 (PTGS1/COX1) are overexpressed in CAFs compared to cancer cells.
IFNγ alters the immunopeptidome presented on HLA class I (HLA-I), and its activity on cancer cells is known to be important for effective immunotherapy responses. We performed proteomic analyses of untreated and IFNγ-treated colorectal cancer patient-derived organoids and combined this with transcriptomic and HLA-I immunopeptidomics data to dissect mechanisms that lead to remodeling of the immunopeptidome through IFNγ. IFNγ-induced changes in the abundance of source proteins, switching from the constitutive to the immunoproteasome, and differential upregulation of different HLA alleles explained some, but not all, observed peptide abundance changes. By selecting for peptides which increased or decreased the most in abundance, but originated from proteins with limited abundance changes, we discovered that the amino acid composition of presented peptides also influences whether a peptide is upregulated or downregulated on HLA-I through IFNγ. The presence of proline within the peptide core was most strongly associated with peptide downregulation. This was validated in an independent dataset. Proline substitution in relevant core positions did not influence the predicted HLA-I binding affinity or stability, indicating that proline effects on peptide processing may be most relevant. Understanding the multiple factors that influence the abundance of peptides presented on HLA-I in the absence or presence of IFNγ is important to identify the best targets for antigen-specific cancer immunotherapies such as vaccines or T-cell receptor engineered therapeutics.<h4>Significance</h4>IFNγ remodels the HLA-I-presented immunopeptidome. We showed that peptide-specific factors influence whether a peptide is upregulated or downregulated and identified a preferential loss or downregulation of those with proline near the peptide center. This will help selecting immunotherapy target antigens which are consistently presented by cancer cells.
The survival of children with diffuse intrinsic pontine glioma (DIPG) remains dismal, with new treatments desperately needed. In a prospective biopsy-stratified clinical trial, we combined detailed molecular profiling and drug screening in newly established patient-derived models in vitro and in vivo. We identified in vitro sensitivity to MEK inhibitors in DIPGs harboring MAPK pathway alterations, but treatment of patient-derived xenograft models and a patient at relapse failed to elicit a significant response. We generated trametinib-resistant clones in a BRAFG469V model through continuous drug exposure and identified acquired mutations in MEK1/2 with sustained pathway upregulation. These cells showed hallmarks of mesenchymal transition and expression signatures overlapping with inherently trametinib-insensitive patient-derived cells, predicting sensitivity to dasatinib. Combined trametinib and dasatinib showed highly synergistic effects in vitro and on ex vivo brain slices. We highlight the MAPK pathway as a therapeutic target in DIPG and show the importance of parallel resistance modeling and combinatorial treatments for meaningful clinical translation.<h4>Significance</h4>We report alterations in the MAPK pathway in DIPGs to confer initial sensitivity to targeted MEK inhibition. We further identify for the first time the mechanism of resistance to single-agent targeted therapy in these tumors and suggest a novel combinatorial treatment strategy to overcome it in the clinic. This article is highlighted in the In This Issue feature, p. 587.
Genomic instability arising from defective responses to DNA damage<sup>1</sup> or mitotic chromosomal imbalances<sup>2</sup> can lead to the sequestration of DNA in aberrant extranuclear structures called micronuclei (MN). Although MN are a hallmark of ageing and diseases associated with genomic instability, the catalogue of genetic players that regulate the generation of MN remains to be determined. Here we analyse 997 mouse mutant lines, revealing 145 genes whose loss significantly increases (n = 71) or decreases (n = 74) MN formation, including many genes whose orthologues are linked to human disease. We found that mice null for Dscc1, which showed the most significant increase in MN, also displayed a range of phenotypes characteristic of patients with cohesinopathy disorders. After validating the DSCC1-associated MN instability phenotype in human cells, we used genome-wide CRISPR-Cas9 screening to define synthetic lethal and synthetic rescue interactors. We found that the loss of SIRT1 can rescue phenotypes associated with DSCC1 loss in a manner paralleling restoration of protein acetylation of SMC3. Our study reveals factors involved in maintaining genomic stability and shows how this information can be used to identify mechanisms that are relevant to human disease biology<sup>1</sup>.
Multidrug resistance-associated protein 2 (MRP2/ABCC2) is a polyspecific efflux transporter of organic anions expressed in hepatocyte canalicular membranes. MRP2 dysfunction, in Dubin-Johnson syndrome or by off-target inhibition, for example by the uricosuric drug probenecid, elevates circulating bilirubin glucuronide and is a cause of jaundice. Here, we determine the cryo-EM structure of rat Mrp2 (rMrp2) in an autoinhibited state and in complex with probenecid. The autoinhibited state exhibits an unusual conformation for this class of transporter in which the regulatory domain is folded within the transmembrane domain cavity. In vitro phosphorylation, mass spectrometry and transport assays show that phosphorylation of the regulatory domain relieves this autoinhibition and enhances rMrp2 transport activity. The in vitro data is confirmed in human hepatocyte-like cells, in which inhibition of endogenous kinases also reduces human MRP2 transport activity. The drug-bound state reveals two probenecid binding sites that suggest a dynamic interplay with autoinhibition. Mapping of the Dubin-Johnson mutations onto the rodent structure indicates that many may interfere with the transition between conformational states.
Neuronal differentiation requires building a complex intracellular architecture, and therefore the coordinated regulation of defined sets of genes. RNA-binding proteins (RBPs) play a key role in this regulation. However, while their action on individual mRNAs has been explored in depth, the mechanisms used to coordinate gene expression programs shaping neuronal morphology are poorly understood. To address this, we studied how the paradigmatic RBP IMP1 (IGF2BP1), an essential developmental factor, selects and regulates its RNA targets during the human neuronal differentiation. We perform a combination of system-wide and molecular analyses, revealing that IMP1 developmentally transitions to and directly regulates the expression of mRNAs encoding essential regulators of the microtubule network, a key component of neuronal morphology. Furthermore, we show that m6A methylation drives the selection of specific IMP1 mRNA targets and their protein expression during the developmental transition from neural precursors to neurons, providing a molecular principle for the onset of target selectivity.
Small molecules that induce protein degradation hold the potential to overcome several limitations of the currently available inhibitors. Monovalent or molecular glue degraders, in particular, enable the benefits of protein degradation without the disadvantages of high molecular weight and the resulting challenge in drug development that are associated with bivalent molecules like Proteolysis Targeting Chimeras. One key challenge in designing monovalent degraders is how to build in the degrader activity─how can we convert an inhibitor into a degrader? If degradation activity requires very specific molecular features, it will be difficult to find new degraders and challenging to optimize those degraders toward drugs. Herein, we demonstrate that an unexpectedly wide range of modifications to the degradation-inducing group of the cyclin K degrader CR8 are tolerated, including both aromatic and nonaromatic groups. We used these findings to convert the pan-CDK inhibitors dinaciclib and AT-7519 to Cyclin K degraders, leading to a novel dinaciclib-based compound with improved degradation activity compared to CR8 and confirm the mechanism of degradation. These results suggest that general design principles can be generated for the development and optimization of monovalent degraders.
Resistance is a major problem with effective cancer treatment and the stroma forms a significant portion of the tumor mass but traditional drug screens involve cancer cells alone. Cancer-associated fibroblasts (CAFs) are a major tumor stroma component and its secreted proteins may influence the function of cancer cells. The majority of secretome studies compare different cancer or CAF cell lines exclusively. Here, we present the direct characterization of the secreted protein profiles between CAFs and <i>KRAS</i> mutant-cancer cell lines from colorectal, lung, and pancreatic tissues using multiplexed mass spectrometry. 2573 secreted proteins were annotated, and differential analysis highlighted understudied CAF-enriched secreted proteins, including Wnt family member 5B (WNT5B), in addition to established CAF markers, such as collagens. The functional role of CAF secreted proteins was explored by assessing its effect on the response to 97 anticancer drugs since stromal cells may cause a differing cancer drug response, which may be missed on routine drug screening using cancer cells alone. CAF secreted proteins caused specific effects on each of the cancer cell lines, which highlights the complexity and challenges in cancer treatment and so the importance to consider stromal elements.
The proper control of mitosis depends on the ubiquitin-mediated degradation of the right mitotic regulator at the right time. This is effected by the Anaphase Promoting Complex/Cyclosome (APC/C) ubiquitin ligase that is regulated by the Spindle Assembly Checkpoint (SAC). The SAC prevents the APC/C from recognising Cyclin B1, the essential anaphase and cytokinesis inhibitor, until all chromosomes are attached to the spindle. Once chromosomes are attached, Cyclin B1 is rapidly degraded to enable chromosome segregation and cytokinesis. We have a good understanding of how the SAC inhibits the APC/C, but relatively little is known about how the APC/C recognises Cyclin B1 as soon as the SAC is turned off. Here, by combining live-cell imaging, in vitro reconstitution biochemistry, and structural analysis by cryo-electron microscopy, we provide evidence that the rapid recognition of Cyclin B1 in metaphase requires spatial regulation of the APC/C. Using fluorescence cross-correlation spectroscopy, we find that Cyclin B1 and the APC/C primarily interact at the mitotic apparatus. We show that this is because Cyclin B1, like the APC/C, binds to nucleosomes, and identify an 'arginine-anchor' in the N-terminus as necessary and sufficient for binding to the nucleosome. Mutating the arginine anchor on Cyclin B1 reduces its interaction with the APC/C and delays its degradation: cells with the mutant, non-nucleosome-binding Cyclin B1 become aneuploid, demonstrating the physiological relevance of our findings. Together, our data demonstrate that mitotic chromosomes promote the efficient interaction between Cyclin B1 and the APC/C to ensure the timely degradation of Cyclin B1 and genomic stability.
CDK4/6 inhibition in combination with endocrine therapy is the standard of care for estrogen receptor (ER+) breast cancer, and although cytostasis is frequently observed, new treatment strategies that enhance efficacy are required. Here, we perform two independent genome-wide CRISPR screens to identify genetic determinants of CDK4/6 and endocrine therapy sensitivity. Genes involved in oxidative stress and ferroptosis modulate sensitivity, with GPX4 as the top sensitiser in both screens. Depletion or inhibition of GPX4 increases sensitivity to palbociclib and giredestrant, and their combination, in ER+ breast cancer models, with GPX4 null xenografts being highly sensitive to palbociclib. GPX4 perturbation additionally sensitises triple negative breast cancer (TNBC) models to palbociclib. Palbociclib and giredestrant induced oxidative stress and disordered lipid metabolism, leading to a ferroptosis-sensitive state. Lipid peroxidation is promoted by a peroxisome AGPAT3-dependent pathway in ER+ breast cancer models, rather than the classical ACSL4 pathway. Our data demonstrate that CDK4/6 and ER inhibition creates vulnerability to ferroptosis induction, that could be exploited through combination with GPX4 inhibitors, to enhance sensitivity to the current therapies in breast cancer.
Receptor-interacting serine/threonine-protein kinase 1 (RIPK1) functions as a critical stress sentinel that coordinates cell survival, inflammation, and immunogenic cell death (ICD). Although the catalytic function of RIPK1 is required to trigger cell death, its non-catalytic scaffold function mediates strong pro-survival signaling. Accordingly, cancer cells can hijack RIPK1 to block necroptosis and evade immune detection. We generated a small-molecule proteolysis-targeting chimera (PROTAC) that selectively degraded human and murine RIPK1. PROTAC-mediated depletion of RIPK1 deregulated TNFR1 and TLR3/4 signaling hubs, accentuating the output of NF-κB, MAPK, and IFN signaling. Additionally, RIPK1 degradation simultaneously promoted RIPK3 activation and necroptosis induction. We further demonstrated that RIPK1 degradation enhanced the immunostimulatory effects of radio- and immunotherapy by sensitizing cancer cells to treatment-induced TNF and interferons. This promoted ICD, antitumor immunity, and durable treatment responses. Consequently, targeting RIPK1 by PROTACs emerges as a promising approach to overcome radio- or immunotherapy resistance and enhance anticancer therapies.
The F-box and WD repeat domain containing 7 (FBXW7) tumour suppressor gene encodes a substrate-recognition subunit of Skp, cullin, F-box (SCF)-containing complexes. The tumour-suppressive role of FBXW7 is ascribed to its ability to drive ubiquitination and degradation of oncoproteins. Despite this molecular understanding, therapeutic approaches that target defective FBXW7 have not been identified. Using genome-wide clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 screens, focussed RNA-interference screens and whole and phospho-proteome mass spectrometry profiling in multiple FBXW7 wild-type and defective isogenic cell lines, we identified a number of FBXW7 synthetic lethal targets, including proteins involved in the response to replication fork stress and proteins involved in replication origin firing, such as cell division cycle 7-related protein kinase (CDC7) and its substrate, DNA replication complex GINS protein SLD5 (GINS4). The CDC7 synthetic lethal effect was confirmed using small-molecule inhibitors. Mechanistically, FBXW7/CDC7 synthetic lethality is dependent upon the replication factor telomere-associated protein RIF1 (RIF1), with RIF1 silencing reversing the FBXW7-selective effects of CDC7 inhibition. The delineation of FBXW7 synthetic lethal effects we describe here could serve as the starting point for subsequent drug discovery and/or development in this area.