Update on: March 28, 2023

Pan-cancer analysis identifies tumor-specific antigens derived from transposable elements

Cryptic promoters within transposable elements (TEs) can be transcriptionally reactivated in tumors to create new TE-chimeric transcripts, which can produce immunogenic antigens.

 Pan-cancer analysis identifies tumor-specific antigens derived from transposable elements
Reference: Nakul M. Shah, Nature genetics, 2023

We performed a comprehensive screen for these TE exaptation events in 33 TCGA tumor types, 30 GTEx adult tissues and 675 cancer cell lines, and identified 1,068 TE-exapted candidates with the potential to generate shared tumor-specific TE-chimeric antigens (TS-TEAs). Whole-lysate and HLA-pulldown mass spectrometry data confirmed that TS-TEAs are presented on the surface of cancer cells.

In addition, we highlight tumor-specific membrane proteins transcribed from TE promoters that constitute aberrant epitopes on the extracellular surface of cancer cells. Altogether, we showcase the high pan-cancer prevalence of TS-TEAs and atypical membrane proteins that could potentially be therapeutically exploited and targeted. Reference

Deep human proteome sequencing

An average shotgun proteomics experiment detects approximately 10,000 human proteins from a single sample. However, individual proteins are typically identified by peptide sequences representing a small fraction of their total amino acids.
Deep proteome sequencing
Reference: Pavel Sinitcyn et al , Nature Biotechnology, 2023
Hence, an average shotgun experiment fails to distinguish different protein variants and isoforms. Deeper proteome sequencing is therefore required for the global discovery of protein isoforms. Using six different human cell lines, six proteases, deep fractionation and three tandem mass spectrometry fragmentation methods, we identify a million unique peptides from 17,717 protein groups, with a median sequence coverage of approximately 80%. Direct comparison with RNA expression data provides evidence for the translation of most nonsynonymous variants.
We have also hypothesized that undetected variants likely arise from mutation-induced protein instability. We further observe comparable detection rates for exon–exon junction peptides representing constitutive and alternative splicing events. Our dataset represents a resource for proteoform discovery and provides direct evidence that most frame-preserving alternatively spliced isoforms are translated. Reference

Multiomic signatures of body mass index

Multiomic profiling can reveal population heterogeneity for both health and disease states. Obesity drives a myriad of metabolic perturbations and is a risk factor for multiple chronic diseases.

Multiomic signatures of body mass index i
Reference: Kengo Watanabe et al, Nature medicine, 2023

Here we report an atlas of cross-sectional and longitudinal changes in 1,111 blood analytes associated with variation in body mass index (BMI), as well as multiomic associations with host polygenic risk scores and gut microbiome composition, from a cohort of 1,277 individuals enrolled in a wellness program (Arivale). Machine learning model predictions of BMI from blood multiomics captured heterogeneous phenotypic states of host metabolism and gut microbiome composition better than BMI, which was also validated in an external cohort (TwinsUK).

Our analyses further identified blood analyte–analyte associations that were modified by metabolomics-inferred BMI and partially reversed in individuals with metabolic obesity during the intervention. Taken together, our findings provide a blood atlas of the molecular perturbations associated with changes in obesity status, serving as a resource to quantify metabolic health for predictive and preventive medicine. Reference

The enormous repetitive Antarctic krill genome reveals environmental adaptations and population insights

Antarctic krill (Euphausia superba) is Earth’s most abundant wild animal, and its enormous biomass is vital to the Southern Ocean ecosystem. Here, we report a 48.01-Gb chromosome-level Antarctic krill genome, whose large genome size appears to have resulted from inter-genic transposable element expansions.

Reference: Changwei Shao et al, Cell, 2023
Reference: Changwei Shao et al, Cell, 2023

Our assembly reveals the molecular architecture of the Antarctic krill circadian clock and uncovers expanded gene families associated with molting and energy metabolism, providing insights into adaptations to the cold and highly seasonal Antarctic environment. Population-level genome re-sequencing from four geographical sites around the Antarctic continent reveals no clear population structure but highlights natural selection associated with environmental variables.

An apparent drastic reduction in krill population size 10 mya and a subsequent rebound 100 thousand years ago coincides with climate change events. Our findings uncover the genomic basis of Antarctic krill adaptations to the Southern Ocean and provide valuable resources for future Antarctic research. Reference

Multiomic analysis of malignant pleural mesothelioma

Malignant pleural mesothelioma (MPM) is an aggressive cancer with rising incidence and challenging clinical management.

Reference: Lise Mangiante et al, Nature genetics, 2023
Multiomic analysis of malignant pleural mesothelioma

Through a large series of whole-genome sequencing data, integrated with transcriptomic and epigenomic data using multiomics factor analysis, we demonstrate that the current World Health Organization classification only accounts for up to 10% of interpatient molecular differences. Instead, the MESOMICS project paves the way for a morphomolecular classification of MPM based on four dimensions: ploidy, tumor cell morphology, adaptive immune response and CpG island methylator profile.

We show that these four dimensions are complementary, capture major interpatient molecular differences and are delimited by extreme phenotypes that—in the case of the interdependent tumor cell morphology and adapted immune response—reflect tumor specialization. These findings unearth the interplay between MPM functional biology and its genomic history, and provide insights into the variations observed in the clinical behavior of patients with MPM. Reference

Pharmacogenomic profiling reveals molecular features of chemotherapy resistance in glioblastoma

Although temozolomide (TMZ) has been used as a standard adjuvant chemotherapeutic agent for primary glioblastoma (GBM), treating isocitrate dehydrogenase wild-type (IDH-wt) cases remains challenging due to intrinsic and acquired drug resistance. Therefore, elucidation of the molecular mechanisms of TMZ resistance is critical for its precision application.

Pharmacogenomic profiling in glioblastoma
Reference: Yoonhee Nam et al Genome Biology, 2023

We stratified 69 primary IDH-wt GBM patients into TMZ-resistant (n = 29) and sensitive (n = 40) groups, using TMZ screening of the corresponding patient-derived glioma stem-like cells (GSCs). Genomic and transcriptomic features were then examined to identify TMZ-associated molecular alterations. Subsequently, we developed a machine learning (ML) model to predict TMZ response from combined signatures.

We identified molecular characteristics associated to TMZ sensitivity, and illustrate the potential clinical value of a ML model trained from pharmacogenomic profiling of patient-derived GSC against IDH-wt GBMs. Reference

Optimal dietary patterns for prevention of chronic disease

Multiple dietary patterns have been associated with different diseases; however, their comparability to improve overall health has yet to be determined.

Optimal dietary patterns for prevention of chronic disease
Reference: Peilu Wang et al, Nature medicine, 2023

Here, in 205,852 healthcare professionals from three US cohorts followed for up to 32 years, we prospectively assessed two mechanism-based diets and six diets based on dietary recommendations in relation to major chronic disease, defined as a composite outcome of incident major cardiovascular disease (CVD), type 2 diabetes and cancer.

We demonstrated that adherence to a healthy diet was generally associated with a lower risk of major chronic disease (hazard ratio (HR) comparing the 90th with the 10th percentile of dietary pattern scores = 0.58–0.80). Participants with low insulinemic (HR = 0.58, 95% confidence interval (CI) = 0.57, 0.60), low inflammatory (HR = 0.61, 95% CI = 0.60, 0.63) or diabetes risk-reducing (HR = 0.70, 95% CI = 0.69, 0.72) diet had the largest risk reduction for incident major CVD, type 2 diabetes and cancer as a composite and individually.

Similar findings were observed across gender and diverse ethnic groups. Our results suggest that dietary patterns associated with markers of hyperinsulinemia and inflammation and diabetes development may inform on future dietary guidelines for chronic disease prevention. Reference

CeDAR: incorporating cell type hierarchy improves cell type-specific differential analyses in bulk omics data

Bulk high-throughput omics data contain signals from a mixture of cell types. Recent developments of deconvolution methods facilitate cell type-specific inferences from bulk data.

 bulk omics data
Reference: Luxiao Chen et al, Genome Biology, 2023

Our real data exploration suggests that differential expression or methylation status is often correlated among cell types. Based on this observation, we develop a novel statistical method named CeDAR to incorporate the cell type hierarchy in cell type-specific differential analyses of bulk data. Extensive simulation and real data analyses demonstrate that this approach significantly improves the accuracy and power in detecting cell type-specific differential signals compared with existing methods, especially in low-abundance cell types. Reference

Base editing screens map mutations affecting interferon-γ signaling in cancer

Interferon-γ (IFN-γ) signaling mediates host responses to infection, inflammation and anti-tumor immunity. Mutations in the IFN-γ signaling pathway cause immunological disorders, hematological malignancies, and resistance to immune checkpoint blockade (ICB) in cancer; however, the function of most clinically observed variants remains unknown.

Base editing screens map mutations
Reference: Matthew A. Coelho et al, Cancer Cell, 2023

Here, we systematically investigate the genetic determinants of IFN-γ response in colorectal cancer cells using CRISPR-Cas9 screens and base editing mutagenesis. Deep mutagenesis of JAK1 with cytidine and adenine base editors, combined with pathway-wide screens, reveal loss-of-function and gain-of-function mutations, including causal variants in hematological malignancies and mutations detected in patients refractory to ICB.

We functionally validate variants of uncertain significance in primary tumor organoids, where engineering missense mutations in JAK1 enhanced or reduced sensitivity to autologous tumor-reactive T cells. We identify more than 300 predicted missense mutations altering IFN-γ pathway activity, generating a valuable resource for interpreting gene variant function. Reference

Genomic autopsy to identify underlying causes of pregnancy loss and perinatal death

Pregnancy loss and perinatal death are devastating events for families. We assessed ‘genomic autopsy’ as an adjunct to standard autopsy for 200 families who had experienced fetal or newborn death, providing a definitive or candidate genetic diagnosis in 105 families.

Genomic autopsy
Reference: Alicia B. Byrne et al, Nature Medicine (2023)

Our cohort provides evidence of severe atypical in utero presentations of known genetic disorders and identifies novel phenotypes and disease genes. Inheritance of 42% of definitive diagnoses were either autosomal recessive (30.8%), X-linked recessive (3.8%) or autosomal dominant (excluding de novos, 7.7%), with risk of recurrence in future pregnancies. We report that at least ten families (5%) used their diagnosis for preimplantation (5) or prenatal diagnosis (5) of 12 pregnancies.

We emphasize the clinical importance of genomic investigations of pregnancy loss and perinatal death, with short turnaround times for diagnostic reporting and followed by systematic research follow-up investigations. This approach has the potential to enable accurate counseling for future pregnancies. Reference

Impact of the Euro 2020 championship on the spread of COVID-19

Large-scale events like the UEFA Euro 2020 football (soccer) championship offer a unique opportunity to quantify the impact of gatherings on the spread of COVID-19, as the number and dates of matches played by participating countries resembles a randomized study.

Euro 2020 championship on the spread of COVID-19
Reference: Jonas Dehning et al, Nature Communications, 2023

Using Bayesian modeling and the gender imbalance in COVID-19 data, we attribute 840,000 (95% CI: [0.39M, 1.26M]) COVID-19 cases across 12 countries to the championship. The impact depends non-linearly on the initial incidence, the reproduction number R, and the number of matches played.

The strongest effects are seen in Scotland and England, where as much as 10,000 primary cases per million inhabitants occur from championship-related gatherings. The average match-induced increase in R was 0.46 [0.18, 0.75] on match days, but important matches caused an increase as large as +3. Altogether, our results provide quantitative insights that help judge and mitigate the impact of large-scale events on pandemic spread. Reference

Single-cell transcriptomics reveals a mechanosensitive injury signaling pathway in early diabetic nephropathy

Diabetic nephropathy (DN) is the leading cause of end-stage renal disease, and histopathologic glomerular lesions are among the earliest structural alterations of DN. However, the signaling pathways that initiate these glomerular alterations are incompletely understood.

Single-cell transcriptomics analsis in early diabetic nephropathy
Reference: Shuya Liu et al, Genome Medicine, 2023

To delineate the cellular and molecular basis for DN initiation, we performed single-cell and bulk RNA sequencing of renal cells from type 2 diabetes mice (BTBR ob/ob) at the early stage of DN.

Analysis of differentially expressed genes revealed glucose-independent responses in glomerular cell types. The gene regulatory network upstream of glomerular cell programs suggested the activation of mechanosensitive transcriptional pathway MRTF-SRF predominantly taking place in mesangial cells. Importantly, activation of MRTF-SRF transcriptional pathway was also identified in DN glomeruli in independent patient cohort datasets. Furthermore, ex vivo kidney perfusion suggested that the regulation of MRTF-SRF is a common mechanism in response to glomerular hyperfiltration. Reference

Development of a treatment selection algorithm for SGLT2 and DPP-4 inhibitor therapies in people with type 2 diabetes

Current treatment guidelines do not provide recommendations to support the selection of treatment for most people with type 2 diabetes. We aimed to develop and validate an algorithm to allow selection of optimal treatment based on glycaemic response, weight change, and tolerability outcomes when choosing between SGLT2 inhibitor or DPP-4 inhibitor therapies.

Development of a treatment selection algorithm for SGLT2 and DPP-4 inhibitor therapies in people with type 2 diabetes: a retrospective cohort study
Reference: John M Dennis et al, The Lancet Digital Health, 2022

In this retrospective cohort study, we identified patients initiating SGLT2 and DPP-4 inhibitor therapies after Jan 1, 2013, from the UK Clinical Practice Research Datalink (CPRD). We excluded those who received SGLT2 or DPP-4 inhibitors as first-line treatment or insulin at the same time, had estimated glomerular filtration rate (eGFR) of less than 45 mL/min per 1·73 m2, or did not have a valid baseline glycated haemoglobin (HbA1c) measure (<53 or ≥120 mmol/mol).

Among 10 253 patients initiating SGLT2 inhibitors and 16 624 patients initiating DPP-4 inhibitors in CPRD, baseline HbA1c, age, BMI, eGFR, and alanine aminotransferase were associated with differential HbA1c outcome with SGLT2 inhibitor and DPP-4 inhibitor therapies. The median age of participants was 62·0 years (IQR 55·0–70·0). 10 016 (37·3%) were women and 16 861 (62·7%) were men. An algorithm based on these five features identified a subgroup, representing around four in ten CPRD patients, with a 5 mmol/mol or greater observed benefit with SGLT2 inhibitors in all validation cohorts (CPRD 8·8 mmol/mol [95% CI 7·8–9·8]; CANTATA-D and CANTATA-D2 trials 5·8 mmol/mol [3·9–7·7]; BI1245.20 trial 6·6 mmol/mol [2·2–11·0]). In CPRD, predicted differential HbA1c response with SGLT2 inhibitor and DPP-4 inhibitor therapies was not associated with weight change.  Reference

Spatially aware dimension reduction for spatial transcriptomics

Spatial transcriptomics are a collection of genomic technologies that have enabled transcriptomic profiling on tissues with spatial localization information.

spatial transcriptomics
Reference: Lulu Shang et al , Nature communications, 2022

Here, we develop a spatially-aware dimension reduction method, SpatialPCA, that can extract a low dimensional representation of the spatial transcriptomics data with biological signal and preserved spatial correlation structure, thus unlocking many existing computational tools previously developed in single-cell RNAseq studies for tailored analysis of spatial transcriptomics. We illustrate the benefits of SpatialPCA for spatial domain detection and explores its utility for trajectory inference on the tissue and for high-resolution spatial map construction.

In the real data applications, SpatialPCA identifies key molecular and immunological signatures in a detected tumor surrounding microenvironment, including a tertiary lymphoid structure that shapes the gradual transcriptomic transition during tumorigenesis and metastasis. Reference

Systematic single-cell pathway analysis to characterize early T cell activation

Pathway analysis is a key analytical stage in the interpretation of omics data, providing a powerful method for detecting alterations in cellular processes.
Systematic single-cell pathway analysis to characterize early T cell activation
Reference: Jack A. Bibby et al Cell Reports, 2022
We recently developed a sensitive and distribution-free statistical framework for multisample distribution testing, which we implement here in the open-source R package single-cell pathway analysis (SCPA). We demonstrate the effectiveness of SCPA over commonly used methods, generate a scRNA-seq T cell dataset, and characterize pathway activity over early cellular activation. This reveals regulatory pathways in T cells, including an intrinsic type I interferon system regulating T cell survival and a reliance on arachidonic acid metabolism throughout T cell activation.
A systems-level characterization of pathway activity in T cells across multiple tissues also identifies alpha-defensin expression as a hallmark of bone-marrow-derived T cells. Overall, this work provides a widely applicable tool for single-cell pathway analysis and highlights regulatory mechanisms of T cells. Reference

An extensive resource for Bioinformatics, Epigenomics, Genomics and Metagenomics