Publications
Nucleic acids research, 2019
Publication Abstract
Tissues used in pathology laboratories are typically stored in the form of formalin-fixed, paraffin-embedded (FFPE) samples. One important consideration in repurposing FFPE material for next generation sequencing (NGS) analysis is the sequencing artifacts that can arise from the significant damage to nucleic acids due to treatment with formalin, storage at room temperature and extraction. One such class of artifacts consists of chimeric reads that appear to be derived from non-contiguous portions of the genome. Here, we show that a major proportion of such chimeric reads align to both the 'Watson' and 'Crick' strands of the reference genome. We refer to these as strand-split artifact reads (SSARs). This study provides a conceptual framework for the mechanistic basis of the genesis of SSARs and other chimeric artifacts along with supporting experimental evidence, which have led to approaches to reduce the levels of such artifacts. We demonstrate that one of these approaches, involving S1 nuclease-mediated removal of single-stranded fragments and overhangs, also reduces sequence bias, base error rates, and false positive detection of copy number and single nucleotide variants. Finally, we describe an analytical approach for quantifying SSARs from NGS data.
PloS one, 2019
Publication Abstract
Inflammation contributes to breast cancer development through its effects on cell damage. This damage is usually dealt with by key genes involved in apoptosis and autophagy pathways.
Oncoimmunology, 2019
Publication Abstract
The self-immunopeptidome is the repertoire of all self-peptides that can be presented by the combination of MHC variants carried by an individual, defined by their HLA genotype. Each MHC variant presents a distinct set of self-peptides, and the number of peptides in a set is variable. Subjects carrying MHC variants that present fewer self-peptides should also present fewer mutated peptides, resulting in decreased immune pressure on tumor cells. To explore this, we predicted peptide-MHC binding values using all unique 8-11mer human peptides in the human proteome and all available HLA class I allelic variants, for a total of 134 billion unique peptide--MHC binding predictions. From these predictions, we observe that most peptides are able to be presented by relatively few (< 250) MHC, while some can be presented by upwards of 1,500 different MHC. There is substantial overlap among the repertoires of peptides presented by different MHC and no relationship between the number of peptides presented and HLA population frequency. Nearly 30% of self-peptides are presentable by at least one MHC, leaving 70% of the human peptidome unsurveyed by T cells. We observed similar distributions of predicted self-immunopeptidome sizes in cancer subjects compared to controls, and within the pan-cancer population, predicted self-immunopeptidome size combined with mutational load to predict survival. Self-immunopeptidome analysis revealed evidence for tumor immunoediting and identified specific peptide positions that most influence immunogenicity. Because self-immunopeptidome size is defined by HLA genotypes and approximates neoantigen load, HLA genotyping could offer a rapid predictive biomarker for response to immunotherapy.
Nature protocols, 2019
Publication Abstract
A critical step in proteomics analysis is the optimal extraction and processing of protein material to ensure the highest sensitivity in downstream detection. Achieving this requires a sample-handling technology that exhibits unbiased protein manipulation, flexibility in reagent use, and virtually lossless processing. Addressing these needs, the single-pot, solid-phase-enhanced sample-preparation (SP3) technology is a paramagnetic bead-based approach for rapid, robust, and efficient processing of protein samples for proteomic analysis. SP3 uses a hydrophilic interaction mechanism for exchange or removal of components that are commonly used to facilitate cell or tissue lysis, protein solubilization, and enzymatic digestion (e.g., detergents, chaotropes, salts, buffers, acids, and solvents) before downstream proteomic analysis. The SP3 protocol consists of nonselective protein binding and rinsing steps that are enabled through the use of ethanol-driven solvation capture on the surface of hydrophilic beads, and elution of purified material in aqueous conditions. In contrast to alternative approaches, SP3 combines compatibility with a substantial collection of solution additives with virtually lossless and unbiased recovery of proteins independent of input quantity, all in a simplified single-tube protocol. The SP3 protocol is simple and efficient, and can be easily completed by a standard user in ~30 min, including reagent preparation. As a result of these properties, SP3 has successfully been used to facilitate examination of a broad range of sample types spanning simple and complex protein mixtures in large and very small amounts, across numerous organisms. This work describes the steps and extensive considerations involved in performing SP3 in bottom-up proteomics, using a simplified protein cleanup scenario for illustration.
Methods in molecular biology (Clifton, N.J.), 2019
Publication Abstract
Liquid biopsies are rapidly emerging as powerful tools for the early detection of cancer, noninvasive genomic profiling of localized or metastatic tumors, prompt detection of treatment resistance-associated mutations, and monitoring of therapeutic response and minimal residual disease in patients during clinical follow-up. Growing evidence strongly supports the utility of circulating tumor DNA (ctDNA) as a biomarker for the stratification and clinical management of lymphoma patients. However, ctDNA is diluted by variable amounts of cell-free DNA (cfDNA) shed by nonneoplastic cells causing a background signal of wild-type DNA that limits the sensitivity of methods that rely on DNA sequencing. Here, we describe an error suppression method for single-molecule counting that relies on targeted sequencing of cfDNA libraries constructed with semi-degenerate barcode adapters. Custom pools of biotinylated DNA baits for target enrichment can be designed to specifically track somatic mutations in one patient, survey mutation hotspots with diagnostic and prognostic value or be comprised of comprehensive gene panels with broad patient coverage in lymphoma. Such methods are amenable to track ctDNA levels during longitudinal liquid biopsy testing with high specificity and sensitivity and characterize, in real time, the genetic profiles of tumors without the need of standard invasive biopsies. The analysis of ultra-deep sequencing data according to the bioinformatics pipelines also described in this chapter affords to harness lower limits of detection for ctDNA below 0.1%.
Journal of clinical oncology : official journal of the American Society of Clinical Oncology, 2019
Publication Abstract
High-grade B-cell lymphoma with MYC and BCL2 and/or BCL6 rearrangements (HGBL-DH/TH) has a poor outcome after standard chemoimmunotherapy. We sought to understand the biologic underpinnings of HGBL-DH/TH with BCL2 rearrangements (HGBL-DH/TH- BCL2) and diffuse large B-cell lymphoma (DLBCL) morphology through examination of gene expression.
Cold Spring Harbor molecular case studies, 2018
Publication Abstract
Thyroid-like follicular renal cell carcinoma (TLFRCC) is a rare cancer with few reports of metastatic disease. Little is known regarding genomic characteristics and therapeutic targets. We present the clinical, pathologic, genomic, and transcriptomic analyses of a case of a 27-yr-old male with TLFRCC who presented initially with bone metastases of unknown primary. Genomic DNA from peripheral blood and metastatic tumor samples were sequenced. A transcriptome of 280 million sequence reads was generated from the same tumor sample. Tumor somatic expression profiles were analyzed to detect aberrant expression. Genomic and transcriptomic data sets were integrated to reveal dysregulation in pathways and identify potential therapeutic targets. Integrative genomic analysis with The Cancer Genome Atlas (TCGA) data set revealed the following outliers in gene expression profiles: (81st percentile), (99th percentile), (100th percentile), and (99th and 100th percentiles, respectively), and (86th percentile). The patient received first-line sunitinib to target PDGFRA and PDGFRB and had stable disease for >6 mo, followed by nivolumab upon progression. To the authors' knowledge, this is the first reported case of comprehensive somatic genomic analyses in a patient with metastatic TLFRCC. Somatic analyses provided molecular confirmation of the primary site of cancer and potential therapeutic strategies in a rare disease with little evidence of efficacy on systemic therapy.
Molecular therapy oncolytics, 2018
Publication Abstract
Tumor cells frequently evade applied therapies through the accumulation of genomic mutations and rapid evolution. In the case of oncolytic virotherapy, understanding the mechanisms by which cancer cells develop resistance to infection and lysis is critical to the development of more effective viral-based platforms. Here, we identify APOBEC3 as an important factor that restricts the potency of oncolytic vesicular stomatitis virus (VSV). We show that VSV infection of B16 murine melanoma cells upregulated APOBEC3 in an IFN-β-dependent manner, which was responsible for the evolution of virus-resistant cell populations and suggested that APOBEC3 expression promoted the acquisition of a virus-resistant phenotype. Knockdown of APOBEC3 in B16 cells diminished their capacity to develop resistance to VSV infection and enhanced the therapeutic effect of VSV . Similarly, overexpression of human APOBEC3B promoted the acquisition of resistance to oncolytic VSV both and . Finally, we demonstrate that APOBEC3B expression had a direct effect on the fitness of VSV, an RNA virus that has not previously been identified as restricted by APOBEC3B. This research identifies APOBEC3 enzymes as key players to target in order to improve the efficacy of viral or broader nucleic acid-based therapeutic platforms.
Cell stem cell, 2018
Publication Abstract
Acute leukemias are aggressive malignancies of developmentally arrested hematopoietic progenitors. We sought here to explore the possibility that changes in hematopoietic stem/progenitor cells during development might alter the biology of leukemias arising from this tissue compartment. Using a mouse model of acute T cell leukemia, we found that leukemias generated from fetal liver (FL) and adult bone marrow (BM) differed dramatically in their leukemia stem cell activity with FL leukemias showing markedly reduced serial transplantability as compared to BM leukemias. We present evidence that this difference is due to NOTCH1-driven autocrine IGF1 signaling, which is active in FL cells but restrained in BM cells by EZH2-dependent H3K27 trimethylation. Further, we confirmed this mechanism is operative in human disease and show that enforced IGF1 signaling effectively limits leukemia stem cell activity. These findings demonstrate that resurrecting dormant fetal programs in adult cells may represent an alternate therapeutic approach in human cancer.
Genes, 2018
Publication Abstract
The grizzly bear ( ssp. ) represents the largest population of brown bears in North America. Its genome was sequenced using a microfluidic partitioning library construction technique, and these data were supplemented with sequencing from a nanopore-based long read platform. The final assembly was 2.33 Gb with a scaffold N50 of 36.7 Mb, and the genome is of comparable size to that of its close relative the polar bear (2.30 Gb). An analysis using 4104 highly conserved mammalian genes indicated that 96.1% were found to be complete within the assembly. An automated annotation of the genome identified 19,848 protein coding genes. Our study shows that the combination of the two sequencing modalities that we used is sufficient for the construction of highly contiguous reference quality mammalian genomes. The assembled genome sequence and the supporting raw sequence reads are available from the NCBI (National Center for Biotechnology Information) under the bioproject identifier PRJNA493656, and the assembly described in this paper is version QXTK01000000.