Short Description

This experiment spawned from the investigation of the role of NFKBIZ in DLBCL. We are using the DLC DLBCL cohort (n = 348) to look for mutation patterns in relation to NFKBIZ. One critical feature we are considering is the DLBCL cell-of-origin (COO). We have COO status for every DLC sample based on the results of the Lymph2Cx NanoString assay. This assay measures RNA levels for 20 genes from FFPE tissue in order to classify the sample as either ABC, GCB, unclassified (UNC) or undefined (NA).

However, Ryan has expressed concern that this assay is undercalling ABC DLBCLs. Fortunately, we have RNA-seq data for 322 DLC samples (out of 348). We also have mutation data for 72 known or candidate lymphoma genes from a targeted sequencing experiment for the entire cohort (work-in-progress). Therefore, we have sufficient data to refine the COO assignments of the DLC cohort. This report describes our attempt at accomplishing this.

Shared Variables

Below is the list of arguments that will be shared for the entirety of this analysis. Note the seed for any random number generators.

# File paths
expr_path <- file.path(PROJHOME, "data", "expr.tsv")
coo_path <- file.path(PROJHOME, "data", "coo.tsv")
nc_path <- file.path(PROJHOME, "data", "normal_content.tsv")
muts_path <- file.path(PROJHOME, "data", "mutations.tsv")
snvs_and_indels_path <- file.path(PROJHOME, "data", "snvs_and_indels.maf")
cnvs_and_svs_path <- file.path(PROJHOME, "data", "cnvs_and_svs.tsv")
wright_genes_path <- file.path(PROJHOME, "reference", "wright_genes.txt")
lymph2cx_genes_path <- file.path(PROJHOME, "reference", "lymph2cx_genes.txt")

# Random number generated by runif(1, 0, 10^8)
set.seed(87510475)

# Number of threads used when possible
doMC::registerDoMC(cores = 8)

# Number of discrete levels when converting numeric vector
n_breaks <- 5

# Minimum fraction of affected samples for a codon to be called a hotspot
min_recur <- 0.05

# Number of most variably expressed genes to display in heatmaps
ntop <- 500

# Method for multiple test correction
p_adjust_method <- "BH"

# Q-value cutoff
qval_cutoff <- 0.1