Overview

In this notebook we test for differentially expressed genes between PRMT7 knockdown and wildtype samples using the limma-voom framework.

Requires: DGE_filtered_normalized.rds produced by 02_quality_control.Rmd


Setup

library(limma)
library(edgeR)
library(tidyverse)
library(ggrepel)
library(pheatmap)
library(RColorBrewer)
library(here)

set.seed(42)

results_dir <- here("demo-analysis", "output", "differential-expression", "results")
figures_dir <- here("demo-analysis", "output", "differential-expression", "figures")

dir.create(results_dir, showWarnings = FALSE, recursive = TRUE)
dir.create(figures_dir, showWarnings = FALSE, recursive = TRUE)

Load Data

Load the filtered and normalized DGEList object saved at the end of Part 1.

DGE <- readRDS(here("demo-analysis", "output", "differential-expression",
                    "DGE_filtered_normalized.rds"))

cat("Genes:", nrow(DGE), "| Samples:", ncol(DGE), "\n")
## Genes: 13783 | Samples: 6
cat("Groups:", paste(levels(DGE$samples$group), collapse = ", "), "\n")
## Groups: PRMT7_WT, PRMT7_KD
DGE$samples
##                group lib.size norm.factors      sample genotype treatment
## SRR12546980 PRMT7_WT 24323440    0.9658872 SRR12546980 PRMT7_WT Untreated
## SRR12546982 PRMT7_KD 29211491    1.0310403 SRR12546982 PRMT7_KD Untreated
## SRR12546984 PRMT7_WT 25741290    0.9680670 SRR12546984 PRMT7_WT Untreated
## SRR12546986 PRMT7_KD 25511001    1.0323176 SRR12546986 PRMT7_KD Untreated
## SRR12546988 PRMT7_WT 25427173    0.9659094 SRR12546988 PRMT7_WT Untreated
## SRR12546990 PRMT7_KD 29730447    1.0402622 SRR12546990 PRMT7_KD Untreated
##                     Run
## SRR12546980 SRR12546980
## SRR12546982 SRR12546982
## SRR12546984 SRR12546984
## SRR12546986 SRR12546986
## SRR12546988 SRR12546988
## SRR12546990 SRR12546990

Design Matrix

The design matrix tells limma which samples belong to which groups and what we want to test. The first factor level (PRMT7_WT) is the reference — fold changes are expressed relative to WT.

  • Intercept: Average expression in PRMT7_WT
  • genotypePRMT7_KD: The log2 fold-change between KD and WT — this is what we are testing
# WT is the reference level (listed first)
DGE$samples$genotype <- factor(DGE$samples$genotype,
                                levels = c("PRMT7_WT", "PRMT7_KD"))

design <- model.matrix(~genotype, data = DGE$samples)
design
##             (Intercept) genotypePRMT7_KD
## SRR12546980           1                0
## SRR12546982           1                1
## SRR12546984           1                0
## SRR12546986           1                1
## SRR12546988           1                0
## SRR12546990           1                1
## attr(,"assign")
## [1] 0 1
## attr(,"contrasts")
## attr(,"contrasts")$genotype
## [1] "contr.treatment"
colnames(design)
## [1] "(Intercept)"      "genotypePRMT7_KD"

voom Transformation

voom converts counts to log2-CPM and calculates a precision weight for each observation based on the mean-variance relationship. The resulting plot should show a smooth downward-sloping red curve — highly expressed genes have more stable variance and receive higher weights in the model.

# normalize.method = "none" because TMM was already applied in Part 1
v <- voom(DGE,
          design           = design,
          normalize.method = "none",
          plot             = TRUE)


Fit Linear Model

lmFit fits a linear model to each gene. eBayes applies empirical Bayes moderation, which borrows information across genes to stabilize variance estimates — particularly important for small sample sizes.

fit <- lmFit(v, design)
fit <- eBayes(fit, robust = TRUE)

# Summary of DE genes at FDR < 0.05
# Down = lower in KD, Up = higher in KD
summary(decideTests(fit, adjust.method = "fdr", p.value = 0.05, lfc = 1))
##        (Intercept) genotypePRMT7_KD
## Down           903              779
## NotSig        1660            11569
## Up           11220             1435

Extract Results

topTable extracts the full results table. coef = 2 refers to the genotypePRMT7_KD coefficient — the KD vs WT comparison.

top_genes_strict <- topTable(fit, 
                             coef = 2, 
                             adjust.method = "BH", 
                             number = 20, 
                             sort.by = "P", 
                             lfc = 1)
cat("\nTop 20 genes (FDR < 0.05, |logFC| > 1):\n")
## 
## Top 20 genes (FDR < 0.05, |logFC| > 1):
print(top_genes_strict)
##                   logFC  AveExpr         t      P.Value    adj.P.Val        B
## Sema6a         2.642542 7.299770  58.82438 2.279533e-18 1.804907e-14 32.44928
## Snhg12         3.575451 5.746345  58.24992 2.619034e-18 1.804907e-14 31.79379
## Lgals3         2.949807 6.069722  53.30723 9.177189e-18 4.216307e-14 30.94777
## Kctd12         2.093207 6.834837  51.66017 1.430064e-17 4.927641e-14 30.69986
## Cdca3         -2.080032 6.749328 -49.52941 2.592996e-17 6.263685e-14 30.13553
## Fabp5         -2.076326 8.060723 -49.35342 2.726700e-17 6.263685e-14 30.11828
## Cip2a         -2.220172 6.408930 -48.69871 3.292701e-17 6.483329e-14 29.88244
## Synpo          2.071817 7.982858  47.69763 4.415029e-17 7.606544e-14 29.64361
## Tmem140        2.957815 4.869355  47.20868 5.106380e-17 7.820138e-14 28.94670
## Samd9l         2.293150 6.601248  46.53788 6.249573e-17 8.303737e-14 29.26317
## 2410006H16Rik  2.657524 7.620666  50.08110 6.674049e-17 8.303737e-14 29.23842
## Ccnb1         -1.898817 6.535109 -46.06006 7.229547e-17 8.303737e-14 29.13535
## Irgm1          2.280746 6.589148  45.09932 9.734097e-17 8.968323e-14 28.83338
## Hmgb2         -1.945442 6.504549 -45.09626 9.743426e-17 8.968323e-14 28.84261
## Tapbp          1.949993 6.206750  45.09076 9.760201e-17 8.968323e-14 28.81601
## Stat1          2.143216 6.895095  44.69852 1.104060e-16 9.510791e-14 28.72430
## Idh2          -1.680399 7.782923 -43.79112 1.474470e-16 1.195449e-13 28.45038
## Phgdh         -1.914652 6.973673 -43.40912 1.668398e-16 1.251901e-13 28.32335
## Ablim1         1.722907 6.713781  43.30520 1.725757e-16 1.251901e-13 28.28466
## Rgs16          2.480363 5.414548  43.13983 1.821425e-16 1.255235e-13 28.09178
cat("Total genes tested:    ", nrow(top_genes_strict), "\n")
## Total genes tested:     20
cat("DE genes (FDR < 0.05):", sum(top_genes_strict$adj.P.Val < 0.05), "\n")
## DE genes (FDR < 0.05): 20
cat("Up in KD:             ", sum(top_genes_strict$adj.P.Val < 0.05 & top_genes_strict$logFC > 0), "\n")
## Up in KD:              13
cat("Down in KD:           ", sum(top_genes_strict$adj.P.Val < 0.05 & top_genes_strict$logFC < 0), "\n\n")
## Down in KD:            7
# Top 20 by p-value
topTable(fit, coef = 2, adjust.method = "BH", number = 20, sort.by = "P", lfc = 1)
##                   logFC  AveExpr         t      P.Value    adj.P.Val        B
## Sema6a         2.642542 7.299770  58.82438 2.279533e-18 1.804907e-14 32.44928
## Snhg12         3.575451 5.746345  58.24992 2.619034e-18 1.804907e-14 31.79379
## Lgals3         2.949807 6.069722  53.30723 9.177189e-18 4.216307e-14 30.94777
## Kctd12         2.093207 6.834837  51.66017 1.430064e-17 4.927641e-14 30.69986
## Cdca3         -2.080032 6.749328 -49.52941 2.592996e-17 6.263685e-14 30.13553
## Fabp5         -2.076326 8.060723 -49.35342 2.726700e-17 6.263685e-14 30.11828
## Cip2a         -2.220172 6.408930 -48.69871 3.292701e-17 6.483329e-14 29.88244
## Synpo          2.071817 7.982858  47.69763 4.415029e-17 7.606544e-14 29.64361
## Tmem140        2.957815 4.869355  47.20868 5.106380e-17 7.820138e-14 28.94670
## Samd9l         2.293150 6.601248  46.53788 6.249573e-17 8.303737e-14 29.26317
## 2410006H16Rik  2.657524 7.620666  50.08110 6.674049e-17 8.303737e-14 29.23842
## Ccnb1         -1.898817 6.535109 -46.06006 7.229547e-17 8.303737e-14 29.13535
## Irgm1          2.280746 6.589148  45.09932 9.734097e-17 8.968323e-14 28.83338
## Hmgb2         -1.945442 6.504549 -45.09626 9.743426e-17 8.968323e-14 28.84261
## Tapbp          1.949993 6.206750  45.09076 9.760201e-17 8.968323e-14 28.81601
## Stat1          2.143216 6.895095  44.69852 1.104060e-16 9.510791e-14 28.72430
## Idh2          -1.680399 7.782923 -43.79112 1.474470e-16 1.195449e-13 28.45038
## Phgdh         -1.914652 6.973673 -43.40912 1.668398e-16 1.251901e-13 28.32335
## Ablim1         1.722907 6.713781  43.30520 1.725757e-16 1.251901e-13 28.28466
## Rgs16          2.480363 5.414548  43.13983 1.821425e-16 1.255235e-13 28.09178

Volcano Plot

A volcano plot shows log-fold change (x-axis) against statistical significance (y-axis). Genes in the upper corners are both large in effect size and highly significant. Dashed lines mark a 2-fold change threshold and p = 0.05.

# All results for visualization
all_results <- topTable(fit,
                        coef = 2,
                        adjust.method = "BH",
                        number = Inf,
                        sort.by = "none")

all_results$category <- case_when(
  all_results$adj.P.Val < 0.05 & all_results$logFC >  1 ~ "Up (>2-fold)",
  all_results$adj.P.Val < 0.05 & all_results$logFC < -1 ~ "Down (>2-fold)",
  all_results$adj.P.Val < 0.05                           ~ "Significant",
  TRUE                                                    ~ "Not Significant"
)

table(all_results$category)
## 
##  Down (>2-fold) Not Significant     Significant    Up (>2-fold) 
##             779            3593            7976            1435
all_results$label <- ifelse(
  rownames(all_results) %in% rownames(topTable(fit, coef = 2, number = 10, lfc = 1)),
  rownames(all_results),
  NA
)

ggplot(all_results, aes(x = logFC, y = -log10(P.Value), color = category)) +
  geom_point(alpha = 0.5, size = 1) +
  scale_color_manual(values = c("Up (>2-fold)" = "red",
                                "Down (>2-fold)" = "blue",
                                "Significant" = "orange",
                                "Not Significant" = "grey80")) +
  geom_vline(xintercept = c(-1, 1), linetype = "dashed", alpha = 0.5) +
  geom_hline(yintercept = -log10(0.05), linetype = "dashed", alpha = 0.5) +
  labs(title = "Volcano Plot: PRMT7 KD vs WT",
       subtitle = "Thresholds: FDR < 0.05, |logFC| > 1 (2-fold change)",
       x = "Log2 Fold Change",
       y = "-Log10 P-value",
       color = "Category") +
  geom_text_repel(aes(label = label), size = 3, max.overlaps = 20, na.rm = TRUE)

  theme_minimal() +
  theme(legend.position = "right")
## <theme> List of 144
##  $ line                            : <ggplot2::element_line>
##   ..@ colour       : chr "black"
##   ..@ linewidth    : num 0.5
##   ..@ linetype     : num 1
##   ..@ lineend      : chr "butt"
##   ..@ linejoin     : chr "round"
##   ..@ arrow        : logi FALSE
##   ..@ arrow.fill   : chr "black"
##   ..@ inherit.blank: logi TRUE
##  $ rect                            : <ggplot2::element_rect>
##   ..@ fill         : chr "white"
##   ..@ colour       : chr "black"
##   ..@ linewidth    : num 0.5
##   ..@ linetype     : num 1
##   ..@ linejoin     : chr "round"
##   ..@ inherit.blank: logi TRUE
##  $ text                            : <ggplot2::element_text>
##   ..@ family       : chr ""
##   ..@ face         : chr "plain"
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : chr "black"
##   ..@ size         : num 11
##   ..@ hjust        : num 0.5
##   ..@ vjust        : num 0.5
##   ..@ angle        : num 0
##   ..@ lineheight   : num 0.9
##   ..@ margin       : <ggplot2::margin> num [1:4] 0 0 0 0
##   ..@ debug        : logi FALSE
##   ..@ inherit.blank: logi TRUE
##  $ title                           : <ggplot2::element_text>
##   ..@ family       : NULL
##   ..@ face         : NULL
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : NULL
##   ..@ size         : NULL
##   ..@ hjust        : NULL
##   ..@ vjust        : NULL
##   ..@ angle        : NULL
##   ..@ lineheight   : NULL
##   ..@ margin       : NULL
##   ..@ debug        : NULL
##   ..@ inherit.blank: logi TRUE
##  $ point                           : <ggplot2::element_point>
##   ..@ colour       : chr "black"
##   ..@ shape        : num 19
##   ..@ size         : num 1.5
##   ..@ fill         : chr "white"
##   ..@ stroke       : num 0.5
##   ..@ inherit.blank: logi TRUE
##  $ polygon                         : <ggplot2::element_polygon>
##   ..@ fill         : chr "white"
##   ..@ colour       : chr "black"
##   ..@ linewidth    : num 0.5
##   ..@ linetype     : num 1
##   ..@ linejoin     : chr "round"
##   ..@ inherit.blank: logi TRUE
##  $ geom                            : <ggplot2::element_geom>
##   ..@ ink        : chr "black"
##   ..@ paper      : chr "white"
##   ..@ accent     : chr "#3366FF"
##   ..@ linewidth  : num 0.5
##   ..@ borderwidth: num 0.5
##   ..@ linetype   : int 1
##   ..@ bordertype : int 1
##   ..@ family     : chr ""
##   ..@ fontsize   : num 3.87
##   ..@ pointsize  : num 1.5
##   ..@ pointshape : num 19
##   ..@ colour     : NULL
##   ..@ fill       : NULL
##  $ spacing                         : 'simpleUnit' num 5.5points
##   ..- attr(*, "unit")= int 8
##  $ margins                         : <ggplot2::margin> num [1:4] 5.5 5.5 5.5 5.5
##  $ aspect.ratio                    : NULL
##  $ axis.title                      : NULL
##  $ axis.title.x                    : <ggplot2::element_text>
##   ..@ family       : NULL
##   ..@ face         : NULL
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : NULL
##   ..@ size         : NULL
##   ..@ hjust        : NULL
##   ..@ vjust        : num 1
##   ..@ angle        : NULL
##   ..@ lineheight   : NULL
##   ..@ margin       : <ggplot2::margin> num [1:4] 2.75 0 0 0
##   ..@ debug        : NULL
##   ..@ inherit.blank: logi TRUE
##  $ axis.title.x.top                : <ggplot2::element_text>
##   ..@ family       : NULL
##   ..@ face         : NULL
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : NULL
##   ..@ size         : NULL
##   ..@ hjust        : NULL
##   ..@ vjust        : num 0
##   ..@ angle        : NULL
##   ..@ lineheight   : NULL
##   ..@ margin       : <ggplot2::margin> num [1:4] 0 0 2.75 0
##   ..@ debug        : NULL
##   ..@ inherit.blank: logi TRUE
##  $ axis.title.x.bottom             : NULL
##  $ axis.title.y                    : <ggplot2::element_text>
##   ..@ family       : NULL
##   ..@ face         : NULL
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : NULL
##   ..@ size         : NULL
##   ..@ hjust        : NULL
##   ..@ vjust        : num 1
##   ..@ angle        : num 90
##   ..@ lineheight   : NULL
##   ..@ margin       : <ggplot2::margin> num [1:4] 0 2.75 0 0
##   ..@ debug        : NULL
##   ..@ inherit.blank: logi TRUE
##  $ axis.title.y.left               : NULL
##  $ axis.title.y.right              : <ggplot2::element_text>
##   ..@ family       : NULL
##   ..@ face         : NULL
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : NULL
##   ..@ size         : NULL
##   ..@ hjust        : NULL
##   ..@ vjust        : num 1
##   ..@ angle        : num -90
##   ..@ lineheight   : NULL
##   ..@ margin       : <ggplot2::margin> num [1:4] 0 0 0 2.75
##   ..@ debug        : NULL
##   ..@ inherit.blank: logi TRUE
##  $ axis.text                       : <ggplot2::element_text>
##   ..@ family       : NULL
##   ..@ face         : NULL
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : chr "#4D4D4DFF"
##   ..@ size         : 'rel' num 0.8
##   ..@ hjust        : NULL
##   ..@ vjust        : NULL
##   ..@ angle        : NULL
##   ..@ lineheight   : NULL
##   ..@ margin       : NULL
##   ..@ debug        : NULL
##   ..@ inherit.blank: logi TRUE
##  $ axis.text.x                     : <ggplot2::element_text>
##   ..@ family       : NULL
##   ..@ face         : NULL
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : NULL
##   ..@ size         : NULL
##   ..@ hjust        : NULL
##   ..@ vjust        : num 1
##   ..@ angle        : NULL
##   ..@ lineheight   : NULL
##   ..@ margin       : <ggplot2::margin> num [1:4] 2.2 0 0 0
##   ..@ debug        : NULL
##   ..@ inherit.blank: logi TRUE
##  $ axis.text.x.top                 : <ggplot2::element_text>
##   ..@ family       : NULL
##   ..@ face         : NULL
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : NULL
##   ..@ size         : NULL
##   ..@ hjust        : NULL
##   ..@ vjust        : NULL
##   ..@ angle        : NULL
##   ..@ lineheight   : NULL
##   ..@ margin       : <ggplot2::margin> num [1:4] 0 0 4.95 0
##   ..@ debug        : NULL
##   ..@ inherit.blank: logi TRUE
##  $ axis.text.x.bottom              : <ggplot2::element_text>
##   ..@ family       : NULL
##   ..@ face         : NULL
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : NULL
##   ..@ size         : NULL
##   ..@ hjust        : NULL
##   ..@ vjust        : NULL
##   ..@ angle        : NULL
##   ..@ lineheight   : NULL
##   ..@ margin       : <ggplot2::margin> num [1:4] 4.95 0 0 0
##   ..@ debug        : NULL
##   ..@ inherit.blank: logi TRUE
##  $ axis.text.y                     : <ggplot2::element_text>
##   ..@ family       : NULL
##   ..@ face         : NULL
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : NULL
##   ..@ size         : NULL
##   ..@ hjust        : num 1
##   ..@ vjust        : NULL
##   ..@ angle        : NULL
##   ..@ lineheight   : NULL
##   ..@ margin       : <ggplot2::margin> num [1:4] 0 2.2 0 0
##   ..@ debug        : NULL
##   ..@ inherit.blank: logi TRUE
##  $ axis.text.y.left                : <ggplot2::element_text>
##   ..@ family       : NULL
##   ..@ face         : NULL
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : NULL
##   ..@ size         : NULL
##   ..@ hjust        : NULL
##   ..@ vjust        : NULL
##   ..@ angle        : NULL
##   ..@ lineheight   : NULL
##   ..@ margin       : <ggplot2::margin> num [1:4] 0 4.95 0 0
##   ..@ debug        : NULL
##   ..@ inherit.blank: logi TRUE
##  $ axis.text.y.right               : <ggplot2::element_text>
##   ..@ family       : NULL
##   ..@ face         : NULL
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : NULL
##   ..@ size         : NULL
##   ..@ hjust        : NULL
##   ..@ vjust        : NULL
##   ..@ angle        : NULL
##   ..@ lineheight   : NULL
##   ..@ margin       : <ggplot2::margin> num [1:4] 0 0 0 4.95
##   ..@ debug        : NULL
##   ..@ inherit.blank: logi TRUE
##  $ axis.text.theta                 : NULL
##  $ axis.text.r                     : <ggplot2::element_text>
##   ..@ family       : NULL
##   ..@ face         : NULL
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : NULL
##   ..@ size         : NULL
##   ..@ hjust        : num 0.5
##   ..@ vjust        : NULL
##   ..@ angle        : NULL
##   ..@ lineheight   : NULL
##   ..@ margin       : <ggplot2::margin> num [1:4] 0 2.2 0 2.2
##   ..@ debug        : NULL
##   ..@ inherit.blank: logi TRUE
##  $ axis.ticks                      : <ggplot2::element_blank>
##  $ axis.ticks.x                    : NULL
##  $ axis.ticks.x.top                : NULL
##  $ axis.ticks.x.bottom             : NULL
##  $ axis.ticks.y                    : NULL
##  $ axis.ticks.y.left               : NULL
##  $ axis.ticks.y.right              : NULL
##  $ axis.ticks.theta                : NULL
##  $ axis.ticks.r                    : NULL
##  $ axis.minor.ticks.x.top          : NULL
##  $ axis.minor.ticks.x.bottom       : NULL
##  $ axis.minor.ticks.y.left         : NULL
##  $ axis.minor.ticks.y.right        : NULL
##  $ axis.minor.ticks.theta          : NULL
##  $ axis.minor.ticks.r              : NULL
##  $ axis.ticks.length               : 'rel' num 0.5
##  $ axis.ticks.length.x             : NULL
##  $ axis.ticks.length.x.top         : NULL
##  $ axis.ticks.length.x.bottom      : NULL
##  $ axis.ticks.length.y             : NULL
##  $ axis.ticks.length.y.left        : NULL
##  $ axis.ticks.length.y.right       : NULL
##  $ axis.ticks.length.theta         : NULL
##  $ axis.ticks.length.r             : NULL
##  $ axis.minor.ticks.length         : 'rel' num 0.75
##  $ axis.minor.ticks.length.x       : NULL
##  $ axis.minor.ticks.length.x.top   : NULL
##  $ axis.minor.ticks.length.x.bottom: NULL
##  $ axis.minor.ticks.length.y       : NULL
##  $ axis.minor.ticks.length.y.left  : NULL
##  $ axis.minor.ticks.length.y.right : NULL
##  $ axis.minor.ticks.length.theta   : NULL
##  $ axis.minor.ticks.length.r       : NULL
##  $ axis.line                       : <ggplot2::element_blank>
##  $ axis.line.x                     : NULL
##  $ axis.line.x.top                 : NULL
##  $ axis.line.x.bottom              : NULL
##  $ axis.line.y                     : NULL
##  $ axis.line.y.left                : NULL
##  $ axis.line.y.right               : NULL
##  $ axis.line.theta                 : NULL
##  $ axis.line.r                     : NULL
##  $ legend.background               : <ggplot2::element_blank>
##  $ legend.margin                   : NULL
##  $ legend.spacing                  : 'rel' num 2
##  $ legend.spacing.x                : NULL
##  $ legend.spacing.y                : NULL
##  $ legend.key                      : <ggplot2::element_blank>
##  $ legend.key.size                 : 'simpleUnit' num 1.2lines
##   ..- attr(*, "unit")= int 3
##  $ legend.key.height               : NULL
##  $ legend.key.width                : NULL
##  $ legend.key.spacing              : NULL
##  $ legend.key.spacing.x            : NULL
##  $ legend.key.spacing.y            : NULL
##  $ legend.key.justification        : NULL
##  $ legend.frame                    : NULL
##  $ legend.ticks                    : NULL
##  $ legend.ticks.length             : 'rel' num 0.2
##  $ legend.axis.line                : NULL
##  $ legend.text                     : <ggplot2::element_text>
##   ..@ family       : NULL
##   ..@ face         : NULL
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : NULL
##   ..@ size         : 'rel' num 0.8
##   ..@ hjust        : NULL
##   ..@ vjust        : NULL
##   ..@ angle        : NULL
##   ..@ lineheight   : NULL
##   ..@ margin       : NULL
##   ..@ debug        : NULL
##   ..@ inherit.blank: logi TRUE
##  $ legend.text.position            : NULL
##  $ legend.title                    : <ggplot2::element_text>
##   ..@ family       : NULL
##   ..@ face         : NULL
##   ..@ italic       : chr NA
##   ..@ fontweight   : num NA
##   ..@ fontwidth    : num NA
##   ..@ colour       : NULL
##   ..@ size         : NULL
##   ..@ hjust        : num 0
##   ..@ vjust        : NULL
##   ..@ angle        : NULL
##   ..@ lineheight   : NULL
##   ..@ margin       : NULL
##   ..@ debug        : NULL
##   ..@ inherit.blank: logi TRUE
##  $ legend.title.position           : NULL
##  $ legend.position                 : chr "right"
##  $ legend.position.inside          : NULL
##  $ legend.direction                : NULL
##  $ legend.byrow                    : NULL
##  $ legend.justification            : chr "center"
##  $ legend.justification.top        : NULL
##  $ legend.justification.bottom     : NULL
##  $ legend.justification.left       : NULL
##  $ legend.justification.right      : NULL
##  $ legend.justification.inside     : NULL
##   [list output truncated]
##  @ complete: logi TRUE
##  @ validate: logi TRUE
ggsave(here(figures_dir, "volcano_plot.png"), width = 8, height = 6, dpi = 300)

MA Plot

An MA plot shows log-fold change (y-axis) against average expression (x-axis). The bulk of genes should sit around logFC = 0, with DE genes scattered above and below. A systematic trend would suggest a normalization problem.

ggplot(all_results, aes(x = AveExpr, y = logFC, color = category)) +
  geom_point(alpha = 0.5, size = 1) +
  scale_color_manual(values = c("Up (>2-fold)" = "red",
                                "Down (>2-fold)" = "blue",
                                "Significant" = "orange",
                                "Not Significant" = "grey80")) +
  geom_hline(yintercept = 0, linetype = "solid", alpha = 0.5) +
  geom_hline(yintercept = c(-1, 1), linetype = "dashed", alpha = 0.3) +
  labs(title = "MA Plot: PRMT7 KD vs WT",
       subtitle = "Thresholds: FDR < 0.05, |logFC| > 1 (2-fold change)",
       x = "Average Expression (log2-CPM)",
       y = "Log2 Fold Change",
       color = "Category") +
  theme_minimal()

ggsave(here(figures_dir, "ma_plot.png"), width = 8, height = 6, dpi = 300)

Heatmap of Top 50 DE Genes

Expression values are z-scored per gene (row) so that expression patterns are comparable across genes regardless of absolute expression level. WT and KD samples should separate clearly if the DE signal is strong.

top50        <- rownames(topTable(fit, coef = 2, number = 50, lfc = 1))
top50_logcpm <- cpm(DGE[top50, ], log = TRUE)
top50_scaled <- t(scale(t(top50_logcpm)))  # Z-score per gene

annotation_col <- data.frame(
  Genotype  = DGE$samples$genotype,
  row.names = colnames(top50_logcpm)
)

# Display in document
pheatmap(top50_scaled,
         annotation_col = annotation_col,
         cluster_rows   = TRUE,
         cluster_cols   = TRUE,
         show_rownames  = TRUE,
         show_colnames  = TRUE,
         main           = "Top 50 DE Genes",
         fontsize_row   = 6,
         fontsize_col   = 8,
         color          = colorRampPalette(c("blue", "white", "red"))(100))

# Save to file
pheatmap(top50_scaled,
         annotation_col = annotation_col,
         cluster_rows   = TRUE,
         cluster_cols   = TRUE,
         show_rownames  = TRUE,
         show_colnames  = TRUE,
         main           = "Top 50 DE Genes",
         fontsize_row   = 6,
         fontsize_col   = 8,
         color          = colorRampPalette(c("blue", "white", "red"))(100),
         filename       = here(figures_dir, "heatmap_top50.png"),
         width          = 8,
         height         = 10)

Export Results

all_results$gene <- rownames(all_results)

write.csv(all_results,
          here(results_dir, "de_results_all.csv"),
          row.names = FALSE)

# Significant genes only
sig_results <- all_results %>% filter(adj.P.Val < 0.05 & abs(logFC) > 1)

write.csv(sig_results,
          here(results_dir, "de_results_significant.csv"),
          row.names = FALSE)

cat("✓ de_results_all.csv:        ", nrow(all_results), "genes\n")
## ✓ de_results_all.csv:         13783 genes
cat("✓ de_results_significant.csv:", nrow(sig_results), "genes\n")
## ✓ de_results_significant.csv: 2214 genes

Session Information

writeLines(capture.output(sessionInfo()),
           here(results_dir, "sessionInfo_limma.txt"))
sessionInfo()
## R version 4.5.3 (2026-03-11)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] here_1.0.2         RColorBrewer_1.1-3 pheatmap_1.0.13    ggrepel_0.9.8     
##  [5] lubridate_1.9.5    forcats_1.0.1      stringr_1.6.0      dplyr_1.2.0       
##  [9] purrr_1.2.1        readr_2.2.0        tidyr_1.3.2        tibble_3.3.1      
## [13] ggplot2_4.0.2      tidyverse_2.0.0    edgeR_4.8.2        limma_3.66.0      
## 
## loaded via a namespace (and not attached):
##  [1] sass_0.4.10       generics_0.1.4    stringi_1.8.7     lattice_0.22-9   
##  [5] hms_1.1.4         digest_0.6.39     magrittr_2.0.4    evaluate_1.0.5   
##  [9] grid_4.5.3        timechange_0.4.0  fastmap_1.2.0     rprojroot_2.1.1  
## [13] jsonlite_2.0.0    scales_1.4.0      textshaping_1.0.5 jquerylib_0.1.4  
## [17] cli_3.6.5         rlang_1.1.7       withr_3.0.2       cachem_1.1.0     
## [21] yaml_2.3.12       otel_0.2.0        tools_4.5.3       tzdb_0.5.0       
## [25] locfit_1.5-9.12   vctrs_0.7.2       R6_2.6.1          lifecycle_1.0.5  
## [29] ragg_1.5.2        pkgconfig_2.0.3   pillar_1.11.1     bslib_0.10.0     
## [33] gtable_0.3.6      Rcpp_1.1.1        glue_1.8.0        systemfonts_1.3.2
## [37] statmod_1.5.1     xfun_0.57         tidyselect_1.2.1  rstudioapi_0.18.0
## [41] knitr_1.51        dichromat_2.0-0.1 farver_2.1.2      htmltools_0.5.9  
## [45] labeling_0.4.3    rmarkdown_2.30    compiler_4.5.3    S7_0.2.1