您现在的位置是：首页 > 其他

当前栏目

跟着存档教程动手学RNAseq分析（六）：差异分析流程总结

流程教程分析总结差异跟着动手存档

2023-06-13 09:16:26 时间

我们详细介绍了差异表达分析工作流程中的各个步骤，并提供了理论和示例代码。为了给运行DGE分析所需的代码提供更简洁的参考，我们总结了如下分析中的步骤：

使用tximport导入Salmon的基因水平计数数据

# Run tximport
 txi <- tximport(files, type="salmon", tx2gene=t2g, countsFromAbundance = "lengthScaledTPM")
 
 # "files" is a vector wherein each element is the path to the salmon quant.sf file, and each element is named with the name of the sample.
 # "t2g" is a 2 column data frame which contains transcript IDs mapped to geneIDs (in that order)

创建dds对象：

 # Check that the row names of the metadata equal
 # the column names of the **raw counts** data
 all(colnames(txi$counts) == rownames(metadata))
 
 # Create DESeq2Dataset object
 dds <- DESeqDataSetFromTximport(txi, colData = metadata, design = ~ condition)

探索性数据分析(PCA和层次聚类)-识别数据中的异常值和变异来源：

 # Transform counts for data visualization
 rld <- rlog(dds, blind=TRUE)
 
 # Plot PCA 
 plotPCA(rld, intgroup="condition")
 
 # Extract the rlog matrix from the object
 # and compute pairwise correlation values
 rld_mat <- assay(rld)
 rld_cor <- cor(rld_mat)
 
 # Plot heatmap
 pheatmap(rld_cor, annotation = metadata)

运行DESeq2

# **Optional step** - Re-create DESeq2 dataset 
# if the design formula has changed after QC analysis
# in include other sources of variation 
# using "dds <- DESeqDataSetFromTximport(txi, colData = metadata, design = ~ covaraite + condition)"

# Run DESeq2 differential expression analysis
 dds <- DESeq(dds)

# **Optional step** - Output normalized counts 
# to save as a file to access outside RStudio 
# using "normalized_counts <- counts(dds, normalized=TRUE)"

检查离散估计的拟合：

plotDispEsts(dds)

创建对比，对特定条件下的收缩log2折叠变化进行Wald检验：

 # Output results of Wald test for contrast
 contrast <- c("condition", "level_to_compare", "base_level")
 res <- results(dds, contrast = contrast, alpha = 0.05)
 res <- lfcShrink(dds, contrast = contrast, res=res)

输出显著性结果：

 # Set thresholds
 padj.cutoff < - 0.05
 
 # Turn the results object into a tibble for use with tidyverse functions
 res_tbl <- res %>%
               data.frame() %>%
               rownames_to_column(var="gene") %>% 
               as_tibble()
 
 # Subset the significant results
 sig_res <- filter(res_tbl, padj < padj.cutoff)

8. 可视化结果：火山图、热图、top基因的标准化计数图等。

9. 进行分析，提取结果的功能意义：GO或KEGG富集，GSEA等。

猜你喜欢

MySQL数据库介绍与应用：详解MySQL数据库的定义、特点及应用场景（mysql数据库是啥）
MySQL存储二进制数据的技术（mysql二进制数据）
如何解压war包_tomcat自动解压war包
MySQL中unhex函数的用法与实例解析（mysql中unhex）
.NET MVC简单介绍
Nginx 了解一下？
git仓库之gogs安装(docker版/二进制版)
系统分析师案例必备知识点汇总---2023系列文章一
SpringMVC异常处理方式详解编程语言
s in processing big dataNeo4j: Big Data Processing Made Easy（neo4jexcel）
从ASPX到Oracle一个有趣的转变（aspx oracle）
jqueryready(fn)事件使用介绍
什么是路由重分布技术？深入理解重分布配置！
windows中apache301页面跳转实现方法
微搭低代码从入门到精通12-网格布局
科技加持新材料被赋予无限想象
Typecho数据库常用的API，创建/更新/读取/删除

zl程序教程

当前栏目

跟着存档教程动手学RNAseq分析（六）：差异分析流程总结

相关文章