MicrobiomeStatPlots | 核密度图教程Kernal density plot tutorial

刘永鑫Adam

于 2025-06-15 07:03:09 发布

阅读量962

点赞数 7

CC 4.0 BY-SA版权

本文链接：https://blog.csdn.net/woodcorpse/article/details/148679137

视频讲解

在线观看:https://www.bilibili.com/video/BV1XjMBzeEVk/

核密度估计图简介

核密度图（Kernel Density Plot）是一种用于估计数据分布的图形工具，它通过平滑数据点生成一个连续的概率密度函数，从而显示数据的分布情况。核密度图比直方图更为光滑，因为它不会依赖于具体的分箱选择。核密度估计是基于核函数（通常是高斯核函数）计算的，核函数会在每个数据点周围放置一个平滑的曲线，然后将这些曲线相加得到整体的密度估计。核密度图能够很好地显示数据的分布特征，如数据的多峰性和尾部行为。核密度图的优点包括：提供数据分布的平滑估计；不依赖于直方图的分箱选择；更好地展示数据的多峰性。核密度图的缺点包括：对于不同的带宽参数选择，结果可能会有较大差异；对于极端值敏感。

标签：#微生物组数据分析 #MicrobiomeStatPlots #核密度图 #R语言可视化 #Kernel Density plot

作者：First draft(初稿)：Defeng Bai(白德凤)；Proofreading(校对)：Ma Chuang(马闯) and Jiani Xun(荀佳妮)；Text tutorial(文字教程)：Defeng Bai(白德凤)

引文: Defeng Bai, Chuang Ma, Jiani Xun, Hao Luo, Haifei Yang, Hujie Lyu, et al. 2025. "MicrobiomeStatPlots: Microbiome statistics plotting gallery for meta-omics and bioinformatics." iMeta 4: e70002. https://doi.org/10.1002/imt2.70002

源代码及测试数据链接：

https://github.com/YongxinLiu/MicrobiomeStatPlot/项目中目录 3.Visualization_and_interpretation/KernalDensityPlot

或公众号后台回复“MicrobiomeStatPlot”领取

核密度估计图案例

这是来自于哈佛大学医学院Dong D. Wang团队2024年发表于Nature Medicine上的一篇论文用到的核密度图展示样本年龄的分布情况。论文题目为：Strain-specific gut microbial signatures in type 2 diabetes identified in a cross-cohort analysis of 8,117 metagenomes. https://doi.org/10.1038/s41591-024-03067-7.

核密度估计图R语言实战

源代码及测试数据链接：

https://github.com/YongxinLiu/MicrobiomeStatPlot/

或公众号后台回复“MicrobiomeStatPlot”领取

软件包安装

# 基于CRAN安装R包，检测没有则安装p_list = c("ggplot2", "ggridges")for(p in p_list){if (!requireNamespace(p)){install.packages(p)}    library(p, character.only = TRUE, quietly = TRUE, warn.conflicts = FALSE)}
# 加载R包 Load the packagesuppressWarnings(suppressMessages(library(ggplot2)))suppressWarnings(suppressMessages(library(ggridges)))

实战

# 读取CSV文件(read csv file)# 数据来自于Nature Medicine论文data <- read.csv("data/data01.csv", header = TRUE, sep = ",") 
# 创建颜色映射(creat color)studycol <- c("#466983", "#339900", "#837B8D",               "#809900", "#5DB1DD", "#802268",              "#CC9900", "#CE3D32","#C75127", "#996600")study_names <- c("DIRECT-PLUS (ISR)", "Fromentin_2022 (DEU/DNK/FRA)", "HPFS (USA)",                  "Karlsson_2013 (SWE)", "NHSII (USA)", "Qin_2012 (CHN)",                 "SOL (USA)", "Wu_2020a (SWE)", "Wu_2020b (SWE)", "Zhong_2019 (CHN)")names(studycol) <- study_names
# 将Study列转换为因子，并指定顺序# Convert the Study column to a factor and specify the orderdata$study <- factor(data$study, levels = study_names)
# 按照指定顺序排列数据# Sort the data in the specified order# data <- data[order(data$study, decreasing = FALSE), ]
# 创建图形，按照指定顺序绘制, 图例在右侧# Create graphics and draw them in the specified order, legend on the rightp <- ggplot(data, aes(x = age, y = study, fill = study)) +  geom_density_ridges(order = order(data$study, decreasing = FALSE)) +  scale_fill_manual(values = studycol) +  theme_minimal()
# 将图例放在下方(Place legend below)# 将Study列转换为因子，并指定顺序# Create graphics and draw them in the specified orderdata$study <- factor(data$study, levels = rev(study_names))  # 将顺序反转(Reverse the order)
# 创建图形，按照指定顺序绘制# Create graphics and draw them in the specified orderp1 <- ggplot(data, aes(x = age, y = study, fill = study)) +  geom_density_ridges() +  scale_fill_manual(values = studycol, guide = guide_legend(reverse = TRUE)) +  theme_bw()+  scale_x_continuous(breaks = seq(0,100, by=20))+  labs(x="Age (years)", y="Density")+  theme(legend.position = "bottom",        panel.grid = element_blank(),        axis.text.y = element_blank(),        axis.ticks.y = element_blank(),        element_text(family = "sans"))+  guides(fill = guide_legend(nrow = 3, byrow = TRUE, reverse = TRUE))  # 将图例分成三行显示ggsave(paste("results/Kernal_density_plot01",".pdf", sep=""), p1, width=55 * 1.5, height=70 * 1.5, unit='mm')

使用此脚本，请引用下文：

Defeng Bai, Chuang Ma, Jiani Xun, Hao Luo, Haifei Yang, Hujie Lyu, et al. 2025. "MicrobiomeStatPlots: Microbiome statistics plotting gallery for meta-omics and bioinformatics." iMeta 4: e70002. https://doi.org/10.1002/imt2.70002

宏基因组推荐

本公众号现全面开放投稿，希望文章作者讲出自己的科研故事，分享论文的精华与亮点。投稿请联系小编（微信号：yongxinliu 或 meta-genomics）

10000+：扩增子EasyAmplicon 比较基因组JCVI 序列分析SeqKit2 维恩图EVenn

iMetaOmics高引猪微生物组 16S扩增子综述易扩增子(EasyAmplicon)

系列教程：微生物组入门 Biostar 微生物组宏基因组

专业技能：学术图表高分文章生信宝典不可或缺的人

写在后面

为鼓励读者交流快速解决科研困难，我们建立了“宏基因组”讨论群，已有国内外6000+ 科研人员加入。请添加主编微信meta-genomics带你入群，务必备注“姓名-单位-研究方向-职称/年级”。高级职称请注明身份，另有海内外微生物PI群供大佬合作交流。技术问题寻求帮助，首先阅读《如何优雅的提问》学习解决问题思路，仍未解决群内讨论，问题不私聊，帮助同行。

点击阅读原文