复现的图来源于论文
A multi-kingdom collection of 33,804 reference genomes for the human vaginal microbiome
https://www.nature.com/articles/s41564-024-01751-5
部分实例数据截图
加载需要用到的R包
library(ggplot2)
library(patchwork)
library(readxl)
library(tidyverse)
library(ggpmisc)
library(scatterpie)
读取数据
fig1b.dat<-read_excel("D:/R_4_1_0_working_directory/env001/2024.data/20240717/41564_2024_1751_MOESM6_ESM.xlsx")
fig1b.dat %>%
mutate(`Genome quality`=factor(`Genome quality`,
levels=c("near-complete","high-quality","medium-quality"))) -> fig1b.dat
fig1b.dat %>% colnames()
频率分布直方图1
ggplot(data = fig1b.dat,aes(x=`% Completeness`))+
geom_histogram(aes(fill=`Genome quality`),
binwidth = 0.5,
color="grey")+
scale_fill_manual(values = c("#80b1d3","#fdb461","#8dd3c7"))+
scale_y_continuous(limits = c(0,3000),
expand = expansion(mult = c(0,0)),
labels = scales::comma)+
theme_classic()+
theme(axis.line.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
axis.text.x = element_blank())+
labs(y="Number of\nMAGs")+
theme(legend.position = "none") -> p1
p1
频率分布直方图2
ggplot(data = fig1b.dat,aes(y=`% Contamination`))+
geom_histogram(aes(fill=`Genome quality`),
binwidth = 0.06,
color="grey",
linewidth=0.1)+
scale_fill_manual(values = c("#80b1d3","#fdb461","#8dd3c7"))+
scale_x_continuous(limits = c(0,4000),
expand = expansion(mult = c(0,0)),
labels = scales::comma)+
theme_classic()+
theme(axis.line.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
axis.text.y = element_blank())+
labs(x="Number of MAGs")+
theme(legend.position = "none") -> p4
p4
散点图
ggplot(data = fig1b.dat,aes(x=`% Completeness`,y=`% Contamination`))+
geom_point(aes(color=`Genome quality`))+
scale_color_manual(values = c("#80b1d3","#fdb461","#8dd3c7"))+
theme_bw()+
theme(panel.grid = element_blank(),
legend.position = "none") -> p3
p3
三个图进行组合
wrap_plots(p1,plot_spacer(),p3,p4)+
plot_layout(heights = c(1,4),widths = c(3,1))
饼图和右侧的图例暂时不知道如何用代码添加了,出图后借助其他软件编辑吧
欢迎大家关注我的公众号
小明的数据分析笔记本
小明的数据分析笔记本 公众号 主要分享:1、R语言和python做数据分析和数据可视化的简单小例子;2、园艺植物相关转录组学、基因组学、群体遗传学文献阅读笔记;3、生物信息学入门学习资料及自己的学习笔记!