GSEA p-value和FDR计算方法

最新推荐文章于 2023-11-01 09:02:11 发布

JasonKQLin

最新推荐文章于 2023-11-01 09:02:11 发布

阅读量930

点赞数

分类专栏：统计文章标签：机器学习

本文链接：https://blog.csdn.net/linkequa/article/details/131458225

版权

统计专栏收录该内容

42 篇文章

订阅专栏

1，p-value计算方法

Estimating Significance. We assess the significance of an observed ES by comparing it with the set of scores ESNULL computed with randomly assigned phenotypes.

Randomly assign the original phenotype labels to samples, reorder genes, and re-compute ES(S).
Repeat step 1 for 1,000 permutations, and create a histogram of the corresponding enrichment scores ESNULL.
Estimate nominal P value for S from ESNULL by using the positive or negative portion of the distribution corresponding to the sign of the observed ES(S).

2，FDR计算方法

Multiple Hypothesis Testing.

Determine ES(S) for each gene set in the collection or database.
For each S and 1000 fixed permutations π of the phenotype labels, reorder the genes in L and determine ES(S, π).
Adjust for variation in gene set size. Normalize the ES(S, π) and the observed ES(S), separately rescaling the positive and negative scores by dividing by the mean of the ES(S, π) to yield the normalized scores NES(S, π) and NES(S) (see Supporting Text).
Compute FDR. Control the ratio of false positives to the total number of gene sets attaining a fixed level of significance separately for positive (negative) NES(S) and NES(S, π).

Create a histogram of all NES(S, π) over all S and π. Use this null distribution to compute an FDR q value, for a given NES(S) = NES* ≥ 0. The FDR is the ratio of the percentage of all (S, π) with NES(S, π) ≥ 0, whose NES(S, π) ≥ NES*, divided by the percentage of observed S with NES(S) ≥ 0, whose NES(S) ≥ NES*, and similarly if NES(S) = NES* ≤ 0.