Title: | Group Testing Procedures for Signal Detection and Goodness-of-Fit |
---|---|
Description: | It provides cumulative distribution function (CDF), quantile, p-value, statistical power calculator and random number generator for a collection of group-testing procedures, including the Higher Criticism tests, the one-sided Kolmogorov-Smirnov tests, the one-sided Berk-Jones tests, the one-sided phi-divergence tests, etc. The input are a group of p-values. The null hypothesis is that they are i.i.d. Uniform(0,1). In the context of signal detection, the null hypothesis means no signals. In the context of the goodness-of-fit testing, which contrasts a group of i.i.d. random variables to a given continuous distribution, the input p-values can be obtained by the CDF transformation. The null hypothesis means that these random variables follow the given distribution. For reference, see [1]Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and power of optimal signal-detection statistics in finite case", IEEE Transactions on Signal Processing (2020) 68, 1021-1033; [2] Hong Zhang and Zheyang Wu. "The general goodness-of-fit tests for correlated data", Computational Statistics & Data Analysis (2022) 167, 107379. |
Authors: | Hong Zhang and Zheyang Wu |
Maintainer: | Hong Zhang <[email protected]> |
License: | GPL-2 |
Version: | 0.3.0 |
Built: | 2025-01-12 03:03:07 UTC |
Source: | https://github.com/cran/SetTest |
CDF of Berk-Jones statitic under the null hypothesis.
pbj(q, M, k0, k1, onesided = FALSE, method = "ecc", ei = NULL)
pbj(q, M, k0, k1, onesided = FALSE, method = "ecc", ei = NULL)
q |
- quantile, must be a scalar. |
M |
- correlation matrix of input statistics (of the input p-values). |
k0 |
- search range starts from the k0th smallest p-value. |
k1 |
- search range ends at the k1th smallest p-value. |
onesided |
- TRUE if the input p-values are one-sided. |
method |
- default = "ecc": the effective correlation coefficient method in reference 2. "ave": the average method in reference 3, which is an earlier version of reference 2. The "ecc" method is more accurate and numerically stable than "ave" method. |
ei |
- the eigenvalues of M if available. |
The left-tail probability of the null distribution of B-J statistic at the given quantile.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and power of optimal signal-detection statistics in finite case", IEEE Transactions on Signal Processing (2020) 68, 1021-1033 2. Hong Zhang and Zheyang Wu. "The general goodness-of-fit tests for correlated data", Computational Statistics & Data Analysis (2022) 167, 107379 3. Hong Zhang and Zheyang Wu. "Generalized Goodness-Of-Fit Tests for Correlated Data", arXiv:1806.03668.
stat.bj
for the definition of the statistic.
pval <- runif(10) bjstat <- stat.phi(pval, s=1, k0=1, k1=10)$value pbj(q=bjstat, M=diag(10), k0=1, k1=10)
pval <- runif(10) bjstat <- stat.phi(pval, s=1, k0=1, k1=10)$value pbj(q=bjstat, M=diag(10), k0=1, k1=10)
CDF of Higher Criticism statistic under the null hypothesis.
phc(q, M, k0, k1, onesided = FALSE, method = "ecc", ei = NULL)
phc(q, M, k0, k1, onesided = FALSE, method = "ecc", ei = NULL)
q |
- quantile, must be a scalar. |
M |
- correlation matrix of input statistics (of the input p-values). |
k0 |
- search range starts from the k0th smallest p-value. |
k1 |
- search range ends at the k1th smallest p-value. |
onesided |
- TRUE if the input p-values are one-sided. |
method |
- default = "ecc": the effective correlation coefficient method in reference 2. "ave": the average method in reference 3, which is an earlier version of reference 2. The "ecc" method is more accurate and numerically stable than "ave" method. |
ei |
- the eigenvalues of M if available. |
The left-tail probability of the null distribution of HC statistic at the given quantile.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and power of optimal signal-detection statistics in finite case", IEEE Transactions on Signal Processing (2020) 68, 1021-1033 2. Hong Zhang and Zheyang Wu. "The general goodness-of-fit tests for correlated data", Computational Statistics & Data Analysis (2022) 167, 107379 3. Hong Zhang and Zheyang Wu. "Generalized Goodness-Of-Fit Tests for Correlated Data", arXiv:1806.03668.
stat.hc
for the definition of the statistic.
pval <- runif(10) hcstat <- stat.phi(pval, s=2, k0=1, k1=5)$value phc(q=hcstat, M=diag(10), k0=1, k1=10)
pval <- runif(10) hcstat <- stat.phi(pval, s=2, k0=1, k1=5)$value phc(q=hcstat, M=diag(10), k0=1, k1=10)
Statistical power of Berk and Jones test.
power.bj( alpha, n, beta, method = "gaussian-gaussian", eps = 0, mu = 0, df = 1, delta = 0 )
power.bj( alpha, n, beta, method = "gaussian-gaussian", eps = 0, mu = 0, df = 1, delta = 0 )
alpha |
- type-I error rate. |
n |
- dimension parameter, i.e. the number of input statitics to construct B-J statistic. |
beta |
- search range parameter. Search range = (1, beta*n). Beta must be between 1/n and 1. |
method |
- different alternative hypothesis, including mixtures such as, "gaussian-gaussian", "gaussian-t", "t-t", "chisq-chisq", and "exp-chisq". By default, we use Gaussian mixture. |
eps |
- mixing parameter of the mixture. |
mu |
- mean of non standard Gaussian model. |
df |
- degree of freedom of t/Chisq distribution and exp distribution. |
delta |
- non-cenrality of t/Chisq distribution. |
We consider the following hypothesis test,
Specifically, and
, where
is the mixing parameter,
and
is
speified by the "method" argument:
"gaussian-gaussian": is the standard normal CDF and
is the CDF of normal distribution with
defined by mu and
.
"gaussian-t": is the standard normal CDF and
is the CDF of t distribution with degree of freedom defined by df.
"t-t": is the CDF of t distribution with degree of freedom defined by df and
is the CDF of non-central t distribution with degree of freedom defined by df and non-centrality defined by delta.
"chisq-chisq": is the CDF of Chisquare distribution with degree of freedom defined by df and
is the CDF of non-central Chisquare distribution with degree of freedom defined by df and non-centrality defined by delta.
"exp-chisq": is the CDF of exponential distribution with parameter defined by df and
is the CDF of non-central Chisqaure distribution with degree of freedom defined by df and non-centrality defined by delta.
Power of BJ test.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and Statistical Power of Optimal Signal-Detection Methods In Finite Cases", submitted.
2. Donoho, David; Jin, Jiashun. "Higher criticism for detecting sparse heterogeneous mixtures". Annals of Statistics 32 (2004).
3. Jager, Leah; Wellner, Jon A. "Goodness-of-fit tests via phi-divergences". Annals of Statistics 35 (2007).
4. Berk, R.H. & Jones, D.H. Z. "Goodness-of-fit test statistics that dominate the Kolmogorov statistics". Wahrscheinlichkeitstheorie verw Gebiete (1979) 47: 47.
stat.bj
for the definition of the statistic.
power.bj(0.05, n=10, beta=0.5, eps = 0.1, mu = 1.2)
power.bj(0.05, n=10, beta=0.5, eps = 0.1, mu = 1.2)
Statistical power of Higher Criticism test.
power.hc( alpha, n, beta, method = "gaussian-gaussian", eps = 0, mu = 0, df = 1, delta = 0 )
power.hc( alpha, n, beta, method = "gaussian-gaussian", eps = 0, mu = 0, df = 1, delta = 0 )
alpha |
- type-I error rate. |
n |
- dimension parameter, i.e. the number of input statitics to construct Higher Criticism statistic. |
beta |
- search range parameter. Search range = (1, beta*n). Beta must be between 1/n and 1. |
method |
- different alternative hypothesis, including mixtures such as, "gaussian-gaussian", "gaussian-t", "t-t", "chisq-chisq", and "exp-chisq". By default, we use Gaussian mixture. |
eps |
- mixing parameter of the mixture. |
mu |
- mean of non standard Gaussian model. |
df |
- degree of freedom of t/Chisq distribution and exp distribution. |
delta |
- non-cenrality of t/Chisq distribution. |
We consider the following hypothesis test,
Specifically, and
, where
is the mixing parameter,
and
is
speified by the "method" argument:
"gaussian-gaussian": is the standard normal CDF and
is the CDF of normal distribution with
defined by mu and
.
"gaussian-t": is the standard normal CDF and
is the CDF of t distribution with degree of freedom defined by df.
"t-t": is the CDF of t distribution with degree of freedom defined by df and
is the CDF of non-central t distribution with degree of freedom defined by df and non-centrality defined by delta.
"chisq-chisq":
is the CDF of Chisquare distribution with degree of freedom defined by df and
is the CDF of non-central Chisquare distribution with degree of freedom defined by df and non-centrality defined by delta.
"exp-chisq": is the CDF of exponential distribution with parameter defined by df and
is the CDF of non-central Chisqaure distribution with degree of freedom defined by df and non-centrality defined by delta.
Power of HC test.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and Statistical Power of Optimal Signal-Detection Methods In Finite Cases", submitted.
2. Donoho, David; Jin, Jiashun. "Higher criticism for detecting sparse heterogeneous mixtures". Annals of Statistics 32 (2004).
stat.hc
for the definition of the statistic.
power.hc(0.05, n=10, beta=0.5, eps = 0.1, mu = 1.2)
power.hc(0.05, n=10, beta=0.5, eps = 0.1, mu = 1.2)
Statistical power of phi-divergence test.
power.phi( alpha, n, s, beta, method = "gaussian-gaussian", eps = 0, mu = 0, df = 1, delta = 0 )
power.phi( alpha, n, s, beta, method = "gaussian-gaussian", eps = 0, mu = 0, df = 1, delta = 0 )
alpha |
- type-I error rate. |
n |
- dimension parameter, i.e. the number of input statitics to construct phi-divergence statistic. |
s |
- phi-divergence parameter. s = 2 is the higher criticism statitic.s = 1 is the Berk and Jones statistic. |
beta |
- search range parameter. Search range = (1, beta*n). Beta must be between 1/n and 1. |
method |
- different alternative hypothesis, including mixtures such as, "gaussian-gaussian", "gaussian-t", "t-t", "chisq-chisq", and "exp-chisq". By default, we use Gaussian mixture. |
eps |
- mixing parameter of the mixture. |
mu |
- mean of non standard Gaussian model. |
df |
- degree of freedom of t/Chisq distribution and exp distribution. |
delta |
- non-cenrality of t/Chisq distribution. |
We consider the following hypothesis test,
Specifically, and
, where
is the mixing parameter,
and
is
speified by the "method" argument:
"gaussian-gaussian": is the standard normal CDF and
is the CDF of normal distribution with
defined by mu and
.
"gaussian-t": is the standard normal CDF and
is the CDF of t distribution with degree of freedom defined by df.
"t-t": is the CDF of t distribution with degree of freedom defined by df and
is the CDF of non-central t distribution with degree of freedom defined by df and non-centrality defined by delta.
"chisq-chisq": is the CDF of Chisquare distribution with degree of freedom defined by df and
is the CDF of non-central Chisquare distribution with degree of freedom defined by df and non-centrality defined by delta.
"exp-chisq": is the CDF of exponential distribution with parameter defined by df and
is the CDF of non-central Chisqaure distribution with degree of freedom defined by df and non-centrality defined by delta.
Power of phi-divergence test.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and Statistical Power of Optimal Signal-Detection Methods In Finite Cases", submitted.
2. Donoho, David; Jin, Jiashun. "Higher criticism for detecting sparse heterogeneous mixtures". Annals of Statistics 32 (2004).
stat.phi
for the definition of the statistic.
#If the alternative hypothesis Gaussian mixture with eps = 0.1 and mu = 1.2:# power.phi(0.05, n=10, s=2, beta=0.5, eps = 0.1, mu = 1.2)
#If the alternative hypothesis Gaussian mixture with eps = 0.1 and mu = 1.2:# power.phi(0.05, n=10, s=2, beta=0.5, eps = 0.1, mu = 1.2)
calculate the left-tail probability of phi-divergence under general correlation matrix.
pphi(q, M, k0, k1, s = 2, t = 30, onesided = FALSE, method = "ecc", ei = NULL)
pphi(q, M, k0, k1, s = 2, t = 30, onesided = FALSE, method = "ecc", ei = NULL)
q |
- quantile, must be a scalar. |
M |
- correlation matrix of input statistics (of the input p-values). |
k0 |
- search range starts from the k0th smallest p-value. |
k1 |
- search range ends at the k1th smallest p-value. |
s |
- the phi-divergence test parameter. |
t |
- numerical truncation parameter. |
onesided |
- TRUE if the input p-values are one-sided. |
method |
- default = "ecc": the effective correlation coefficient method in reference 2. "ave": the average method in reference 3, which is an earlier version of reference 2. The "ecc" method is more accurate and numerically stable than "ave" method. |
ei |
- the eigenvalues of M if available. |
Left-tail probability of the phi-divergence statistics.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and power of optimal signal-detection statistics in finite case", IEEE Transactions on Signal Processing (2020) 68, 1021-1033 2. Hong Zhang and Zheyang Wu. "The general goodness-of-fit tests for correlated data", Computational Statistics & Data Analysis (2022) 167, 107379 3. Hong Zhang and Zheyang Wu. "Generalized Goodness-Of-Fit Tests for Correlated Data", arXiv:1806.03668.
M = toeplitz(1/(1:10)*(-1)^(0:9)) #alternating polynomial decaying correlation matrix pphi(q=2, M=M, k0=1, k1=5, s=2) pphi(q=2, M=M, k0=1, k1=5, s=2, method = "ecc") pphi(q=2, M=M, k0=1, k1=5, s=2, method = "ave") pphi(q=2, M=diag(10), k0=1, k1=5, s=2)
M = toeplitz(1/(1:10)*(-1)^(0:9)) #alternating polynomial decaying correlation matrix pphi(q=2, M=M, k0=1, k1=5, s=2) pphi(q=2, M=M, k0=1, k1=5, s=2, method = "ecc") pphi(q=2, M=M, k0=1, k1=5, s=2, method = "ave") pphi(q=2, M=diag(10), k0=1, k1=5, s=2)
calculate the left-tail probability of omnibus phi-divergence statistics under general correlation matrix.
pphi.omni(q, M, K0, K1, S, t = 30, onesided = FALSE, method = "ecc", ei = NULL)
pphi.omni(q, M, K0, K1, S, t = 30, onesided = FALSE, method = "ecc", ei = NULL)
q |
- quantile, must be a scalar. |
M |
- correlation matrix of input statistics (of the input p-values). |
K0 |
- vector of search range starts (from the k0th smallest p-value). |
K1 |
- vector of search range ends (at the k1th smallest p-value). |
S |
- vector of the phi-divergence test parameters. |
t |
- numerical truncation parameter. |
onesided |
- TRUE if the input p-values are one-sided. |
method |
- default = "ecc": the effective correlation coefficient method in reference 2. "ave": the average method in reference 3, which is an earlier version of reference 2. The "ecc" method is more accurate and numerically stable than "ave" method. |
ei |
- the eigenvalues of M if available. |
Left-tail probability of omnibus phi-divergence statistics.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and power of optimal signal-detection statistics in finite case", IEEE Transactions on Signal Processing (2020) 68, 1021-1033 2. Hong Zhang and Zheyang Wu. "The general goodness-of-fit tests for correlated data", Computational Statistics & Data Analysis (2022) 167, 107379 3. Hong Zhang and Zheyang Wu. "Generalized Goodness-Of-Fit Tests for Correlated Data", arXiv:1806.03668.
M = matrix(0.3,10,10) + diag(1-0.3, 10) pphi.omni(0.05, M=M, K0=rep(1,2), K1=rep(5,2), S=c(1,2))
M = matrix(0.3,10,10) + diag(1-0.3, 10) pphi.omni(0.05, M=M, K0=rep(1,2), K1=rep(5,2), S=c(1,2))
Quantile of Berk-Jones statistic under the null hypothesis.
qbj(p, M, k0, k1, onesided = FALSE, method = "ecc", ei = NULL, err_thr = 1e-04)
qbj(p, M, k0, k1, onesided = FALSE, method = "ecc", ei = NULL, err_thr = 1e-04)
p |
- a scalar left probability that defines the quantile. |
M |
- correlation matrix of input statistics (of the input p-values). |
k0 |
- search range starts from the k0th smallest p-value. |
k1 |
- search range ends at the k1th smallest p-value. |
onesided |
- TRUE if the input p-values are one-sided. |
method |
- default = "ecc": the effective correlation coefficient method in reference 2. "ave": the average method in reference 3, which is an earlier version of reference 2. The "ecc" method is more accurate and numerically stable than "ave" method. |
ei |
- the eigenvalues of M if available. |
err_thr |
- the error threshold. The default value is 1e-4. |
Quantile of BJ statistics.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and power of optimal signal-detection statistics in finite case", IEEE Transactions on Signal Processing (2020) 68, 1021-1033 2. Hong Zhang and Zheyang Wu. "The general goodness-of-fit tests for correlated data", Computational Statistics & Data Analysis (2022) 167, 107379 3. Hong Zhang and Zheyang Wu. "Generalized Goodness-Of-Fit Tests for Correlated Data", arXiv:1806.03668.
stat.bj
for the definition of the statistic.
## The 0.05 critical value of BJ statistic when n = 10: qbj(p=.95, M=diag(10), k0=1, k1=5, onesided=FALSE) qbj(p=1-1e-5, M=diag(10), k0=1, k1=5, onesided=FALSE, err_thr=1e-8)
## The 0.05 critical value of BJ statistic when n = 10: qbj(p=.95, M=diag(10), k0=1, k1=5, onesided=FALSE) qbj(p=1-1e-5, M=diag(10), k0=1, k1=5, onesided=FALSE, err_thr=1e-8)
Quantile of Higher Criticism statistics under the null hypothesis.
qhc(p, M, k0, k1, onesided = FALSE, method = "ecc", ei = NULL, err_thr = 1e-04)
qhc(p, M, k0, k1, onesided = FALSE, method = "ecc", ei = NULL, err_thr = 1e-04)
p |
- a scalar left probability that defines the quantile. |
M |
- correlation matrix of input statistics (of the input p-values). |
k0 |
- search range starts from the k0th smallest p-value. |
k1 |
- search range ends at the k1th smallest p-value. |
onesided |
- TRUE if the input p-values are one-sided. |
method |
- default = "ecc": the effective correlation coefficient method in reference 2. "ave": the average method in reference 3, which is an earlier version of reference 2. The "ecc" method is more accurate and numerically stable than "ave" method. |
ei |
- the eigenvalues of M if available. |
err_thr |
- the error threshold. The default value is 1e-4. |
Quantile of HC statistics.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and power of optimal signal-detection statistics in finite case", IEEE Transactions on Signal Processing (2020) 68, 1021-1033 2. Hong Zhang and Zheyang Wu. "The general goodness-of-fit tests for correlated data", Computational Statistics & Data Analysis (2022) 167, 107379 3. Hong Zhang and Zheyang Wu. "Generalized Goodness-Of-Fit Tests for Correlated Data", arXiv:1806.03668.
stat.hc
for the definition of the statistic.
## The 0.05 critical value of HC statistic when n = 10: qhc(p=.95, M=diag(10), k0=1, k1=5, onesided=FALSE) qhc(p=1-1e-5, M=diag(10), k0=1, k1=5, onesided=FALSE, err_thr=1e-8)
## The 0.05 critical value of HC statistic when n = 10: qhc(p=.95, M=diag(10), k0=1, k1=5, onesided=FALSE) qhc(p=1-1e-5, M=diag(10), k0=1, k1=5, onesided=FALSE, err_thr=1e-8)
Quantile of phi-divergence statistic under the null hypothesis.
qphi( p, M, k0, k1, s = 2, t = 30, onesided = FALSE, method = "ecc", ei = NULL, err_thr = 1e-04 )
qphi( p, M, k0, k1, s = 2, t = 30, onesided = FALSE, method = "ecc", ei = NULL, err_thr = 1e-04 )
p |
- a scalar left probability that defines the quantile. |
M |
- correlation matrix of input statistics (of the input p-values). |
k0 |
- search range starts from the k0th smallest p-value. |
k1 |
- search range ends at the k1th smallest p-value. |
s |
- the phi-divergence test parameter. |
t |
- numerical truncation parameter. |
onesided |
- TRUE if the input p-values are one-sided. |
method |
- default = "ecc": the effective correlation coefficient method in reference 2. "ave": the average method in reference 3, which is an earlier version of reference 2. The "ecc" method is more accurate and numerically stable than "ave" method. |
ei |
- the eigenvalues of M if available. |
err_thr |
- the error threshold. The default value is 1e-4. |
Quantile of the phi-divergence statistics.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and power of optimal signal-detection statistics in finite case", IEEE Transactions on Signal Processing (2020) 68, 1021-1033 2. Hong Zhang and Zheyang Wu. "The general goodness-of-fit tests for correlated data", Computational Statistics & Data Analysis (2022) 167, 107379 3. Hong Zhang and Zheyang Wu. "Generalized Goodness-Of-Fit Tests for Correlated Data", arXiv:1806.03668.
stat.phi
for the definition of the statistic.
qphi(p=.95, M=diag(10), k0=1, k1=5, s=2, onesided=FALSE) qphi(p=1-1e-3, M=diag(10), k0=1, k1=5, s=2, onesided=FALSE) qphi(p=1-1e-3, M=diag(10), k0=1, k1=5, s=2, onesided=FALSE, err_thr = 1e-6) qphi(p=1-1e-5, M=diag(10), k0=1, k1=5, s=2, onesided=FALSE) qphi(p=1-1e-5, M=diag(10), k0=1, k1=5, s=2, onesided=FALSE, err_thr = 1e-6) qphi(p=1-1e-5, M=diag(10), k0=1, k1=5, s=2, onesided=FALSE, err_thr = 1e-8)
qphi(p=.95, M=diag(10), k0=1, k1=5, s=2, onesided=FALSE) qphi(p=1-1e-3, M=diag(10), k0=1, k1=5, s=2, onesided=FALSE) qphi(p=1-1e-3, M=diag(10), k0=1, k1=5, s=2, onesided=FALSE, err_thr = 1e-6) qphi(p=1-1e-5, M=diag(10), k0=1, k1=5, s=2, onesided=FALSE) qphi(p=1-1e-5, M=diag(10), k0=1, k1=5, s=2, onesided=FALSE, err_thr = 1e-6) qphi(p=1-1e-5, M=diag(10), k0=1, k1=5, s=2, onesided=FALSE, err_thr = 1e-8)
Construct Berk and Jones (BJ) statistics.
stat.bj(p, k0 = 1, k1 = NA)
stat.bj(p, k0 = 1, k1 = NA)
p |
- vector of input p-values. |
k0 |
- search range left end parameter. Default k0 = 1. |
k1 |
- search range right end parameter. Default k1 = 0.5*number of input p-values. |
Let ,
be a sequence of ordered p-values, the Berk and Jones statistic
and when ,
, otherwise
.
value - BJ statistic constructed from a vector of p-values.
location - the order of the p-values to obtain BJ statistic.
stat - vector of marginal BJ statistics.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and Statistical Power of Optimal Signal-Detection Methods In Finite Cases", submitted.
2. Jager, Leah; Wellner, Jon A. "Goodness-of-fit tests via phi-divergences". Annals of Statistics 35 (2007).
3. Berk, R.H. & Jones, D.H. Z. "Goodness-of-fit test statistics that dominate the Kolmogorov statistics". Wahrscheinlichkeitstheorie verw Gebiete (1979) 47: 47.
stat.bj(runif(10)) #When the input are statistics# stat.test = rnorm(20) p.test = 1 - pnorm(stat.test) stat.bj(p.test, k0 = 2, k1 = 20)
stat.bj(runif(10)) #When the input are statistics# stat.test = rnorm(20) p.test = 1 - pnorm(stat.test) stat.bj(p.test, k0 = 2, k1 = 20)
Construct Higher Criticism (HC) statistics.
stat.hc(p, k0 = 1, k1 = NA)
stat.hc(p, k0 = 1, k1 = NA)
p |
- vector of input p-values. |
k0 |
- search range left end parameter. Default k0 = 1. |
k1 |
- search range right end parameter. Default k1 = 0.5*number of input p-values. |
Let ,
be a sequence of ordered p-values, the higher criticism statistic
value - HC statistic constructed from a vector of p-values.
location - the order of the p-values to obtain HC statistic.
stat - vector of marginal HC statistics.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and Statistical Power of Optimal Signal-Detection Methods In Finite Cases", submitted.
2. Donoho, David; Jin, Jiashun. "Higher criticism for detecting sparse heterogeneous mixtures". Annals of Statistics 32 (2004).
stat.hc(runif(10)) #When the input are statistics# stat.test = rnorm(20) p.test = 1 - pnorm(stat.test) stat.hc(p.test, k0 = 1, k1 = 10)
stat.hc(runif(10)) #When the input are statistics# stat.test = rnorm(20) p.test = 1 - pnorm(stat.test) stat.hc(p.test, k0 = 1, k1 = 10)
Construct phi-divergence statistics.
stat.phi(p, s, k0 = 1, k1 = NA)
stat.phi(p, s, k0 = 1, k1 = NA)
p |
- vector of input p-values. |
s |
- phi-divergence parameter. s = 2 is the higher criticism statitic.s = 1 is the Berk and Jones statistic. |
k0 |
- search range left end parameter. Default k0 = 1. |
k1 |
- search range right end parameter. Default k1 = 0.5*number of input p-values. |
Let ,
be a sequence of ordered p-values, the phi-divergence statistic
and when ,
, otherwise
.
value - phi-divergence statistic constructed from a vector of p-values.
location - the order of the p-values to obtain phi-divergence statistic.
stat - vector of marginal phi-divergence statistics.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and Statistical Power of Optimal Signal-Detection Methods In Finite Cases", submitted.
2. Jager, Leah; Wellner, Jon A. "Goodness-of-fit tests via phi-divergences". Annals of Statistics 35 (2007).
stat.phi(runif(10), s = 2) #When the input are statistics# stat.test = rnorm(20) p.test = 1 - pnorm(stat.test) stat.phi(p.test, s = 0.5, k0 = 2, k1 = 5)
stat.phi(runif(10), s = 2) #When the input are statistics# stat.test = rnorm(20) p.test = 1 - pnorm(stat.test) stat.phi(p.test, s = 0.5, k0 = 2, k1 = 5)
calculate the omnibus phi-divergence statistics under general correlation matrix.
stat.phi.omni( p, M, K0 = rep(1, 2), K1 = rep(length(M[1, ]), 2), S = c(1, 2), t = 30, onesided = FALSE, method = "ecc", ei = NULL )
stat.phi.omni( p, M, K0 = rep(1, 2), K1 = rep(length(M[1, ]), 2), S = c(1, 2), t = 30, onesided = FALSE, method = "ecc", ei = NULL )
p |
- input pvalues. |
M |
- correlation matrix of input statistics (of the input p-values). |
K0 |
- vector of search range starts (from the k0th smallest p-value). |
K1 |
- vector of search range ends (at the k1th smallest p-value). |
S |
- vector of the phi-divergence test parameters. |
t |
- numerical truncation parameter. |
onesided |
- TRUE if the input p-values are one-sided. |
method |
- default = "ecc": the effective correlation coefficient method in reference 2. "ave": the average method in reference 3, which is an earlier version of reference 2. The "ecc" method is more accurate and numerically stable than "ave" method. |
ei |
- the eigenvalues of M if available. |
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and power of optimal signal-detection statistics in finite case", IEEE Transactions on Signal Processing (2020) 68, 1021-1033 2. Hong Zhang and Zheyang Wu. "The general goodness-of-fit tests for correlated data", Computational Statistics & Data Analysis (2022) 167, 107379 3. Hong Zhang and Zheyang Wu. "Generalized Goodness-Of-Fit Tests for Correlated Data", arXiv:1806.03668.
p.test = runif(10) M = toeplitz(1/(1:10)*(-1)^(0:9)) #alternating polynomial decaying correlation matrix stat.phi.omni(p.test, M=M, K0=rep(1,2), K1=rep(5,2), S=c(1,2))
p.test = runif(10) M = toeplitz(1/(1:10)*(-1)^(0:9)) #alternating polynomial decaying correlation matrix stat.phi.omni(p.test, M=M, K0=rep(1,2), K1=rep(5,2), S=c(1,2))
Multiple comparison test using Berk and Jones (BJ) statitics.
test.bj(prob, M, k0, k1, onesided = FALSE, method = "ecc", ei = NULL)
test.bj(prob, M, k0, k1, onesided = FALSE, method = "ecc", ei = NULL)
prob |
- vector of input p-values. |
M |
- correlation matrix of input statistics (of the input p-values). |
k0 |
- search range starts from the k0th smallest p-value. |
k1 |
- search range ends at the k1th smallest p-value. |
onesided |
- TRUE if the input p-values are one-sided. |
method |
- default = "ecc": the effective correlation coefficient method in reference 2. "ave": the average method in reference 3, which is an earlier version of reference 2. The "ecc" method is more accurate and numerically stable than "ave" method. |
ei |
- the eigenvalues of M if available. |
pvalue - the p-value of the Berk-Jones test.
bjstat - the Berk-Jones statistic.
location - the order of the input p-values to obtain BJ statistic.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and power of optimal signal-detection statistics in finite case", IEEE Transactions on Signal Processing (2020) 68, 1021-1033 2. Hong Zhang and Zheyang Wu. "The general goodness-of-fit tests for correlated data", Computational Statistics & Data Analysis (2022) 167, 107379 3. Hong Zhang and Zheyang Wu. "Generalized Goodness-Of-Fit Tests for Correlated Data", arXiv:1806.03668. 4. Leah Jager and Jon Wellner. "Goodness-of-fit tests via phi-divergences". Annals of Statistics 35 (2007).
stat.bj
for the definition of the statistic.
test.bj(runif(10), M=diag(10), k0=1, k1=10) #When the input are statistics# stat.test = rnorm(20) p.test = 2*(1 - pnorm(abs(stat.test))) test.bj(p.test, M=diag(20), k0=1, k1=10)
test.bj(runif(10), M=diag(10), k0=1, k1=10) #When the input are statistics# stat.test = rnorm(20) p.test = 2*(1 - pnorm(abs(stat.test))) test.bj(p.test, M=diag(20), k0=1, k1=10)
Multiple comparison test using Higher Criticism (HC) statitics.
test.hc(prob, M, k0, k1, onesided = FALSE, method = "ecc", ei = NULL)
test.hc(prob, M, k0, k1, onesided = FALSE, method = "ecc", ei = NULL)
prob |
- vector of input p-values. |
M |
- correlation matrix of input statistics (of the input p-values). |
k0 |
- search range starts from the k0th smallest p-value. |
k1 |
- search range ends at the k1th smallest p-value. |
onesided |
- TRUE if the input p-values are one-sided. |
method |
- default = "ecc": the effective correlation coefficient method in reference 2. "ave": the average method in reference 3, which is an earlier version of reference 2. The "ecc" method is more accurate and numerically stable than "ave" method. |
ei |
- the eigenvalues of M if available. |
pvalue - The p-value of the HC test.
hcstat - HC statistic.
location - the order of the input p-values to obtain HC statistic.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and power of optimal signal-detection statistics in finite case", IEEE Transactions on Signal Processing (2020) 68, 1021-1033 2. Hong Zhang and Zheyang Wu. "The general goodness-of-fit tests for correlated data", Computational Statistics & Data Analysis (2022) 167, 107379 3. Hong Zhang and Zheyang Wu. "Generalized Goodness-Of-Fit Tests for Correlated Data", arXiv:1806.03668. 4. Donoho, David; Jin, Jiashun. "Higher criticism for detecting sparse heterogeneous mixtures". Annals of Statistics 32 (2004).
stat.hc
for the definition of the statistic.
pval.test = runif(10) test.hc(pval.test, M=diag(10), k0=1, k1=10) #When the input are statistics# stat.test = rnorm(20) p.test = 2*(1 - pnorm(abs(stat.test))) test.hc(p.test, M=diag(20), k0=1, k1=10)
pval.test = runif(10) test.hc(pval.test, M=diag(10), k0=1, k1=10) #When the input are statistics# stat.test = rnorm(20) p.test = 2*(1 - pnorm(abs(stat.test))) test.hc(p.test, M=diag(20), k0=1, k1=10)
Multiple comparison test using phi-divergence statistics.
test.phi(prob, M, k0, k1, s = 2, onesided = FALSE, method = "ecc", ei = NULL)
test.phi(prob, M, k0, k1, s = 2, onesided = FALSE, method = "ecc", ei = NULL)
prob |
- vector of input p-values. |
M |
- correlation matrix of input statistics (of the input p-values). |
k0 |
- search range starts from the k0th smallest p-value. |
k1 |
- search range ends at the k1th smallest p-value. |
s |
- phi-divergence parameter. s = 2 is the higher criticism statitic.s = 1 is the Berk and Jones statistic. |
onesided |
- TRUE if the input p-values are one-sided. |
method |
- default = "ecc": the effective correlation coefficient method in reference 2. "ave": the average method in reference 3, which is an earlier version of reference 2. The "ecc" method is more accurate and numerically stable than "ave" method. |
ei |
- the eigenvalues of M if available. |
pvalue - The p-value of the phi-divergence test.
phistat - phi-diergence statistic.
location - the order of the input p-values to obtain phi-divergence statistic.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and power of optimal signal-detection statistics in finite case", IEEE Transactions on Signal Processing (2020) 68, 1021-1033 2. Hong Zhang and Zheyang Wu. "The general goodness-of-fit tests for correlated data", Computational Statistics & Data Analysis (2022) 167, 107379 3. Hong Zhang and Zheyang Wu. "Generalized Goodness-Of-Fit Tests for Correlated Data", arXiv:1806.03668. 4. Leah Jager and Jon Wellner. "Goodness-of-fit tests via phi-divergences". Annals of Statistics 35 (2007).
stat.phi
for the definition of the statistic.v
stat.test = rnorm(20) # Z-scores p.test = 2*(1 - pnorm(abs(stat.test))) test.phi(p.test, M=diag(20), s = 0.5, k0=1, k1=10) test.phi(p.test, M=diag(20), s = 1, k0=1, k1=10) test.phi(p.test, M=diag(20), s = 2, k0=1, k1=10)
stat.test = rnorm(20) # Z-scores p.test = 2*(1 - pnorm(abs(stat.test))) test.phi(p.test, M=diag(20), s = 0.5, k0=1, k1=10) test.phi(p.test, M=diag(20), s = 1, k0=1, k1=10) test.phi(p.test, M=diag(20), s = 2, k0=1, k1=10)
calculate the right-tail probability of omnibus phi-divergence statistics under general correlation matrix.
test.phi.omni(prob, M, K0, K1, S, onesided = FALSE, method = "ecc", ei = NULL)
test.phi.omni(prob, M, K0, K1, S, onesided = FALSE, method = "ecc", ei = NULL)
prob |
- vector of input p-values. |
M |
- correlation matrix of input statistics (of the input p-values). |
K0 |
- vector of search range starts (from the k0th smallest p-value). |
K1 |
- vector of search range ends (at the k1th smallest p-value). |
S |
- vector of the phi-divergence test parameters. |
onesided |
- TRUE if the input p-values are one-sided. |
method |
- default = "ecc": the effective correlation coefficient method in reference 2. "ave": the average method in reference 3, which is an earlier version of reference 2. The "ecc" method is more accurate and numerically stable than "ave" method. |
ei |
- the eigenvalues of M if available. |
p-value of the omnibus test.
p-values of the individual phi-divergence test.
1. Hong Zhang, Jiashun Jin and Zheyang Wu. "Distributions and power of optimal signal-detection statistics in finite case", IEEE Transactions on Signal Processing (2020) 68, 1021-1033 2. Hong Zhang and Zheyang Wu. "The general goodness-of-fit tests for correlated data", Computational Statistics & Data Analysis (2022) 167, 107379 3. Hong Zhang and Zheyang Wu. "Generalized Goodness-Of-Fit Tests for Correlated Data", arXiv:1806.03668.
M = matrix(0.3,10,10) + diag(1-0.3, 10) test.phi.omni(runif(10), M=M, K0=rep(1,2), K1=rep(5,2), S=c(1,2))
M = matrix(0.3,10,10) + diag(1-0.3, 10) test.phi.omni(runif(10), M=M, K0=rep(1,2), K1=rep(5,2), S=c(1,2))