Title: | Biclustering via Latent Block Model Adapted to Overdispersed Count Data |
---|---|
Description: | Implementation of a probabilistic method for biclustering adapted to overdispersed count data. It is a Gamma-Poisson Latent Block Model. It also implements two selection criteria in order to select the number of biclusters. |
Authors: | Julie Aubert [aut, cre] , INRAE [cph] |
Maintainer: | Julie Aubert <[email protected]> |
License: | GPL-3 |
Version: | 0.1.0 |
Built: | 2024-10-31 16:33:06 UTC |
Source: | https://github.com/julieaubert/cobiclust |
Perform a biclustering adapted to overdispersed count data.
cobiclust( x, K = 2, G = 3, nu_j = NULL, a = NULL, akg = FALSE, cvg_lim = 1e-05, nbiter = 5000, tol = 1e-04 )
cobiclust( x, K = 2, G = 3, nu_j = NULL, a = NULL, akg = FALSE, cvg_lim = 1e-05, nbiter = 5000, tol = 1e-04 )
x |
the input matrix of observed data. |
K |
an integer specifying the number of groups in rows. |
G |
an integer specifying the number of groups in columns. |
nu_j |
a vector of . The length is equal to the number of colums. |
a |
an numeric. |
akg |
a logical variable indicating whether to use a common dispersion parameter (akg = FALSE) or a dispersion parameter per cocluster (akg = TRUE). |
cvg_lim |
a number specifying the threshold used for convergence criterion (cvg_lim = 1e-05 by default). |
nbiter |
the maximal number of iterations for the global loop of variational EM algorithm (nbiter = 5000 by default). |
tol |
the level of relative iteration convergence tolerance (tol = 1e-04 by default). |
An object of class cobiclustering
cobiclustering
for the cobiclustering class.
npc <- c(50, 40) # nodes per class KG <- c(2, 3) # classes nm <- npc * KG # nodes Z <- diag( KG[1]) %x% matrix(1, npc[1], 1) W <- diag(KG[2]) %x% matrix(1, npc[2], 1) L <- 70 * matrix( runif( KG[1] * KG[2]), KG[1], KG[2]) M_in_expectation <- Z %*% L %*% t(W) size <- 50 M<-matrix( rnbinom( n = length(as.vector(M_in_expectation)), mu = as.vector(M_in_expectation), size = size) , nm[1], nm[2]) rownames(M) <- paste("OTU", 1:nrow(M), sep = "_") colnames(M) <- paste("S", 1:ncol(M), sep = "_") res <- cobiclust(M, K = 2, G = 3, nu_j = rep(1,120), a = 1/size, cvg_lim = 1e-5)
npc <- c(50, 40) # nodes per class KG <- c(2, 3) # classes nm <- npc * KG # nodes Z <- diag( KG[1]) %x% matrix(1, npc[1], 1) W <- diag(KG[2]) %x% matrix(1, npc[2], 1) L <- 70 * matrix( runif( KG[1] * KG[2]), KG[1], KG[2]) M_in_expectation <- Z %*% L %*% t(W) size <- 50 M<-matrix( rnbinom( n = length(as.vector(M_in_expectation)), mu = as.vector(M_in_expectation), size = size) , nm[1], nm[2]) rownames(M) <- paste("OTU", 1:nrow(M), sep = "_") colnames(M) <- paste("S", 1:ncol(M), sep = "_") res <- cobiclust(M, K = 2, G = 3, nu_j = rep(1,120), a = 1/size, cvg_lim = 1e-5)
Calculate selection criteria.
selection_criteria(x, K, G)
selection_criteria(x, K, G)
x |
The output of the cobiclust function. |
K |
The number of groups in rows. |
G |
The number of groups in columns. |
A dataframe with 7 columns.
vICL
the vICL selection criterion.
BIC
the BIC selection criterion.
penKG
the value of the BIC penalty.
lb
the value of the lower bound of the log-likelihood.
entZW
the value of the entropy of the latent variables Z and W.
K
the number of groups in rows.
G
the number of groups in columns.