cutnorm.tools package¶
Submodules¶
cutnorm.tools.sbm module¶
-
cutnorm.tools.sbm.
erdos_renyi
(n, p)¶ Generates Erdos Renyi random graph size n with probability p
Parameters: - n – int, size of the output matrix
- p – float, edge probability
Returns: Erdos Renyi random graph matrix 2d array, shape (n,n)
-
cutnorm.tools.sbm.
make_symmetric_triu
(mat)¶ Makes the matrix symmetric upper triangular
Parameters: mat – 2d array, shape (n,n) Returns: upper triangular symmetric matrix of the input 2d array, shape (n,n)
-
cutnorm.tools.sbm.
sbm
(community_sizes, prob_mat)¶ Generates a stochastic block matrix
Community_sizes indicate the size of each community and the probability matrix indicate the probability that a 1 will be generated for each element within the community.
Parameters: - community_sizes – 1d array, shape (n) sizes of community
- prob_mat – 2d array, shape (n,n) probability of edges for each community
Returns: stochastic block matrix, 2d array, shape depending on community sizes
-
cutnorm.tools.sbm.
sbm_autoregressive
(community_sizes, prob_list)¶ Generates an autoregressive SBM
An autoregressive SBM has edge probability according to the prob_list on the diagonal but (prob_list[i] * prob_list[j])**(abs(i - j)) for the off-diagonal blocks entries.
This idea is similar to the autoregressive models
Parameters: - community_sizes – 1d array, shape (n) sizes of community
- prob_list – 1d array, shape (n), where n is the number of diagonal blocks
Returns: An autoregressive SBM, 2d array, shape depending on community sizes
-
cutnorm.tools.sbm.
sbm_autoregressive_prob
(community_sizes, prob_list)¶ Generates the underlying probability matrix thatgives rise to the autoregressive SBM
Parameters: - community_sizes – 1d array, shape (n) sizes of community
- prob_list – 1d array, shape (n), where n is the number of diagonal blocks
Returns: A probability matrix for an autoregressive SBM, 2d array, shape depending on community sizes
-
cutnorm.tools.sbm.
sbm_prob
(community_sizes, prob_mat)¶ Generates a matrix indicating the underlying probability that gives rise to a stochastic block matrix
Parameters: - community_sizes – 1d array, shape (n) sizes of community
- prob_mat – 2d array, shape (n,n) probability of edges for each community
Returns: probabilities of a stochastic block matrix, 2d array, shape depending on community sizes
cutnorm.tools.distort module¶
-
cutnorm.tools.distort.
add_gaussian_noise
(mat, mean, std)¶ Adds gaussian noise to the matrix
Parameters: - mat – 2d array, shape (n,n)
- mean – gaussian mean
- std – gaussian std
Returns: Processed matrix
-
cutnorm.tools.distort.
shift
(mat, n_shift)¶ Shifts the matrix by rolling it along the diagonal
Parameters: - mat – 2d array, shape (n,n)
- n_shift – number to roll
Returns: Shifted matrix
cutnorm.tools.dbf_testing module¶
Implementation of DBF statistic
from “Distance-based analysis of variance: approximate inference and an application to genome-wide association studies”, Christopher Minas and Giovanni Montana
-
cutnorm.tools.dbf_testing.
dbf_pvalue
(mc, vc, gc, fstat, trW, trB)¶ pval = pearson_three(mc, vc, gc, fstat) compute one sided p value from standardized pearson three distribution
Parameters: - mc – mean
- vc – variance
- gc – skewness
- fstat – DBF distance based F statistic, between vs. within group variability
- trW – within group variability for DBF statistic
- trB – between group varaibility for DBF statistic
Returns: pval = one sided p value
-
cutnorm.tools.dbf_testing.
dbf_test
(dmatrix, labels)¶ pval, fstat, Bvar, Wvar = dbf_test(dmatrix, labels) run dbf test
Parameters: - dmatrix – N by N array of distances
- Labels – N array group membership
Returns: one sided p value fstat: DBF distance based F statistic, between vs. within group variability Bvar: between vs overall group variability Wvar: within vs overall group variability
Return type: pval
-
cutnorm.tools.dbf_testing.
distance_variability
(dmatrix, Ic)¶ fstat, trW, trB = distance_variability(dmatrix, Ic)
compute (i) within group variability, (ii) between group variability (iii) total variability (iv) distance based F statistic (DBF)
Parameters: - dmatrix – N by N array of distances
- Ic – N by G array group membership
Returns: DBF distance based F statistic, between vs. within group variability trW: Within group variability trB: Between group variability
Return type: fstat
-
cutnorm.tools.dbf_testing.
distribution_parameters
(dmatrix, Ic)¶ mc, vc, gc = distribution_parameters(dmatrix, Ic) compute (pearson III) null distribution parameters
Parameters: - dmatrix – N by N array of distances
- Ic – N by G array group membership
Returns: mean vc: variance gc: skewness
Return type: mc
-
cutnorm.tools.dbf_testing.
inv_f_fn
(mc, vc, gc, fstat, trT)¶ inverse fn
Parameters: - mc – mean
- vc – variance
- gc – skewness
- fstat – DBF distance based F statistic, between vs. within group variability
- trT – total group varaibility for DBF statistic
Returns: inv fn val