cutnorm.tools package

Submodules

cutnorm.tools.sbm module

cutnorm.tools.sbm.erdos_renyi(n, p)

Generates Erdos Renyi random graph size n with probability p

Parameters:
  • n – int, size of the output matrix
  • p – float, edge probability
Returns:

Erdos Renyi random graph matrix 2d array, shape (n,n)

cutnorm.tools.sbm.make_symmetric_triu(mat)

Makes the matrix symmetric upper triangular

Parameters:mat – 2d array, shape (n,n)
Returns:upper triangular symmetric matrix of the input 2d array, shape (n,n)
cutnorm.tools.sbm.sbm(community_sizes, prob_mat)

Generates a stochastic block matrix

Community_sizes indicate the size of each community and the probability matrix indicate the probability that a 1 will be generated for each element within the community.

Parameters:
  • community_sizes – 1d array, shape (n) sizes of community
  • prob_mat – 2d array, shape (n,n) probability of edges for each community
Returns:

stochastic block matrix, 2d array, shape depending on community sizes

cutnorm.tools.sbm.sbm_autoregressive(community_sizes, prob_list)

Generates an autoregressive SBM

An autoregressive SBM has edge probability according to the prob_list on the diagonal but (prob_list[i] * prob_list[j])**(abs(i - j)) for the off-diagonal blocks entries.

This idea is similar to the autoregressive models

Parameters:
  • community_sizes – 1d array, shape (n) sizes of community
  • prob_list – 1d array, shape (n), where n is the number of diagonal blocks
Returns:

An autoregressive SBM, 2d array, shape depending on community sizes

cutnorm.tools.sbm.sbm_autoregressive_prob(community_sizes, prob_list)

Generates the underlying probability matrix thatgives rise to the autoregressive SBM

Parameters:
  • community_sizes – 1d array, shape (n) sizes of community
  • prob_list – 1d array, shape (n), where n is the number of diagonal blocks
Returns:

A probability matrix for an autoregressive SBM, 2d array, shape depending on community sizes

cutnorm.tools.sbm.sbm_prob(community_sizes, prob_mat)

Generates a matrix indicating the underlying probability that gives rise to a stochastic block matrix

Parameters:
  • community_sizes – 1d array, shape (n) sizes of community
  • prob_mat – 2d array, shape (n,n) probability of edges for each community
Returns:

probabilities of a stochastic block matrix, 2d array, shape depending on community sizes

cutnorm.tools.distort module

cutnorm.tools.distort.add_gaussian_noise(mat, mean, std)

Adds gaussian noise to the matrix

Parameters:
  • mat – 2d array, shape (n,n)
  • mean – gaussian mean
  • std – gaussian std
Returns:

Processed matrix

cutnorm.tools.distort.shift(mat, n_shift)

Shifts the matrix by rolling it along the diagonal

Parameters:
  • mat – 2d array, shape (n,n)
  • n_shift – number to roll
Returns:

Shifted matrix

cutnorm.tools.dbf_testing module

Implementation of DBF statistic

from “Distance-based analysis of variance: approximate inference and an application to genome-wide association studies”, Christopher Minas and Giovanni Montana

cutnorm.tools.dbf_testing.dbf_pvalue(mc, vc, gc, fstat, trW, trB)

pval = pearson_three(mc, vc, gc, fstat) compute one sided p value from standardized pearson three distribution

Parameters:
  • mc – mean
  • vc – variance
  • gc – skewness
  • fstat – DBF distance based F statistic, between vs. within group variability
  • trW – within group variability for DBF statistic
  • trB – between group varaibility for DBF statistic
Returns:

pval = one sided p value

cutnorm.tools.dbf_testing.dbf_test(dmatrix, labels)

pval, fstat, Bvar, Wvar = dbf_test(dmatrix, labels) run dbf test

Parameters:
  • dmatrix – N by N array of distances
  • Labels – N array group membership
Returns:

one sided p value fstat: DBF distance based F statistic, between vs. within group variability Bvar: between vs overall group variability Wvar: within vs overall group variability

Return type:

pval

cutnorm.tools.dbf_testing.distance_variability(dmatrix, Ic)

fstat, trW, trB = distance_variability(dmatrix, Ic)

compute (i) within group variability, (ii) between group variability (iii) total variability (iv) distance based F statistic (DBF)

Parameters:
  • dmatrix – N by N array of distances
  • Ic – N by G array group membership
Returns:

DBF distance based F statistic, between vs. within group variability trW: Within group variability trB: Between group variability

Return type:

fstat

cutnorm.tools.dbf_testing.distribution_parameters(dmatrix, Ic)

mc, vc, gc = distribution_parameters(dmatrix, Ic) compute (pearson III) null distribution parameters

Parameters:
  • dmatrix – N by N array of distances
  • Ic – N by G array group membership
Returns:

mean vc: variance gc: skewness

Return type:

mc

cutnorm.tools.dbf_testing.inv_f_fn(mc, vc, gc, fstat, trT)

inverse fn

Parameters:
  • mc – mean
  • vc – variance
  • gc – skewness
  • fstat – DBF distance based F statistic, between vs. within group variability
  • trT – total group varaibility for DBF statistic
Returns:

inv fn val

Module contents