Table Of Contents

Previous topic

neurospin.clustering.hierarchical_clustering

Next topic

neurospin.glm.glm

This Page

neurospin.eda.dimension_reduction

Module: neurospin.eda.dimension_reduction

Inheritance diagram for nipy.neurospin.eda.dimension_reduction:

This modules contains several classes to perform non-linear dimension reduction. Each class has 2 methods, ‘train’ and ‘test’ - ‘train’ performs the computation of low-simensional data embedding and the information to generalize to new data - ‘test’ computes the embedding for new dsamples of data This is done for - Multi-dimensional scaling - Isompap (knn or eps-neighb implementation) - Locality Preseving projections (LPP) - laplacian embedding (train only)

Future developpements will include some supervised cases, e.g. LDA,LDE and the estimation of the latent dimension, at least in simple cases.

Classes

MDS

class nipy.neurospin.eda.dimension_reduction.MDS(X=None, rdim=1, fdim=1)

Bases: nipy.neurospin.eda.dimension_reduction.NLDR

This is a particular class that perfoms linear dimension reduction using multi-dimensional scaling besides the fields of NDLR, it contains the following ones: - trained: trained==1 means that the system has been trained and can generalize - embedding: array of shape (nbitems,rdim) this is representation of the training data - offset: array of shape(nbitems) affine part of the embedding - projector: array of shape(fdim,rdim) linear part of the embedding

__init__(X=None, rdim=1, fdim=1)
test(X)
chart = MDS.test(X,verbose=0) X = array of shape(nbitems,fdim) new data points to be embedded verbose=0 : verbosity mode chart: resulting rdim-dimensional represntation
train(verbose=0)
chart = MDS.train(verbose=0) verbose=0 : verbosity mode chart: resulting rdim-dimensional represntation

NLDR

class nipy.neurospin.eda.dimension_reduction.NLDR(X=None, rdim=1, fdim=1)

This is a generic class for dimension reduction techniques the main fields are - train_data : the input dataset from which the DR is perfomed - fdim=1 - rdim=1

__init__(X=None, rdim=1, fdim=1)
check_data(X)
set_train_data(X)
test(X)
train()

eps_Isomap

class nipy.neurospin.eda.dimension_reduction.eps_Isomap(X=None, rdim=1, fdim=1)

Bases: nipy.neurospin.eda.dimension_reduction.NLDR

This is a particular class that perfoms linear dimension reduction using eps-ball neighbor modelling and isomapping. besides the fields of NDLR, it contains the following ones: - eps : eps-ball model used in the knn graph building - G : resulting graph based on the training data - trained: trained==1 means that the system has been trained and can generalize - embedding: array of shape (nbitems,rdim) this is representation of the training data - offset: array of shape(nbitems) affine part of the embedding - projector: array of shape(fdim,rdim) linear part of the embedding

__init__(X=None, rdim=1, fdim=1)
test(X)
chart = eps_Isomap.test(X,verbose=0) INPUT X = array of shape(nbitems,fdim) new data points to be embedded verbose=0 : verbosity mode OUTPUT chart: resulting rdim-dimensional represntation
train(eps=1.0, p=300, verbose=0)
chart = eps_Isomap.train(X,verbose=0) INPUT eps= 1.0: self.eps p = 300 number points used in the low dimensional approximation - verbose=0 : verbosity mode OUTPUT: - chart = eps_Isomap.embedding

knn_Isomap

class nipy.neurospin.eda.dimension_reduction.knn_Isomap(X=None, rdim=1, fdim=1)

Bases: nipy.neurospin.eda.dimension_reduction.NLDR

This is a particular class that perfoms linear dimension reduction using k nearest neighbor modelling and isomapping. besides the fields of NDLR, it contains the following ones: - k : number of neighbors in the knn graph building - G : resulting graph based on the training data - trained: trained==1 means that the system has been trained and can generalize - embedding: array of shape (nbitems,rdim) this is representation of the training data - offset: array of shape(nbitems) affine part of the embedding - projector: array of shape(fdim,rdim) linear part of the embedding

__init__(X=None, rdim=1, fdim=1)
test(X)
chart = knn_Isomap.test(X,verbose=0) INPUT X = array of shape(nbitems,fdim) new data points to be embedded verbose=0 : verbosity mode OUTPUT chart: resulting rdim-dimensional represntation
train(k=1, p=300, verbose=0)
chart = knn_Isomap.train(verbose=0) INPUT: - k=1 : k in the knn system - p=300 : number points used in the low dimensional approximation - verbose=0 : verbosity mode OUTPUT: - chart = knn_Isomap.embedding

knn_LE

class nipy.neurospin.eda.dimension_reduction.knn_LE(X=None, rdim=1, fdim=1)

Bases: nipy.neurospin.eda.dimension_reduction.NLDR

This is a particular class that perfoms linear dimension reduction using k nearest neighbor modelling and laplacian embedding. besides the fields of NDLR, it contains the following ones: - k : number of neighbors in the knn graph building - G : resulting graph based on the training data - trained: trained==1 means that the system has been trained and can generalize - embedding: array of shape (nbitems,rdim) this is representation of the training data NB: to date, only the training part (embedding computation) is considered

__init__(X=None, rdim=1, fdim=1)
train(k=1, verbose=0)
chart = knn_LE.train(k=1,verbose=0) INPUT: - k=1 : k in the knn system - verbose=0 : verbosity mode OUTPUT: - chart = knn_LE.embedding

knn_LPP

class nipy.neurospin.eda.dimension_reduction.knn_LPP(X=None, rdim=1, fdim=1)

Bases: nipy.neurospin.eda.dimension_reduction.NLDR

This is a particular class that perfoms linear dimension reduction using k nearest neighbor modelling and locality preserving projection (LPP). besides the fields of NDLR, it contains the following ones: - k : number of neighbors in the knn graph building - G : resulting graph based on the training data - trained: trained==1 means that the system has been trained and can generalize - embedding: array of shape (nbitems,rdim) this is representation of the training data - projector: array of shape(fdim,rdim) linear part of the embedding

__init__(X=None, rdim=1, fdim=1)
test(X)
chart = knn_LPP.test(X,verbose=0) INPUT X = array of shape(nbitems,fdim) new data points to be embedded verbose=0 : verbosity mode OUTPUT chart: resulting rdim-dimensional represntation
train(k=1, verbose=0)
chart = knn_LPP.train(verbose=0) INPUT: - k=1 : k in the knn system - verbose=0 : verbosity mode OUTPUT: - chart = knn_LPP.embedding

Functions

nipy.neurospin.eda.dimension_reduction.CCA(X, Y, eps=9.9999999999999998e-13)
Canonical corelation analysis of two matrices INPUT: - X and Y are (nbitem,p) and (nbitem,q) arrays that are analysed - eps=1.e-12 is a small biasing constant to grant invertibility of the matrices OUTPUT - ccs: the canconical correlations NOTE - It is expected that nbitem>>max(p,q) - In general it makes more sense if p=q
nipy.neurospin.eda.dimension_reduction.Euclidian_distance(X, Y=None)
Considering the rows of X (and Y=X) as vectors, compute the distance matrix between each pair of vector
nipy.neurospin.eda.dimension_reduction.Euclidian_mds(X, dim, verbose=0)
returns a dim-dimensional MDS representation of the rows of X using an Euclidian metric
nipy.neurospin.eda.dimension_reduction.LE(G, dim, verbose=0, maxiter=1000)
Laplacian Embedding of the data returns the dim-dimensional LE of the graph G chart = LE(G,dim,verbose=0,maxiter=1000) INPUT: - G : Weighted graph that represents the data - dim=1 : number of dimensions - verbose = 0: verbosity level - maxiter=1000: maximum number of iterations of the algorithm OUTPUT - chart, array of shape(G.V,dim) NOTE : In fact the current implementation retruns what is now referred to a diffusion map at time t=1
nipy.neurospin.eda.dimension_reduction.LE_dev(G, dim, verbose=0, maxiter=1000)
Laplacian Embedding of the data returns the dim-dimensional LE of the graph G INPUT: - G : Weighted graph that represents the data - dim=1 : number of dimensions - verbose = 0: verbosity level - maxiter=1000: maximum number of iterations of the algorithm OUTPUT - chart, array of shape(G.V,dim)
nipy.neurospin.eda.dimension_reduction.LPP(G, X, dim, verbose=0, maxiter=1000) INPUT: - G : Weighted graph that represents the data - X : related input dataset - dim=1 : number of dimensions - verbose = 0: verbosity level - maxiter=1000: maximum number of iterations of the algorithm OUTPUT -proj, array of shape(X.shape[, 1], dim)
nipy.neurospin.eda.dimension_reduction.Orthonormalize(M) orthonormalize the columns of M (Gram-Schmidt procedure)
nipy.neurospin.eda.dimension_reduction.check_isometry(G, chart, nseeds=100, verbose=0)
A simple check of the Isometry: look whether the output distance match the intput distances for nseeds points OUTPUT: - a proportion factor to optimize the metric
nipy.neurospin.eda.dimension_reduction.infer_latent_dim(X, verbose=0, maxr=-1)
r = infer_latent_dim(X,verbose = 0) Infer the latent dimension of an aray assuming data+gaussian noise mixture INPUT: - an array X - verbose=0 : verbositry level - maxr=-1 maximum dimension that can be achieved if maxr = -1, this is equal to rank(X) OUPTUT - r the inferred dimension
nipy.neurospin.eda.dimension_reduction.isomap(G, dim=1, p=300, verbose = 0) Isomapping of the data return the dim-dimensional ISOMAP chart that best represents the graph G INPUT: - G : Weighted graph that represents the data - dim=1 : number of dimensions - p=300 : nystrom reduction of the problem - verbose = 0: verbosity level OUTPUT - chart, array of shape(G.V, dim)
nipy.neurospin.eda.dimension_reduction.isomap_dev(G, dim=1, p=300, verbose=0)
chart,proj,offset =isomap(G,dim=1,p=300,verbose = 0) Isomapping of the data return the dim-dimensional ISOMAP chart that best represents the graph G INPUT: - G : Weighted graph that represents the data - dim=1 : number of dimensions - p=300 : nystrom reduction of the problem - verbose = 0: verbosity level OUTPUT - chart, array of shape(G.V,dim) NOTE: - this ‘dev’ version is expected to yield more accurate results than the other approximation, because of a better out of samples generalization procedure.
nipy.neurospin.eda.dimension_reduction.local_correction_for_embedding(G, chart, sigma=1.0)
WIP : an unfinished fuction that aims at improving isomap’s prbs the idea is to optimize the representation of local distances INPUT: G: the graph to be isomapped chart: the input chart sigma: a scale parameter OUTPUT: chart : the corrected chart
nipy.neurospin.eda.dimension_reduction.local_sym_normalize(G)
graph symmetric normalization; moiover, the normalizing vector is returned NB : this is now in the C graph lib.
nipy.neurospin.eda.dimension_reduction.mds(dg, dim=1, verbose=0)
Multi-dimensional scaling, i.e. derivation of low dimensional representations from distance matrices. INPUT: - dg: a (nbitem,nbitem) distance matrix - dim=1: the dimension of the desired representation - verbose=0: verbosity level
nipy.neurospin.eda.dimension_reduction.partial_floyd_graph(G, k)
Create a graph of the knn in teh geodesic sense, given an input graph G
nipy.neurospin.eda.dimension_reduction.sparse_local_correction_for_embedding(G, chart, sigma=1.0, niter=100)
WIP : an unfinished fuction that aims at improving isomap’s prbs the idea is to optimize the representation of local distances INPUT: G: the graph to be isomapped chart: the input chart sigma: a scale parameter OUTPUT: chart : the corrected chart NOTE: the graph G is reordeered