Title: | Diversity Measures on Multipartite Graphs |
---|---|
Description: | Computing diversity measures on multipartite graphs. This package first implements a parametrized family of such diversity measures which apply on probability distributions. Sometimes called "True Diversity", this family contains famous measures such as the richness, the Shannon entropy, the Herfindahl-Hirschman index, and the Berger-Parker index. Second, the package allows to apply these measures on probability distributions resulting from random walks between the parts of multipartite graphs. By defining an initial distribution on a given part of the graph and a path to follow between the different parts, the probability of the walker's position within the final part is then computed, thus providing a particular instance of diversity to measure. |
Authors: | Robin Lamarche-Perrin [aut, cre] |
Maintainer: | Robin Lamarche-Perrin <[email protected]> |
License: | GPL-3 | file LICENSE |
Version: | 2.0 |
Built: | 2024-11-05 02:55:54 UTC |
Source: | https://github.com/lamarche-perrin/triversity |
get_all_diversities
builds on get_diversity_from_path
to compute
the diversity values associated to all possible paths of a given length
between
the different parts of the input multipartite graph
.
get_all_diversities(graph, length = 2, cycles = TRUE, type = c("mean", "collective"), order = NULL, measure = NULL)
get_all_diversities(graph, length = 2, cycles = TRUE, type = c("mean", "collective"), order = NULL, measure = NULL)
graph |
A multipartite graph obtained by calling the |
length |
A positive integer giving the maximal length of the computed paths. |
cycles |
A boolean value indicating if the diversity associated to paths containing cycles should be computed. |
type |
A vector specifying the types of diversity to compute, among |
order |
A vector of positive floats (possibly including |
measure |
A vector of strings giving the names of the diversity measures to compute.
Possible values are |
A dataframe summarising all the computed diversities.
data (tripartite_example) graph <- get_multipartite (data=tripartite_example) get_all_diversities (graph)
data (tripartite_example) graph <- get_multipartite (data=tripartite_example) get_all_diversities (graph)
get_distribution_from_path
computes the probability distribution of a random walk
following a given path
between the different parts of the input multipartite
graph
. It starts at a given part with an initial probability distribution, then
randomly follows the links of the graph between the different parts according to the
input path
, then stops at the last specified part.
get_distribution_from_path(graph, path, initial_distribution = NULL, initial_node = NULL)
get_distribution_from_path(graph, path, initial_distribution = NULL, initial_node = NULL)
graph |
A multipartite graph obtained by calling the |
path |
A vector of character strings giving the path that the random walk should
follow between the different parts of the input multipartite |
initial_distribution |
(optional) A vector of floats in [ |
initial_node |
(optional) A character string giving the label of a node in the first
part of the input |
A vector of floats in [0
,1
] and summing to 1
giving the
probability distribution of the random walk when arriving at the last part, after having
followed the input path
within the different parts of the graph.
data (tripartite_example) graph <- get_multipartite (data=tripartite_example) path <- c(2,1,2,3) as.matrix (get_distribution_from_path (graph, path)) as.matrix (get_distribution_from_path (graph, path, initial_distribution=c(1/3,0,0,2/3))) as.matrix (get_distribution_from_path (graph, path, initial_node='i2'))
data (tripartite_example) graph <- get_multipartite (data=tripartite_example) path <- c(2,1,2,3) as.matrix (get_distribution_from_path (graph, path)) as.matrix (get_distribution_from_path (graph, path, initial_distribution=c(1/3,0,0,2/3))) as.matrix (get_distribution_from_path (graph, path, initial_node='i2'))
get_diversity_from_distribution
computes diversity values associated to an input
probability distribution
. The implemented diversity measures all belong to the
parametrized family of "True Diversity" measures. They can either be specified by their
diversity order
in [0
,Inf
[ or by their measure
name when it
corresponds to classical instances such as the richness, the Shannon entropy, the
Herfindahl-Hirschman index, or the Berger-Parker index.
get_diversity_from_distribution(distribution, order = NULL, measure = NULL)
get_diversity_from_distribution(distribution, order = NULL, measure = NULL)
distribution |
A vector of floats in [ |
order |
A vector of positive floats (possibly including |
measure |
A vector of strings giving the names of the diversity measures to compute.
Possible values are |
A vector of positive floats giving the diversity values of the input probability distribution.
distribution <- c (1/4, 1/2, 1/12, 2/12) get_diversity_from_distribution (distribution) get_diversity_from_distribution (distribution, order=c(0,Inf), measure='entropy')
distribution <- c (1/4, 1/2, 1/12, 2/12) get_diversity_from_distribution (distribution) get_diversity_from_distribution (distribution, order=c(0,Inf), measure='entropy')
get_diversity_from_path
computes diversity values of the probability distribution
of a random walk following a path
between the different parts of the input
multipartite graph
. It starts at a given part with an initial probability
distribution, then randomly follows the links of the graph between the different parts
according to the input path
, then stops at the last specified part. The
implemented diversity measures all belong to the parametrized family of "True Diversity"
measures. They can either be specified by their diversity order
in
[0
,Inf
[ or by their measure
name when it corresponds to classical
instances such as the richness, the Shannon entropy, the Herfindahl-Hirschman index,
or the Berger-Parker index.
get_diversity_from_path(graph, path, type = "individual", mean_distribution = NULL, initial_distribution = NULL, initial_node = NULL, order = NULL, measure = NULL)
get_diversity_from_path(graph, path, type = "individual", mean_distribution = NULL, initial_distribution = NULL, initial_node = NULL, order = NULL, measure = NULL)
graph |
A multipartite graph obtained by calling the |
path |
A vector of character strings giving the path that the random walk should
follow between the different parts of the input multipartite |
type |
Either |
mean_distribution |
(optional, only when |
initial_distribution |
(optional, only when |
initial_node |
(optional, only when |
order |
A vector of positive floats (possibly including |
measure |
A vector of strings giving the names of the diversity measures to compute.
Possible values are |
A matrix (or a vector) of positive floats giving the individual diversity values
of the random walks following the input path
(or their mean, or the collective
diversity).
data (tripartite_example) graph <- get_multipartite (data=tripartite_example) path <- c(1,2,3) get_diversity_from_path (graph, path, 'individual', measure=c('entropy','herfindahl')) get_diversity_from_path (graph, path, 'individual', order=c(0,1,Inf)) # Mean of individual diversities get_diversity_from_path (graph, path, 'mean') get_diversity_from_path (graph, path, 'mean', mean_distribution=c(1/3,2/3)) # Collective diversities get_diversity_from_path (graph, path, 'collective') get_diversity_from_path (graph, path, 'collective', initial_distribution=c(0.75,0.25))
data (tripartite_example) graph <- get_multipartite (data=tripartite_example) path <- c(1,2,3) get_diversity_from_path (graph, path, 'individual', measure=c('entropy','herfindahl')) get_diversity_from_path (graph, path, 'individual', order=c(0,1,Inf)) # Mean of individual diversities get_diversity_from_path (graph, path, 'mean') get_diversity_from_path (graph, path, 'mean', mean_distribution=c(1/3,2/3)) # Collective diversities get_diversity_from_path (graph, path, 'collective') get_diversity_from_path (graph, path, 'collective', initial_distribution=c(0.75,0.25))
get_diversity_from_transition
either computes the individual diversity values
associated to the rows of the input transition
matrix, or the geometric means of
these values, optionally weighting them according to the input mean_distribution
.
This hence allows to compute conditional diversity values associated to a transition
matrix.
get_diversity_from_transition(transition, type = "individual", mean_distribution = NULL, order = NULL, measure = NULL)
get_diversity_from_transition(transition, type = "individual", mean_distribution = NULL, order = NULL, measure = NULL)
transition |
A matrix of floats in [ |
type |
Either |
mean_distribution |
(optional, only when |
order |
A vector of positive floats (possibly including |
measure |
A vector of strings giving the names of the diversity measures to compute.
Possible values are |
A matrix (or a vector) of positive floats giving the individual diversity values
(or their geometric mean) of the row of the input transition
matrix.
transition <- matrix (c (1/3, 1/3, 1/3, 0.9, 0.1, 0), nrow=2, ncol=3, byrow=TRUE) get_diversity_from_transition (transition, type='individual', order=c(0,Inf), measure='entropy') get_diversity_from_transition (transition, type='mean') get_diversity_from_transition (transition, type='mean', mean_distribution=c(1/4,3/4))
transition <- matrix (c (1/3, 1/3, 1/3, 0.9, 0.1, 0), nrow=2, ncol=3, byrow=TRUE) get_diversity_from_transition (transition, type='individual', order=c(0,Inf), measure='entropy') get_diversity_from_transition (transition, type='mean') get_diversity_from_transition (transition, type='mean', mean_distribution=c(1/4,3/4))
get_multipartite
builds a properly-structured multipartite graph
from a file or from a data.frame
containing raw data. This object can then
be used by the other functions of this package. The structure
of the input data and of the resulting structure is detailed below.
get_multipartite(filename = NULL, data = NULL)
get_multipartite(filename = NULL, data = NULL)
filename |
A character string giving the path to the file containing the raw data. The input file should contain at least four columns, separated by spaces. Each row gives a link between two nodes belonging to two different parts of the multipartite graph. The first column gives the label of the part of the first node and the second column gives the label of this node (any character string). Similarly, the third and fourth columns give the labels of the second part and of the second node. The file can eventually contain a fifth column giving the weights of the links (any positive integer or float value). |
data |
A This |
A properly-structured multipartite graph that can be used by the other functions
of the triversity
package.
The resulting object encodes the labels of the parts and of the nodes in the multipartite
graph, as well as the transition matrices of random walks following different paths
between parts (encoded as sparse matrices of floats in [0
,1
], with all
rows summing to 1
). These transition matrices
are then used by functions such as get_distribution_from_path
to
compute the probability distributions of such random walks, or such as
get_diversity_from_path
to compute the diversity of these distributions.
Assuming the object returned by get_multipartite
is stored in the
graph
variable, then:
graph$parts
is a vector of character strings giving the labels of the
different parts in the multipartite graph;
graph$nodes
is a list of vectors of character strings giving the labels
of the nodes constituting the different parts of the multipartite graph;
graph$edges
is a data.tree
whose nodes each contains the
transition matrix of the corresponding random walk.
Package slam
for the sparse encoding of transition matrices and
package data.tree
for the tree structure storing these matrices.
data (tripartite_example) graph <- get_multipartite (data=tripartite_example) graph$parts part1 <- graph$parts[1] graph$nodes[[part1]] part2 <- graph$parts[2] as.matrix (graph$edges$Climb(part1,part2)$matrix)
data (tripartite_example) graph <- get_multipartite (data=tripartite_example) graph$parts part1 <- graph$parts[1] graph$nodes[[part1]] part2 <- graph$parts[2] as.matrix (graph$edges$Climb(part1,part2)$matrix)
get_transition_from_path
computes the transition matrix of a random walk following
a path
between the different parts of the input multipartite graph
.
get_transition_from_path(graph, path)
get_transition_from_path(graph, path)
graph |
A multipartite graph obtained by calling the |
path |
A vector of character strings giving the path that the random walk should
follow between the different parts of the input multipartite |
Note that the multipartite graph structure implemented in this package
stores in memory any computed transition matrix to avoid redundant computation
in the future. Hence, the latter execution of get_transition_from_path
, or of
any other function that builds on it, can be much faster than the first call.
The transition matrices are stored within a data.tree
in the input
graph
variable (see graph$edges
).
A matrix of floats in [0
,1
], with all lines summing to 1
,
giving the transition matrix of the random walk following the input path
.
data (tripartite_example) graph <- get_multipartite (data=tripartite_example) as.matrix (get_transition_from_path (graph, path=c(2,1,2,3)))
data (tripartite_example) graph <- get_multipartite (data=tripartite_example) as.matrix (get_transition_from_path (graph, path=c(2,1,2,3)))
tripartite_example
is a data.frame
containing raw data that encodes a small
tripartite graph. It has the proper format to be loaded by get_multipartite
.
It contains four columns. Each row gives a link between two nodes belonging to two
different parts of the tripartite graph. The first column gives the label if the part of
the first node and the second column gives the label of this node (any character string).
Similarly, the third and fourth columns give the labels of the second part and of the
second node. A fifth column could eventually be added to give the weights of the links
(any positive integer or float value).
data(tripartite_example)
data(tripartite_example)
An object of class data.frame
with 11 rows and 4 columns.
data (tripartite_example) head (tripartite_example) graph <- get_multipartite (data=tripartite_example) graph
data (tripartite_example) head (tripartite_example) graph <- get_multipartite (data=tripartite_example) graph
triversity
is an R package for the computation of diversity measures
on multipartite graphs. First, it implements a parametrized family of such diversity
measures which apply on probability distributions. Sometimes called "True Diversity",
this family contains famous measures such as the richness, the Shannon entropy, the
Herfindahl-Hirschman index, and the Berger-Parker index. Second, the package allows
to apply these measures on probability distributions resulting from random walks between
the parts of multipartite graphs. By defining an initial distribution on a given part
of the graph and a path to follow between the different parts, the probability of the
walker's position within the final part is then computed, thus providing a particular
instance of diversity to measure.
This package has been developed by researchers of the Complex Networks team, located within the Computer Science Laboratory of Paris 6 (LIP6), for the AlgoDiv project founded by the French National Agency of Research (ANR) under grant ANR-15-CE38-0001.
Links:
AlgoDiv project: http://algodiv.huma-num.fr/
Complex Networks team: http://www.complexnetworks.fr/
LIP6: https://www.lip6.fr/
Contact:
Robin Lamarche-Perrin: [email protected]
See also my webpage: https://www-complexnetworks.lip6.fr/~lamarche/
List of main collaborators:
Robin Lamarche-Perrin (CNRS, ISC-PIF, LIP6)
Lionel Tabourier (UPMC, LIP6)
Fabien Tarissan (CNRS, ISP, LIP6)
Raphaël Fournier S'niehotta (CNAM, CÉDRIC)
Rémy Cazabet (UdL, LIRIS)
Copyright 2017 © Robin Lamarche-Perrin
triversity
is free software: you can redistribute it and/or modify it under the
terms of the GNU General Public License as published by the Free Software Foundation,
either version 3 of the License, or (at your option) any later version. It is distributed
in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
Public License for more details. You should have received a copy of the GNU General
Public License along with this program. If not, see http://www.gnu.org/licenses/.