graphdot.kernel package¶

class graphdot.kernel.Tang2019MolecularKernel(stopping_probability=0.01, starting_probability=1.0, element_prior=0.2, edge_length_scale=0.05, **kwargs)[source]¶

Bases: object

A margianlized graph kernel for 3D molecular structures as in: Tang, Y. H., & de Jong, W. A. (2019). Prediction of atomization energy using graph kernel and active learning. The Journal of chemical physics, 150(4), 044107. The kernel can be directly used together with Graph.from_ase() to operate on molecular structures.

Parameters:

stopping_probability (float in (0, 1)) – The probability for the random walk to stop during each step.
starting_probability (float) – The probability for the random walk to start from any node. See the p kwarg of graphdot.kernel.marginalized.MarginalizedGraphKernel
element_prior (float in (0, 1)) – The baseline similarity between distinct elements — an element always have a similarity 1 to itself.
edge_length_scale (float in (0, inf)) – length scale of the Gaussian kernel on edge length. A rule of thumb is that the similarity decays smoothly from 1 to nearly 0 around three times of the length scale.

__call__(X, Y=None, **kwargs)[source]¶: Same call signature as graphdot.kernel.marginalized.MarginalizedGraphKernel.__call__()

bounds¶

clone_with_theta(theta)[source]¶

diag(X, **kwargs)[source]¶: Same call signature as graphdot.kernel.marginalized.MarginalizedGraphKernel.diag()

hyperparameter_bounds¶

hyperparameters¶

theta¶

class graphdot.kernel.KernelOverMetric(distance, expr, x, **hyperparameters)[source]¶

Bases: object

__call__(X, Y=None, eval_gradient=False)[source]¶: Call self as a function.

bounds¶

clone_with_theta(theta=None)[source]¶

diag(X)[source]¶

get_params()[source]¶

hyperparameters¶

theta¶

class graphdot.kernel.MarginalizedGraphKernel(node_kernel, edge_kernel, p=1.0, q=0.01, q_bounds=(0.0001, 0.9999), eps=0.01, ftol=1e-08, gtol=1e-06, dtype=<class 'float'>, backend='auto')[source]¶

Bases: object

Implements the random walk-based graph similarity kernel as proposed in: Kashima, H., Tsuda, K., & Inokuchi, A. (2003). Marginalized kernels between labeled graphs. In Proceedings of the 20th international conference on machine learning (ICML-03) (pp. 321-328).

Parameters:

node_kernel (microkernel) – A kernelet that computes the similarity between individual nodes
edge_kernel (microkernel) – A kernelet that computes the similarity between individual edge
p (positive number (default=1.0) or StartingProbability) – The starting probability of the random walk on each node. Must be either a positive number or a concrete subclass instance of StartingProbability.
q (float in (0, 1)) – The probability for the random walk to stop during each step.
q_bounds (pair of floats) – The lower and upper bound that the stopping probability can vary during hyperparameter optimization.
eps (float) – The step size used for finite difference approximation of the gradient. Only used for nodal matrices (nodal=True).
dtype (numpy dtype) – The data type of the kernel matrix to be returned.
backend ('auto' or 'cuda' or an instance of) –

:param graphdot.kernel.marginalized.Backend.: The computing engine that solves the marginalized graph kernel’s: generalized Laplacian equation.

__call__(X, Y=None, eval_gradient=False, nodal=False, lmin=0, timing=False)[source]¶

Compute pairwise similarity matrix between graphs

Parameters:

X (list of N graphs) – The graphs must all have same node and edge attributes.
Y (None or list of M graphs) – The graphs must all have same node and edge attributes.
eval_gradient (Boolean) – If True, computes the gradient of the kernel matrix with respect to hyperparameters and return it alongside the kernel matrix.
nodal (bool) – If True, return node-wise similarities; otherwise, return graphwise similarities.
lmin (0 or 1) – Number of steps to skip in each random walk path before similarity is computed. (lmin + 1) corresponds to the starting value of l in the summation of Eq. 1 in Tang & de Jong, 2019 https://doi.org/10.1063/1.5078640 (or the first unnumbered equation in Section 3.3 of Kashima, Tsuda, and Inokuchi, 2003).

Returns:

kernel_matrix (ndarray) – if Y is None, return a square matrix containing pairwise similarities between the graphs in X; otherwise, returns a matrix containing similarities across graphs in X and Y.
gradient (ndarray) – The gradient of the kernel matrix with respect to kernel hyperparameters. Only returned if eval_gradient is True.

active_theta_mask¶

bounds¶: The logarithms of a reshaped X-by-2 array of kernel hyperparameter bounds, excluing those declared as ‘fixed’ or those with equal lower and upper bounds.

clone_with_theta(theta)[source]¶

diag(X, eval_gradient=False, nodal=False, lmin=0, active_theta_only=True, timing=False)[source]¶

Compute the self-similarities for a list of graphs

Parameters:

X (list of N graphs) – The graphs must all have same node attributes and edge attributes.
eval_gradient (Boolean) – If True, computes the gradient of the kernel matrix with respect to hyperparameters and return it alongside the kernel matrix.
nodal (bool) – If True, returns a vector containing nodal self similarties; if False, returns a vector containing graphs’ overall self similarities; if ‘block’, return a list of square matrices which forms a block-diagonal matrix, where each diagonal block represents the pairwise nodal similarities within a graph.
lmin (0 or 1) – Number of steps to skip in each random walk path before similarity is computed. (lmin + 1) corresponds to the starting value of l in the summation of Eq. 1 in Tang & de Jong, 2019 https://doi.org/10.1063/1.5078640 (or the first unnumbered equation in Section 3.3 of Kashima, Tsuda, and Inokuchi, 2003).
active_theta_only (bool) – Whether or not to return only gradients with regard to the non-fixed hyperparameters.

Returns:

diagonal (numpy.array or list of np.array(s)) – If nodal=True, returns a vector containing nodal self similarties; if nodal=False, returns a vector containing graphs’ overall self similarities; if nodal = ‘block’, return a list of square matrices, each being a pairwise nodal similarity matrix within a graph.
gradient – The gradient of the kernel matrix with respect to kernel hyperparameters. Only returned if eval_gradient is True.

flat_hyperparameters¶

hyperparameter_bounds¶

hyperparameters¶: A hierarchical representation of all the kernel hyperparameters.

is_stationary()[source]¶

n_dims¶: Number of hyperparameters including both optimizable and fixed ones.

requires_vector_input¶

theta¶: The logarithms of a flattened array of kernel hyperparameters, excluing those declared as ‘fixed’ or those with equal lower and upper bounds.

trait_t¶: alias of Traits

classmethod traits(diagonal=False, symmetric=False, nodal=False, lmin=0, eval_gradient=False)[source]¶

Subpackages¶

graphdot.kernel.marginalized package
- Submodules
  - graphdot.kernel.marginalized.basekernel module
  - graphdot.kernel.marginalized.starting_probability module

graphdot.kernel package¶

Subpackages¶

Submodules¶