FAB/HME Engine Specifications

This document describes specifications of the FAB/HME-engine.

Overview

This FAB engine consists of models and learners parts.

A learner outputs the model as a result of the learning process using the set of learning data (feature and target data). A specific learner class is defined for each learning type: a combination of learning types (regression or classification), gating function types, and component types.

A model can predict target data by inputting feature data. Since FAB models are represented by linear or simple equations and its parameters, users can understand how the model predicts target values by interpreting the parameter values.

Learner

Learner Classes

There exist ten learners for FAB/HME learning:

learning type

target

gate type

component type

learner class

Regression

Single

Bernoulli

Linear

HMEBernGateLeastSquaresRgLearner (Bern/Rg)

Non-linear

HMEBernGateBSplineRgLearner (Bern/NLRg)

Logistic

Linear

HMELogitGateLeastSquaresRgLearner (Logit/Rg)

Non-linear

HMELogitGateBSplineRgLearner (Logit/NLRg)

Classification

Bernoulli

Linear

HMEBernGateLogisticRgLearner (Bern/Cl)

Non-linear

HMEBernGateBSplineClLearner (Bern/NLCl)

Logistic

Linear

HMELogitGateLogisticRgLearner (Logit/Cl)

Non-linear

HMELogitGateBSplineClLearner (Logit/NLCl)

Multi

Bernoulli

Linear

HMEBernGateSoftmaxClLearner (Bern/MCl)

Logistic

Linear

HMELogitGateSoftmaxClLearner (Logit/MCl)

Learn Methods

To execute a learning process, call learn(X, Y) method of an instance of the learner:

parameter

type

size

description

X

numpy.ndarray

(num_samples, num_features)

Feature data.

Y

numpy.array

(num_samples)

Target data for single target.

numpy.ndarray

(num_samples, num_targets)

Target data for multi targets.

Note

For single target classification problems, each value in Y must be either -1.0 or 1.0.

Note

For multi target classification problems, only one element for each sample in Y must be 1.0, and others must be 0.0.

A returned object from the learn() is a tuple of (model, vposterior, context). Each object in the tuple are as follows:

object

type

description

model

HMESupervisedModel

A model object as a result of the learning.

vposterior

HMEBinaryTreeVPosterior

Variational posterior used in the learning.

context

HMELearningContext

A context object such as histories of FIC and the number of components in the learning.

Initialization

Initialization Types

Each learner class has three class-methods to create its instance. The default initialization method, __init__(), is not available for FAB-engine users.

init_random()

It creates a learner for random start.

init_with_posterior()

It creates a learner for posterior hot-start.

init_with_model_dict()

It creates a learner for model hot-start.

List of Argument Parameters

The following table indicates the existence of parameters for each learner and initialization method. The meanings or other information of each parameter are described in the following sections.

parameter

Regression

Classification

Single target

Multi target

Bern

Logit

Bern

Logit

Bern

Logit

Rg

NLRg

Rg

NLRg

Cl

NLCl

Cl

NLCl

MCl

MCl

max_fab_iterations

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

start_from_mstep

R

R

R

R

R

R

R

R

R

R

num_acceleration_steps

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

repeat_until_convergence

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

projection_estep

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

shrink_threshold

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

fab_stop_threshold

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

hard_gate

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

gate_feature_ids

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

comp_feature_ids

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

comp_mandatory_feature_ids

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

comp_positive_feature_ids

R / P / M

R / P / M

R / P / M

R / P / M

comp_negative_feature_ids

R / P / M

R / P / M

R / P / M

R / P / M

tree_depth

R

R

R

R

R

R

R

R

R

R

comp_bspline_degree

R / P

R / P

R / P

R / P

comp_bspline_basis_dim

R / P

R / P

R / P

R / P

comp_weights_min_scale

R

R

R

R

R

R

R

R

R

R

comp_weights_max_scale

R

R

R

R

R

R

R

R

R

R

comp_bias_min_scale

R

R

R

R

R

R

R

R

R

R

comp_bias_max_scale

R

R

R

R

R

R

R

R

R

R

comp_variance_min_scale

R

R

R

R

comp_variance_max_scale

R

R

R

R

gate_opt_mode

M

M

M

M

M

M

M

M

M

M

gate_max_bins

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

gate_opt_type

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

gate_l2_regularize

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

with_gate_scaled_l0_regularize

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

max_gate_relevant_features

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

gate_svd_threshold

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

comp_foba_skip

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

comp_foba_skip_max_interval

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

comp_opt_mode

M

M

M

M

M

M

M

M

M

M

comp_two_stage_opt

R / P / M

R / P / M

comp_backward_step

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

comp_opt_type

R / P / M

R / P / M

R / P / M

R / P / M

post_comp_opt_type

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

comp_l2_regularize

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

comp_pspline

R / P / M

R / P / M

R / P / M

R / P / M

with_comp_scaled_l0_regularize

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

max_comp_relevant_features

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

max_comp_foba_iterations

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

comp_svd_threshold

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

num_threads_gates

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

num_threads_gate_features

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

num_threads_comps

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

R / P / M

posterior_prob

P

P

P

P

P

P

P

P

P

P

comp_ids

P

P

P

P

P

P

P

P

P

P

model_dict

M

M

M

M

M

M

M

M

M

M

  • R : random start by init_random()

  • P : posterior hot-start by init_with_posterior()

  • M : model hot-start by init_with_model_dict()

Parameter Descriptions

Meanings of input parameters are as follows:

Argument parameters for initialization methods of the learners.

parameter

type

domain

default

description

max_fab_iterations

int

[1, inf)

100

Maximum number of FAB-iterations.

start_from_mstep

bool

True / False

False

If True, the first iteration starts with M-step; otherwise, E-step.

num_acceleration_steps

int

[0, inf)

0

The number of steps of acceleration algorithm for each FAB-iteration. If 0, the acceleration algorithm is disabled.

repeat_until_convergence

bool

True / False

False

If False, FAB-iterations and the post-processing are executed only once even if the FAB-iterations are stopped not by convergence condition but by max_fab_iterations condition.

projection_estep

bool

True / False

False

Whether the projection E-step algorithm is enabled.

shrink_threshold

float or str

[1, inf) or (0%, 100%)

1.0

Threshold value for shrinkage. If a percentage value (e.g. '1.0%') is specified, shrinkage is executed according to relative value, \(N_{\rm scaled\_sample} \times t_{\rm shrink}\) where \(t_{\rm shrink}\) is the threshold value and \(N_{\rm scaled\_sample}\) is the number of scaled expected samples.

fab_stop_threshold

float or str

(0, inf) or (0%, inf%)

0.001

Threshold value for FAB-iterations: if the increase of FIC value is less than the threshold, the FAB-iterations is considered to be converged. If a percentage value (e.g. '1.0%') is specified, convergence check is executed according to relative value, \((FIC^{(t)} - FIC^{(t-1)}) / | FIC^{(t-1)} |\).

hard_gate

bool

True / False

True

If True, hard-gate post-processing is enabled.

gate_feature_ids

None or list[int]

Length: [1, inf); Element: [0, inf)

None

List of feature IDs which are applied to the parameter optimizations. If None, all features are used.

comp_feature_ids

None or list[int]

Length: [0, inf); Element: [0, inf)

None

List of feature IDs which are applied to the parameter optimizations. If None, all features are used. If empty list, model is learned as a decision tree.

comp_mandatory_feature_ids

None or list[int]

Length: [1, inf); Element: [0, inf)

None

List of feature IDs which non-L0-regularize constraints are applied to. It means the specified features will always be relevant for all components. If None, no features are specified for non-L0-regularization, which implies all relevant features are selected by FoBa algorithm.

comp_positive_feature_ids

None or list[int]

Length: [1, inf); Element: [0, inf)

None

List of feature IDs whose weight values for all components are constrained to positive values. If None, all features are optimized with no constraints.

comp_negative_feature_ids

None or list[int]

Length: [1, inf); Element: [0, inf)

None

List of feature IDs whose weight values for all components are constrained to negative values. If None, all features are optimized with no constraints.

tree_depth

int

[0, inf)

5

Initial depth of the gate-tree structure of latent variable prior. The initial number of components is \(2^d\) where \(d\) is tree depth. If 0, the optimization with only one component will be executed.

comp_bspline_degree

int

[0, inf)

3

Degree of B-spline function.

comp_bspline_basis_dim

int

[4, inf)

10

The number of B-spline basis functions to be generated for each feature.

comp_weights_min_scale

float

(-inf, inf)

-0.5

Scale value for the initialization of weight values of components.

comp_weights_max_scale

float

(-inf, inf)

0.5

Scale value for the initialization of weight values of components.

comp_bias_min_scale

float

(-inf, inf)

0.25

Scale value for the initialization of bias values of components.

comp_bias_max_scale

float

(-inf, inf)

0.75

Scale value for the initialization of bias values of components.

comp_variance_min_scale

float

(0, inf)

0.1

Scale value for the initialization of variance values of components.

comp_variance_max_scale

float

(0, inf)

0.25

Scale value for the initialization of variance values of components.

gate_opt_mode

str

{‘opt’, ‘refit’, ‘keep’}

‘opt’

Mode of the parameter optimization. If ‘opt’, the parameters are optimized with all features, If ‘refit’, the parameters are fit with relevant features, If ‘keep’, the parameters are kept.

gate_max_bins

None or int

[1, inf)

None

Maximum number of binning for each feature, which is used for the parameter optimization. If None, all unique samples for each feature are used; otherwise, the equal-width binning algorithm is adopted.

gate_opt_type

str

See description

See description

Algorithm of the parameter optimization. The domain and default value depends on each learner type and described in the following section.

gate_l2_regularize

float

[0, inf)

0.0

L2-regularization hyper-parameter for the parameter optimization. The larger the specified value, the stronger the regularization effect is. If 0.0, L2-regularization is disabled.

with_gate_scaled_l0_regularize

bool

True / False

True

Whether with scaled L0-regularization using a tighter lower bound of FIC for the parameter optimization; approximation of det(F) is refined, where F is a Fisher matrix.

max_gate_relevant_features

int

[1, inf)

3

Maximum number of the relevant features for each gate.

gate_svd_threshold

float

[0, inf)

0.00001

Threshold value for singular value decomposition (SVD) in the parameter optimization.

comp_foba_skip

str

{‘power_of_two’, ‘quarter_square’, ‘none’}

‘power_of_two’

The judging function type for the FoBa algorithm skipping. If ‘none’, FoBa is executed for all FAB-iteration steps. FoBa is skipped at \({\rm log}_{2}t \ne {\rm ceil}({\rm log}_{2}t)\) if ‘power_of_two’, or \(t \bmod {\rm ceil}(\sqrt{t}) \ne 0\) if ‘quarter_square’. \(t\) is FAB-iteration step index number starting from 1.

comp_foba_skip_max_interval

int

[2, inf)

25

The maximum interval for the FoBa algorithm skipping. If comp_foba_skip is ‘none’, this value is ignored.

comp_opt_mode

str

{‘opt’, ‘refit’}

‘opt’

Mode of the parameter optimization. If ‘opt’, the parameters are optimized with all features, If ‘refit’, the parameters are fit with relevant features.

comp_two_stage_opt

bool

True / False

False

Whether the two-stage optimization is enabled. If True, the first stage performs the parameter optimization on user-specified mandatory features (comp_mandatory_feature_ids), and the second stage carries out the parameter optimization to the residual of the first stage for only the relevant non-mandatory features.

comp_backward_step

bool

True / False

False

Whether the backward-steps of FoBa algorithm are enabled. In the post-process, backward-steps are carried out regardless of this argument value.

comp_opt_type

str

See description

See description

Algorithm of the parameter optimization. The domain and default value depends on each learner type and described in the following section.

post_comp_opt_type

str

See description

See description

Algorithm of the parameter optimization in the post-processing. The domain and default value depends on each learner type and described in the following section.

comp_l2_regularize

float

[0, inf)

0.0

L2-regularization hyper-parameter for the parameter optimization. The larger the specified value, the stronger the regularization effect is. If 0.0, L2-regularization is disabled.

comp_pspline

float

[0, inf)

1.0

L2-regularization coefficient value for penalized B-spline function (P-spline).

with_comp_scaled_l0_regularize

bool

True / False

True

Whether with scaled L0-regularization using a tighter lower bound of FIC for the parameter optimization; approximation of det(F) is refined, where F is a Fisher matrix.

max_comp_relevant_features

int

[1, inf)

100

Maximum number of the relevant features for each component.

max_comp_foba_iterations

int

[1, inf)

100

Maximum number of the FoBa-iterations for each component.

comp_svd_threshold

float

[0, inf)

0.00001

Threshold value for singular value decomposition (SVD) in the parameter optimization.

num_threads_gates

int

[1, inf)

1

Maximum number of OpenMP threads of the parameter optimization where tasks for all gates are divided into.

num_threads_gate_features

int

[1, inf)

1

Maximum number of OpenMP threads of the parameter optimization where tasks for all features are divided into.

num_threads_comps

int

[1, inf)

1

Maximum number of OpenMP threads of the parameter optimization.

posterior_prob

numpy.ndarray

Initial posterior distribution for posterior hot-start. Size of the posterior matrix = (num_samples, num_comps). The number of samples (rows) must be consistent with that for input data given at learn().

comp_ids

list[int]

Length: [1, inf); Element: [0, inf)

List of component ID numbers for posterior hot-start, whose IDs are assigned the same as the components with corresponding locations in a complete binary tree numbered from left to right (0 to \(2^d - 1\)) where \(d\) is tree depth. Initial tree structure is defined from this parameter. Note that the length of comp_ids must be the same as that of columns of posterior_prob.

model_dict

dict

Information on FAB/HME supervised model. For the format of model_dict, refer to an example for to_dict() method which is defined in HMESupervisedModel.

Learner Specific Parameters

The domain and default values of learner type specific parameters are as follows:

gate type

parameter

value

description

Logistic

gate_opt_type

‘quadratic’

using quadratic upper bound approximation with matrix inversion lemma. [default]

‘quadratic_svd’

using quadratic upper bound approximation with singular value decomposition.


component type

parameter

value

description

Linear regression

comp_opt_type

‘svd’

using singular value decomposition.

‘mil’

using matrix inversion lemma for efficient evaluation of inversion matrices. [default]

Single target linear classification

comp_opt_type

‘standard’

using liblinear-weights.

‘quadratic’

using quadratic upper bound approximation with matrix inversion lemma. [default]

‘quadratic_svd’

using quadratic upper bound approximation with singular value decomposition.

post_comp_opt_type

‘standard’

using liblinear-weights. [default]

‘quadratic’

using quadratic upper bound approximation.

Non-linear classification

post_comp_opt_type

‘standard’

repeating optimization 10 times by the same algorithm as ‘quadratic’. [default]

‘quadratic’

using quadratic upper bound approximation.

Multi target linear classification

post_comp_opt_type

‘standard’

repeating optimization 100 times by the similar algorithm as ‘quadratic’. [default]

‘quadratic’

using quadratic upper bound approximation.

Models

HMESupervisedModel is a common model class for supervised learnings of FAB/HME.

Attributes of HMESupervisedModel.

attribute

type

description

components

list[SupervisedComponoent]

Component objects.

lvprior

HMELVPrior

Latent variable prior object.

num_features

int

The number of features.

num_targets

int

The number of targets.

gate_feature_ids

list[int]

Feature ID numbers applied to the parameter optimizations.

comp_feature_ids

list[int]

Feature ID numbers applied to the parameter optimizations.

Components

HMESupervisedModel contains one or more components. The component type is decided by the learning type: least-squares regression, logistic regression, B-spline regression, or B-spline classification.

All types of components predict target values by using decision function: \(Z = X W + b\). The target value is determined as \(Y = Z\) for regressions, or \(Y = \{ 1 + \exp(-Z) \} ^{-1}\) for classifications.

Common attributes

Common attributes of all component classes.

attribute

symbol

type

domain

description

comp_id

int

Component ID number.

feature_ids

list[int]

Feature ID numbers for data applied to the parameter optimizations.

weights

\(W\)

numpy.array

(-inf, inf) or [-inf, inf]

Weight values. The size of \(W\) is equal to the number of features applied to the parameter optimization. \(W[i]\) for irrelevant features are zero. The size of \(W\) is (num_features) for linear prediction components, (num_features, basis_dim) for B-spline prediction components or (num_features, num_targets) for softmax prediction components. The domain of each element of \(W\) is (-inf, inf) for linear and softmax prediction components or [-inf, inf] for B-spline prediction components (the value can be nan).

bias

\(b\)

float or numpy.array

(-inf, inf) or [-inf, inf]

Bias value. The type of \(b\) is float for linear and B-spline prediction components or numpy.array for softmax prediction components. The domain of \(b\) is (-inf, inf) for linear and B-spline prediction components or [-inf, inf] for softmax prediction components.

Regression component specific attributes

Specific attributes for regression components

attribute

symbol

type

domain

description

variance

\(\sigma^2\)

float

[0, inf]

Variance value.

B-spline prediction component specific attributes

Specific attributes for B-spline prediction components

attribute

symbol

type

domain

description

degree

int

[0, inf)

Degree of the B-spline function.

knot_vecs

numpy.array

[-inf, inf]

Knot vectors for all features. The size of knot_vecs is (num_features, num_samples, num_knots). Each element of knot_vecs can be nan.

Basis functions of B-spline prediction components for each feature are generated by the following algorithm. Let \(M\) be the number of basis functions for each feature, whose value is given by an argument parameter comp_bspline_basis_dim.

  • For \(p < M - 1\):
    \[g_p^{(k)}(x) = \frac{x-x_p}{x_{p+k}-x_p} g_p^{(k-1)}(x) + \frac{x_{p+k+1}-x}{x_{p+k+1}-x_{p+1}} g_{p+1}^{(k-1)}(x),\]

    where \(g_p^{(0)}(x) = 1\) when \(x_p \leq x < x_{p+1}\), otherwise \(g_p^{(0)}(x) = 0\). If the two knot points are the same (\(x_p = x_{p+k}\)), the term \((x-x_p) / (x_{p+k}-x_p)\) is defined as zero.

  • For \(p = M - 1\):
    \[g_p^{(k)}(x) = x.\]

where, \(x\) is a feature value (a column of input feature data, \(X\); e.g. X[j] for \(k\)-th feature), \(x_p\) is a knot point where \(p = 0, 1, ..., M - 1\) (e.g. \(x_p\) is knot_vecs[p]), and \(k\) is degree of B-spline functions.

Latent Variable Prior

Prior classes

There are two kinds of prior classes are defined in the FAB-engine, and both of them are sub-classes of HMEBinaryTreeLVPrior:

  • HMEBernGateLVPrior

  • HMELogitGateLVPrior

The difference of these classes are type of gating function as mentioned later.

Attributes of HMEBinaryTreeLVPrior

attribute

type

description

root_node

HMEBinaryTreeNode

Root node object of gating-tree in the prior.

num_gates

int

The number of gating-nodes in the prior.

num_comps

int

The number of component-nodes in the prior.

Note

There exists lvprior.traverse_depth_first(gates_only=True) method, which yield the all node objects with traversing the tree structures. Here, lvprior is an instance of latent variable prior class. The argument gates_only means whether the only gate nodes (not component node) are traversed, and its default value is False.

Nodes of Gating-Tree

A prior is composed of gating-nodes and component-nodes defined as BinaryTreeGateNode and BinaryTreeComponentNode respectively. Both classes are sub-classes of HMEBinaryTreeNode. Only when a model has just one component, there are no BinaryTreeGateNodes in the model.

BinaryTreeGateNode is a class for gating-nodes.

Attributes of BinaryTreeGateNode

attribute

type

description

gate_index

int

Gate index number.

gate_func

BernGateFunction / LogitGateFunction

Gating function object.

parent_node

HMEBinaryTreeNode

Parent node object of the node.

left_node

HMEBinaryTreeNode

Left-child node object of the node.

right_node

HMEBinaryTreeNode

Right-child node object of the node.

BinaryTreeComponentNode is a class for component-nodes.

Attributes of BinaryTreeComponentNode

attribute

type

description

comp_index

int

Component index number.

parent_node

HMEBinaryTreeNode

Parent node object of the node.

Note

BinaryTreeComponentNode does not hold parameters of the component (weights, bias, etc.), a component object described above takes the information; BinaryTreeComponentNode maps the component list (comps) in the model to corresponding positions of component-nodes in the gating-tree.

Bernoulli-Gating Function

BernGateFunction is a class for Bernoulli-gating function.

Attributes of BernGateFunction

attribute

symbol

type

domain

description

feature_ids

list[int]

Feature ID numbers of the data for optimizing the gate function.

feature_id

\(\gamma\)

int

[0, num_features)

Feature ID applied to the gate function. The ID number corresponds to the column index of user-specified X at learn().

threshold

\(t\)

float

(-inf, inf)

Threshold value for the Bernoulli-gating function.

prob_left

\(g\)

float

[0, 1]

Probability of left-down when \(x[\gamma] < t\), where \(x\) is a sample in feature data.

Note

For a Bernoulli-gating function, probability of left-down is equal to \((1 - g)\) when \(x[\gamma] \ge t\).

Note

A variable internal_feature_index is defined for internal use, which implies the feature index corresponding to the feature data applied to the gate optimization (not the all features).

Logistic-Gating Function

LogitGateFunction is a class for logistic-gating function.

Attributes of LogitGateFunction

attribute

symbol

type

domain

description

feature_ids

list[int]

Feature ID numbers of the data for optimizing the gate function.

weights

\(W\)

numpy.array

(-inf, inf)

Weight values. The size of \(W\) is equal to the number of features applied to the parameter optimization. \(W[i]\) for irrelevant features are zero.

bias

\(b\)

float

(-inf, inf)

Bias value.

hard_gate

bool

True / False

Whether the gate is hard-gate (True) or soft-gate (False).

Note

The probability of left-down is defined as \(p = 1 / \{ 1+\exp(-Z) \}\) where \(Z = XW + b\) in the case of soft-gate. It means the sample is more likely to left-down if the decision function is positive value as \(Z > 0\). In the case of hard-gate, the probability of left-down is 1.0 if the decision function is zero or positive value (\(Z \geq 0\)).

Others

Logging

All logging messages are output through the Python logging library and are named fab or its sub-namespace such as fab.hme. Four log-levels are used in the FAB-engine: ERROR, WARN, INFO and DEBUG.

Errors occurred in the FAB-engine are handled as Python standard exception objects with error messages. Applications using the FAB-engine should be able to handle the exception object properly to display the error information to users. Errors in C++ modules in the FAB-engine are turned into Python standard exception object inside the engine.

Random Seeds

FAB learning processes theoretically depend on initial status (variational posterior distribution and model parameters such as weights, bias, etc.). Since the FAB-engine uses numpy.random library to generate random values for them in initialization of learning processes, users can specify a random seed by numpy.random.seed(SEED_VALUE).