FABHMELogitGateBSplineCl Component Specification¶
Contents
Overview¶
FABHMELogitGateBSplineCl component is a B-spline non-linear binary classification component with FAB/HME algorithm. This component learns a tree-structured model in which each sample is assigned to a component according to Logistic gating functions.
Note
FAB engine uses the word ‘component’ with a different meaning from that of SAMPO. Each component in FAB/HME is a prediction formula, and each sample data is assigned to a specific component for prediction.
Example:
SPD:
# fabhmecl.spd dl1 -> fab1 --- components: dl1: component: DataLoader fab1: component: FABHMELogitGateBSplineClComponent features: name != 'class' tree_depth: 3 target: name == 'class' positive_label: 'Iris-setosa' global_settings: keep_attributes: - class feature_exclude: - class
Input of the component:
_sid
sepal_length_ in_cm
sepal_width_ in_cm
class
0
4.9
2.5
Iris-versicolor
1
6.2
2.8
Iris-versicolor
2
7.2
3.6
Iris-versicolor
…
…
…
…
28
6.2
2.9
Iris-setosa
29
6.7
3.1
Iris-setosa
Output of the component:
_sid
fab1_ actual
fab1_ predict
fab1_score
fab1_ assigned_comp_id
0
-1
1
2.657069e+00
2
1
-1
1
6.524541e-01
2
2
-1
-1
-1.600153e+00
0
…
…
…
…
…
28
1
1
6.524541e-01
2
29
1
-1
-1.080094e+00
0
_sid
fab1_ predict_c0
fab1_ score_c0
fab1_ predict_c1
fab1_ score_c1
fab1_ predict_c2
fab1_ score_c2
0
1
7.921206e-01
-1
-1.028756e+00
1
2.657069e+00
1
-1
-5.600341e-01
-1
-2.346818e+01
1
6.524541e-01
2
-1
-1.600153e+00
-1
-1.082974e+01
-1
-8.895575e-01
…
…
…
…
…
…
…
28
-1
-5.600341e-01
-1
-1.821556e+01
1
6.524541e-01
29
-1
-1.080094e+00
-1
-2.240158e+01
-1
-1.185517e-01
This component has component-specific external formats for model and prediction result evaluation.
See also
Component-common external format files in convert_process
Parameters¶
This component has the following component-specific parameters.
SPD¶
The following parameters are for “components” section of SPD.
Parameter Name |
Type |
Domain |
Default Value |
Description |
---|---|---|---|---|
positive_label 1 |
str |
See Description |
– |
A value chosen from the target attributes to be set as positive label. The domain of this parameter corresponds to that of the target attribute. |
max_fab_iterations |
int |
[1, inf) |
100 |
Maximum number of FAB-iterations. |
bool |
True / False |
False |
If True, the first iteration starts with M-step; otherwise, E-step. |
|
num_acceleration_steps |
int |
[0, inf) |
0 |
The number of steps of acceleration algorithm for each FAB-iteration. If 0, the acceleration algorithm is disabled. |
repeat_until_convergence |
bool |
True / False |
False |
If False, FAB-iterations and the post-processing are executed only once
even if the FAB-iterations are stopped not by convergence condition but
by |
projection_estep |
bool |
True / False |
False |
Whether the projection E-step algorithm is enabled. |
shrink_threshold |
float or str |
[1, inf) or (0%, 100%) |
1.0 |
Threshold value for shrinkage. If a percentage value (e.g. |
fab_stop_threshold |
float or str |
(0, inf) or (0%, inf%) |
0.001 |
Threshold value for FAB-iterations: if the increase of FIC value
is less than the threshold, the FAB-iterations is considered to
be converged. If a percentage value (e.g. |
gate_features |
str |
Query format |
all() |
Features which are applied to gate parameter optimizations. If not specified, all features are used. |
comp_features |
str |
Query format |
all() |
Features which are applied to component parameter optimizations. If not specified, all features are used. If empty, the model is learned as a decision tree. |
comp_mandatory_features |
str |
Query format |
See Description |
Features which non-L0-regularize constraints are applied to. It means the specified features will always be relevant for all components. If not specified, no features are specified for non-L0-regularization, which implies all relevant features are selected by FoBa algorithm. |
int |
[0, inf) |
5 |
Initial depth of the gate-tree structure of latent variable prior. The initial number of components is \(2^d\) where \(d\) is tree depth. If 0, the optimization with only one component will be executed. |
|
comp_bspline_degree 3 |
int |
[0, inf) |
3 |
Degree of B-spline function. |
comp_bspline_basis_dim 3 |
int |
[4, inf) |
10 |
The number of B-spline basis functions to be generated for each feature. |
float |
(-inf, inf) |
-0.5 |
Scale value for the initialization of weight values of components. |
|
float |
(-inf, inf) |
0.5 |
Scale value for the initialization of weight values of components. |
|
float |
(-inf, inf) |
0.25 |
Scale value for the initialization of bias values of components. |
|
float |
(-inf, inf) |
0.75 |
Scale value for the initialization of bias values of components. |
|
gate_l2_regularize |
float |
[0, inf) |
0.0 |
L2-regularization hyper-parameter for gate-parameter optimization. The larger the specified value, the stronger the regularization effect is. If 0.0, L2-regularization is disabled. |
with_gate_scaled_l0_regularize |
bool |
True / False |
True |
Whether with scaled L0-regularization using a tighter lower bound of FIC for gate parameter optimization; approximation of det(F) is refined, where F is a Fisher matrix. |
max_gate_relevant_features |
int |
[1, inf) |
3 |
Maximum number of the relevant features for each gate. |
comp_l2_regularize |
float |
[0, inf) |
0.0 |
L2-regularization hyper-parameter for component parameter optimization. The larger the specified value, the stronger the regularization effect is. If 0.0, L2-regularization is disabled. |
comp_pspline |
float |
[0, inf) |
1.0 |
L2-regularization coefficient value for penalized B-spline function (P-spline). |
with_comp_scaled_l0_regularize |
bool |
True / False |
True |
Whether with scaled L0-regularization using a tighter lower bound of FIC for component parameter optimization; approximation of det(F) is refined, where F is a Fisher matrix. |
max_comp_relevant_features |
int |
[1, inf) |
100 |
Maximum number of the relevant features for each component. |
num_threads_gates |
int |
[1, inf) |
1 |
Maximum number of OpenMP threads of gate parameter optimization where tasks for all gates are divided into. |
num_threads_comps |
int |
[1, inf) |
1 |
Maximum number of OpenMP threads of component parameter optimization. |
SRC¶
The following parameter is for “hotstart” section of SRC.
Parameter Name |
Type |
Domain |
Default Value |
Description |
---|---|---|---|---|
type |
str |
{‘posterior’, ‘mh_refit_comp’, ‘mh_opt_comp’, ‘mh_refit_gate_and_refit_comp’, ‘mh_refit_gate_and_opt_comp’, ‘mh_opt_gate_and_opt_comp’} |
The hot-start type. If ‘posterior’, FAB learns with posterior hot-start which use the initial model whose tree structure is generated by base model and data. Each gate and component parameters are initialized randomly. ‘mh_XXX’ means FAB learns with model hot-start which uses base model as initial model. ‘refit_{gate, comp}’ means refitting the gate functions or prediction formulas with current data. ‘opt_{gate, comp}’ means optimizing (feature selection and fitting) the gate functions or prediction formulas with current data. |
Utilizable Sample Metadata¶
Warning
_fabhme_assigned_comp_id is deprecated. Use hotstart section of SRC instead of _fabhme_assigned_comp_id data column.
This component can utilize the _fabhme_assigned_comp_id attribute of the sample metadata to hot-start with posterior. When the attribute _fabhme_assigned_comp_id attribute is specified in the input data, this component will start the FAB/HME algorithm with the _fabhme_assigned_comp_id attribute values as its initial posterior.
To create the attribute _fabhme_assigned_comp_id, see the specification of the command sampo_ps_fabhme export_assigned_comp_id.
Output Attributes¶
This component generates the following attributes.
Attribute Name |
Scale |
Description |
---|---|---|
<component_id>_actual |
INTEGER |
Values of target attribute. |
<component_id>_predict |
INTEGER |
Predicted values. |
<component_id>_score |
REAL |
A prediction result is obtained by classifying this value according to a boundary. |
<component_id>_assigned_comp_id |
INTEGER |
Component IDs formula assigned by gating functions. |
<component_id>_predict_c<hme_comp_id> |
INTEGER |
Predicted values for the prediction formula of component id, <hme_comp_id>. |
<component_id>_score_c<hme_comp_id> |
REAL |
Score values for the prediction formula of component id, <hme_comp_id>. |
<component_id>_basisfunc_<feature_attr_name>:<basis_func_index> |
REAL |
Basis function values. |
These attributes are in the component output data. These can be loaded in SAMPO API.
See also
Obtaining process results via ProcessResultLoader.
When convert_process is executed, the component output data will be saved in two separate files:
All non-basis function value attributes will be saved as <component_id>_predict_result.csv.
This file describes the prediction result of the component.
_sid,fab1_actual,fab1_predict,fab1_score,fab1_assigned_comp_id,fab1_predict_c0,fab1_score_c0,fab1_predict_c1,fab1_score_c1,fab1_predict_c2,fab1_score_c2 0,-1,1,2.657069e+00,2,1,7.921206e-01,-1,-1.028756e+00,1,2.657069e+00 1,-1,1,6.524541e-01,2,-1,-5.600341e-01,-1,-2.346818e+01,1,6.524541e-01 2,-1,-1,-1.600153e+00,0,-1,-1.600153e+00,-1,-1.082974e+01,-1,-8.895575e-01 ... 28,1,1,6.524541e-01,2,-1,-5.600341e-01,-1,-1.821556e+01,1,6.524541e-01 29,1,-1,-1.080094e+00,0,-1,-1.080094e+00,-1,-2.240158e+01,-1,-1.185517e-01
Basis function value attributes will be saved as basis_func_values.csv.
This file describes the basis function values of B-spline functions.
_sid,fab1_basisfunc_std1_CRIM:0,fab1_basisfunc_std1_CRIM:1,fab1_basisfunc_std1_CRIM:2,fab1_basisfunc_std1_CRIM:3,fab1_basisfunc_std1_CRIM:4,fab1_basisfunc_std1_CRIM:5,fab1_basisfunc_std1_CRIM:6,fab1_basisfunc_std1_CRIM:7,fab1_basisfunc_std1_CRIM:8,fab1_basisfunc_std1_CRIM:9,fab1_basisfunc_std1_ZN:0,fab1_basisfunc_std1_ZN:1,fab1_basisfunc_std1_ZN:2,fab1_basisfunc_std1_ZN:3,fab1_basisfunc_std1_ZN:4,fab1_basisfunc_std1_ZN:5,fab1_basisfunc_std1_ZN:6,fab1_basisfunc_std1_ZN:7,fab1_basisfunc_std1_ZN:8,fab1_basisfunc_std1_ZN:9,fab1_basisfunc_std1_NOX:0,fab1_basisfunc_std1_NOX:1,fab1_basisfunc_std1_NOX:2,fab1_basisfunc_std1_NOX:3,fab1_basisfunc_std1_NOX:4,fab1_basisfunc_std1_NOX:5,fab1_basisfunc_std1_NOX:6,fab1_basisfunc_std1_NOX:7,fab1_basisfunc_std1_NOX:8,fab1_basisfunc_std1_NOX:9,fab1_basisfunc_bin1(0)_CHAS:0,fab1_basisfunc_bin1(0)_CHAS:1,fab1_basisfunc_bin1(0)_CHAS:2,fab1_basisfunc_bin1(0)_CHAS:3,fab1_basisfunc_bin1(0)_CHAS:4,fab1_basisfunc_bin1(0)_CHAS:5,fab1_basisfunc_bin1(0)_CHAS:6,fab1_basisfunc_bin1(0)_CHAS:7,fab1_basisfunc_bin1(0)_CHAS:8,fab1_basisfunc_bin1(0)_CHAS:9,fab1_basisfunc_bin1(1)_RAD:0,fab1_basisfunc_bin1(1)_RAD:1,fab1_basisfunc_bin1(1)_RAD:2,fab1_basisfunc_bin1(1)_RAD:3,fab1_basisfunc_bin1(1)_RAD:4,fab1_basisfunc_bin1(1)_RAD:5,fab1_basisfunc_bin1(1)_RAD:6,fab1_basisfunc_bin1(1)_RAD:7,fab1_basisfunc_bin1(1)_RAD:8,fab1_basisfunc_bin1(1)_RAD:9,fab1_basisfunc_std1_LSTAT:0,fab1_basisfunc_std1_LSTAT:1,fab1_basisfunc_std1_LSTAT:2,fab1_basisfunc_std1_LSTAT:3,fab1_basisfunc_std1_LSTAT:4,fab1_basisfunc_std1_LSTAT:5,fab1_basisfunc_std1_LSTAT:6,fab1_basisfunc_std1_LSTAT:7,fab1_basisfunc_std1_LSTAT:8,fab1_basisfunc_std1_LSTAT:9,fab1_basisfunc_std1_TAX:0,fab1_basisfunc_std1_TAX:1,fab1_basisfunc_std1_TAX:2,fab1_basisfunc_std1_TAX:3,fab1_basisfunc_std1_TAX:4,fab1_basisfunc_std1_TAX:5,fab1_basisfunc_std1_TAX:6,fab1_basisfunc_std1_TAX:7,fab1_basisfunc_std1_TAX:8,fab1_basisfunc_std1_TAX:9,fab1_basisfunc_bin1(3)_RAD:0,fab1_basisfunc_bin1(3)_RAD:1,fab1_basisfunc_bin1(3)_RAD:2,fab1_basisfunc_bin1(3)_RAD:3,fab1_basisfunc_bin1(3)_RAD:4,fab1_basisfunc_bin1(3)_RAD:5,fab1_basisfunc_bin1(3)_RAD:6,fab1_basisfunc_bin1(3)_RAD:7,fab1_basisfunc_bin1(3)_RAD:8,fab1_basisfunc_bin1(3)_RAD:9,fab1_basisfunc_std1_DIS:0,fab1_basisfunc_std1_DIS:1,fab1_basisfunc_std1_DIS:2,fab1_basisfunc_std1_DIS:3,fab1_basisfunc_std1_DIS:4,fab1_basisfunc_std1_DIS:5,fab1_basisfunc_std1_DIS:6,fab1_basisfunc_std1_DIS:7,fab1_basisfunc_std1_DIS:8,fab1_basisfunc_std1_DIS:9,fab1_basisfunc_std1_PTRATIO:0,fab1_basisfunc_std1_PTRATIO:1,fab1_basisfunc_std1_PTRATIO:2,fab1_basisfunc_std1_PTRATIO:3,fab1_basisfunc_std1_PTRATIO:4,fab1_basisfunc_std1_PTRATIO:5,fab1_basisfunc_std1_PTRATIO:6,fab1_basisfunc_std1_PTRATIO:7,fab1_basisfunc_std1_PTRATIO:8,fab1_basisfunc_std1_PTRATIO:9,fab1_basisfunc_std1_B:0,fab1_basisfunc_std1_B:1,fab1_basisfunc_std1_B:2,fab1_basisfunc_std1_B:3,fab1_basisfunc_std1_B:4,fab1_basisfunc_std1_B:5,fab1_basisfunc_std1_B:6,fab1_basisfunc_std1_B:7,fab1_basisfunc_std1_B:8,fab1_basisfunc_std1_B:9,fab1_basisfunc_std1_INDUS:0,fab1_basisfunc_std1_INDUS:1,fab1_basisfunc_std1_INDUS:2,fab1_basisfunc_std1_INDUS:3,fab1_basisfunc_std1_INDUS:4,fab1_basisfunc_std1_INDUS:5,fab1_basisfunc_std1_INDUS:6,fab1_basisfunc_std1_INDUS:7,fab1_basisfunc_std1_INDUS:8,fab1_basisfunc_std1_INDUS:9,fab1_basisfunc_std1_RM:0,fab1_basisfunc_std1_RM:1,fab1_basisfunc_std1_RM:2,fab1_basisfunc_std1_RM:3,fab1_basisfunc_std1_RM:4,fab1_basisfunc_std1_RM:5,fab1_basisfunc_std1_RM:6,fab1_basisfunc_std1_RM:7,fab1_basisfunc_std1_RM:8,fab1_basisfunc_std1_RM:9,fab1_basisfunc_std1_AGE:0,fab1_basisfunc_std1_AGE:1,fab1_basisfunc_std1_AGE:2,fab1_basisfunc_std1_AGE:3,fab1_basisfunc_std1_AGE:4,fab1_basisfunc_std1_AGE:5,fab1_basisfunc_std1_AGE:6,fab1_basisfunc_std1_AGE:7,fab1_basisfunc_std1_AGE:8,fab1_basisfunc_std1_AGE:9 0,6.666667e-01,3.333333e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-4.197819e-01,0.000000e+00,4.056000e-01,5.621333e-01,3.226667e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,2.848299e-01,3.540404e-01,5.969046e-01,4.905502e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-1.287909e+00,0.000000e+00,0.000000e+00,3.657979e-01,5.893919e-01,4.481024e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-1.442174e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,3.173806e-01,6.185440e-01,6.407538e-02,0.000000e+00,0.000000e+00,4.136719e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.810398e-01,4.160185e-01,2.941641e-03,0.000000e+00,-1.200134e-01,0.000000e+00,0.000000e+00,5.681834e-01,4.278832e-01,3.933446e-03,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,1.402136e-01,0.000000e+00,2.974283e-01,6.290620e-01,7.350970e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-6.666082e-01,0.000000e+00,0.000000e+00,4.828731e-01,5.023389e-01,1.478799e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-1.459000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.000000e-01,4.410519e-01,2.741603e-01,6.400531e-01,8.578652e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-1.075562e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.000000e-01,1.000000e+00,6.666667e-01,3.333333e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.000000e-01,1.000000e+00 1,6.654090e-01,3.345904e-01,5.937007e-07,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-4.173393e-01,6.666667e-01,3.333333e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-4.877224e-01,0.000000e+00,1.878266e-01,6.654025e-01,1.467709e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-5.933810e-01,0.000000e+00,4.359346e-01,5.396535e-01,2.441193e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-7.402622e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,4.352526e-01,5.401738e-01,2.457365e-02,0.000000e+00,0.000000e+00,1.942745e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.036805e-01,4.849149e-01,1.140454e-02,3.671664e-01,0.000000e+00,0.000000e+00,2.433332e-01,6.522031e-01,1.044637e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.571599e-01,2.243847e-01,6.581007e-01,1.175145e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-9.873295e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,4.131583e-01,5.566621e-01,3.017957e-02,0.000000e+00,0.000000e+00,-3.030941e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.000000e-01,4.410519e-01,0.000000e+00,3.101911e-01,6.224435e-01,6.736547e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-4.924394e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.000000e-01,1.000000e+00,6.666667e-01,3.333333e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,6.666667e-01,3.333333e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00 2,6.654102e-01,3.345892e-01,5.925699e-07,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-4.173416e-01,6.666667e-01,3.333333e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-4.877224e-01,0.000000e+00,1.878266e-01,6.654025e-01,1.467709e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-5.933810e-01,0.000000e+00,4.359346e-01,5.396535e-01,2.441193e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-7.402622e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,3.479622e-01,6.006842e-01,5.135363e-02,0.000000e+00,1.282714e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,2.419814e-01,6.526661e-01,1.053525e-01,0.000000e+00,0.000000e+00,-2.658118e-01,0.000000e+00,0.000000e+00,2.433332e-01,6.522031e-01,1.044637e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.571599e-01,2.243847e-01,6.581007e-01,1.175145e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-9.873295e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,4.131583e-01,5.566621e-01,3.017957e-02,0.000000e+00,0.000000e+00,-3.030941e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,1.951574e-01,5.942084e-01,3.964270e-01,3.711468e-01,5.858889e-01,4.296433e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-1.208727e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.000000e-01,1.000000e+00,6.666667e-01,3.333333e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,6.666667e-01,3.333333e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00 3,6.651060e-01,3.348931e-01,9.144462e-07,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-4.167504e-01,6.666667e-01,3.333333e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-4.877224e-01,3.728038e-01,5.847932e-01,4.240303e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-1.306878e+00,0.000000e+00,5.390128e-01,4.542103e-01,6.776858e-03,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-8.352838e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,4.997154e-01,4.882744e-01,1.201021e-02,0.000000e+00,1.016303e+00,0.000000e+00,0.000000e+00,0.000000e+00,3.579481e-01,5.944367e-01,4.761513e-02,0.000000e+00,0.000000e+00,0.000000e+00,-8.098885e-01,0.000000e+00,0.000000e+00,0.000000e+00,3.321228e-01,6.101833e-01,5.769394e-02,0.000000e+00,0.000000e+00,0.000000e+00,1.077737e+00,3.580211e-01,5.943904e-01,4.758852e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-1.106115e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.451185e-01,4.487702e-01,6.111363e-03,0.000000e+00,1.130321e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,1.822800e-01,5.900916e-01,4.161628e-01,5.004857e-01,4.876232e-01,1.189113e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-1.361517e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.000000e-01,1.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.000000e-01,1.000000e+00,6.666667e-01,3.333333e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00 ... 504,6.604905e-01,3.394952e-01,1.437113e-05,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-4.077641e-01,6.666667e-01,3.333333e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-4.877224e-01,0.000000e+00,0.000000e+00,0.000000e+00,4.462810e-01,5.316804e-01,2.203857e-02,0.000000e+00,0.000000e+00,0.000000e+00,1.157384e-01,0.000000e+00,0.000000e+00,0.000000e+00,6.050596e-01,3.934473e-01,1.493110e-03,0.000000e+00,0.000000e+00,0.000000e+00,1.581241e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,1.817470e-01,6.660136e-01,1.522394e-01,0.000000e+00,0.000000e+00,7.256721e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.900481e-01,4.064453e-01,7.369964e-01,1.958019e-01,6.643210e-01,1.398771e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-6.684368e-01,0.000000e+00,4.743410e-01,5.093332e-01,1.632578e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-8.032117e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.710729e-01,4.233816e-01,1.176466e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,1.906723e-01,5.929144e-01,4.032249e-01,0.000000e+00,6.346830e-01,3.649239e-01,3.930952e-04,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-8.653016e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.000000e-01,1.000000e+00,6.666667e-01,3.333333e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.000000e-01,1.000000e+00 505,6.642058e-01,3.357919e-01,2.275176e-06,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-4.150002e-01,6.666667e-01,3.333333e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-4.877224e-01,0.000000e+00,0.000000e+00,0.000000e+00,4.462810e-01,5.316804e-01,2.203857e-02,0.000000e+00,0.000000e+00,0.000000e+00,1.157384e-01,0.000000e+00,0.000000e+00,0.000000e+00,6.050596e-01,3.934473e-01,1.493110e-03,0.000000e+00,0.000000e+00,0.000000e+00,1.581241e-01,0.000000e+00,0.000000e+00,0.000000e+00,2.461861e-01,6.512057e-01,1.026082e-01,0.000000e+00,0.000000e+00,0.000000e+00,-3.627671e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,4.170544e-01,5.538074e-01,2.913818e-02,4.347315e-01,0.000000e+00,6.662848e-01,3.337151e-01,5.470025e-08,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-6.132465e-01,0.000000e+00,4.743410e-01,5.093332e-01,1.632578e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-8.032117e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.710729e-01,4.233816e-01,1.176466e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.000000e-01,4.410519e-01,0.000000e+00,4.495709e-01,5.291142e-01,2.131485e-02,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,-6.690583e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.000000e-01,1.000000e+00,6.666667e-01,3.333333e-01,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,5.000000e-01,1.000000e+00
Attribute Metadata¶
The metadata of the output attributes is created with the following rules.
Context Rule¶
Attribute Name |
Context Name |
Description |
---|---|---|
All the output attributes of this component |
field_path |
List of the superordinate concepts of each output attribute based on the following hierarchical structure of the output attributes: root
├── fabhmecl
│ ├── assigned_comp_id
│ └── component
│ ├── 0
│ │ ├── predict
│ │ └── score
│ ├── 1
│ │ ├── predict
│ │ └── score
│ .
│ .
│ .
│
└── binary_classification
├── actual
├── predict
└── score
|
<component_id>_actual, <component_id>_predict, <component_id>_predict_c<hme_comp_id> |
positive_map |
Mapping between a positive value and a positive label. |
<component_id>_actual, <component_id>_predict, <component_id>_predict_c<hme_comp_id> |
negative_map |
Mapping between a negative value and a negative label. |
<component_id>_assigned_comp_id |
active_comp_ids |
List of component IDs corresponding to each prediction formula. |
Derivation Rule¶
Attribute Name |
Derived From |
---|---|
<component_id>_actual |
Derived from the target attribute. |
<component_id>_predict |
Derived from the attributes which have non-zero coefficients in any prediction formula. |
<component_id>_score |
Derived from the attributes which have non-zero coefficients in any prediction formula. |
<component_id>_assigned_comp_id |
Derived from the attributes used in the gating functions. |
<component_id>_predict_c<hme_comp_id> |
Derived from the attributes which have non-zero coefficients in the prediction formula of component id, <hme_comp_id>. |
<component_id>_score_c<hme_comp_id> |
Derived from the attributes which have non-zero coefficients in the prediction formula of component id, <hme_comp_id>. |
<component_id>_basisfunc_<feature_attr_name>:<basis_function_index> |
Derived from the attribute of the name of <feature_attr_name>. |
Example¶
{
"nodes": [
{
"aid": "fab1[15]",
"name": "fab1_basisfunc_sepal_width_in_cm:9",
"scale": "real",
"is_excluded": false,
"cid": "fab1",
"cindex": 15,
"values": null,
"is_kept": false,
"context": null
},
{
"aid": "_sid",
"name": "_sid",
"scale": "integer",
"is_excluded": false,
"cid": null,
"cindex": 0,
"values": null,
"is_kept": false,
"context": null
},
...
],
"links": [
[
"dl1[1]",
"fab1[14]"
],
[
"dl1[1]",
"fab1[5]"
],
...
]
}
See also
Attribute metadata file format in Attribute Metadata File Specification
Model¶
The model of this component can be described by the following parameters.
Model Parameter |
Type |
Domain |
Description |
---|---|---|---|
fic |
float |
(-inf, inf) |
Factorized Information Criterion. The asymptotic approximation value used by FAB/HME. |
num_initial_comps |
int |
[0, inf) |
The initial number of components before iterations. |
num_active_comps |
int |
[0, inf) |
The terminal number of active components after iterations. |
gate_tree |
dict |
See Description |
Dictionary form of the gating tree structure. |
prediction_formulas |
pandas.DataFrame |
See Description |
Component weights and bias for each prediction formula. |
bspline_params |
pandas.DataFrame |
See Description |
Degree and basis dimensionality of the B-spline function. |
bspline_knot_vecs |
pandas.DataFrame |
See Description |
Knot vectors for all features for all knots in the B-spline function. |
The gate_tree
dictionary keys are described below:
Gate Tree Dictionary Key |
Type |
Domain |
Description |
---|---|---|---|
gate_type |
str |
‘logit’ |
The type of gate. |
hard_gate |
bool |
true / false |
Whether the gate is hard_gate or not. |
nodes |
list of dict |
See Description |
List of node dictionaries. |
edges |
list of dict |
See Description |
List of edge dictionaries. |
The keys of each node dictionary in nodes
are described below:
Node Dictionary Key |
Type |
Domain |
Description |
---|---|---|---|
node_id |
int |
[0, inf) |
The node ID. |
node_type |
str |
{‘gate’, ‘component’} |
The node type. |
gate_func |
dict |
See Description |
The |
comp_id |
int |
[0, inf) |
The component ID. Specifiable if |
The keys of each edge dictionary in edges
are described below:
Edge Dictionary Key |
Type |
Domain |
Description |
---|---|---|---|
source |
int |
[0, inf) |
The |
target |
int |
[0, inf) |
The |
is_left |
bool |
true / false |
Whether the target node is the left-child of the source. |
The keys of the gate_func
dictionary are described below:
Gate Function Dictionary Key |
Type |
Domain |
Description |
---|---|---|---|
bias |
float |
(-inf, inf) |
The gate function bias. |
weights |
list of dict |
See Description |
Lists weights dictionaries mapping each attribute to its corresponding weight. |
When the model is loaded in the SAMPO API, the model parameters will be output as a single dictionary.
See also
Obtaining process results via ProcessResultLoader
{'fic': -23.832958802449035,
'num_initial_comps': 32,
'num_active_comps': 2,
'gate_tree':
{'gate_type': 'logit',
'hard_gate': True,
'nodes': [
{'comp_id': 20, 'node_type': 'component', 'node_id': 1},
{'node_type': 'gate',
'node_id': 0,
'gate_func':
{'bias': -14.594158450398055,
'weights': [
{'aid': 'dl[0]', 'attr_name': 'sepal_length_in_cm', 'weight': 10.426327199487217},
{'aid': 'dl[1]', 'attr_name': 'petal_length_in_cm', 'weight': -13.460106074504926}]}},
{'comp_id': 30, 'node_type': 'component', 'node_id': 2}],
'edges': [
{'source': 0, 'target': 1, 'is_left': True},
{'source': 0, 'target': 2, 'is_left': False}]}},
'prediction_formulas':
prediction_formula_20 prediction_formula_30
attr_name basis_function_index
sepal_length_in_cm 0 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
6 0 0
7 0 0
8 0 0
9 0 0
petal_length_in_cm 0 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
6 0 0
7 0 0
8 0 0
9 0 0
bias -1 1,
'bspline_params': degree basis_dim
0 3 10,
'bspline_knot_vecs':
knot_value_0 knot_value_1 knot_value_2 knot_value_3 knot_value_4 knot_value_5 knot_value_6 knot_value_7 knot_value_8 knot_value_9 knot_value_10 knot_value_11 knot_value_12
attr_name
sepal_length_in_cm 3.9625 3.9625 4.3 4.6375 4.975 5.3125 5.65 5.9875 6.325 6.6625 7.0 7.3375 7.3375
petal_length_in_cm 1.7000 1.7000 2.0 2.3000 2.600 2.9000 3.20 3.5000 3.800 4.1000 4.4 4.7000 4.7000
}
External Format¶
When convert_process is executed, the model parameters are saved into different files and are grouped as: general information, gating function, prediction formula, B-spline parameters, and B-spline knot vectors.
General Information¶
This file describes \(FIC\) after learning the model, initial number of components, and the terminal number of components.
fic,num_initial_comps,num_active_comps
-1.294308e+02,8,3
Gate Tree¶
This file describes the structure and parameters of the gate-tree of the model.
{
"gate_tree": {
"gate_type": "logit",
"hard_gate": true,
"nodes": [
{
"node_id": 11,
"node_type": "gate",
"gate_func": {
"weights": [
{
"aid": "std1[4]",
"attr_name": "std1_RM",
"weight": -3.6682658685673992e+00
},
{
"aid": "std1[7]",
"attr_name": "std1_TAX",
"weight": -5.8122016705226542e+00
},
{
"aid": "std1[10]",
"attr_name": "std1_LSTAT",
"weight": 1.0537643144910271e+01
}
],
"bias": 1.2740926133353371e+01
}
},
{
"node_id": 10,
"node_type": "gate",
"gate_func": {
"weights": [
{
"aid": "std1[2]",
"attr_name": "std1_INDUS",
"weight": 7.6493213521271874e-01
},
{
"aid": "std1[4]",
"attr_name": "std1_RM",
"weight": 2.5021103534594329e+00
},
{
"aid": "std1[10]",
"attr_name": "std1_LSTAT",
"weight": -2.8529074313420657e+00
}
],
"bias": -2.3621841358789547e-01
}
},
...
{
"node_id": 13,
"node_type": "component",
"comp_id": 13
},
{
"node_id": 12,
"node_type": "component",
"comp_id": 12
},
...
],
"edges": [
{
"source": 11,
"target": 13,
"is_left": false
},
{
"source": 11,
"target": 12,
"is_left": true
},
...
]
}
}
Prediction Formulas¶
This file describes parameters of prediction formulas: weights and bias values.
aid,attr_name,basis_function_index,prediction_formula_0
dl1[0],sepal_length_in_cm,0,2.3589409814056678e-01
dl1[0],sepal_length_in_cm,1,3.2958604508257561e-01
dl1[0],sepal_length_in_cm,2,6.3181400239767593e-02
...
,bias,,5.4161895819219632e+00
B-spline Parameters¶
This file describes parameters of B-spline type prediction formulas: degree and the number of basis function for each feature.
degree,basis_dim
3,10
B-spline Knot Vectors¶
This file describes knot vectors of B-spline’s prediction formula.
aid,attr_name,knot_value_0,knot_value_1,knot_value_2,knot_value_3,knot_value_4,knot_value_5,knot_value_6,knot_value_7,knot_value_8,knot_value_9,knot_value_10,knot_value_11,knot_value_12
std1[0],std1_CRIM,-1.2341486543554931e+00,-1.2341486543554931e+00,-8.2810505758699771e-01,-4.2206146081850215e-01,-1.6017864050006603e-02,3.9002573271848906e-01,7.9606932948698450e-01,1.2021129262554799e+00,1.6081565230239758e+00,2.0142001197924713e+00,2.4202437165609667e+00,2.8262873133294621e+00,2.8262873133294621e+00
std1[1],std1_ZN,-1.3831265020520607e+00,-1.3831265020520607e+00,-1.0478983552351340e+00,-7.1267020841820727e-01,-3.7744206160128058e-01,-4.2213914784353879e-02,2.9301423203257282e-01,6.2824237884949952e-01,9.6347052566642621e-01,1.2986986724833531e+00,1.6339268193002798e+00,1.9691549661172065e+00,1.9691549661172065e+00
std1[2],std1_INDUS,-1.9600846596365655e+00,-1.9600846596365655e+00,-1.6593508630904183e+00,-1.3586170665442712e+00,-1.0578832699981242e+00,-7.5714947345197714e-01,-4.5641567690583007e-01,-1.5568188035968311e-01,1.4505191618646407e-01,4.4578571273261125e-01,7.4651950927875799e-01,1.0472533058249049e+00,1.0472533058249049e+00
...
Prediction Result Evaluation¶
The indices used in evaluating prediction results of this component are described below.
Evaluation Index |
Type |
Description |
---|---|---|
true_positive |
int |
Number of samples determined as positive correctly (TP). |
false_positive |
int |
Number of samples determined as positive incorrectly (FP). |
true_negative |
int |
Number of samples determined as negative correctly (TN). |
false_negative |
int |
Number of samples determined as negative incorrectly (FN). |
accuracy |
float |
Proportion of true results in the population as shown below:
\(\frac{\mbox{TP} + \mbox{TN}}{\mbox{TP} + \mbox{FP} + \mbox{TN} + \mbox{FN}}\)
|
classification_error |
float |
Proportion of false results in the population as shown below:
\(\frac{\mbox{FP} + \mbox{FN}}{\mbox{TP} + \mbox{FP} + \mbox{TN} + \mbox{FN}} = 1 - \mbox{accuracy}\)
|
precision |
float |
Proportion of the
true_positive against all samples determined as positive as shown below:\(\frac{\mbox{TP}}{\mbox{TP} + \mbox{FP}}\)
|
recall |
float |
Proportion of the
true_positive against all the actual positive samples as shown below:\(\frac{\mbox{TP}}{\mbox{TP} + \mbox{FN}}\)
|
specificity |
float |
Proportion of the
true_negative against all the actual negative samples as shown below:\(\frac{\mbox{TN}}{\mbox{TN} + \mbox{FP}}\)
|
false_positive_rate |
float |
Proportion of the
false_positive against all the actual negative samples as shown below:\(\frac{\mbox{FP}}{\mbox{TN} + \mbox{FP}} = 1 - \mbox{specificity}\)
|
false_negative_rate |
float |
Proportion of the
false_negative against all the actual positive samples as shown below:\(\frac{\mbox{FN}}{\mbox{TP} + \mbox{FN}} = 1 - \mbox{recall}\)
|
f_measure |
float |
Harmonic mean of
precision and recall as shown below:\(\frac{2 \times \mbox{precision} \times \mbox{recall}}{\mbox{precision} + \mbox{recall}}\)
|
auc |
float |
Area under ROC (Receiver Operating Characteristic) curve. |
area_under_precision_recall |
float |
Area under PR (Precision-Recall) curve. |
When obtaining these evaluation results in SAMPO API, a pandas.DataFrame is loaded with the evaluation indices as the columns of the DataFrame.
See also
Obtaining process results via ProcessResultLoader
External Format¶
When convert_process is executed, the evaluation results are saved as a CSV file with the evaluation indices as the header of the CSV.
This file describes the evaluation of the prediction result of the component.
true_positive,false_positive,true_negative,false_negative,accuracy,classification_error,precision,recall,specificity,false_positive_rate,false_negative_rate,f_measure,auc,area_under_precision_recall
6,3,14,7,6.666667e-01,3.333333e-01,6.666667e-01,4.615385e-01,8.235294e-01,1.764706e-01,5.384615e-01,5.454545e-01,6.696833e-01,5.715832e-01