FABHMEBernGateLinearMultiCl Component Specification¶
Contents
Overview¶
FABHMEBernGateLinearMultiCl component is a linear multiclass classification component with FAB/HME algorithm. This component learns a tree-structured model in which each sample is assigned to a component according to Bernoulli gating functions.
Note
FAB engine uses the word ‘component’ with a different meaning from that of SAMPO. Each component in FAB/HME is a prediction formula, and each sample data is assigned to a specific component for prediction.
Example:
SPD:
# fabhmemcl.spd dl1 -> fab1 --- components: dl1: component: DataLoader fab1: component: FABHMEBernGateLinearMultiClComponent features: name != 'class' tree_depth: 3 target: name == 'class' global_settings: keep_attributes: - class feature_exclude: - class
Input of the component:
_sid
sepal_length_in_cm
sepal_width_in_cm
class
0
5.1
3.5
Iris-setosa
1
4.9
3.0
Iris-setosa
2
4.7
3.2
Iris-versicolor
…
…
…
…
28
6.7
2.5
Iris-virginica
29
7.2
3.6
Iris-virginica
Output of the component:
_sid
fab1_actual
fab1_predict
0
Iris-setosa
Iris-setosa
1
Iris-setosa
Iris-setosa
2
Iris-versicolor
Iris-versicolor
…
…
…
28
Iris-virginica
Iris-virginica
29
Iris-virginica
Iris-virginica
_sid
fab1_score_Iris-setosa
fab1_score_Iris-versicolor
fab1_score_Iris-virginica
fab1_assigned_comp_id
0
3.066959e+00
-9.652613e-01
-1.528725e+01
0
1
2.729733e+00
-5.539330e-01
-1.273995e+01
0
2
2.838539e+00
6.186438e+00
4.268897e+00
0
…
…
…
…
…
28
2.357728e+00
9.662180e+00
1.384438e+01
0
29
3.097255e+00
8.260996e+00
9.879197e+00
0
_sid
fab1_predict_c0
fab1_score_c0_Iris-setosa
fab1_score_c0_Iris-versicolor
fab1_score_c0_Iris-virginica
0
Iris-setosa
3.066959e+00
-9.652613e-01
-1.528725e+01
1
Iris-setosa
2.729733e+00
-5.539330e-01
-1.273995e+01
2
Iris-versicolor
2.838539e+00
6.186438e+00
4.268897e+00
…
…
…
…
…
28
Iris-virginica
2.357728e+00
9.662180e+00
1.384438e+01
29
Iris-virginica
3.097255e+00
8.260996e+00
9.879197e+00
This component has component-specific external formats for model and prediction result evaluation.
See also
Component-common external format files in convert_process
Parameters¶
This component has the following component-specific parameters.
SPD¶
The following parameters are for “components” section of SPD.
Parameter Name |
Type |
Domain |
Default Value |
Description |
---|---|---|---|---|
max_fab_iterations |
int |
[1, inf) |
100 |
Maximum number of FAB-iterations. |
bool |
True / False |
False |
If True, the first iteration starts with M-step; otherwise, E-step. |
|
num_acceleration_steps |
int |
[0, inf) |
0 |
The number of steps of acceleration algorithm for each FAB-iteration. If 0, the acceleration algorithm is disabled. |
repeat_until_convergence |
bool |
True / False |
False |
If False, FAB-iterations and the post-processing are executed only once
even if the FAB-iterations are stopped not by convergence condition but
by |
projection_estep |
bool |
True / False |
False |
Whether the projection E-step algorithm is enabled. |
shrink_threshold |
float or str |
[1, inf) or (0%, 100%) |
1.0 |
Threshold value for shrinkage. If a percentage value (e.g. |
fab_stop_threshold |
float or str |
(0, inf) or (0%, inf%) |
0.001 |
Threshold value for FAB-iterations: if the increase of FIC value
is less than the threshold, the FAB-iterations is considered to
be converged. If a percentage value (e.g. |
gate_features |
str |
Query format |
all() |
Features which are applied to gate parameter optimizations. If not specified, all features are used. |
comp_features |
str |
Query format |
all() |
Features which are applied to component parameter optimizations. If not specified, all features are used. If empty, the model is learned as a decision tree. |
comp_mandatory_features |
str |
Query format |
See Description |
Features which non-L0-regularize constraints are applied to. It means the specified features will always be relevant for all components. If not specified, no features are specified for non-L0-regularization, which implies all relevant features are selected by FoBa algorithm. |
int |
[0, inf) |
5 |
Initial depth of the gate-tree structure of latent variable prior. The initial number of components is \(2^d\) where \(d\) is tree depth. If 0, the optimization with only one component will be executed. |
|
float |
(-inf, inf) |
-0.5 |
Scale value for the initialization of weight values of components. |
|
float |
(-inf, inf) |
0.5 |
Scale value for the initialization of weight values of components. |
|
float |
(-inf, inf) |
0.25 |
Scale value for the initialization of bias values of components. |
|
float |
(-inf, inf) |
0.75 |
Scale value for the initialization of bias values of components. |
|
gate_max_bins |
int |
[1, inf) |
See Description |
Maximum number of binning for each feature, which is used for gate parameter optimization. If not specified, all unique samples for each feature are used; otherwise, the equal-width binning algorithm is adopted. |
comp_foba_skip |
str |
{‘power_of_two’, ‘quarter_square’, ‘none’} |
‘power_of_two’ |
The judging function type for the FoBa algorithm skipping. If ‘none’, FoBa is executed for all FAB-iteration steps. FoBa is skipped at \({\rm log}_{2}t \ne {\rm ceil}({\rm log}_{2}t)\) if ‘power_of_two’, or \(t \bmod {\rm ceil}(\sqrt{t}) \ne 0\) if ‘quarter_square’. \(t\) is FAB-iteration step index (\(t\) starts from 1). |
comp_foba_skip_max_interval |
int |
[2, inf) |
25 |
The maximum interval for the FoBa algorithm skipping. If comp_foba_skip is ‘none’, this value is ignored. |
comp_backward_step |
bool |
True / False |
False |
Whether the backward-steps of FoBa algorithm are enabled. In the post-process, backward-steps are carried out regardless of this argument value. |
comp_l2_regularize |
float |
[0, inf) |
0.0 |
L2-regularization hyper-parameter for component parameter optimization. The larger the specified value, the stronger the regularization effect is. If 0.0, L2-regularization is disabled. |
with_comp_scaled_l0_regularize |
bool |
True / False |
True |
Whether with scaled L0-regularization using a tighter lower bound of FIC for component parameter optimization; approximation of det(F) is refined, where F is a Fisher matrix. |
max_comp_relevant_features |
int |
[1, inf) |
100 |
Maximum number of the relevant features for each component. |
max_comp_foba_iterations |
int |
[1, inf) |
100 |
Maximum number of the FoBa-iterations for each component. |
num_threads_gates |
int |
[1, inf) |
1 |
Maximum number of OpenMP threads of gate parameter optimization where tasks for all gates are divided into. |
num_threads_gate_features |
int |
[1, inf) |
1 |
Maximum number of OpenMP threads of gate parameter optimization where tasks for all features are divided into. |
num_threads_comps |
int |
[1, inf) |
1 |
Maximum number of OpenMP threads of component parameter optimization. |
SRC¶
The following parameter is for “hotstart” section of SRC.
Parameter Name |
Type |
Domain |
Default Value |
Description |
---|---|---|---|---|
type |
str |
{‘posterior’, ‘mh_refit_comp’, ‘mh_opt_comp’, ‘mh_refit_gate_and_refit_comp’, ‘mh_refit_gate_and_opt_comp’, ‘mh_opt_gate_and_opt_comp’} |
The hot-start type. If ‘posterior’, FAB learns with posterior hot-start which use the initial model whose tree structure is generated by base model and data. Each gate and component parameters are initialized randomly. ‘mh_XXX’ means FAB learns with model hot-start which uses base model as initial model. ‘refit_{gate, comp}’ means refitting the gate functions or prediction formulas with current data. ‘opt_{gate, comp}’ means optimizing (feature selection and fitting) the gate functions or prediction formulas with current data. |
Utilizable Sample Metadata¶
Warning
_fabhme_assigned_comp_id is deprecated. Use hotstart section of SRC instead of _fabhme_assigned_comp_id data column.
This component can utilize the _fabhme_assigned_comp_id attribute of the sample metadata to hot-start with posterior. When the attribute _fabhme_assigned_comp_id attribute is specified in the input data, this component will start the FAB/HME algorithm with the _fabhme_assigned_comp_id attribute values as its initial posterior.
To create the attribute _fabhme_assigned_comp_id, see the specification of the command sampo_ps_fabhme export_assigned_comp_id.
Output Attributes¶
This component generates the following attributes.
Attribute Name |
Scale |
Description |
---|---|---|
<component_id>_actual |
NOMINAL |
Values of the target attribute. |
<component_id>_predict |
NOMINAL |
Predicted values. |
<component_id>_score_<target_class> |
REAL |
A prediction result is obtained by identifying the maximum score among all target classes. |
<component_id>_assigned_comp_id |
INTEGER |
Component IDs assigned by gating functions. |
<component_id>_predict_c<hme_comp_id> |
NOMINAL |
Predicted values for the prediction formula of component ID, <hme_comp_id>. |
<component_id>_score_c<hme_comp_id>_<target_class> |
REAL |
Score values per target class for the prediction formula of component ID, <hme_comp_id>. |
These attributes are in the component output data. These can be loaded in SAMPO API.
See also
Obtaining process results via ProcessResultLoader.
When convert_process is executed, the component output data will be saved in <component_id>_predict_result.csv.
This file describes the prediction result of the component.
_sid,fab1_actual,fab1_predict,fab1_score_Iris-setosa,fab1_score_Iris-versicolor,fab1_score_Iris-virginica,fab1_assigned_comp_id,fab1_predict_c2,fab1_score_c2_Iris-setosa,fab1_score_c2_Iris-versicolor,fab1_score_c2_Iris-virginica
0,Iris-setosa,Iris-setosa,1.000000e+00,-1.932341e+00,-1.101594e+01,2,Iris-setosa,1.000000e+00,-1.932341e+00,-1.101594e+01
1,Iris-setosa,Iris-setosa,1.000000e+00,-1.932341e+00,-1.101594e+01,2,Iris-setosa,1.000000e+00,-1.932341e+00,-1.101594e+01
2,Iris-setosa,Iris-setosa,1.000000e+00,-2.135714e+00,-1.147150e+01,2,Iris-setosa,1.000000e+00,-2.135714e+00,-1.147150e+01
...
28,Iris-virginica,Iris-virginica,1.000000e+00,7.016043e+00,9.028858e+00,2,Iris-virginica,1.000000e+00,7.016043e+00,9.028858e+00
29,Iris-virginica,Iris-virginica,1.000000e+00,7.626160e+00,1.039555e+01,2,Iris-virginica,1.000000e+00,7.626160e+00,1.039555e+01
Attribute Metadata¶
The metadata of the output attributes is created with the following rules.
Context Rule¶
Attribute Name |
Context Name |
Description |
---|---|---|
All the output attributes of this component |
field_path |
List of the superordinate concepts of each output attribute based on the following hierarchical structure of the output attributes: root
├── fabhmemcl
│ ├── assigned_comp_id
│ └── component
│ ├── 0
│ │ ├── predict
│ │ └── score
│ │ ├── *<target_class_0>*
│ │ ├── *<target_class_1>*
│ │ .
│ │ .
│ │ .
│ │
│ ├── 1
│ │ ├── predict
│ │ └── score
│ │ ├── *<target_class_0>*
│ │ ├── *<target_class_1>*
│ │ .
│ │ .
│ │ .
│ .
│ .
│ .
│
└── multiclass_classification
├── actual
├── predict
└── score
├── *<target_class_0>*
├── *<target_class_1>*
.
.
.
|
<component_id>_assigned_comp_id |
active_comp_ids |
List of component IDs corresponding to each prediction formula. |
Derivation Rule¶
Attribute Name |
Derived From |
---|---|
<component_id>_actual |
Derived from the target attribute. |
<component_id>_predict |
Derived from the attributes which have non-zero coefficients in any prediction formula. |
<component_id>_score_<target_class> |
Derived from the attributes which have non-zero coefficients in any prediction formula. |
<component_id>_assigned_comp_id |
Derived from the attributes used in the gating functions. |
<component_id>_predict_c<hme_comp_id> |
Derived from the attributes which have non-zero coefficients in the prediction formula of component ID, <hme_comp_id>. |
<component_id>_score_c<hme_comp_id>_<target_class> |
Derived from the attributes which have non-zero coefficients in the prediction formula of component ID, <hme_comp_id>. |
Example¶
{
"nodes": [
{"aid": "_sid", "name": "_sid", ... },
{"aid": "dl1[0]", "name": "sepal_length_in_cm", ... },
{"aid": "dl1[1]", "name": "sepal_width_in_cm", ... },
{"aid": "dl1[2]", "name": "petal_length_in_cm", ... },
{"aid": "dl1[3]", "name": "petal_width_in_cm", ... },
{"aid": "dl1[4]", "name": "class", ... },
{"aid": "fab1[0]", "name": "fab1_actual", "scale": "nominal", "is_excluded": false,
"cid": "fab1", "cindex": 0, "is_kept": false,
"values": ["Iris-setosa", "Iris-versicolor", "Iris-virginica"],
"context": {
"field_path": ["multiclass_classification", "actual"]
},
},
{"aid": "fab1[1]", "name": "fab1_predict", "scale": "nominal", "is_excluded": false,
"cid": "fab1", "cindex": 1, "is_kept": false,
"values": ["Iris-setosa", "Iris-versicolor", "Iris-virginica"],
"context": {
"field_path": ["multiclass_classification", "predict"]
}
},
{"aid": "fab1[2]", "name": "fab1_score_Iris-setosa", "scale": "real", "is_excluded": false,
"cid": "fab1", "cindex": 2, "values": null, "is_kept": false,
"context": {
"field_path": ["multiclass_classification", "score", "Iris-setosa"]
}
},
{"aid": "fab1[3]", "name": "fab1_score_Iris-versicolor", "scale": "real", "is_excluded": false,
"cid": "fab1", "cindex": 3, "values": null, "is_kept": false,
"context": {
"field_path": ["multiclass_classification", "score", "Iris-versicolor"]
}
},
{"aid": "fab1[4]", "name": "fab1_score_Iris-virginica", "scale": "real", "is_excluded": false,
"cid": "fab1", "cindex": 4, "values": null, "is_kept": false,
"context": {
"field_path": ["multiclass_classification", "score", "Iris-virginica"]
}
},
{"aid": "fab1[5]", "name": "fab1_assigned_comp_id", "scale": "integer",
"is_excluded": false, "cid": "fab1", "cindex": 5, "values": null, "is_kept": false,
"context": {
"active_comp_ids": [7, 13, 17, 19, 22], "field_path": ["fabhmemcl", "assigned_comp_id"]
}
},
{"aid": "fab1[6]", "name": "fab1_predict_c7", "scale": "nominal", "is_excluded": false,
"cid": "fab1", "cindex": 6, "is_kept": false,
"values": ["Iris-setosa", "Iris-versicolor", "Iris-virginica"],
"context": {
"field_path": ["fabhmemcl", "component", 7, "predict"]}
}
},
{"aid": "fab1[7]", "name": "fab1_score_c7_Iris-setosa", "scale": "real", "is_excluded": false,
"cid": "fab1", "cindex": 7, "values": null, "is_kept": false,
"context": {
"field_path": ["fabhmemcl", "component", 7, "score", "Iris-setosa"]
}
},
{"aid": "fab1[8]", "name": "fab1_score_c7_Iris-versicolor", "scale": "real", "is_excluded": false,
"cid": "fab1", "cindex": 8, "values": null, "is_kept": false,
"context": {
"field_path": ["fabhmemcl", "component", 7, "score", "Iris-versicolor"]
}
},
{"aid": "fab1[9]", "name": "fab1_score_c7_Iris-virginica", "scale": "real", "is_excluded": false,
"cid": "fab1", "cindex": 9, "values": null, "is_kept": false,
"context": {
"field_path": ["fabhmemcl", "component", 8, "score", "Iris-virginica"]
}
},
...
],
"links": [
{"source": "dl1[1]", "target": "fab1[2]"},
{"source": "dl1[1]", "target": "fab1[1]"},
{"source": "dl1[1]", "target": "fab1[3]"},
{"source": "dl1[0]", "target": "fab1[5]"},
{"source": "dl1[0]", "target": "fab1[12]"},
{"source": "dl1[0]", "target": "fab1[4]"},
{"source": "dl1[0]", "target": "fab1[8]"},
{"source": "dl1[0]", "target": "fab1[2]"},
{"source": "dl1[0]", "target": "fab1[9]"},
{"source": "dl1[0]", "target": "fab1[3]"},
{"source": "dl1[0]", "target": "fab1[6]"},
{"source": "dl1[0]", "target": "fab1[13]"},
{"source": "dl1[0]", "target": "fab1[7]"},
{"source": "dl1[0]", "target": "fab1[10]"},
{"source": "dl1[0]", "target": "fab1[1]"},
{"source": "dl1[0]", "target": "fab1[11]"},
{"source": "dl1[2]", "target": "fab1[2]"},
{"source": "dl1[2]", "target": "fab1[1]"},
{"source": "dl1[2]", "target": "fab1[3]"},
{"source": "dl1[3]", "target": "fab1[2]"},
{"source": "dl1[3]", "target": "fab1[1]"},
{"source": "dl1[3]", "target": "fab1[3]"},
{"source": "dl1[4]", "target": "fab1[0]"}
]
}
See also
Attribute metadata file format in Attribute Metadata File Specification
Model¶
The model of this component can be described by the following parameters.
Model Parameter |
Type |
Domain |
Description |
---|---|---|---|
fic |
float |
(-inf, inf) |
Factorized Information Criterion. The asymptotic approximation value used by FAB/HME. |
num_initial_comps |
int |
[0, inf) |
The initial number of components before iterations. |
num_active_comps |
int |
[0, inf) |
The terminal number of active components after iterations. |
gate_tree |
dict |
See Description |
Dictionary form of the gating tree structure. |
prediction_formulas |
pandas.DataFrame |
See Description |
Component weights and bias for each prediction formula. |
The gate_tree
dictionary keys are described below:
Gate Tree Dictionary Key |
Type |
Domain |
Description |
---|---|---|---|
gate_type |
str |
‘bern’ |
The type of gate. |
hard_gate |
bool |
true / false |
Whether the gate is hard_gate or not. |
nodes |
list of dict |
See Description |
List of node dictionaries. |
edges |
list of dict |
See Description |
List of edge dictionaries. |
The keys of each node dictionary in nodes
are described below:
Node Dictionary Key |
Type |
Domain |
Description |
---|---|---|---|
node_id |
int |
[0, inf) |
The node ID. |
node_type |
str |
{‘gate’, ‘component’} |
The node type. |
gate_func |
dict |
See Description |
The |
comp_id |
int |
[0, inf) |
The component ID. Specifiable if |
The keys of each edge dictionary in edges
are described below:
Edge Dictionary Key |
Type |
Domain |
Description |
---|---|---|---|
source |
int |
[0, inf) |
The |
target |
int |
[0, inf) |
The |
is_left |
bool |
true / false |
Whether the target node is the left-child of the source. |
The keys of the gate_func
dictionary are described below:
Gate Function Dictionary Key |
Type |
Domain |
Description |
---|---|---|---|
attr_name |
str |
See Description |
The attribute name. |
aid |
str |
See Description |
The attribute ID. |
threshold |
float |
(-inf, inf) |
Threshold value of the Bernoulli-gating function. |
prob_left_smaller_than_threshold |
float |
[0.0, 1.0] |
Probability that the value of left-child node is smaller than the |
When the model is loaded in the SAMPO API, the model parameters will be output as a single dictionary.
See also
Obtaining process results via ProcessResultLoader
{'fic': -64.281909689671252,
'num_initial_comps': 32,
'num_active_comps': 2,
'gate_tree':
{'gate_type': 'bern',
'hard_gate': True,
'nodes': [
{'node_type': 'gate',
'node_id': 0,
'gate_func':
{'threshold': 3.3499999999999996,
'aid': 'dl[1]',
'attr_name': 'petal_length_in_cm',
'prob_left_smaller_than_threshold': 0.0}},
{'comp_id': 17, 'node_type': 'component', 'node_id': 1},
{'comp_id': 19, 'node_type': 'component', 'node_id': 2}],
'edges': [
{'source': 0, 'target': 1, 'is_left': True},
{'source': 0, 'target': 2, 'is_left': False}]}},
'prediction_formulas':
prediction_formula_17_Iris-setosa prediction_formula_17_Iris-versicolor prediction_formula_17_Iris-virginica prediction_formula_19_Iris-setosa prediction_formula_19_Iris-versicolor prediction_formula_19_Iris-virginica
attr_name
sepal_length_in_cm 0.120060 0.000000 2.648575 -2.698730e-12 2.543482 3.747440
petal_length_in_cm 0.065109 0.000000 -2.231421 0.000000e+00 0.000000 0.000000
bias 0.390583 -inf -7.419183 3.617190e-01 -12.697070 -20.274437}
Name rule of ‘prediction_formulas’ column is prediction_formula_<component_id>_<nominal_index>
External Format¶
When convert_process is executed, the model parameters are saved into different files and are grouped as: general information, gating function, and prediction formula.
General Information¶
This file describes \(FIC\) after learning the model, initial number of components, and the terminal number of components.
fic,num_initial_comps,num_active_comps
-1.294308e+02,8,3
Gate Tree¶
This file describes the structure and parameters of the gate-tree of the model.
{
"gate_tree": {
"gate_type": "bern",
"hard_gate": true,
"nodes": [
{
"node_id": 1,
"node_type": "gate",
"gate_func": {
"aid": "dl1[1]",
"attr_name": "sepal_width_in_cm",
"threshold": 2.5499999999999998e+00,
"prob_left_smaller_than_threshold": 1.0000000000000000e+00
}
},
{
"node_id": 0,
"node_type": "gate",
"gate_func": {
"aid": "dl1[1]",
"attr_name": "sepal_width_in_cm",
"threshold": 3.7500000000000000e+00,
"prob_left_smaller_than_threshold": 1.0000000000000000e+00
}
},
...
{
"node_id": 2,
"node_type": "component",
"comp_id": 2
},
{
"node_id": 5,
"node_type": "component",
"comp_id": 12
},
...
],
"edges": [
{
"source": 1,
"target": 3,
"is_left": false
},
{
"source": 1,
"target": 2,
"is_left": true
},
...
]
}
}
Prediction Formulas¶
This file describes parameters of prediction formulas: weights and bias values.
aid,attr_name,prediction_formula_2_Iris-setosa,prediction_formula_2_Iris-versicolor,prediction_formula_2_Iris-virginica
dl1[0],sepal_length_in_cm,0.0000000000000000e+00,0.0000000000000000e+00,0.0000000000000000e+00
dl1[1],sepal_width_in_cm,0.0000000000000000e+00,0.0000000000000000e+00,0.0000000000000000e+00
dl1[2],petal_length_in_cm,6.7879035725582071e-13,2.0337236935969112e+00,4.5556358511732196e+00
dl1[3],petal_width_in_cm,0.0000000000000000e+00,0.0000000000000000e+00,0.0000000000000000e+00
,bias,9.9999999999719602e-01,-4.7795545358542579e+00,-1.7393829518577114e+01
Predict Result File¶
This file describes the prediction result of the component.
_sid,fab1_actual,fab1_predict,fab1_score_Iris-setosa,fab1_score_Iris-versicolor,fab1_score_Iris-virginica,fab1_assigned_comp_id,fab1_predict_c2,fab1_score_c2_Iris-setosa,fab1_score_c2_Iris-versicolor,fab1_score_c2_Iris-virginica
0,Iris-setosa,Iris-setosa,1.000000e+00,-1.932341e+00,-1.101594e+01,2,Iris-setosa,1.000000e+00,-1.932341e+00,-1.101594e+01
1,Iris-setosa,Iris-setosa,1.000000e+00,-1.932341e+00,-1.101594e+01,2,Iris-setosa,1.000000e+00,-1.932341e+00,-1.101594e+01
2,Iris-setosa,Iris-setosa,1.000000e+00,-2.135714e+00,-1.147150e+01,2,Iris-setosa,1.000000e+00,-2.135714e+00,-1.147150e+01
...
28,Iris-virginica,Iris-virginica,1.000000e+00,7.016043e+00,9.028858e+00,2,Iris-virginica,1.000000e+00,7.016043e+00,9.028858e+00
29,Iris-virginica,Iris-virginica,1.000000e+00,7.626160e+00,1.039555e+01,2,Iris-virginica,1.000000e+00,7.626160e+00,1.039555e+01
Predict Result Evaluation File¶
This file describes the evaluation of the prediction result of the component.
accuracy_weighted_average,classification_error_weighted_average,precision_weighted_average,recall_weighted_average,specificity_weighted_average,false_positive_rate_weighted_average,false_negative_rate_weighted_average,f_measure_weighted_average,true_positive_Iris-setosa,false_positive_Iris-setosa,true_negative_Iris-setosa,false_negative_Iris-setosa,accuracy_Iris-setosa,classification_error_Iris-setosa,precision_Iris-setosa,recall_Iris-setosa,specificity_Iris-setosa,false_positive_rate_Iris-setosa,false_negative_rate_Iris-setosa,f_measure_Iris-setosa,true_positive_Iris-versicolor,false_positive_Iris-versicolor,true_negative_Iris-versicolor,false_negative_Iris-versicolor,accuracy_Iris-versicolor,classification_error_Iris-versicolor,precision_Iris-versicolor,recall_Iris-versicolor,specificity_Iris-versicolor,false_positive_rate_Iris-versicolor,false_negative_rate_Iris-versicolor,f_measure_Iris-versicolor,true_positive_Iris-virginica,false_positive_Iris-virginica,true_negative_Iris-virginica,false_negative_Iris-virginica,accuracy_Iris-virginica,classification_error_Iris-virginica,precision_Iris-virginica,recall_Iris-virginica,specificity_Iris-virginica,false_positive_rate_Iris-virginica,false_negative_rate_Iris-virginica,f_measure_Iris-virginica,cf_Iris-setosa_Iris-setosa,cf_Iris-setosa_Iris-versicolor,cf_Iris-setosa_Iris-virginica,cf_Iris-versicolor_Iris-setosa,cf_Iris-versicolor_Iris-versicolor,cf_Iris-versicolor_Iris-virginica,cf_Iris-virginica_Iris-setosa,cf_Iris-virginica_Iris-versicolor,cf_Iris-virginica_Iris-virginica
8.222222e-01,1.777778e-01,8.194444e-01,7.333333e-01,8.666667e-01,1.333333e-01,2.666667e-01,6.705517e-01,5,3,7,0,8.000000e-01,2.000000e-01,6.250000e-01,1.000000e+00,7.000000e-01,3.000000e-01,0.000000e+00,7.692308e-01,5,1,9,0,9.333333e-01,6.666667e+00,8.333333e-01,1.000000e+00,9.000000e-01,1.000000e-01,0.000000e+00,9.090909e-01,1,0,10,4,7.333333e-01,2.666667e-01,1.000000e+00,2.000000e-01,1.000000e+00,0.000000e+00,8.000000e-01,3.333333e-01,5,0,0,0,5,0,3,1,1
Prediction Result Evaluation¶
The following is a classwise evaluation index list for each target class, \(i\). Weighted averages of evaluation indices are subsequently computed wherein the weight of a target class is the proportion of the occurrences of the class in the actual population.
Evaluation Index |
Type |
Description |
---|---|---|
true_positive_<i> (**) |
int |
Number of samples determined as positive for each target class, \(i\), correctly (TP). |
false_positive_<i> (**) |
int |
Number of samples determined as positive for each target class, \(i\), incorrectly (FP). |
true_negative_<i> (**) |
int |
Number of samples determined as negative for each target class, \(i\), correctly (TN). |
false_negative_<i> (**) |
int |
Number of samples determined as negative for each target class, \(i\), incorrectly (FN). |
accuracy_<i> |
float |
Proportion of true results for each target class, \(i\), in the population as shown below:
\(\frac{\mbox{TP}_{i} + \mbox{TN}_{i}}{\mbox{TP}_{i} + \mbox{FP}_{i} + \mbox{TN}_{i} + \mbox{FN}_{i}}\)
|
classification_error_<i> |
float |
Proportion of false results for each target class, \(i\), in the population as shown below:
\(\frac{\mbox{FP}_{i} + \mbox{FN}_{i}}{\mbox{TP}_{i} + \mbox{FP}_{i} + \mbox{TN}_{i} + \mbox{FN}}_{i} = 1 - \mbox{accuracy}_{i}\)
|
precision_<i> |
float |
Proportion of the
true_positive of each target class, \(i\), against all samples determined as positive as shown below:\(\frac{\mbox{TP}_{i}}{\mbox{TP}_{i} + \mbox{FP}_{i}}\)
|
recall_<i> |
float |
Proportion of the
true_positive of each target class, \(i\), against all the actual positive samples as shown below:\(\frac{\mbox{TP}_{i}}{\mbox{TP}_{i} + \mbox{FN}_{i}}\)
|
specificity_<i> |
float |
Proportion of the
true_negative of each target class, \(i\), against all the actual negative samples as shown below:\(\frac{\mbox{TN}_{i}}{\mbox{TN}_{i} + \mbox{FP}_{i}}\)
|
false_positive_rate_<i> |
float |
Proportion of the
false_positive of each target class, \(i\), against all the actual negative samples as shown below:\(\frac{\mbox{FP}_{i}}{\mbox{TN}_{i} + \mbox{FP}_{i}} = 1 - \mbox{specificity}_{i}\)
|
false_negative_rate_<i> |
float |
Proportion of the
false_negative of each target class, \(i\), against all the actual positive samples as shown below:\(\frac{\mbox{FN}_{i}}{\mbox{TP}_{i} + \mbox{FN}_{i}} = 1 - \mbox{recall}_{i}\)
|
f_measure_<i> |
float |
Harmonic mean of
precision and recall of each target class, \(i\), as shown below:\(\frac{2 \times \mbox{precision}_{i} \times \mbox{recall}_{i}}{\mbox{precision}_{i} + \mbox{recall}_{i}}\)
|
cf_<i>_<j> (**) |
int |
Confusion matrix values that show the number of actual class, \(i\), values predicted as \(j\).
There are \(\mbox{num_target_classes}^{2}\) cf index values for every evaluation.
|
(**) Weighted average is not computed for this index.
When obtaining these evaluation results in SAMPO API, a pandas.DataFrame is loaded with the evaluation indices as the columns of the DataFrame.
See also
Obtaining process results via ProcessResultLoader
External Format¶
When convert_process is executed, the evaluation results are saved as a CSV file with the evaluation indices as the header of the CSV.
This file describes the evaluation of the prediction result of the component.
accuracy_weighted_average,classification_error_weighted_average,precision_weighted_average,recall_weighted_average,specificity_weighted_average,false_positive_rate_weighted_average,false_negative_rate_weighted_average,f_measure_weighted_average,true_positive_Iris-setosa,false_positive_Iris-setosa,true_negative_Iris-setosa,false_negative_Iris-setosa,accuracy_Iris-setosa,classification_error_Iris-setosa,precision_Iris-setosa,recall_Iris-setosa,specificity_Iris-setosa,false_positive_rate_Iris-setosa,false_negative_rate_Iris-setosa,f_measure_Iris-setosa,true_positive_Iris-versicolor,false_positive_Iris-versicolor,true_negative_Iris-versicolor,false_negative_Iris-versicolor,accuracy_Iris-versicolor,classification_error_Iris-versicolor,precision_Iris-versicolor,recall_Iris-versicolor,specificity_Iris-versicolor,false_positive_rate_Iris-versicolor,false_negative_rate_Iris-versicolor,f_measure_Iris-versicolor,true_positive_Iris-virginica,false_positive_Iris-virginica,true_negative_Iris-virginica,false_negative_Iris-virginica,accuracy_Iris-virginica,classification_error_Iris-virginica,precision_Iris-virginica,recall_Iris-virginica,specificity_Iris-virginica,false_positive_rate_Iris-virginica,false_negative_rate_Iris-virginica,f_measure_Iris-virginica,cf_Iris-setosa_Iris-setosa,cf_Iris-setosa_Iris-versicolor,cf_Iris-setosa_Iris-virginica,cf_Iris-versicolor_Iris-setosa,cf_Iris-versicolor_Iris-versicolor,cf_Iris-versicolor_Iris-virginica,cf_Iris-virginica_Iris-setosa,cf_Iris-virginica_Iris-versicolor,cf_Iris-virginica_Iris-virginica
8.222222e-01,1.777778e-01,8.194444e-01,7.333333e-01,8.666667e-01,1.333333e-01,2.666667e-01,6.705517e-01,5,3,7,0,8.000000e-01,2.000000e-01,6.250000e-01,1.000000e+00,7.000000e-01,3.000000e-01,0.000000e+00,7.692308e-01,5,1,9,0,9.333333e-01,6.666667e+00,8.333333e-01,1.000000e+00,9.000000e-01,1.000000e-01,0.000000e+00,9.090909e-01,1,0,10,4,7.333333e-01,2.666667e-01,1.000000e+00,2.000000e-01,1.000000e+00,0.000000e+00,8.000000e-01,3.333333e-01,5,0,0,0,5,0,3,1,1