FABHMELogitGateLinearRg Component Specification¶
Contents
Overview¶
FABHMELogitGateLinearRg component is a linear regression component with FAB/HME algorithm. This component learns a tree-structured model in which each sample is assigned to a component according to Logistic gating functions.
Note
FAB engine uses the word ‘component’ with a different meaning from that of SAMPO. Each component in FAB/HME is a prediction formula, and each sample data is assigned to a specific component for prediction.
Example:
SPD:
# fabhmerg.spd dl1 -> std1 -> fab1 --- components: dl1: component: DataLoader std1: component: StandardizeFDComponent features: scale == 'real' or scale == 'integer' fab1: component: FABHMELogitGateLinearRgComponent features: name != 'Concrete_compressive_strength_MPa' standardize_target: True target: name == 'Concrete_compressive_strength_MPa' tree_depth: 5 shrink_threshold: 1.0% global_settings: keep_attributes: - Concrete_compressive_strength_MPa feature_exclude: - Concrete_compressive_strength_MPa
Input of the component:
_sid
std1_Superplasticizer_ kg_in_a_m3_mixture
std1_Coarse_Aggregate_ kg_in_a_m3_mixture
Concrete_compressive_ strength_MPa
0
0.729738484
0.705292074
9
1
1.413868312
1.434904563
9
2
-1.387806224
-1.321409287
4
…
…
…
…
8
-0.019546567
0.016213611
8
9
-0.736254006
-0.835000961
6
Output of the component:
_sid
fab1_ actual
fab1_ std_actual
fab1_ predict
fab1_ std_predict
fab1_ assigned_comp_id
0
9
-4.686873e-01
1.193986e+01
3.613565e-01
2
1
9
-4.686873e-01
1.487464e+01
1.189968e+00
2
2
4
-1.880396e+00
5.744445e+00
-1.387866e+00
0
…
…
…
…
…
…
8
8
-7.510290e-01
9.614527e+00
-2.951807e-01
0
9
6
-1.315712e+00
7.587341e+00
-8.675398e-01
0
_sid
fab1_ predict_c0
fab1_ std_predict_c0
fab1_ predict_c1
fab1_ std_predict_c1
fab1_ predict_c2
fab1_ std_predict_c2
0
1.173386e+01
3.031946e-01
1.285257e+01
6.190547e-01
1.193986e+01
3.613565e-01
1
1.366890e+01
8.495373e-01
1.503635e+01
1.235628e+00
1.487464e+01
1.189968e+00
2
5.744445e+00
-1.387866e+00
6.786511e+00
-1.093648e+00
3.787681e+00
-1.940342e+00
…
…
…
…
…
…
…
8
9.614527e+00
-2.951807e-01
1.079011e+01
3.673590e-02
9.168116e+00
-4.212211e-01
9
7.587341e+00
-8.675398e-01
8.242365e+00
-6.825991e-01
5.744203e+00
-1.387935e+00
This component has component-specific external formats for model and prediction result evaluation.
See also
Component-common external format files in convert_process
Parameters¶
This component has the following component-specific parameters.
SPD¶
The following parameters are for “components” section of SPD.
Parameter Name |
Type |
Domain |
Default Value |
Description |
---|---|---|---|---|
standardize_target |
bool |
True / False |
False |
If this parameter is True, the target attribute is standardized. |
max_fab_iterations |
int |
[1, inf) |
100 |
Maximum number of FAB-iterations. |
bool |
True / False |
False |
If True, the first iteration starts with M-step; otherwise, E-step. |
|
num_acceleration_steps |
int |
[0, inf) |
0 |
The number of steps of acceleration algorithm for each FAB-iteration. If 0, the acceleration algorithm is disabled. |
repeat_until_convergence |
bool |
True / False |
False |
If False, FAB-iterations and the post-processing are executed only once
even if the FAB-iterations are stopped not by convergence condition but
by |
projection_estep |
bool |
True / False |
False |
Whether the projection E-step algorithm is enabled. |
shrink_threshold |
float or str |
[1, inf) or (0%, 100%) |
1.0 |
Threshold value for shrinkage. If a percentage value (e.g. |
fab_stop_threshold |
float or str |
(0, inf) or (0%, inf%) |
0.001 |
Threshold value for FAB-iterations: if the increase of FIC value
is less than the threshold, the FAB-iterations is considered to
be converged. If a percentage value (e.g. |
gate_features |
str |
Query format |
all() |
Features which are applied to gate parameter optimizations. If not specified, all features are used. |
comp_features |
str |
Query format |
all() |
Features which are applied to component parameter optimizations. If not specified, all features are used. If empty, the model is learned as a decision tree. |
comp_mandatory_features |
str |
Query format |
See Description |
Features which non-L0-regularize constraints are applied to. It means the specified features will always be relevant for all components. If not specified, no features are specified for non-L0-regularization, which implies all relevant features are selected by FoBa algorithm. |
comp_positive_features |
str |
Query format |
See Description |
Features whose weight values for all components are constrained to positive values. If not specified, all features are optimized with no constraints. |
comp_negative_features |
str |
Query format |
See Description |
Features whose weight values for all components are constrained to negative values. If not specified, all features are optimized with no constraints. |
int |
[0, inf) |
5 |
Initial depth of the gate-tree structure of latent variable prior. The initial number of components is \(2^d\) where \(d\) is tree depth. If 0, the optimization with only one component will be executed. |
|
float |
(-inf, inf) |
-0.5 |
Scale value for the initialization of weight values of components. |
|
float |
(-inf, inf) |
0.5 |
Scale value for the initialization of weight values of components. |
|
float |
(-inf, inf) |
0.25 |
Scale value for the initialization of bias values of components. |
|
float |
(-inf, inf) |
0.75 |
Scale value for the initialization of bias values of components. |
|
float |
(0, inf) |
0.1 |
Scale value for the initialization of variance values of components. |
|
float |
(0, inf) |
0.25 |
Scale value for the initialization of variance values of components. |
|
gate_l2_regularize |
float |
[0, inf) |
0.0 |
L2-regularization hyper-parameter for gate-parameter optimization. The larger the specified value, the stronger the regularization effect is. If 0.0, L2-regularization is disabled. |
with_gate_scaled_l0_regularize |
bool |
True / False |
True |
Whether with scaled L0-regularization using a tighter lower bound of FIC for gate parameter optimization; approximation of det(F) is refined, where F is a Fisher matrix. |
max_gate_relevant_features |
int |
[1, inf) |
3 |
Maximum number of the relevant features for each gate. |
comp_foba_skip |
str |
{‘power_of_two’, ‘quarter_square’, ‘none’} |
‘power_of_two’ |
The judging function type for the FoBa algorithm skipping. If ‘none’, FoBa is executed for all FAB-iteration steps. FoBa is skipped at \({\rm log}_{2}t \ne {\rm ceil}({\rm log}_{2}t)\) if ‘power_of_two’, or \(t \bmod {\rm ceil}(\sqrt{t}) \ne 0\) if ‘quarter_square’. \(t\) is FAB-iteration step index (\(t\) starts from 1). |
comp_foba_skip_max_interval |
int |
[2, inf) |
25 |
The maximum interval for the FoBa algorithm skipping. If comp_foba_skip is ‘none’, this value is ignored. |
comp_two_stage_opt |
bool |
True / False |
False |
Whether the two-stage optimization is enabled.
If True, the first stage performs the parameter optimization on
user-specified mandatory features ( |
comp_backward_step |
bool |
True / False |
False |
Whether the backward-steps of FoBa algorithm are enabled. In the post-process, backward-steps are carried out regardless of this argument value. |
comp_l2_regularize |
float |
[0, inf) |
0.0 |
L2-regularization hyper-parameter for component parameter optimization. The larger the specified value, the stronger the regularization effect is. If 0.0, L2-regularization is disabled. |
with_comp_scaled_l0_regularize |
bool |
True / False |
True |
Whether with scaled L0-regularization using a tighter lower bound of FIC for component parameter optimization; approximation of det(F) is refined, where F is a Fisher matrix. |
max_comp_relevant_features |
int |
[1, inf) |
100 |
Maximum number of the relevant features for each component. |
max_comp_foba_iterations |
int |
[1, inf) |
100 |
Maximum number of the FoBa-iterations for each component. |
num_threads_gates |
int |
[1, inf) |
1 |
Maximum number of OpenMP threads of gate parameter optimization where tasks for all gates are divided into. |
num_threads_comps |
int |
[1, inf) |
1 |
Maximum number of OpenMP threads of component parameter optimization. |
SRC¶
The following parameter is for “hotstart” section of SRC.
Parameter Name |
Type |
Domain |
Default Value |
Description |
---|---|---|---|---|
type |
str |
{‘posterior’, ‘mh_refit_comp’, ‘mh_opt_comp’, ‘mh_refit_gate_and_refit_comp’, ‘mh_refit_gate_and_opt_comp’, ‘mh_opt_gate_and_opt_comp’} |
The hot-start type. If ‘posterior’, FAB learns with posterior hot-start which use the initial model whose tree structure is generated by base model and data. Each gate and component parameters are initialized randomly. ‘mh_XXX’ means FAB learns with model hot-start which uses base model as initial model. ‘refit_{gate, comp}’ means refitting the gate functions or prediction formulas with current data. ‘opt_{gate, comp}’ means optimizing (feature selection and fitting) the gate functions or prediction formulas with current data. |
Utilizable Sample Metadata¶
Warning
_fabhme_assigned_comp_id is deprecated. Use hotstart section of SRC instead of _fabhme_assigned_comp_id data column.
This component can utilize the _fabhme_assigned_comp_id attribute of the sample metadata to hot-start with posterior. When the attribute _fabhme_assigned_comp_id attribute is specified in the input data, this component will start the FAB/HME algorithm with the _fabhme_assigned_comp_id attribute values as its initial posterior.
To create the attribute _fabhme_assigned_comp_id, see the specification of the command sampo_ps_fabhme export_assigned_comp_id.
Output Attributes¶
This component generates the following attributes.
Attribute Name |
Scale |
Description |
---|---|---|
<component_id>_actual |
INTEGER/REAL (depend on target attribute) |
Values of target attribute. |
<component_id>_std_actual |
REAL |
Standardized values of |
<component_id>_predict |
REAL |
Predicted values. |
<component_id>_std_predict |
REAL |
Standardized values of |
<component_id>_assigned_comp_id |
INTEGER |
Component IDs assigned by gating functions. |
<component_id>_predict_c<hme_comp_id> |
REAL |
Predicted values for the prediction formula of component id, <hme_comp_id>. |
<component_id>_std_predict_c<hme_comp_id> |
REAL |
Standardized predicted values for the prediction formula of component id, <hme_comp_id>. |
These attributes are in the component output data. These can be loaded in SAMPO API.
See also
Obtaining process results via ProcessResultLoader.
When convert_process is executed, the component output data will be saved in <component_id>_predict_result.csv.
This file describes the prediction result of the component.
_sid,fab1_actual,fab1_std_actual,fab1_predict,fab1_std_predict,fab1_assigned_comp_id,fab1_predict_c0,fab1_std_predict_c0,fab1_predict_c1,fab1_std_predict_c1,fab1_predict_c2,fab1_std_predict_c2
0,9,-4.686873e-01,1.193986e+01,3.613565e-01,2,1.173386e+01,3.031946e-01,1.285257e+01,6.190547e-01,1.193986e+01,3.613565e-01
1,9,-4.686873e-01,1.487464e+01,1.189968e+00,2,1.366890e+01,8.495373e-01,1.503635e+01,1.235628e+00,1.487464e+01,1.189968e+00
2,4,-1.880396e+00,5.744445e+00,-1.387866e+00,0,5.744445e+00,-1.387866e+00,6.786511e+00,-1.093648e+00,3.787681e+00,-1.940342e+00
...
8,8,-7.510290e-01,9.614527e+00,-2.951807e-01,0,9.614527e+00,-2.951807e-01,1.079011e+01,3.673590e-02,9.168116e+00,-4.212211e-01
9,6,-1.315712e+00,7.587341e+00,-8.675398e-01,0,7.587341e+00,-8.675398e-01,8.242365e+00,-6.825991e-01,5.744203e+00,-1.387935e+00
Attribute Metadata¶
The metadata of the output attributes is created with the following rules.
Context Rule¶
Attribute Name |
Context Name |
Description |
---|---|---|
All the output attributes of this component |
field_path |
List of the superordinate concepts of each output attribute based on the following hierarchical structure of the output attributes: root
├── fabhmerg
│ ├── assigned_comp_id
│ └── component
│ ├── 0
│ │ ├── predict
│ │ └── std_predict
│ ├── 1
│ │ ├── predict
│ │ └── std_predict
│ .
│ .
│ .
│
└── regression
├── actual
├── std_actual
├── predict
└── std_predict
|
<component_id>_std_actual, <component_id>_std_predict, <component_id>_std_predict_c<hme_comp_id> |
mean |
Mean of the target values for learning. |
<component_id>_std_actual, <component_id>_std_predict, <component_id>_std_predict_c<hme_comp_id> |
std |
Standard deviation of the target values for learning. |
<component_id>_assigned_comp_id |
active_comp_ids |
List of component IDs corresponding to each prediction formula. |
Derivation Rule¶
Attribute Name |
Derived From |
---|---|
<component_id>_actual, <component_id>_std_actual |
Derived from the target attribute. |
<component_id>_predict, <component_id>_std_predict |
Derived from the attributes which have non-zero coefficients in any prediction formula. |
<component_id>_assigned_comp_id |
Derived from the attributes used in the gating functions. |
<component_id>_predict_c<hme_comp_id>, <component_id>_std_predict_c<hme_comp_id> |
Derived from the attributes which have non-zero coefficients in the prediction formula of component id, <hme_comp_id>. |
Example¶
{
"nodes": [
{"aid": "_sid", "name": "_sid", ... },
...
{"aid": "dl1[0]", "name": "Superplasticizer_kg_in_a_m3_mixture", ... },
{"aid": "dl1[1]", "name": "Coarse_Aggregate_kg_in_a_m3_mixture", ... },
...
{"aid": "std1[0]", "name": "std1_Superplasticizer_kg_in_a_m3_mixture", "scale": "real",
"is_excluded": false, "cid": "std1", "cindex": 0, "values": null, "is_kept": false,
"context": {
"std": 4.8989794855663543e-01,
"mean": 1.0000000000000001e-01
}
},
{"aid": "std1[1]", "name": "std1_Coarse_Aggregate_kg_in_a_m3_mixture", "scale": "real",
"is_excluded": false, "cid": "std1", "cindex": 1, "values": null, "is_kept": false,
"context": {
"std": 4.0463422692599771e+01,
"mean": 9.5183199999999999e+02
}
},
...
{"aid": "fab1[0]", "name": "fab1_actual", "scale": "real", "is_excluded": false,
"cid": "fab1", "cindex": 0, "values": null, "is_kept": false,
"context": {"field_path": ["regression", "actual"]}
},
{"aid": "fab1[1]", "name": "fab1_std_actual", "scale": "real", "is_excluded": false,
"cid": "fab1", "cindex": 1, "values": null, "is_kept": false,
"context": {"std": null, "field_path": ["regression", "std_actual"], "mean": null}
},
{"aid": "fab1[2]", "name": "fab1_predict", "scale": "real", "is_excluded": false,
"cid": "fab1", "cindex": 2, "values": null, "is_kept": false,
"context": {"field_path": ["regression", "predict"]}
},
{"aid": "fab1[3]", "name": "fab1_std_predict", "scale": "real", "is_excluded": false,
"cid": "fab1", "cindex": 3, "values": null, "is_kept": false,
"context": {"std": null, "field_path": ["regression", "std_predict"], "mean": null}
},
{"aid": "fab1[4]", "name": "fab1_assigned_comp_id", "scale": "integer", "is_excluded": false,
"cid": "fab1", "cindex": 4, "values": null, "is_kept": false,
"context": {"active_comp_ids": [0, 5, 9, 10, 13, 20, 24, 27], "field_path": ["fabhmerg", "assigned_comp_id"]}
},
{"aid": "fab1[5]", "name": "fab1_predict_c0", "scale": "real", "is_excluded": false,
"cid": "fab1", "cindex": 5, "values": null, "is_kept": false,
"context": {"field_path": ["fabhmerg", "component", 0, "predict"]}
},
{"aid": "fab1[6]", "name": "fab1_std_predict_c0", "scale": "real", "is_excluded": false,
"cid": "fab1", "cindex": 6, "values": null, "is_kept": false,
"context": {"std": null, "field_path": ["fabhmerg", "component", 0, "std_predict"], "mean": null}
},
...
],
"links": [
{"source": "std1[0]", "target": "fab1[3]"},
{"source": "std1[0]", "target": "fab1[2]"},
{"source": "std1[0]", "target": "fab1[19]"},
{"source": "std1[0]", "target": "fab1[20]"},
{"source": "dl1[4]", "target": "fab1[1]"},
{"source": "dl1[4]", "target": "fab1[0]"},
{"source": "dl1[1]", "target": "std1[1]"},
{"source": "std1[2]", "target": "fab1[11]"},
{"source": "std1[2]", "target": "fab1[16]"},
{"source": "std1[2]", "target": "fab1[4]"},
{"source": "std1[2]", "target": "fab1[6]"},
{"source": "std1[2]", "target": "fab1[2]"},
{"source": "std1[2]", "target": "fab1[3]"},
{"source": "std1[2]", "target": "fab1[5]"},
{"source": "std1[2]", "target": "fab1[12]"},
{"source": "std1[2]", "target": "fab1[15]"},
{"source": "std1[1]", "target": "fab1[2]"},
{"source": "std1[1]", "target": "fab1[4]"},
{"source": "std1[1]", "target": "fab1[3]"},
{"source": "std1[3]", "target": "fab1[10]"},
{"source": "std1[3]", "target": "fab1[9]"},
{"source": "std1[3]", "target": "fab1[16]"},
{"source": "std1[3]", "target": "fab1[8]"},
{"source": "std1[3]", "target": "fab1[18]"},
{"source": "std1[3]", "target": "fab1[4]"},
{"source": "std1[3]", "target": "fab1[6]"},
{"source": "std1[3]", "target": "fab1[13]"},
{"source": "std1[3]", "target": "fab1[7]"},
{"source": "std1[3]", "target": "fab1[2]"},
{"source": "std1[3]", "target": "fab1[3]"},
{"source": "std1[3]", "target": "fab1[5]"},
{"source": "std1[3]", "target": "fab1[17]"},
{"source": "std1[3]", "target": "fab1[14]"},
{"source": "std1[3]", "target": "fab1[15]"},
{"source": "dl1[3]", "target": "std1[3]"},
{"source": "dl1[0]", "target": "std1[0]"},
{"source": "dl1[2]", "target": "std1[2]"}
]
}
See also
Attribute metadata file format in Attribute Metadata File Specification
Model¶
The model of this component can be described by the following parameters.
Model Parameter |
Type |
Domain |
Description |
---|---|---|---|
fic |
float |
(-inf, inf) |
Factorized Information Criterion. The asymptotic approximation value used by FAB/HME. |
num_initial_comps |
int |
[0, inf) |
The initial number of components before iterations. |
num_active_comps |
int |
[0, inf) |
The terminal number of active components after iterations. |
standardize_mean |
float |
(-inf, inf) |
Mean value used for standardizing the target attribute during learning. |
standardize_std |
float |
(-inf, inf) |
Standard deviation value used for standardizing the target attribute during learning. |
gate_tree |
dict |
See Description |
Dictionary form of the gating tree structure. |
prediction_formulas |
pandas.DataFrame |
See Description |
Component weights and bias for each prediction formula. |
The gate_tree
dictionary keys are described below:
Gate Tree Dictionary Key |
Type |
Domain |
Description |
---|---|---|---|
gate_type |
str |
‘logit’ |
The type of gate. |
hard_gate |
bool |
true / false |
Whether the gate is hard_gate or not. |
nodes |
list of dict |
See Description |
List of node dictionaries. |
edges |
list of dict |
See Description |
List of edge dictionaries. |
The keys of each node dictionary in nodes
are described below:
Node Dictionary Key |
Type |
Domain |
Description |
---|---|---|---|
node_id |
int |
[0, inf) |
The node ID. |
node_type |
str |
{‘gate’, ‘component’} |
The node type. |
gate_func |
dict |
See Description |
The |
comp_id |
int |
[0, inf) |
The component ID. Specifiable if |
The keys of each edge dictionary in edges
are described below:
Edge Dictionary Key |
Type |
Domain |
Description |
---|---|---|---|
source |
int |
[0, inf) |
The |
target |
int |
[0, inf) |
The |
is_left |
bool |
true / false |
Whether the target node is the left-child of the source. |
The keys of the gate_func
dictionary are described below:
Gate Function Dictionary Key |
Type |
Domain |
Description |
---|---|---|---|
bias |
float |
(-inf, inf) |
The gate function bias. |
weights |
list of dict |
See Description |
Lists weights dictionaries mapping each attribute to its corresponding weight. |
When the model is loaded in the SAMPO API, the model parameters will be output as a single dictionary.
See also
Obtaining process results via ProcessResultLoader
{'fic': -51.39944300304459,
'num_active_comps': 3,
'num_initial_comps': 4,
'standardize_mean': 1.1303215277777777e+04,
'standardize_std': 5.7343353765366674e+03,
'gate_tree':
{'gate_type': 'logit',
'hard_gate': True,
'nodes': [
{'node_type': 'gate',
'node_id': 1,
'gate_func':
{'bias': 5.111062768521956,
'weights': [
{'aid': 'std1[4]', 'attr_name': 'std1_Superplasticizer_kg_in_a_m3_mixture', 'weight': -4.383147346299667},
{'aid': 'std1[6]', 'attr_name': 'std1_Fine_Aggregate_kg_in_a_m3_mixture', 'weight': 10.213844035316507}]}},
{'node_type': 'gate',
'node_id': 0,
'gate_func':
{'bias': -2.5521447932552697,
'weights': [
{'aid': 'std1[0]', 'attr_name': 'std1_Cement_kg_in_a_m3_mixture', 'weight': -11.13428640672036},
{'aid': 'std1[1]', 'attr_name': 'std1_Blast_Furnace_Slag_kg_in_a_m3_mixture', 'weight': -8.404401460418903}]}},
{'comp_id': 1, 'node_type': 'component', 'node_id': 3},
{'comp_id': 0, 'node_type': 'component', 'node_id': 2},
{'comp_id': 3, 'node_type': 'component', 'node_id': 4}],
'edges': [
{'source': 1, 'target': 3, 'is_left': False},
{'source': 1, 'target': 2, 'is_left': True},
{'source': 0, 'target': 1, 'is_left': True},
{'source': 0, 'target': 4, 'is_left': False}]}},
'prediction_formulas':
prediction_formula_0 prediction_formula_1 prediction_formula_3
attr_name
std1_Cement_kg_in_a_m3_mixture 0.000000 0.000000 0.000000
std1_Blast_Furnace_Slag_kg_in_a_m3_mixture -0.255175 0.000000 -0.373861
std1_Fly_Ash_kg_in_a_m3_mixture 0.000000 0.000000 0.000000
std1_Water_kg_in_a_m3_mixture 0.000000 0.000000 0.000000
std1_Superplasticizer_kg_in_a_m3_mixture 0.000000 0.000000 0.000000
std1_Coarse_Aggregate_kg_in_a_m3_mixture 0.000000 0.000000 0.000000
std1_Fine_Aggregate_kg_in_a_m3_mixture 0.236025 0.000000 0.000000
bias 0.594811 -1.915292 -0.149098
variance 0.011392 0.614541 0.492770}
External Format¶
When convert_process is executed, the model parameters are saved into different files and are grouped as: general information, gating function, and prediction formula.
General Information¶
This file describes \(FIC\) after learning the model, initial number of components, and the terminal number of components.
fic,num_initial_comps,num_active_comps
-1.294308e+02,8,3
Gate Tree¶
This file describes the structure and parameters of the gate-tree of the model.
{
"gate_tree": {
"gate_type": "logit",
"hard_gate": true,
"nodes": [
{
"node_id": 11,
"node_type": "gate",
"gate_func": {
"weights": [
{
"aid": "std1[4]",
"attr_name": "std1_RM",
"weight": -3.6682658685673992e+00
},
{
"aid": "std1[7]",
"attr_name": "std1_TAX",
"weight": -5.8122016705226542e+00
},
{
"aid": "std1[10]",
"attr_name": "std1_LSTAT",
"weight": 1.0537643144910271e+01
}
],
"bias": 1.2740926133353371e+01
}
},
{
"node_id": 10,
"node_type": "gate",
"gate_func": {
"weights": [
{
"aid": "std1[2]",
"attr_name": "std1_INDUS",
"weight": 7.6493213521271874e-01
},
{
"aid": "std1[4]",
"attr_name": "std1_RM",
"weight": 2.5021103534594329e+00
},
{
"aid": "std1[10]",
"attr_name": "std1_LSTAT",
"weight": -2.8529074313420657e+00
}
],
"bias": -2.3621841358789547e-01
}
},
...
{
"node_id": 13,
"node_type": "component",
"comp_id": 13
},
{
"node_id": 12,
"node_type": "component",
"comp_id": 12
},
...
],
"edges": [
{
"source": 11,
"target": 13,
"is_left": false
},
{
"source": 11,
"target": 12,
"is_left": true
},
...
]
}
}
Prediction Formulas¶
This file describes parameters of prediction formulas: weights, bias and variance values.
aid,attr_name,prediction_formula_0,prediction_formula_1,prediction_formula_2,prediction_formula_3,prediction_formula_4,prediction_formula_5,prediction_formula_7
std1[0],std1_Cement_kg_in_a_m3_mixture,8.2142735978929815e-01,0.0000000000000000e+00,0.0000000000000000e+00,0.0000000000000000e+00,0.0000000000000000e+00,0.0000000000000000e+00,0.0000000000000000e+00
std1[1],std1_Blast_Furnace_Slag_kg_in_a_m3_mixture,0.0000000000000000e+00,0.0000000000000000e+00,0.0000000000000000e+00,0.0000000000000000e+00,0.0000000000000000e+00,0.0000000000000000e+00,0.0000000000000000e+00
...
,bias,-6.7186992548478641e-01,-2.2356491265481820e-02,-1.3838061097770997e+00,5.3112994502587174e-01,-1.2190946387798474e+00,1.0330395448550976e-01,1.3254015778746325e-02
,variance,1.9539664915904803e-01,4.8649969215506857e-02,4.3107489306578556e-01,8.4528051141512206e-01,7.0848813094870444e-01,1.8867837210436236e-01,2.5513538042657519e-01
Prediction Result Evaluation¶
The indices used in evaluating prediction results of this component are described below.
Formula |
Description |
---|---|
\(X_{\mbox{p}}\) |
Array of predicted value. |
\(X_{\mbox{a}}\) |
Array of actual value. |
\(\mbox{mean}(X)\) |
Mean of \(X\). |
\(\mbox{median}(X)\) |
Median value in \(X\). |
\(\mbox{max}(X)\) |
Maximum value in \(X\). |
\([\cdot]_+\) |
A function which returns the argument directly if it is greater than \(0\), otherwise returns \(0\). |
Evaluation Index |
Type |
Description |
---|---|---|
root_mean_squared_error |
float |
RMSE (Root Mean Square Error), which is the square root of the mean squared error as shown below:
\(\sqrt{\mbox{mean}((X_{\mbox{p}} - X_{\mbox{a}})^2)}\)
|
root_median_squared_error |
float |
RMdSE (Root Median Square Error), which is the square root of the median squared error as shown below:
\(\sqrt{\mbox{median}((X_{\mbox{p}} - X_{\mbox{a}})^2)}\)
|
mean_abs_error |
float |
Mean of absolute error as shown below:
\(\mbox{mean}(|X_{\mbox{p}} - X_{\mbox{a}}|)\)
|
median_abs_error |
float |
Median of absolute error as shown below:
\(\mbox{median}(|X_{\mbox{p}} - X_{\mbox{a}}|)\)
|
max_abs_error |
float |
Maximum value of absolute error.
\(\mbox{max}(|X_{\mbox{p}} - X_{\mbox{a}}|)\)
|
relative_root_mean_squared_error |
float |
The square root of the mean squared relative error as shown below:
\(\sqrt{\mbox{mean}((\frac{{\large X}_{\mbox{p}} {\large - X}_{\mbox{a}}}{ {\large X}_{\mbox{a}}})^2)}\)
|
relative_root_median_squared_error |
float |
The square root of the median squared relative error as shown below:
\(\sqrt{\mbox{median}((\frac{{\large X}_{\mbox{p}} {\large - X}_{\mbox{a}}}{ {\large X}_{\mbox{a}}})^2)}\)
|
relative_mean_abs_error |
float |
The mean abs relative error as shown below:
\(\mbox{mean}(|\frac{{\large X}_{\mbox{p}} {\large - X}_{\mbox{a}}}{ {\large X}_{\mbox{a}}}|)\)
|
relative_median_abs_error |
float |
The median abs relative error as shown below:
\(\mbox{median}(|\frac{{\large X}_{\mbox{p}} {\large - X}_{\mbox{a}}}{ {\large X}_{\mbox{a}}}|)\)
|
relative_max_abs_error |
float |
The maximum abs relative error as shown below:
\(\mbox{max}(|\frac{{\large X}_{\mbox{p}} {\large - X}_{\mbox{a}}}{ {\large X}_{\mbox{a}}}|)\)
|
positive_side_root_mean_squared_error |
float |
|
positive_side_root_median_squared_error |
float |
|
positive_side_mean_abs_error |
float |
|
positive_side_median_abs_error |
float |
|
positive_side_max_abs_error |
float |
|
negative_side_root_mean_squared_error |
float |
|
negative_side_root_median_squared_error |
float |
|
negative_side_mean_abs_error |
float |
|
negative_side_median_abs_error |
float |
|
negative_side_max_abs_error |
float |
|
max_upside_err_mean_obs |
float |
Proportion of the maximum error for samples that satisfy the condition, \(X_{\mbox{p}} > X_{\mbox{a}}\) against the mean of actual values as shown below:
\(\frac{\mbox{max}({\large X}_{\mbox{p}} {\large - X}_{\mbox{a}})}{\mbox{mean}({\large X}_{\mbox{a}})}\)
|
mean_upside_err_mean_obs |
float |
Proportion of the mean error whose value is only available if it satisfies the condition, \(X_{\mbox{p}} > X_{\mbox{a}}\) (otherwise \(0\)) against the mean of actual values as shown below:
\(\frac{\mbox{mean}([{\large X}_{\mbox{p}} {\large - X}_{\mbox{a}}]_+)}{\mbox{mean}({\large X}_{\mbox{a}})}\)
|
max_downside_err_mean_obs |
float |
Proportion of the maximum error for samples that satisfy the condition, \(X_{\mbox{a}} \geq X_{\mbox{p}}\) against the mean of actual values as shown below:
\(\frac{\mbox{max}({\large X}_{\mbox{a}} {\large - X}_{\mbox{p}})}{\mbox{mean}({\large X}_{\mbox{a}})}\)
|
mean_downside_err_mean_obs |
float |
Proportion of the mean error whose value is only available if it satisfies the condition, \(X_{\mbox{a}} \geq X_{\mbox{p}}\) (otherwise \(0\)) against the mean of actual values as shown below:
\(\frac{\mbox{mean}([{\large X}_{\mbox{a}} - {\large X}_{\mbox{p}}]_+)}{\mbox{mean}({\large X}_{\mbox{a}})}\)
|
negative_pred_num |
int |
The number of the samples that satisfy the condition, \(X_{\mbox{p}} < 0\). |
std_root_mean_squared_error |
float |
|
std_root_median_squared_error |
float |
|
std_mean_abs_error |
float |
|
std_median_abs_error |
float |
|
std_max_abs_error |
float |
|
std_relative_root_mean_squared_error |
float |
|
std_relative_root_median_squared_error |
float |
|
std_relative_mean_abs_error |
float |
|
std_relative_median_abs_error |
float |
|
std_relative_max_abs_error |
float |
|
std_positive_side_root_mean_squared_error |
float |
|
std_positive_side_root_median_squared_error |
float |
|
std_positive_side_mean_abs_error |
float |
|
std_positive_side_median_abs_error |
float |
|
std_positive_side_max_abs_error |
float |
|
std_negative_side_root_mean_squared_error |
float |
|
std_negative_side_root_median_squared_error |
float |
|
std_negative_side_mean_abs_error |
float |
|
std_negative_side_median_abs_error |
float |
|
std_negative_side_max_abs_error |
float |
|
std_max_upside_err_mean_obs |
float |
|
std_mean_upside_err_mean_obs |
float |
|
std_max_downside_err_mean_obs |
float |
|
std_mean_downside_err_mean_obs |
float |
|
std_negative_pred_num |
int |
|
When obtaining these evaluation results in SAMPO API, a pandas.DataFrame is loaded with the evaluation indices as the columns of the DataFrame.
See also
Obtaining process results via ProcessResultLoader
External Format¶
When convert_process is executed, the evaluation results are saved as a CSV file with the evaluation indices as the header of the CSV.
This file describes the evaluation of the prediction result of the component.
root_mean_squared_error,root_median_squared_error,mean_abs_error,median_abs_error,max_abs_error,relative_root_mean_squared_error,relative_root_median_squared_error,relative_mean_abs_error,relative_median_abs_error,relative_max_abs_error,positive_side_root_mean_squared_error,positive_side_root_median_squared_error,positive_side_mean_abs_error,positive_side_median_abs_error,positive_side_max_abs_error,negative_side_root_mean_squared_error,negative_side_root_median_squared_error,negative_side_mean_abs_error,negative_side_median_abs_error,negative_side_max_abs_error,max_upside_err_mean_obs,mean_upside_err_mean_obs,max_downside_err_mean_obs,mean_downside_err_mean_obs,negative_pred_num,std_root_mean_squared_error,std_root_median_squared_error,std_mean_abs_error,std_median_abs_error,std_max_abs_error,std_relative_root_mean_squared_error,std_relative_root_median_squared_error,std_relative_mean_abs_error,std_relative_median_abs_error,std_relative_max_abs_error,std_positive_side_root_mean_squared_error,std_positive_side_root_median_squared_error,std_positive_side_mean_abs_error,std_positive_side_median_abs_error,std_positive_side_max_abs_error,std_negative_side_root_mean_squared_error,std_negative_side_root_median_squared_error,std_negative_side_mean_abs_error,std_negative_side_median_abs_error,std_negative_side_max_abs_error,std_max_upside_err_mean_obs,std_mean_upside_err_mean_obs,std_max_downside_err_mean_obs,std_mean_downside_err_mean_obs,std_negative_pred_num
1.350699e+01,8.838062e+00,1.036833e+01,8.837361e+00,3.377979e+01,8.686816e-01,3.416881e-01,5.519667e-01,3.413176e-01,3.334933e+00,1.356254e+01,9.506539e+00,1.098259e+01,9.506539e+00,3.377979e+01,1.338252e+01,5.174490e+00,9.001098e+00,5.174490e+00,2.990477e+01,1.240240e+00,2.782290e-01,1.097967e+00,1.024486e-01,2,1.135126e+00,7.427497e-01,8.713529e-01,7.426909e-01,2.838850e+00,2.076069e+00,6.108897e-01,1.178233e+00,6.105078e-01,9.166376e+00,1.139795e+00,7.989285e-01,9.229754e-01,7.989285e-01,2.838850e+00,1.124665e+00,4.348636e-01,7.564513e-01,4.348636e-01,2.513193e+00,-2.609989e+00,-5.855117e-01,-2.310587e+00,-2.155952e-01,74