BinarizeFD Component Specification¶
Contents
Overview¶
BinarizeFD component is a feature descriptor. This component produces an output by binarization. The scale of the input must be INTEGER or REAL.
Example:
SPD:
dl1 -> bin --- components: dl1: component: DataLoader bin: component: BinarizeFDComponent features: scale == 'real' or scale == 'integer' binarize_param: [["re_match('.*Length', name)", [{'threshold': 5.0}, {'threshold': 1.4}]]]
Input of the component:
_sid
Sepal.Length
Petal.Length
0
5.1
1.4
1
inf
1.4
2
4.7
1.3
3
4.6
-inf
4
NaN
NaN
Output of the component:
_sid
bin(5.0)_Sepal.Length
bin(1.4)_Sepal.Length
bin(5.0)_Petal.Length
bin(1.4)_Petal.Length
0
1
1
0
1
1
1
1
0
1
2
0
1
0
0
3
0
1
0
0
4
NaN
NaN
NaN
NaN
This component has no component-specific external formats.
See also
Component-common external format files in convert_process
Parameters¶
Here are the component-specific parameters for the BinarizeFD component.
SPD¶
The following parameter is for “components” section of SPD.
Parameter Name |
Type |
Domain |
Default Value |
Description |
---|---|---|---|---|
binarize_param 1 |
[[feature_expr, [{params}]]]
|
– |
– |
Specifies target features and parameters of binarize. |
- 1
Required parameter
Details of binarize_param¶
feature_expr follows the format described in Attribute Selection of SPD (SAMPO Process Description) File Specification:
're_match(".*Length", name)'
params can be described as dict:
{'threshold': 5.0}
Key of params
Type
Domain
Default Value
Description
threshold
float
(-inf, inf)
0.0
The threshold which binarize the input attribute.
Output Attributes¶
BinarizeFD component generates the following attribute:
Attribute Name |
Scale |
Description |
---|---|---|
<component_id>(<threshold>)_<original_attribute_name> |
INTEGER |
Binarized value by threshold of the original value. |
These attributes are in the component output data. These can be loaded in SAMPO API or saved as data.csv after executing convert_process.
See also
Obtaining process results via ProcessResultLoader.
Attribute Metadata¶
The metadata of the output attributes is created with the following rules.
Context Rule¶
Attribute Name |
Context Name |
Description |
---|---|---|
<component_id>(<threshold>)_<original_attribute_name> |
threshold |
Set the value of |
Derivation Rule¶
Each new attribute is derived from the corresponding attribute selected by the features
parameter of the component.
Example¶
{
"nodes": [
{"aid": "_sid", "name": "_sid", ... },
{"aid": "dl1[0]", "name": "Sepal.Length", ... },
{"aid": "dl1[1]", "name": "Petal.Length", ... },
{"aid": "bin[0]", "name": "bin(5.0)_Sepal.Length", "scale": "integer",
"is_excluded": false, "cid": "bin", "cindex": 0, "values": null, "is_kept": false,
"context": {"threshold": 5.0000000000000000e+00}},
{"aid": "bin[1]", "name": "bin(5.0)_Petal.Length", "scale": "integer",
"is_excluded": false, "cid": "bin", "cindex": 1, "values": null, "is_kept": false,
"context": {"threshold": 5.0000000000000000e+00}},
{"aid": "bin[2]", "name": "bin(1.4)_Sepal.Length", "scale": "integer",
"is_excluded": false, "cid": "bin", "cindex": 2, "values": null, "is_kept": false,
"context": {"threshold": 1.40000000000000000e+00}},
{"aid": "bin[3]", "name": "bin(1.4)_Petal.Length", "scale": "integer",
"is_excluded": false, "cid": "bin", "cindex": 3, "values": null, "is_kept": false,
"context": {"threshold": 1.40000000000000000e+00}},
],
"links": [
{"source": "dl1[0]", "target": "bin[0]"},
{"source": "dl1[0]", "target": "bin[1]"},
{"source": "dl1[1]", "target": "bin[2]"},
{"source": "dl1[1]", "target": "bin[3]"}
]
}
See also
Attribute metadata file format in Attribute Metadata File Specification
Model¶
The model of this component can be described by its fd_params.
fd_params |
Type |
Description |
---|---|---|
source_attr_names |
list of string |
A list of attribute names where the output attribute is derived from. |
params |
dict |
The keys of this dictionary are the same as the context of this component’s Attribute Metadata. |
When loaded in the SAMPO API, the model is represented as a dict of its fd_params.
See also
Obtaining process results via ProcessResultLoader.
{'fd_params':
[{'source_attr_names': ['Sepal.Length'], 'params': {'threshold': 5.0000000000000000e+00}},
{'source_attr_names': ['Petal.Length'], 'params': {'threshold': 5.0000000000000000e+00}},
{'source_attr_names': ['Sepal.Length'], 'params': {'threshold': 1.40000000000000000e+00}},
{'source_attr_names': ['Petal.Length'], 'params': {'threshold': 1.40000000000000000e+00}}]}