LinearHSICFS Component Specification¶
Contents
Overview¶
LinearHSICFS component is a feature selector. This component calculates Linear HSIC (Hilbert-Schmidt Independence Criterion) of the each input features, and outputs features with high score of Linear HSIC. The scale of the input must be INTEGER or REAL.
Example:
SPD:
dl1 -> ts1 -> fs1 --- components: dl1: component: DataLoader ts1: component: TimeshiftFDComponent features: scale == 'integer' or scale == 'real' shift: [["name == 'humidity'", [3, -2]], ["name == 'temperature'", [-1]]] fs1: component: LinearHSICFSComponent features: all() target: name == 'price' max_num_output_features: 2 global_settings: keep_attributes: - 'price' feature_exclude: - 'price'
Input of the fs1 component:
_sid
ts1(3)_humidity
ts1(-2)_humidity
ts1(-1)_temperature
price
0
73
NaN
NaN
100
1
66
NaN
22.3
110
2
NaN
89
21.8
90
3
NaN
88
22.8
100
4
NaN
50
23.4
120
Linear HSIC score of the each input features in fs1:
ts(3)_humidity
ts(-2)_humidity
ts(-1)_temperature
0.038
0.488
0.505
Output of the fs1 component:
Since the value of
max_num_output_features
is 2, fs1 outputs ts1(-2)_humidity and ts(-1)_temperature.
_sid
fs1_ts1(-2)_humidity
fs1_ts1(-1)_temperature
0
NaN
NaN
1
NaN
22.3
2
89
21.8
3
88
22.8
4
50
23.4
This component has no component-specific external formats.
See also
Component-common external format files in convert_process
Parameters¶
Here are the component-specific parameters for the LinearHSICFS component.
SPD¶
The following parameter is for “components” section of SPD.
Parameter Name |
Type |
Domain |
Default Value |
Description |
---|---|---|---|---|
max_num_output_features 1 |
int |
[1, inf) |
– |
Specifies the upper limit number of the output features. |
- 1
Required parameter
The target is required, that is used to calculate Linear HSIC of the each input features.
Output Attributes¶
LinearHSICFS component generates the following attributes:
Attribute Name |
Scale |
Description |
---|---|---|
<component_id>_<original_attribute_name> |
Same as the scale of the original attribute. |
Same as the value of the original attribute. |
The resulting objects or converted output files contain these output attributes and
other attributes specified in keep_attributes
of the SPD.
See also
Obtaining process results via ProcessResultLoader.
Attribute Metadata¶
The metadata of the output attributes is created with the following rules.
Context Rule¶
Attribute Name |
Context Name |
Description |
---|---|---|
<component_id>_<original_attribute_name> |
score |
Set the calculation result of Linear HSIC. |
Derivation Rule¶
Each new attribute is derived from the corresponding attribute selected by the Linear HSIC score of the input features.
Example¶
{
"nodes": [
{"aid": "_sid", "name": "_sid", "scale": "integer", ...},
...,
{"aid": "dl1[0]", "name": "temperature", "scale": "real", "is_excluded": false, ...},
{"aid": "dl1[1]", "name": "humidity", "scale": "integer", "is_excluded": false, ...},
{"aid": "dl1[2]", "name": "price", "scale": "real", "is_excluded": true, ...},
...,
{"aid": "ts1[0]", "name": "ts1(3)_humidity", "scale": "integer", "cid": "ts1", "cindex": 0, "context": {"shift": 3}, ...},
{"aid": "ts1[1]", "name": "ts1(-2)_humidity", "scale": "integer", "cid": "ts1", "cindex": 1, "context": {"shift": -2}, ...},
{"aid": "ts1[2]", "name": "ts1(-1)_temperature", "scale": "real", "cid": "ts1", "cindex": 2, "context": {"shift": -1}, ...},
...,
{"aid": "fs1[0]", "name": "fs1_ts1(-2)_humidity", "scale": "integer", "cid": "fs1", "cindex": 0,
"context": {"score": 4.8804398568390484e-01}}
{"aid": "fs1[1]", "name": "fs1_ts1(-1)_temperature", "scale": "real", "cid": "fs1", "cindex": 1,
"context": {"score": 5.0526028145921520e-01}},
],
"links": [
{"source": "dl1[1]", "target": "ts1[0]"},
{"source": "dl1[1]", "target": "ts1[1]"},
{"source": "dl1[0]", "target": "ts1[2]"},
...,
{"source": "ts1[2]", "target": "fs1[1]"},
{"source": "ts1[1]", "target": "fs1[0]"}
]
}
See also
Attribute metadata file format in Attribute Metadata File Specification
Model¶
The model of this component can be described by its fd_params.
fd_params |
Type |
Description |
---|---|---|
source_attr_names |
list of string |
A list of attribute names where the output attribute is derived from. |
params |
dict |
The keys of this dictionary are the same as the context of this component’s Attribute Metadata. |
These attributes are in the component output data. These can be loaded in SAMPO API or saved as data.csv after executing convert_process.
See also
Obtaining process results via ProcessResultLoader.
{'fd_params':
[{'source_attr_names': ['humidity'], 'params': {}},
{'source_attr_names': ['temperature'], 'params': {}}]}