PolynomializeFD Component Specification¶
Contents
Overview¶
PolynomializeFD component is a feature descriptor. This component polynomializes input. The scale of the input must be INTEGER or REAL.
Example:
SPD:
dl1 -> poly1 --- components: dl1: component: DataLoader poly1: component: PolynomializeFDComponent features: scale == 'real' or scale == 'integer' kmin: 2 kmax: 2
Input of the component:
_sid
temperature
pressure
0
22.3
1001
1
21.8
1002
2
inf
NaN
3
23.4
1002
4
-inf
1002
Output of the component:
_sid
poly1_temperature:temperature
poly1_temperature:pressure
poly1_pressure:pressure
0
497.290000
2.232230e+04
1002001
1
475.240000
2.184360e+04
1004004
2
inf
NaN
NaN
3
547.560000
2.344680e+04
1004004
4
inf
-inf
1004004
This component has no component-specific external formats.
See also
Component-common external format files in convert_process
Parameters¶
Here are the component-specific parameters for the PolynomializeFD component.
SPD¶
The following parameter is for “components” section of SPD.
Parameter Name |
Type |
Domain |
Default Value |
Description |
---|---|---|---|---|
kmin |
int |
[2,4] |
2 |
The minimum number of selected attributes for combination.
The details are further explained below.
|
kmax |
int |
[2,4] |
2 |
The maximum number of selected attributes for combination.
The details are further explained below.
|
combinatoric_type |
str |
‘combinations_with_replacement’, ‘combinations’ |
‘combinations_with_replacement’ |
Identifies whether the combination is with repetition or not.
If set as ‘combinations_with_replacement’, repetitions are permitted (multi-combination).
If set as ‘combinations’, repetitions are not permitted (combination).
|
This component generates new attributes, whose values are product of combinations of selected \(k\) from \(n\) attributes. The \(k\) is selected from kmin
to kmax
. Whether the repetitions are permitted or not is determined by combinatoric_type
.
kmax
must be larger than or equal to kmin
.
Output Attributes¶
PolynomializeFD component generates the following attribute:
Attribute Name |
Scale |
Description |
---|---|---|
<component_id>_<original_attribute_name1>:<original_attribute_name2>: … <original_attribute_nameN> |
REAL |
Product of the N original attributes. |
These attributes are in the component output data. These can be loaded in SAMPO API or saved as data.csv after executing convert_process.
See also
Obtaining process results via ProcessResultLoader.
Attribute Metadata¶
The metadata of the output attributes is created with the following rules.
Context Rule¶
Attribute Name |
Context Name |
Description |
---|---|---|
<component_id>_<original_attribute_name1>:<original_attribute_name2>: … <original_attribute_nameN> |
aids |
Set the applied attributes’ IDs (aids). |
Derivation Rule¶
Each new attribute is derived from the corresponding attributes selected by the features
parameter of the component.
Example¶
{
"nodes": [
{"aid": "_sid", "name": "_sid", ... },
{"aid": "dl1[0]", "name": "temperature", ... },
{"aid": "dl1[1]", "name": "pressure", ... },
{"aid": "poly1[0]", "name": "poly1_temperature:temperature",
"scale": "real", "is_excluded": false, "cid": "poly1", "cindex": 0,
"values": null, "is_kept": false, "context": {"aids": ["dl1[0]", "dl1[0]"]}},
{"aid": "poly1[1]", "name": "poly1_temperature:pressure",
"scale": "real", "is_excluded": false, "cid": "poly1", "cindex": 1,
"values": null, "is_kept": false, "context": {"aids": ["dl1[0]", "dl1[1]"]}},
{"aid": "poly1[2]", "name": "poly1_pressure:pressure",
"scale": "real", "is_excluded": false, "cid": "poly1", "cindex": 2,
"values": null, "is_kept": false, "context": {"aids": ["dl1[1]", "dl1[1]"]}}
],
"links": [
{"source": "dl1[0]", "target": "poly1[0]"},
{"source": "dl1[0]", "target": "poly1[1]"},
{"source": "dl1[1]", "target": "poly1[1]"},
{"source": "dl1[1]", "target": "poly1[2]"}
]
}
See also
Attribute metadata file format in Attribute Metadata File Specification
Details¶
In the learning phase, this component only checks the validity of the parameters by checking whether the conditions below are satisfied:
kmax
must be larger than or equal tokmin
.The number of generated attributes is less than or equal to hundred thousand.
In the running phase, this component polynomializes an input.
If the input data has NaN values, the output data includes NaN values same as the original data. (see example in the overview.)
The number of generated attributes is obtained by the following formulas.
When
combinatoric_type
is ‘combinations_with_replacement’:\[\sum_{k=kmin}^{kmax} {}_{n} H _{k} = \sum_{k=kmin}^{kmax} {}_{n+k-1} C _{k} = \sum_{k=kmin}^{kmax} \frac{(n + k - 1)!}{k! \times (n - 1)!}\]When
combinatoric_type
is ‘combinations’:\[\sum_{k=kmin}^{kmax} {}_{n} C _{k} = \sum_{k=kmin}^{kmax} \frac{n!}{k! \times (n - k)!}\]