MovingStatsFD Component Specification¶
Contents
Overview¶
MovingStatsFD component is a feature descriptor. This component calculates a statistical value for each window (subset of the input data), such as a moving mean (average), max or min. The scale of the input must be INTEGER or REAL.
The input data, considered as series data, depend on the input sequence of samples. The window is specified by start and end positions relative to the sample.
Example:
SPD:
dl1 -> ms1 --- components: dl1: component: DataLoader ms1: component: MovingStatsFDComponent features: scale == 'real' or scale == 'integer' movingstats_param: [["re_match('Sepal.*', name)", [{'window_start': -2, 'window_end':0, 'stats_function': 'mean'}]], ["re_match('Petal.*', name)", [{'window_start': -1, 'window_end':1, 'stats_function': 'nanmean'}]]]
Input of the component:
_sid
Sepal.Length
Petal.Length
0
5.1
1.4
1
4.8
1.2
2
4.5
1.3
3
NaN
NaN
4
4.6
1.3
The window for each sample
_sid
Sepal.Length
Petal.Length
0
[<Out of input>, <Out of input>, 5.1]
[<Out of input>, 1.4, 1.2]
1
[<Out of input>, 5.1, 4.8]
[1.4, 1.2, 1.3]
2
[5.1, 4.8, 4.5]
[1.2, 1.3, NaN]
3
[4.8, 4.5, NaN]
[1.3, NaN, 1.3]
4
[4.5, NaN, 4.6]
[NaN, 1.3, <Out of input>]
Output of the component:
If a window includes <Out of input> or NaN, output becomes NaN.
If the stats_function starts with “nan”(e.g. nanmean), output is calculated ignoring NaN.
_sid
ms1(-2:0:mean)_Sepal.Length
ms1(-1:1:nanmean)_Petal.Length
0
NaN
NaN
1
NaN
1.30
2
4.8
1.25
3
NaN
1.30
4
NaN
NaN
This component has no component-specific external formats.
See also
Component-common external format files in convert_process
Parameters¶
Here are the component-specific parameters for the MovingStatsFD component.
SPD¶
The following parameter is for “components” section of SPD.
Parameter Name |
Type |
Domain |
Default Value |
Description |
---|---|---|---|---|
movingstats_param 1 |
[[feature_expr, [{params}]]]
|
– |
– |
Specifies target features and parameters of moving statistics function. |
- 1
Required parameter
Details of movingstats_param¶
feature_expr follows the format described in Attribute Selection of SPD (SAMPO Process Description) File Specification:
're_match(".*Length", name)'
params can be described as dict:
{'window_start': -2, 'window_end':0, 'stats_function': 'max'}
Key of params |
Type |
Domain |
Default Value |
Description |
---|---|---|---|---|
window_start |
int |
(-inf, inf) |
0 |
The start position of window. |
window_end |
int |
(-inf, inf) |
0 |
The end position of window. |
stats_function 2 |
str |
max, min, mean, nanmax, nanmin, nanmean |
– |
Statistical function. If the stats_function starts with “nan” (e.g. nanmean), the statistics function is calculated ignoring NaN values. |
- 2
Required parameter for parameters description
Output Attributes¶
MovingStatsFD component generates the following attribute:
Attribute Name |
Scale |
Description |
---|---|---|
<component_id>(<window_start>:<window_end>:<stats_function>)_<original_attribute_name> |
REAL |
Calculated statistical value. |
These attributes are in the component output data. These can be loaded in SAMPO API or saved as data.csv after executing convert_process.
See also
Obtaining process results via ProcessResultLoader.
Attribute Metadata¶
The metadata of the output attributes is created with the following rules.
Context Rule¶
Attribute Name |
Context Name |
Description |
---|---|---|
<component_id>(<window_start>:<window_end>:<stats_function>)_<original_attribute_name> |
window_start |
the value of |
<component_id>(<window_start>:<window_end>:<stats_function>)_<original_attribute_name> |
window_end |
the value of |
<component_id>(<window_start>:<window_end>:<stats_function>)_<original_attribute_name> |
stats_function |
the value of |
Derivation Rule¶
Each new attribute is derived from the corresponding attribute selected by the features
parameter of the component.
Example¶
{
"nodes": [
{"aid": "dl1[1]", "name": "Petal.Length", ... },
{"aid": "_sid", "name": "_sid", ... },
{"aid": "dl1[0]", "name": "Sepal.Length", ... },
{"aid": "ms1[1]", "name": "ms1(-1:1:nanmean)_Petal.Length", "scale": "real",
"is_excluded": false, "cid": "ms1", "cindex": 1, "values": null, "is_kept": false,
"context": {"window_end": 1, "stats_function": "nanmean", "window_start": -1}},
{"aid": "ms1[0]", "name": "ms1(-2:0:mean)_Sepal.Length", "scale": "real",
"is_excluded": false, "cid": "ms1", "cindex": 0, "values": null, "is_kept": false,
"context": {"window_end": 0, "stats_function": "mean", "window_start": -2}}
],
"links": [
{"source": "dl1[1]", "target": "ms1[1]"},
{"source": "dl1[0]", "target": "ms1[0]"}
]
}
See also
Attribute metadata file format in Attribute Metadata File Specification
Model¶
The model of this component can be described by its fd_params.
fd_params |
Type |
Description |
---|---|---|
source_attr_names |
list of string |
A list of attribute names where the output attribute is derived from. |
params |
dict |
The keys of this dictionary are the same as the context of this component’s Attribute Metadata. |
When loaded in the SAMPO API, the model is represented as a dict of its fd_params.
See also
Obtaining process results via ProcessResultLoader.
{'fd_params':
[{'source_attr_names': ['Petal.Length'], 'params': {'window_end': 1, 'stats_function': 'nanmean', 'window_start': -1}},
{'source_attr_names': ['Sepal.Length'], 'params': {'window_end': 0, 'stats_function': 'mean', 'window_start': -2}}]}