MovingHistogramFD Component Specification

Overview

MovingHistogramFD component is a feature descriptor. This component calculates a moving histogram which calculates a histogram for each window(subset of the input data), such as a moving average. The scale of the input must be INTEGER or REAL.

The input data is considered as series data depends on the input sequence of samples. The window is specified by start and end positions relative to the sample.

This component generates features as many as the number of bins per input feature. So an output feature is represents the frequency of the observations in the bin.

Example:

  • SPD:

    dl1 -> hist1
    
    ---
    
    components:
        dl1:
            component: DataLoader
    
        hist1:
            component: MovingHistogramFDComponent
            features: scale == 'real' or scale == 'integer'
            moving_histogram_param: [["re_match('.*Length', name)",
                                      [{'window_start': -1, 'window_end': 1,
                                        'num_bins': 2}]]]
    
  • Input of the component:

_sid

Sepal.Length

Petal.Length

0

4.9

inf

1

5.1

1.2

2

4.7

1.3

3

4.6

1.0

4

4.6

-inf

5

5.0

NaN

  • The window for each sample:

_sid

Sepal.Length

Petal.Length

0

[<Out of input>, 4.9, 5.1]

[<Out of input>, inf, 1.2]

1

[4.9, 5.1, 4.7]

[inf, 1.2, 1.3]

2

[5.1, 4.7, 4.6]

[1.2, 1.3, 1.0]

3

[4.7, 4.6, 4.6]

[1.3, 1.0, -inf]

4

[4.6, 4.6, 5.0]

[1.0, -inf, NaN]

5

[4.6, 5.0, <Out of input>]

[-inf, NaN, <Out of input>]

  • Output of the component:

  • If a window includes <Out of input>, outputs NaN.

_sid

hist1(-1:1:2:0)_Sepal.Length

hist1(-1:1:2:1)_Sepal.Length

hist1(-1:1:2:0)_Petal.Length

hist1(-1:1:2:1)_Petal.Length

0

NaN

NaN

NaN

NaN

1

1

2

0

2

2

2

1

1

2

3

3

0

1

1

4

2

1

1

0

5

NaN

NaN

NaN

NaN

This component has no component-specific external formats.

See also

Component-common external format files in convert_process


Parameters

Here are the component-specific parameters for the MovingHistogramFD component.

SPD

The following parameter is for “components” section of SPD.

Parameter Name

Type

Domain

Default Value

Description

moving_histogram_param 1

[[feature_expr, [{params}]]]

Specifies target features and parameters of MovingHistogram function.

1

Required parameter

Details of moving_histogram_param

  • feature_expr follows the format described in Attribute Selection of SPD (SAMPO Process Description) File Specification:

    're_match(".*Length", name)'
    
  • params can be described as dict:

    {'window_start':-1, 'window_end':1, 'num_bins':2}
    

Key of params

Type

Domain

Default Value

Description

window_start

int

(-inf, inf)

0

The start position of window.

window_end

int

(-inf, inf)

0

The end position of window. window_end must be greater than or equal to window_start.

num_bins 2

int

[1, inf)

The number of bins.

2

Required parameter for parameters description


Utilizable Sample Metadata

There are no component-specific sample metadata available.


Output Attributes

The num of output attributes is num of original attributes \(\times\) num_bins

MovingHistogramFD component generates the following attribute:

Attribute Name

Scale

Description

<component_id>(<window_start>:<window_end>:<num_bins>:<bin_index>)_<original_attribute_name>

INTEGER

The frequency of the observations in the bin.

These attributes are in the component output data. These can be loaded in SAMPO API or saved as data.csv after executing convert_process.

See also

Obtaining process results via ProcessResultLoader.


Attribute Metadata

The metadata of the output attributes is created with the following rules.

Context Rule

Attribute Name

Context Name

Description

<component_id>(<window_start>:<window_end>:<num_bins>:<bin_index>)_<original_attribute_name>

window_start

The value of window_start.

<component_id>(<window_start>:<window_end>:<num_bins>:<bin_index>)_<original_attribute_name>

window_end

The value of window_end.

<component_id>(<window_start>:<window_end>:<num_bins>:<bin_index>)_<original_attribute_name>

num_bins

The value of num_bins.

<component_id>(<window_start>:<window_end>:<num_bins>:<bin_index>)_<original_attribute_name>

bin_index

The value of specified bin index.

<component_id>(<window_start>:<window_end>:<num_bins>:<bin_index>)_<original_attribute_name>

bin_edges

The value of bin edges. The number of edges is num_bins + 1.

Derivation Rule

Each new attribute is derived from the corresponding attribute selected by the features parameter of the component.

Example

{
    "nodes": [
        {"aid": "_sid", "name": "_sid", "scale": "integer", "is_excluded": false,
         "cid": null, "cindex": 0, "values": null, "is_kept": false, "context": null},
        {"aid": "dl1[0]", "name": "Sepal.Length", "scale": "real", "is_excluded": false,
         "cid": "dl1", "cindex": 0, "values": null, "is_kept": false, "context": null},
        {"aid": "dl1[1]", "name": "Petal.Length", "scale": "real", "is_excluded": false,
         "cid": "dl1", "cindex": 1, "values": null, "is_kept": false, "context": null},
        {"aid": "hist1[0]", "name": "hist1(-1:1:2:0)_Sepal.Length", "scale": "integer", "is_excluded": false,
         "cid": "hist1", "cindex": 0, "values": null, "is_kept": false,
            "context": {"window_start": -1, "window_end": 1, "num_bins": 2, "bin_index": 0,
                        "bin_edges": [4.5999999999999996e+00, 4.8499999999999996e+00, 5.0999999999999996e+00]
            }
        },
        {"aid": "hist1[1]", "name": "hist1(-1:1:2:1)_Sepal.Length", "scale": "integer", "is_excluded": false,
         "cid": "hist1", "cindex": 1, "values": null, "is_kept": false,
            "context": {"window_start": -1, "window_end": 1, "num_bins": 2, "bin_index": 1,
                        "bin_edges": [4.5999999999999996e+00, 4.8499999999999996e+00, 5.0999999999999996e+00]
            }
        },
        {"aid": "hist1[2]", "name": "hist1(-1:1:2:0)_Petal.Length", "scale": "integer", "is_excluded": false,
         "cid": "hist1", "cindex": 2, "values": null, "is_kept": false,
            "context": {"window_start": -1, "window_end": 1, "num_bins": 2, "bin_index": 0,
                        "bin_edges": [1.0000000000000000e+00, 1.1499999999999999e+00, 1.3000000000000000e+00]
            }
        },
        {"aid": "hist1[3]", "name": "hist1(-1:1:2:1)_Petal.Length", "scale": "integer", "is_excluded": false,
         "cid": "hist1", "cindex": 3, "values": null, "is_kept": false,
            "context": {"window_start": -1, "window_end": 1, "num_bins": 2, "bin_index": 1,
                        "bin_edges": [1.0000000000000000e+00, 1.1499999999999999e+00, 1.3000000000000000e+00]
            }
        }
    ],
    "links": [
        {"source": "dl1[0]", "target": "hist1[0]"},
        {"source": "dl1[0]", "target": "hist1[1]"},
        {"source": "dl1[1]", "target": "hist1[2]"},
        {"source": "dl1[1]", "target": "hist1[3]"}
    ]
}

See also

Attribute metadata file format in Attribute Metadata File Specification


Model

The model of this component can be described by its fd_params.

fd_params

Type

Description

source_attr_names

list of string

A list of attribute names where the output attribute is derived from.

params

dict

The keys of this dictionary are the same as the context of this component’s Attribute Metadata.

When loaded in the SAMPO API, the model is represented as a dict of its fd_params.

See also

Obtaining process results via ProcessResultLoader.

{'fd_params':
    [{'source_attr_names': ['Sepal.Length'], 'params': {'window_start': -1,
                                                        'window_end': 1,
                                                        'num_bins': 2,
                                                        'bin_index': 0,
                                                        'bin_edges': [4.5999999999999996e+00,
                                                                      4.8499999999999996e+00,
                                                                      5.0999999999999996e+00]}},
     {'source_attr_names': ['Sepal.Length'], 'params': {'window_start': -1,
                                                        'window_end': 1,
                                                        'num_bins': 2,
                                                        'bin_index': 1,
                                                        'bin_edges': [4.5999999999999996e+00,
                                                                      4.8499999999999996e+00,
                                                                      5.0999999999999996e+00]}},
     {'source_attr_names': ['Petal.Length'], 'params': {'window_start': -1,
                                                        'window_end': 1,
                                                        'num_bins': 2,
                                                        'bin_index': 0,
                                                        'bin_edges': [1.0000000000000000e+00,
                                                                      1.1499999999999999e+00,
                                                                      1.3000000000000000e+00]}},
     {'source_attr_names': ['Petal.Length'], 'params': {'window_start': -1,
                                                        'window_end': 1,
                                                        'num_bins': 2,
                                                        'bin_index': 1,
                                                        'bin_edges': [1.0000000000000000e+00,
                                                                      1.1499999999999999e+00,
                                                                      1.3000000000000000e+00]}}]}

Details

  • The window is made as follows.

[<sample_index> + window_start, <sample_index> + window_end]

  • This component calculates bin edges that divided the features values, from minimum to maximum, into num_bins equally spaced bins.