====================================
LinearHSICFS Component Specification
====================================

.. contents:: Contents
    :local:

Overview
========

**LinearHSICFS component** is a feature selector.
This component calculates Linear HSIC
(Hilbert-Schmidt Independence Criterion) of the each input features,
and outputs features with high score of Linear HSIC.
The scale of the input must be INTEGER or REAL.

**Example**:

* SPD:

  .. code-block:: python

      dl1 -> ts1 -> fs1

      ---

      components:
          dl1:
              component: DataLoader

          ts1:
              component: TimeshiftFDComponent
              features: scale == 'integer' or scale == 'real'
              shift: [["name == 'humidity'", [3, -2]],
                      ["name == 'temperature'", [-1]]]

          fs1:
              component: LinearHSICFSComponent
              features: all()
              target: name == 'price'
              max_num_output_features: 2

      global_settings:
          keep_attributes:
              - 'price'
          feature_exclude:
              - 'price'

* Input of the *fs1* component:

 +-------+-------------------+--------------------+--------------------+-----------+
 | _sid  |ts1(3)_humidity    |ts1(-2)_humidity    |ts1(-1)_temperature | price     |
 +=======+===================+====================+====================+===========+
 | 0     | 73                | NaN                | NaN                | 100       |
 +-------+-------------------+--------------------+--------------------+-----------+
 | 1     | 66                | NaN                | 22.3               | 110       |
 +-------+-------------------+--------------------+--------------------+-----------+
 | 2     | NaN               | 89                 | 21.8               |  90       |
 +-------+-------------------+--------------------+--------------------+-----------+
 | 3     | NaN               | 88                 | 22.8               | 100       |
 +-------+-------------------+--------------------+--------------------+-----------+
 | 4     | NaN               | 50                 | 23.4               | 120       |
 +-------+-------------------+--------------------+--------------------+-----------+

* Linear HSIC score of the each input features in *fs1*:

 +------------------+-------------------+--------------------+
 |ts(3)_humidity    |ts(-2)_humidity    | ts(-1)_temperature |
 +==================+===================+====================+
 | 0.038            | 0.488             | 0.505              |
 +------------------+-------------------+--------------------+

* Output of the *fs1* component:

  Since the value of ``max_num_output_features`` is 2, *fs1* outputs
  ts1(-2)_humidity and ts(-1)_temperature.

 +------+----------------------+-------------------------+
 | _sid | fs1_ts1(-2)_humidity | fs1_ts1(-1)_temperature |
 +======+======================+=========================+
 | 0    | NaN                  | NaN                     |
 +------+----------------------+-------------------------+
 | 1    | NaN                  | 22.3                    |
 +------+----------------------+-------------------------+
 | 2    | 89                   | 21.8                    |
 +------+----------------------+-------------------------+
 | 3    | 88                   | 22.8                    |
 +------+----------------------+-------------------------+
 | 4    | 50                   | 23.4                    |
 +------+----------------------+-------------------------+

This component has no component-specific external formats.

.. seealso::

    Component-common external format files in :ref:`convert_process`

|

Parameters
==========
Here are the component-specific parameters for the **LinearHSICFS component**.

SPD
---

The following parameter is for "components" section of SPD.

.. list-table::
  :header-rows: 1
  :widths: 10, 5, 15, 10, 50

  * - Parameter Name
    - Type
    - Domain
    - Default Value
    - Description
  * - max_num_output_features [1]_
    - int
    - [1, inf)
    - --
    - Specifies the upper limit number of the output features.

.. [1] Required parameter

The target is required, that is used to calculate Linear HSIC
of the each input features.

Output Attributes
=================
**LinearHSICFS component** generates the following attributes:

.. list-table::
  :header-rows: 1
  :widths: 4,1,3

  * - Attribute Name
    - Scale
    - Description
  * - *<component_id>*\ _\ *<original_attribute_name>*
    - Same as the scale of the original attribute.
    - Same as the value of the original attribute.

The resulting objects or converted output files contain these output attributes and
other attributes specified in ``keep_attributes`` of the SPD.

.. seealso::

    Obtaining process results via `ProcessResultLoader <../../api/process_result_loader.html>`_.

|

Attribute Metadata
==================
The metadata of the output attributes is created with the following rules.

Context Rule
------------
.. list-table::
  :header-rows: 1
  :widths: 4,1,3

  * - Attribute Name
    - Context Name
    - Description
  * - *<component_id>*\ _\ *<original_attribute_name>*
    - score
    - Set the calculation result of Linear HSIC.


Derivation Rule
---------------

Each new attribute is derived from the corresponding attribute
selected by the Linear HSIC score of the input features.


Example
-------
.. code-block:: javascript

    {
        "nodes": [
            {"aid": "_sid", "name": "_sid", "scale": "integer", ...},
            ...,
            {"aid": "dl1[0]", "name": "temperature", "scale": "real",    "is_excluded": false, ...},
            {"aid": "dl1[1]", "name": "humidity",    "scale": "integer", "is_excluded": false, ...},
            {"aid": "dl1[2]", "name": "price",       "scale": "real",    "is_excluded": true,  ...},
            ...,
            {"aid": "ts1[0]", "name": "ts1(3)_humidity",     "scale": "integer", "cid": "ts1", "cindex": 0, "context": {"shift": 3}, ...},
            {"aid": "ts1[1]", "name": "ts1(-2)_humidity",    "scale": "integer", "cid": "ts1", "cindex": 1, "context": {"shift": -2}, ...},
            {"aid": "ts1[2]", "name": "ts1(-1)_temperature", "scale": "real",    "cid": "ts1", "cindex": 2, "context": {"shift": -1}, ...},
            ...,
            {"aid": "fs1[0]", "name": "fs1_ts1(-2)_humidity", "scale": "integer", "cid": "fs1", "cindex": 0,
                             "context": {"score": 4.8804398568390484e-01}}

            {"aid": "fs1[1]", "name": "fs1_ts1(-1)_temperature", "scale": "real", "cid": "fs1", "cindex": 1,
                             "context": {"score": 5.0526028145921520e-01}},
        ],
        "links": [
            {"source": "dl1[1]", "target": "ts1[0]"},
            {"source": "dl1[1]", "target": "ts1[1]"},
            {"source": "dl1[0]", "target": "ts1[2]"},
            ...,
            {"source": "ts1[2]", "target": "fs1[1]"},
            {"source": "ts1[1]", "target": "fs1[0]"}
        ]
    }




.. seealso::

    Attribute metadata file format in :ref:`Attribute Metadata File Specification <attribute-metadata>`

|

Model
=====
The model of this component can be described by its fd_params.

.. list-table::
  :header-rows: 1
  :widths: 3,1,3

  * - fd_params
    - Type
    - Description
  * - source_attr_names
    - list of string
    - A list of attribute names where the output attribute is derived from.
  * - params
    - dict
    - The keys of this dictionary are the same as the context of this component's Attribute Metadata.

These attributes are in the component output data. These can be loaded
in SAMPO API or saved as data.csv after executing :ref:`convert_process`.

.. seealso::

    Obtaining process results via `ProcessResultLoader <../../api/process_result_loader.html>`_.

::

    {'fd_params':
        [{'source_attr_names': ['humidity'], 'params': {}},
         {'source_attr_names': ['temperature'], 'params': {}}]}

Details
=======

Nothing.
