==================================
BinarizeFD Component Specification
==================================

.. contents:: Contents
    :local:

Overview
========
**BinarizeFD component** is a feature descriptor.
This component produces an output by binarization.
The scale of the input must be INTEGER or REAL.

**Example**:

* SPD:

  .. code-block:: python

    dl1 -> bin

    ---

    components:
        dl1:
            component: DataLoader

        bin:
            component: BinarizeFDComponent
            features: scale == 'real' or scale == 'integer'
            binarize_param: [["re_match('.*Length', name)",
                              [{'threshold': 5.0}, {'threshold': 1.4}]]]

* Input of the component:

 +--------+--------------+--------------+
 |   _sid | Sepal.Length | Petal.Length |
 +========+==============+==============+
 |   0    | 5.1          | 1.4          |
 +--------+--------------+--------------+
 |   1    | inf          | 1.4          |
 +--------+--------------+--------------+
 |   2    | 4.7          | 1.3          |
 +--------+--------------+--------------+
 |   3    | 4.6          | -inf         |
 +--------+--------------+--------------+
 |   4    | NaN          | NaN          |
 +--------+--------------+--------------+

* Output of the component:

 +--------+-----------------------+-----------------------+-----------------------+-----------------------+
 |   _sid | bin(5.0)_Sepal.Length | bin(1.4)_Sepal.Length | bin(5.0)_Petal.Length | bin(1.4)_Petal.Length |
 +========+=======================+=======================+=======================+=======================+
 |   0    | 1                     | 1                     | 0                     | 1                     |
 +--------+-----------------------+-----------------------+-----------------------+-----------------------+
 |   1    | 1                     | 1                     | 0                     | 1                     |
 +--------+-----------------------+-----------------------+-----------------------+-----------------------+
 |   2    | 0                     | 1                     | 0                     | 0                     |
 +--------+-----------------------+-----------------------+-----------------------+-----------------------+
 |   3    | 0                     | 1                     | 0                     | 0                     |
 +--------+-----------------------+-----------------------+-----------------------+-----------------------+
 |   4    | NaN                   | NaN                   | NaN                   | NaN                   |
 +--------+-----------------------+-----------------------+-----------------------+-----------------------+

This component has no component-specific external formats.

.. seealso::

    Component-common external format files in :ref:`convert_process`

|

Parameters
==========
Here are the component-specific parameters for the **BinarizeFD component**.

SPD
---

The following parameter is for "components" section of SPD.

.. list-table::
   :header-rows: 1
   :widths: 1, 25, 1, 1, 25

   * - Parameter Name
     - Type
     - Domain
     - Default Value
     - Description
   * - binarize_param [1]_
     - | [[**feature_expr**, [{**params**}]]]
     - --
     - --
     - Specifies target features and parameters of binarize.

.. [1] Required parameter

Details of binarize_param
-------------------------

* **feature_expr** follows the format described in *Attribute Selection* of *SPD (SAMPO Process Description) File Specification*::

      're_match(".*Length", name)'

* **params** can be described as dict::

      {'threshold': 5.0}

 .. list-table::
    :header-rows: 1
    :widths: 15, 5, 15, 10, 30

    * - Key of params
      - Type
      - Domain
      - Default Value
      - Description
    * - threshold
      - float
      - (-inf, inf)
      - 0.0
      - The threshold which binarize the input attribute.

|

Output Attributes
=================
**BinarizeFD component** generates the following attribute:

.. list-table::
  :header-rows: 1
  :widths: 3,1,3

  * - Attribute Name
    - Scale
    - Description
  * - *<component_id>*\ (<threshold>)\ _\ *<original_attribute_name>*
    - INTEGER
    - Binarized value by threshold of the original value.

These attributes are in the component output data. These can be loaded
in SAMPO API or saved as data.csv after executing :ref:`convert_process`.

.. seealso::

    Obtaining process results via `ProcessResultLoader <../../api/process_result_loader.html>`_.

|

Attribute Metadata
==================
The metadata of the output attributes is created with the following rules.

Context Rule
------------
.. list-table::
  :header-rows: 1
  :widths: 3,1,3

  * - Attribute Name
    - Context Name
    - Description
  * - *<component_id>*\ (<threshold>)\ _\ *<original_attribute_name>*
    - threshold
    - Set the value of ``threshold``.

Derivation Rule
---------------
Each new attribute is derived from the corresponding attribute selected by the ``features`` parameter of the component.

Example
-------
.. code-block:: javascript

    {
        "nodes": [
            {"aid": "_sid", "name": "_sid", ... },
            {"aid": "dl1[0]", "name": "Sepal.Length", ... },
            {"aid": "dl1[1]", "name": "Petal.Length", ... },
            {"aid": "bin[0]", "name": "bin(5.0)_Sepal.Length", "scale": "integer",
             "is_excluded": false, "cid": "bin", "cindex": 0, "values": null, "is_kept": false,
             "context": {"threshold": 5.0000000000000000e+00}},
            {"aid": "bin[1]", "name": "bin(5.0)_Petal.Length", "scale": "integer",
             "is_excluded": false, "cid": "bin", "cindex": 1, "values": null, "is_kept": false,
             "context": {"threshold": 5.0000000000000000e+00}},
            {"aid": "bin[2]", "name": "bin(1.4)_Sepal.Length", "scale": "integer",
             "is_excluded": false, "cid": "bin", "cindex": 2, "values": null, "is_kept": false,
             "context": {"threshold": 1.40000000000000000e+00}},
            {"aid": "bin[3]", "name": "bin(1.4)_Petal.Length", "scale": "integer",
             "is_excluded": false, "cid": "bin", "cindex": 3, "values": null, "is_kept": false,
             "context": {"threshold": 1.40000000000000000e+00}},
        ],
        "links": [
            {"source": "dl1[0]", "target": "bin[0]"},
            {"source": "dl1[0]", "target": "bin[1]"},
            {"source": "dl1[1]", "target": "bin[2]"},
            {"source": "dl1[1]", "target": "bin[3]"}
        ]
    }

.. seealso::

    Attribute metadata file format in :ref:`Attribute Metadata File Specification <attribute-metadata>`

|

Model
=====
The model of this component can be described by its fd_params.

.. list-table::
  :header-rows: 1
  :widths: 3,1,3

  * - fd_params
    - Type
    - Description
  * - source_attr_names
    - list of string
    - A list of attribute names where the output attribute is derived from.
  * - params
    - dict
    - The keys of this dictionary are the same as the context of this component's Attribute Metadata.

When loaded in the SAMPO API, the model is represented as a dict of its fd_params.

.. seealso::

    Obtaining process results via `ProcessResultLoader <../../api/process_result_loader.html>`_.

::

    {'fd_params':
        [{'source_attr_names': ['Sepal.Length'], 'params': {'threshold': 5.0000000000000000e+00}},
         {'source_attr_names': ['Petal.Length'], 'params': {'threshold': 5.0000000000000000e+00}},
         {'source_attr_names': ['Sepal.Length'], 'params': {'threshold': 1.40000000000000000e+00}},
         {'source_attr_names': ['Petal.Length'], 'params': {'threshold': 1.40000000000000000e+00}}]}

|

Details
=======
* In the running phase, the output values calculated by the input values and threshold as follows.

    * :math:`original value >=` ``threshold``:
        **output value** = ``1``

    * :math:`original value <` ``threshold``:
        **output value** = ``0``
