================
sampo_ps convert
================

.. contents:: Contents
    :local:

Overview
========
.. warning::

   sampo_ps convert command is deprecated.

**sampo_ps convert** converts a stored data in a ProcessStore from a SAMPO-internal format to a human-readable external format.

The convertible data and their corresponding external format are shown in the table below:

.. list-table::
    :header-rows: 1
    :widths: 1, 2

    * - Convertible Data
      - External Format
    * - SPD (SAMPO Process Description)
      - Same as SPD input file format.
        See the :ref:`SPD File Specification <spd>`.
    * - SRC (SAMPO Run Configuration)
      - Same as SRC input file format.
        See the :ref:`SRC File Specification <src>`.
    * - ASD (Attributes Schema Description)
      - Same as ASD input file format.
        See the :ref:`ASD File Specification <asd>`.
        The attributes in the ASD are component-specific. See component specification.
    * - Attribute metadata
      - See :ref:`Attribute Metadata File Specification <attribute-metadata>`
    * - Selected attributes
      - See :ref:`Selected Attributes File Specification<selected-attributes>`
    * - Model
      - A component-specific format. See each component specification.
    * - Component output data
      - Same as SAMPO CSV input file format.
        See the :doc:`SAMPO CSV File Specification <../../../input/csv>`.
        The attributes in the output data are component-specific. See component specification.
    * - Prediction result evaluation
      - A component-specific format. See component specification.

|

Synopsis
========
See sampo_ps command help::

    $ sampo_ps convert --help

|

Examples
========
You can choose to convert only files in a specific process or component, all the convertible files in a process, and so on.

* Converting data in a process.

  All the convertible files in the process **my_process_a** are output.

  * Command::

      $ sampo_ps convert -s file:///var/process_store_storage -l "my_process_a/*" -o output_dir/

  * Output::

      output_dir
      └── my_process_a
          ├── attr_metadata
          │   └── attr_metadata.json
          ├── components
          │   ├── dl1
          │   │   └── component_output_data
          │   │       ├── data.asd
          │   │       └── data.csv
          │   └── fabreg1
          │       ├── comp_output_data
          │       │   └── rg_predict_result.csv
          │       ├── comp_output_evaluation
          │       │   └── comp_output_evaluation.csv
          │       ├── model
          │       │   ├── fabhmerg_info.csv
          │       │   ├── gate_tree.json
          │       │   └── prediction_formulas.csv
          │       └── selected_attrs
          │           └── selected_attrs.json
          ├── spd
          │   └── my_process_a.spd
          └── src
              └── my_process_a_predict.src

|

* Converting data of a specific component in a process.

  All the convertible files in the component **fabreg1** in the process **my_process_a** are output along with attr_metadata, spd, and src.

  * Command::

      $ sampo_ps convert -s file:///var/process_store_storage -l "my_process_a/fabreg1" -o output_dir/

  * Output::

      output_dir
      └── my_process_a
          ├── attr_metadata
          │   └── attr_metadata.json
          ├── components
          │   └── fabreg1
          │       ├── comp_output_data
          │       │   └── rg_predict_result.csv
          │       ├── comp_output_evaluation
          │       │   └── comp_output_evaluation.csv
          │       ├── model
          │       │   ├── fabhmerg_info.csv
          │       │   ├── gate_tree.json
          │       │   └── prediction_formulas.csv
          │       └── selected_attrs
          │           └── selected_attrs.json
          ├── spd
          │   └── my_process_a.spd
          └── src
              └── my_process_a_predict.src

|

* Converting all the convertible files in a ProcessStore.

  All the convertible files in the ProcessStore are output.

  * Command::

      $ sampo_ps convert -s file:///var/process_store_storage -o output_dir/

  * Output::

      output_dir
      ├── my_process_a
      │   ├── attr_metadata
      │   │   └── attr_metadata.json
      │   ├── components
      │   │   ├── dl1
      │   │   │   └── ...
      │   │   └── fabreg1
      │   │       ├── ...
      │   │       .
      │   │       .
      │   │       .
      │   ├── spd
      │   │   └── my_process_a.spd
      │   └── src
      │       └── my_process_a_predict.src
      ├── my_process_b
      │   ├── ...
      │   .
      │   .
      │   .
      ├── my_process_c
      │   ├── ...
      .   .
      .   .
      .   .

|

* Converting process data in a flat directory structure.

  The converted files are output directly beneath the **output_dir**.

  * Command::

      $ sampo_ps convert -s "file:///var/process_store_storage" -o output_dir/ --flat

  * Output::

      $ tree output_dir
      output_dir/
      ├── my_process_a_learn_attr_metadata.json
      ├── my_process_a_learn_dl1_data.asd
      ├── my_process_a_learn_dl1_data.csv
      ├── my_process_a_learn_fabreg1_comp_output_evaluation.csv
      ├── my_process_a_learn_fabreg1_fabhmerg_info.csv
      ├── my_process_a_learn_fabreg1_gate_tree.json
      ├── my_process_a_learn_fabreg1_prediction_formulas.csv
      ├── my_process_a_learn_fabreg1_rg_predict_result.csv
      ├── my_process_a_learn_fabreg1_selected_attrs.json
      ├── my_process_a_learn_my_process_a.spd
      ├── my_process_a_learn_my_process_a_predict.src
      ├── my_process_b_learn_attr_metadata.json
      ├── ...
      ├── my_process_c_predict_attr_metadata.json
      ├── ...
      .
      .
      .

|

Output Format
=============

Attribute Metadata File Format
------------------------------
* Attribute Metadata File describes the metadata of attributes and the derivation relations in a process.
* Attribute matadata is represented by DAG (Directed Acyclic Graph) structure, consisted of nodes and links.
      * Nodes section represents the information of each attribute.
      * Links section represents derivation relationships of attributes.
* The file follows the JSON format.

**Example**::

    {
        "nodes": [
            {"aid": "dl1[0]", "name": "A", "scale": "integer", "is_excluded": false,"cid": "dl1",
             "cindex": 0, "values": null, "is_kept": true, "context": null},
             {"aid": "dl1[1]", "name": "B", "scale": "nominal", "is_excluded": true, "cid": "dl1",
             "cindex": 1, "values": ["A", "B", "O"], "is_kept": true, "context": null},
             {"aid": "rg1[0]", "name": "actual", "scale": "real", "is_excluded": false, "cid": "rg1",
             "cindex": 0, "values": null, "is_kept": false, "context": {"field_path": ["regression", "actual"]}}
        ],
         "links": [
            {"source": "dl1[0]", "target": "rg1[0]"}
        ]
    }

Nodes Section
^^^^^^^^^^^^^
Nodes section represents the information of all attributes generated in a process.

Each property of attributes is defined as follows:

.. list-table::
    :header-rows: 1
    :widths: 10, 50

    * - Property
      - Description
    * - aid
      - Attribute ID.
    * - name
      - Attribute name.
    * - scale
      - Scale of the attribute.
    * - is_excluded
      - Whether the attribute is excluded as a feature or not.
    * - cid
      - ID of the component by which the attribute was generated.
    * - cindex
      - Index of the attribute in the component by which the attribute was generated.
    * - values
      - Domain of NOMINAL attribute. (**null** if the scale is not NOMINAL.)
    * - is_kept
      - Whether the attribute is kept or not even after running every component.
    * - context
      - Context information of the attribute.

Links Section
^^^^^^^^^^^^^
Links section represents derivation relationships of attributes::

    "links": [
        {"source": "dl1[0]", "target": "rg1[0]"}
    ]

In the above example, links section represents that the attribute **rg1[0]** was derived from the attribute **dl1[0]**.

|

.. _selected-attributes:

Selected Attributes File Format
-------------------------------
* Selected Attributes contains the information of attributes which a learning component selected.
* The file follows the JSON format.

**Example**::

    {
        "selected_features": [
            {"aid": "dl1[0]", "name": "A", "scale": "integer", "is_excluded": false,
             "cid": "dl1", "cindex": 0, "values": null, "is_kept": false, "context": null},
            {"aid": "dl1[1]", "name": "B", "scale": "real", "is_excluded": false,
             "cid": "dl1", "cindex": 1, "values": null, "is_kept": false, "context": null},
            {"aid": "dl1[2]", "name": "C", "scale": "real", "is_excluded": false,
             "cid": "dl1", "cindex": 2, "values": null, "is_kept": false, "context": null}
        ],
        "selected_targets": [
            {"aid": "dl1[3]", "name": "Z", "scale": "integer", "is_excluded": true,
             "cid": "dl1", "cindex": 3, "values": null, "is_kept": true, "context": null}
        ]
    }

Selected Features Section
^^^^^^^^^^^^^^^^^^^^^^^^^
Selected Features Section describes attributes information selected as features.

Each property of attributes is defined as well as that in Nodes Section of Attribute Metadata.

Selected Targets Section
^^^^^^^^^^^^^^^^^^^^^^^^
Selected Targets Section describes attributes information selected as targets.

Each property of attributes is defined as well as that in Nodes Section of Attribute Metadata.
