sampo.api.process_store

create

sampo.api.process_store.create(url)

Creates a process store.

Parameters
urlstr

Process store URL.

Raises
ValidationError
  • If url is not str.

Examples

>>> from sampo.api import process_store
>>> process_store.create('file:///var/process_store_storage')

open_process

sampo.api.process_store.open_process(url, process_name)

Opens a process in the process store and returns a ProcessResultLoader object.

Parameters
urlstr

Process store URL.

process_namestr or ProcessKey

A process name or a ProcessKey object.

Returns
prlProcessResultLoader

A ProcessResultLoader object.

Raises
ValidationError
  • If url is not string.

  • If process_name is neither string nor ProcessKey.

Examples

>>> from sampo.api import process_store
>>> with process_store.open_process('tmpstore', 'fabhmerg_predict') as prl:
...     output_df = prl.load_comp_output('fabrg1')

list_process_metadata

sampo.api.process_store.list_process_metadata(url, all=False)

Lists the process metadata of the processes in a process store as a pandas.DataFrame.

Parameters
urlstr

Process store URL.

allbool

Displays all processes including in-progress processes and previous versions of processes in the process store.

Returns
list_dfpandas.DataFrame

DataFrame that lists the processes from the process store and their metadata.

Columns:

  • process name <dtype: object>
    • Name of the process.

  • version <dtype: object>
    • Version of the process.

  • started at <dtype: datetime64[ns]>
    • Date and time that the process started.

  • running time <dtype: timedelta64[ns]>
    • Time that the process has taken to run. If the status is In progress, this column is NaT.

  • status <dtype: object>
    • There are three statuses:
      • Succeeded : Process has finished successfully.

      • Failed : Process has failed and finished.

      • In progress : Process is in execution.

Raises
ValidationError
  • If url is not str.

  • If all is not bool.

Examples

Example1: Listing current versions of processes in the process store.

In [1]:
from sampo.api import process_store
pstore_url = './pstore'
process_store.list_process_metadata(pstore_url)
Out[1]:
process name version started at running time status
0 fabhmerg_learn a2ec5777-59e9-48b5-9ca3-8403cb642692 2018-06-05 16:02:39.371195 00:00:00.779124 Succeeded
1 fabhmerg_predict 4eea84be-7b72-47af-8152-6e1f9974b21c 2018-06-05 16:02:40.164337 00:00:00.302370 Succeeded

Example2: Listing current and older versions of processes and in-progress processes.

In [2]:
from sampo.api import process_store
pstore_url = './pstore'
process_store.list_process_metadata(pstore_url, True)
Out[2]:
process name version started at running time status
0 fabhmerg_learn a2ec5777-59e9-48b5-9ca3-8403cb642692 2018-06-05 16:02:39.371195 00:00:00.779124 Succeeded
1 fabhmerg_learn 8e7dcdca-783c-47b8-b480-39e2a784d8c6 2018-06-05 15:46:35.923370 00:00:00.865393 Succeeded
2 fabhmerg_predict 4eea84be-7b72-47af-8152-6e1f9974b21c 2018-06-05 16:02:40.164337 00:00:00.302370 Succeeded
3 fabhmerg_predict 3081f6a4-ce08-4c4d-a72d-031513eb47df 2018-06-05 15:46:36.802132 NaT In progress

list_comp_metadata

sampo.api.process_store.list_comp_metadata(url, all=False)

Lists the component metadata of the processes in a process store as a pandas.DataFrame.

Parameters
urlstr

Process store URL.

allbool

Displays all processes including in-progress processes and previous versions of processes in the process store.

Returns
list_dfpandas.DataFrame

DataFrame that lists the components of each process in the process store and their metadata.

Columns:

  • process name <dtype: object>
    • Name of the process that the component belongs to.

  • version <dtype: object>
    • Version of the process that the component belongs to.

  • cid <dtype: object>
    • Component ID

  • started at <dtype: datetime64[ns]>
    • Date and time that the component started.

  • running time <dtype: timedelta64[ns]>
    • Time that the component has taken to run. If the status is In progress, this column is NaT.

Raises
ValidationError
  • If url is not str.

  • If all is not bool.

Examples

Example1: Listing the components of current versions of processes in the process store.

In [1]:
from sampo.api import process_store
pstore_url = './pstore'
process_store.list_comp_metadata(pstore_url)
Out[1]:
process name version cid started at running time
0 fabhmerg_learn a2ec5777-59e9-48b5-9ca3-8403cb642692 dl 2018-06-05 16:02:39.373138 00:00:00.040583
1 fabhmerg_learn a2ec5777-59e9-48b5-9ca3-8403cb642692 bexp 2018-06-05 16:02:39.414316 00:00:00.057005
2 fabhmerg_learn a2ec5777-59e9-48b5-9ca3-8403cb642692 std 2018-06-05 16:02:39.472218 00:00:00.091999
3 fabhmerg_learn a2ec5777-59e9-48b5-9ca3-8403cb642692 rg 2018-06-05 16:02:39.565387 00:00:00.584080
4 fabhmerg_predict 4eea84be-7b72-47af-8152-6e1f9974b21c dl 2018-06-05 16:02:40.177940 00:00:00.063480
5 fabhmerg_predict 4eea84be-7b72-47af-8152-6e1f9974b21c bexp 2018-06-05 16:02:40.246826 00:00:00.047985
6 fabhmerg_predict 4eea84be-7b72-47af-8152-6e1f9974b21c std 2018-06-05 16:02:40.299318 00:00:00.034860
7 fabhmerg_predict 4eea84be-7b72-47af-8152-6e1f9974b21c rg 2018-06-05 16:02:40.348850 00:00:00.117187

Example2: Listing the components of current and older versions of processes in the process store.

In [2]:
from sampo.api import process_store
pstore_url = './pstore'
process_store.list_comp_metadata(pstore_url, all=True)
Out[2]:
process name version cid started at running time
0 fabhmerg_learn a2ec5777-59e9-48b5-9ca3-8403cb642692 dl 2018-06-05 16:02:39.373138 00:00:00.040583
1 fabhmerg_learn a2ec5777-59e9-48b5-9ca3-8403cb642692 bexp 2018-06-05 16:02:39.414316 00:00:00.057005
2 fabhmerg_learn a2ec5777-59e9-48b5-9ca3-8403cb642692 std 2018-06-05 16:02:39.472218 00:00:00.091999
3 fabhmerg_learn a2ec5777-59e9-48b5-9ca3-8403cb642692 rg 2018-06-05 16:02:39.565387 00:00:00.584080
4 fabhmerg_learn 8e7dcdca-783c-47b8-b480-39e2a784d8c6 dl 2018-06-05 15:46:35.925226 00:00:00.041683
5 fabhmerg_learn 8e7dcdca-783c-47b8-b480-39e2a784d8c6 bexp 2018-06-05 15:46:35.967444 00:00:00.055198
6 fabhmerg_learn 8e7dcdca-783c-47b8-b480-39e2a784d8c6 std 2018-06-05 15:46:36.023242 00:00:00.090001
7 fabhmerg_learn 8e7dcdca-783c-47b8-b480-39e2a784d8c6 rg 2018-06-05 15:46:36.114127 00:00:00.674080
8 fabhmerg_predict 4eea84be-7b72-47af-8152-6e1f9974b21c dl 2018-06-05 16:02:40.177940 00:00:00.063480
9 fabhmerg_predict 4eea84be-7b72-47af-8152-6e1f9974b21c bexp 2018-06-05 16:02:40.246826 00:00:00.047985
10 fabhmerg_predict 4eea84be-7b72-47af-8152-6e1f9974b21c std 2018-06-05 16:02:40.299318 00:00:00.034860
11 fabhmerg_predict 4eea84be-7b72-47af-8152-6e1f9974b21c rg 2018-06-05 16:02:40.348850 00:00:00.117187

remove_process

sampo.api.process_store.remove_process(url, process_name)

Removes process from the process store.

If the removed process was the current version, the current version will be assigned to the latest version which has the latest finished_at date in the process metadata.

Warning

If a process name or a ProcessKey with no version is specified, all versions will be removed.

Parameters
urlstr

Process store URL.

process_namestr or ProcessKey

process name or ProcessKey object.

Raises
ValidationError
  • If url is not str.

  • If process_name is neither string nor ProcessKey.

Examples

Example1: Removing by ProcessKey

In [1]:
# Initial state
from sampo.api import process_store
process_store.list_process_metadata('./pstore1', all=True)
Out[1]:
process name version started at running time status
0 fabhmerg_learn 57120ab6-5e60-40da-95fe-a9b412b55ccf 2018-06-16 16:52:18.035794 00:00:00.488717 Succeeded
1 fabhmerg_learn 6e657f5d-7f20-4b3d-8f7b-087d0d93a2f3 2018-06-16 16:52:16.970459 00:00:00.717158 Succeeded
2 fabhmerg_predict 88ac2dd5-54c0-4794-83b1-faca6a979420 2018-06-16 16:52:18.538106 00:00:00.272759 Succeeded
3 fabhmerg_predict e55782cd-7c6d-4afa-b58b-7b4f4fb08b11 2018-06-16 16:52:17.738271 00:00:00.286107 Succeeded
In [2]:
# Removes and lists result.
process_store.remove_process('./pstore1', 'fabhmerg_learn.57120ab6-5e60-40da-95fe-a9b412b55ccf')
process_store.list_process_metadata('./pstore1', all=True)
Out[2]:
process name version started at running time status
0 fabhmerg_learn 6e657f5d-7f20-4b3d-8f7b-087d0d93a2f3 2018-06-16 16:52:16.970459 00:00:00.717158 Succeeded
1 fabhmerg_predict 88ac2dd5-54c0-4794-83b1-faca6a979420 2018-06-16 16:52:18.538106 00:00:00.272759 Succeeded
2 fabhmerg_predict e55782cd-7c6d-4afa-b58b-7b4f4fb08b11 2018-06-16 16:52:17.738271 00:00:00.286107 Succeeded

Example2: Removing all version processes by process name

In [3]:
# Initial state
from sampo.api import process_store
process_store.list_process_metadata('./pstore2', all=True)
Out[3]:
process name version started at running time status
0 fabhmerg_learn c22ca106-6bfb-42a8-ad6f-c6d8f3166fb3 2018-06-16 16:52:20.151958 00:00:00.537344 Succeeded
1 fabhmerg_learn e4e88fbc-8ce2-43cc-9e29-49b9966d2221 2018-06-16 16:52:19.222021 00:00:00.608088 Succeeded
2 fabhmerg_predict 01d1388f-e5ee-4f58-8e94-1032abcd4a31 2018-06-16 16:52:20.703726 00:00:00.288623 Succeeded
3 fabhmerg_predict 58617243-bb8f-454f-bbc5-12ecc6261c74 2018-06-16 16:52:19.844308 00:00:00.295551 Succeeded
In [4]:
# Removes and lists result.
process_store.remove_process('./pstore2', 'fabhmerg_learn')
process_store.list_process_metadata('./pstore2', all=True)
Out[4]:
process name version started at running time status
0 fabhmerg_predict 01d1388f-e5ee-4f58-8e94-1032abcd4a31 2018-06-16 16:52:20.703726 00:00:00.288623 Succeeded
1 fabhmerg_predict 58617243-bb8f-454f-bbc5-12ecc6261c74 2018-06-16 16:52:19.844308 00:00:00.295551 Succeeded

rename_process

sampo.api.process_store.rename_process(url, process_name, new_process_name)

Renames the process.

The specified process will have the process name changed, but the process version will be kept unchanged.

When you rename a specific process, the SRC file in the ProcessStore is reconfigured to reflect the new name.

If the renamed process was the current version, the current version will be assigned to the latest version which has the latest finished_at date in the process metadata.

If a process name or a ProcessKey with no version is specified, all versions will be renamed.

Parameters
urlstr

Process store URL.

process_namestr or ProcessKey

Process name or ProcessKey object.

new_process_namestr

New process name.

Raises
ValidationError
  • If url is not str.

  • If process_name is neither string nor ProcessKey.

  • If new_process_name is not str.

Examples

Example1: Renaming by ProcessKey

In [1]:
# Initial state
from sampo.api import process_store
process_store.list_process_metadata('./pstore1', all=True)
Out[1]:
process name version started at running time status
0 fabhmerg_learn 15d61636-73e1-4323-9313-bcdf58ea9785 2018-06-16 16:48:04.947117 00:00:00.887448 Succeeded
1 fabhmerg_learn f7e20450-56da-4977-8416-32dc2ff04460 2018-06-16 16:48:04.041156 00:00:00.579013 Succeeded
2 fabhmerg_predict 008e2a5c-dc97-4b36-8bdd-09b8443684f8 2018-06-16 16:48:05.852845 00:00:00.279540 Succeeded
3 fabhmerg_predict df0aae9b-0b6e-46b7-9fd6-60aa4321a360 2018-06-16 16:48:04.633552 00:00:00.302591 Succeeded
In [2]:
# Removes and lists result.
process_store.rename_process('./pstore1', 'fabhmerg_learn.15d61636-73e1-4323-9313-bcdf58ea9785', 'new_name')
process_store.list_process_metadata('./pstore1', all=True)
Out[2]:
process name version started at running time status
0 fabhmerg_learn f7e20450-56da-4977-8416-32dc2ff04460 2018-06-16 16:48:04.041156 00:00:00.579013 Succeeded
1 fabhmerg_predict 008e2a5c-dc97-4b36-8bdd-09b8443684f8 2018-06-16 16:48:05.852845 00:00:00.279540 Succeeded
2 fabhmerg_predict df0aae9b-0b6e-46b7-9fd6-60aa4321a360 2018-06-16 16:48:04.633552 00:00:00.302591 Succeeded
3 new_name 15d61636-73e1-4323-9313-bcdf58ea9785 2018-06-16 16:48:04.947117 00:00:00.887448 Succeeded

Example2: Renaming all version processes by process name

In [3]:
# Initial state
from sampo.api import process_store
process_store.list_process_metadata('./pstore2', all=True)
Out[3]:
process name version started at running time status
0 fabhmerg_learn 1c441d17-1bde-41fb-ba06-8873e92284e4 2018-06-16 16:48:07.692683 00:00:00.533209 Succeeded
1 fabhmerg_learn 9bdc83cf-3bd9-47cd-a040-98ba23356ac7 2018-06-16 16:48:06.564030 00:00:00.826740 Succeeded
2 fabhmerg_predict a406eaa2-fd70-4c2f-8598-a193ec041502 2018-06-16 16:48:08.241632 00:00:00.282350 Succeeded
3 fabhmerg_predict c06d88f1-3791-4e40-8000-428658f7884c 2018-06-16 16:48:07.404485 00:00:00.277315 Succeeded
In [4]:
# Removes and lists result.
process_store.rename_process('./pstore2', 'fabhmerg_learn', 'new_name')
process_store.list_process_metadata('./pstore2', all=True)
Out[4]:
process name version started at running time status
0 fabhmerg_predict a406eaa2-fd70-4c2f-8598-a193ec041502 2018-06-16 16:48:08.241632 00:00:00.282350 Succeeded
1 fabhmerg_predict c06d88f1-3791-4e40-8000-428658f7884c 2018-06-16 16:48:07.404485 00:00:00.277315 Succeeded
2 new_name 1c441d17-1bde-41fb-ba06-8873e92284e4 2018-06-16 16:48:07.692683 00:00:00.533209 Succeeded
3 new_name 9bdc83cf-3bd9-47cd-a040-98ba23356ac7 2018-06-16 16:48:06.564030 00:00:00.826740 Succeeded

convert_process

sampo.api.process_store.convert_process(url, process_name, dest_dir, cids=None, flat=False)

Converts stored data in a process store from a SAMPO-internal format to a human-readable process external format.

Parameters
urlstr

Process store URL.

process_name: str

A process name.

dest_dirstr

Path to destination directory.

cidslist or None

Specify target components to convert. Data of components matched with each item of this list is converted. If None, all components in a specified process are output.

Specifies some components:

['dl', 'rg', 'bexp']
flatbool

If true, output files in a flat directory structure. See examples for more details. An output filename includes process name, cid and each data name concatenated with under-score(‘_’) as below.

<process name>_<cid>_<data name>

examples

  • fabreg_learn_fabreg1_reg_predict_result.csv

  • fabreg_learn_fabreg1_selected_attrs.json

Raises
ValidationError
  • If url is not str.

  • If process_name is not str.

  • If dest_dir is not str.

  • If cids is not list and not None.

  • If flat is not bool.

Examples

  • Converting data in a process.

>>> from sampo.api import process_store
>>> process_store.convert_process('./pstore', 'my_process_a', './convert_dir')
  • Output:

    convert_dir/
    └── my_process_a
        ├── attr_metadata
        │   └── attr_metadata.json
        ├── components
        │   ├── dl
        │   │   └── comp_output_data
        │   │       ├── data.asd
        │   │       └── data.csv
        │   ├── rg
        │   │   ├── comp_output_data
        │   │   │   └── rg_predict_result.csv
        │   │   ├── comp_output_evaluation
        │   │   │   └── comp_output_evaluation.csv
        │   │   ├── model
        │   │   │   ├── fabhmerg_info.csv
        │   │   │   ├── gate_tree.json
        │   │   │   └── prediction_formulas.csv
        │   │   └── selected_attrs
        │   │       └── selected_attrs.json
        │   └── std
        │       ├── comp_output_data
        │       │   ├── data.asd
        │       │   └── data.csv
        │       └── selected_attrs
        │           └── selected_attrs.json
        ├── spd
        │   └── my_process_a.spd
        └── src
            └── dump.src
    
  • Converting data in specific components.

>>> from sampo.api import process_store
>>> process_store.convert_process('./pstore', 'my_process_a', './convert_dir', ['dl', 'rg'])
  • Output:

    convert_dir/
    └── my_process_a
        ├── attr_metadata
        │   └── attr_metadata.json
        ├── components
        │   ├── dl
        │   │   └── comp_output_data
        │   │       ├── data.asd
        │   │       └── data.csv
        │   └── rg
        │        ├── comp_output_data
        │        │   └── rg_predict_result.csv
        │        ├── comp_output_evaluation
        │        │   └── comp_output_evaluation.csv
        │        ├── model
        │        │   ├── fabhmerg_info.csv
        │        │   ├── gate_tree.json
        │        │   └── prediction_formulas.csv
        │        └── selected_attrs
        │             └── selected_attrs.json
        ├── spd
        │   └── my_process_a.spd
        └── src
            └── dump.src
    
  • Converting data in a flat directory structure.

>>> from sampo.api import process_store
>>> process_store.convert_process('./pstore', 'my_process_a', './convert_dir', flat=True)
  • Output:

    convert_dir
    ├── my_process_a_attr_metadata.json
    ├── my_process_a_bexp_data.asd
    ├── my_process_a_bexp_data.csv
    ├── my_process_a_bexp_selected_attrs.json
    ├── my_process_a_dl_data.asd
    ├── my_process_a_dl_data.csv
    ├── my_process_a_dump.src
    ├── my_process_a_my_process_a.spd
    ├── my_process_a_rg_comp_output_evaluation.csv
    ├── my_process_a_rg_fabhmerg_info.csv
    ├── my_process_a_rg_gate_tree.json
    ├── my_process_a_rg_prediction_formulas.csv
    ├── my_process_a_rg_rg_predict_result.csv
    ├── my_process_a_rg_selected_attrs.json
    ├── my_process_a_std_data.asd
    ├── my_process_a_std_data.csv
    └── my_process_a_std_selected_attrs.json
    

Process External Format

Convertible data and their corresponding external format

The convertible data and their corresponding external format are shown in the table below:

Convertible Data

External Format

SPD (SAMPO Process Description)

Same as SPD input file format. See the SPD File Specification.

SRC (SAMPO Run Configuration)

Same as SRC input file format. See the SRC File Specification.

ASD (Attributes Schema Description)

Same as ASD input file format. See the ASD File Specification. The attributes in the ASD are component-specific. See component specification.

Attribute metadata

See Attribute Metadata File Specification

Selected attributes

See Selected Attributes File Specification

Model

A component-specific format. See each component specification.

Component output data

Same as SAMPO CSV input file format. See the SAMPO CSV File Specification. The attributes in the output data are component-specific. See component specification.

Prediction result evaluation

A component-specific format. See component specification.

Attribute Metadata File Format

  • Attribute Metadata File describes the metadata of attributes and the derivation relations in a process.

  • Attribute matadata is represented by DAG (Directed Acyclic Graph) structure, consisted of nodes and links.
    • Nodes section represents the information of each attribute.

    • Links section represents derivation relationships of attributes.

  • The file follows the JSON format.

Example:

{
    "nodes": [
        {"aid": "dl1[0]", "name": "A", "scale": "integer", "is_excluded": false,"cid": "dl1",
         "cindex": 0, "values": null, "is_kept": true, "context": null},
         {"aid": "dl1[1]", "name": "B", "scale": "nominal", "is_excluded": true, "cid": "dl1",
         "cindex": 1, "values": ["A", "B", "O"], "is_kept": true, "context": null},
         {"aid": "rg1[0]", "name": "actual", "scale": "real", "is_excluded": false, "cid": "rg1",
         "cindex": 0, "values": null, "is_kept": false, "context": {"field_path": ["regression", "actual"]}}
    ],
     "links": [
        {"source": "dl1[0]", "target": "rg1[0]"}
    ]
}
Nodes Section

Nodes section represents the information of all attributes generated in a process.

Each property of attributes is defined as follows:

Property

Description

aid

Attribute ID.

name

Attribute name.

scale

Scale of the attribute.

is_excluded

Whether the attribute is excluded as a feature or not.

cid

ID of the component by which the attribute was generated.

cindex

Index of the attribute in the component by which the attribute was generated.

values

Domain of NOMINAL attribute. (null if the scale is not NOMINAL.)

is_kept

Whether the attribute is kept or not even after running every component.

context

Context information of the attribute.

Selected Attributes File Format

  • Selected Attributes contains the information of attributes which a learning component selected.

  • The file follows the JSON format.

Example:

{
    "selected_features": [
        {"aid": "dl1[0]", "name": "A", "scale": "integer", "is_excluded": false,
         "cid": "dl1", "cindex": 0, "values": null, "is_kept": false, "context": null},
        {"aid": "dl1[1]", "name": "B", "scale": "real", "is_excluded": false,
         "cid": "dl1", "cindex": 1, "values": null, "is_kept": false, "context": null},
        {"aid": "dl1[2]", "name": "C", "scale": "real", "is_excluded": false,
         "cid": "dl1", "cindex": 2, "values": null, "is_kept": false, "context": null}
    ],
    "selected_targets": [
        {"aid": "dl1[3]", "name": "Z", "scale": "integer", "is_excluded": true,
         "cid": "dl1", "cindex": 3, "values": null, "is_kept": true, "context": null}
    ]
}
Selected Features Section

Selected Features Section describes attributes information selected as features.

Each property of attributes is defined as well as that in Nodes Section of Attribute Metadata.

Selected Targets Section

Selected Targets Section describes attributes information selected as targets.

Each property of attributes is defined as well as that in Nodes Section of Attribute Metadata.

export_process

sampo.api.process_store.export_process(url, process_name, dest_dir, truncate_comp_output=False)

Export stored process data in a process store to a directory. The exported data can be imported to any process store with sampo.api.process_store.import_process() API. This API can export only successful processes. Failed or in-progress cannot be exported.

Parameters
urlstr

Process store URL.

process_namestr

Process name to be exported.

dest_dirstr

Exported data output path.

truncate_comp_outputbool

Whether the component outputs of the exported processes are truncated or not.

Raises
ValidationError
  • If url is not str.

  • If process_name is not str.

  • If dest_dir is not str.

  • If truncate_comp_output is not bool.

Examples

Example1: Exporting current version of a selected in the process store

In [1]:
from sampo.api import process_store
pstore_url = './pstore'

process_store.export_process(pstore_url, 'fabhmerg_learn', './export_dir')

# Checking exported processes
! ls ./export_dir
fabhmerg_learn.a2ec5777-59e9-48b5-9ca3-8403cb642692

Example2: Exporting current version of all processes in the process store

In [2]:
from sampo.api import process_store

pstore_url = './pstore'
process_list = process_store.list_process_metadata('./pstore')['process name'].values

for process_name in process_list:
    process_store.export_process(pstore_url, process_name, './export_dir')

# Checking exported processes
! ls ./export_dir
fabhmerg_learn.a2ec5777-59e9-48b5-9ca3-8403cb642692
fabhmerg_predict.4eea84be-7b72-47af-8152-6e1f9974b21c

import_process

sampo.api.process_store.import_process(input_dir_path, process_name, url)

Import process data exported by sampo.api.process_store.export_process() to a process store.

Parameters
input_dir_pathstr

The directory path of source process data which is imported.

process_namestr or None

The process name. if None, all process will be imported.

urlstr

Process store URL.

Raises
ValidationError
  • If input_dir_path is not str.

  • If process_name is not str or None.

  • If url is not str.

Examples

Example: Importing processes exported by export_process() from a directory to a new process store.

In [1]:
import os
from sampo.api import process_store

new_pstore_url = './new_pstore'
process_store.create(new_pstore_url)

with os.scandir('./export_dir') as exported_processes:
    for entry in exported_processes:
        if entry.is_dir():
            process_store.import_process(entry.path, None, new_pstore_url)

# Checking imported processes
process_store.list_process_metadata(new_pstore_url)
Out[1]:
process name version started at running time status
0 fabhmerg_learn a2ec5777-59e9-48b5-9ca3-8403cb642692 2018-06-05 16:02:39.371195 00:00:00.779124 Succeeded
1 fabhmerg_predict 4eea84be-7b72-47af-8152-6e1f9974b21c 2018-06-05 16:02:40.164337 00:00:00.302370 Succeeded

process_to_spd

sampo.api.process_store.process_to_spd(url, process_name, dest_dir)

Converts a learned process into SPD file. When a process has Feature Learner, it is converted to Feature Descriptor. Output attributes which were not used for the successor are removed. Components without output attributes are removed from process.

Parameters
urlstr

Process store URL.

process_namestr

A process name.

dest_dirstr

SPD is output to this directory path with the following name.

<dest_dir>/<original SPD file name>_<process name>.spd

Raises
Validation Error
  • If url is not str.

  • If process_name is not str.

  • If dest_dir is not str.

Examples

  • aad.spd

    dl -> hr  -> fs
       -> bin -> fs
       -> ts  -> fs
    
    ---
    
    components:
        dl:
            component: DataLoader
    
        hr:
            component: HingeRampFLComponent
            features: scale == 'real' or scale == 'integer'
            max_num_output_features: 15
    
        bin:
            component: BinarizeFLComponent
            features: scale == 'real' or scale == 'integer'
            max_num_output_features: 10
    
        ts:
            component: TimeshiftFDComponent
            features: scale == 'real' or scale == 'integer'
            shift: [["all()", [2,10]]]
    
        fs:
            component: LinearHSICFSComponent
            features: scale == 'real' or scale == 'integer'
            target: name == 'target_value'
            max_num_output_features: 5
    
    global_settings:
        keep_attributes:
            - target_value
        feature_exclude:
            - target_value
    

Consider pstore as a process store containing process which is learned using aad.spd shown above.

>>> from sampo.api import process_store
>>> process_store.process_to_spd('pstore', 'my_process_A', './output_dir')

aad_my_process_A.spd is output.

  • aad_my_process_A.spd

    dl -> bin -> fs
    dl -> ts  -> fs
    
    ---
    
    components:
        dl:
            component: DataLoader
    
        ts:
            component: TimeshiftFDComponent
            features: scale == 'real' or scale == 'integer'
            shift:
            - [name =="attr2", [2]]
            - [name =="attr4", [2]]
    
        bin:
            component: BinarizeFDComponent
            features: scale == 'real' or scale == 'integer'
            binarize_param:
            - [name =="attr2", [{threshold: 3.5}]]
    
        fs:
            component: DummyFDComponent
            features: name == 'bin(3.5)_attr2'
                                or name == 'ts(2)_attr2'
                                or name == 'ts(2)_attr4'
    
    global_settings:
        keep_attributes:
        - target_value
    
        feature_exclude:
        - target_value
    

Thus each component is generated according to the following rules.

  • BinarizeFLComponent converted to BinarizeFDComponent. LinearHSICFSComponent converted to DummyFDComponent.

  • TimeshiftFDComponent output attributes which were not used for the successor are removed. (BinarizeFLComponent and HingeRampFLComponent output attributes were deleted similarly.)

  • HingeRampFLComponent removed from Process. (Output attributes of HingeRampFLComponent were not selected by Feature Selection)