sampo_util db2asd

Overview

Warning

sampo_util db2asd command is deprecated.

sampo_util db2asd generates ASD (Attribute Schema Description) file from a select sql query, database table, or view. Output attribute names, scales, and domains are inferred from each column of the result of an SQL query, a database table, or view.


Synopsis

See sampo_util command help:

$ sampo_util db2asd --help

Supported Database

Supported database specification are following:

PostgreSQL

Supported schema is public only and the recommended data types are as follows:

database data type category

database data type

SAMPO

Remarks

Numeric Types

SMALLINT

INTEGER

REAL if NULL values exist.

INTEGER

INTEGER

REAL if NULL values exist.

BIGINT

INTEGER

REAL if NULL values exist.

SMALLSERIAL

INTEGER

REAL if NULL values exist.

SERIAL

INTEGER

REAL if NULL values exist.

BIGSERIAL

INTEGER

REAL if NULL values exist.

DECIMAL

REAL

REAL

REAL

DOUBLE PRECISION

REAL

NUMERIC

REAL

Monetary Types

MONEY

NOMINAL

Character Types

CHARACTER

NOMINAL

CHARACTER VARYING

NOMINAL

TEXT

NOMINAL

Date/Time Types

TIMESTAMP

DATE

TIMESTAMP WITH TIME ZONE

DATE

DATE

DATE

INTERVAL

DATE

TIME

NOMINAL

TIME WITH TIME ZONE

NOMINAL

Boolean Type

BOOL

NOMINAL

Enumerated Types

NOMINAL

Network Address Types

CIDR

NOMINAL

INET

NOMINAL

MACADDR

NOMINAL

Note

Some database data types not specified in this table may still create NOMINAL scale attributes in the output ASD. However, SAMPO may not be able to fully support those attributes in learning and prediction, thus it is not recommended to use those data types.


Examples

The following examples assume that a PostgreSQL database “testdb” and the user “aapfuser” have been prepared.

  • Generates from a select sql query.

    • Input:

      table_a

      col1

      col2

      col3

      col4

      col5

      col6

      col7

      1

      2013-08-02

      2013-08-02 10:10:10

      10.1

      aaa

      ccc

      2013-08-02 10:10:10

      2

      2013-08-03

      2013-08-03 10:10:10

      10.2

      bbb

      ddd

      2013-08-02 10:10:10

      3

      2013-08-04

      2013-08-04 10:10:10.999999

      11.0

      ccc

      ddd

      2013-08-02 10:10:10

    • Command:

      $ sampo_util db2asd --connection-uri postgresql://aapfuser:aapfpass@localhost:5432/testdb --output-path . 'SELECT * FROM table_a'
      

    Note

    Table or column names with spaces in sql queries must be enclosed in double quotations.

    • Output:

      output.asd::
      
          col1: {scale: INTEGER}
          col2: {scale: DATE}
          col3: {scale: DATE}
          col4: {scale: REAL}
          col5: {scale: NOMINAL, domain: [aaa, bbb, ccc]}
          col6: {scale: NOMINAL, domain: [ccc, ddd]}
          col7: {scale: DATE}
      

    Note

    Checking whether the attribute scales and domains are properly generated is strongly recommended.

  • Generates from database table.

    • Input:

      table_a

      col1

      col2

      col3

      col4

      col5

      col6

      col7

      1

      2013-08-02

      2013-08-02 10:10:10

      10.1

      aaa

      ccc

      2013-08-02 10:10:10

      2

      2013-08-03

      2013-08-03 10:10:10

      10.2

      bbb

      ddd

      2013-08-02 10:10:10

      3

      2013-08-04

      2013-08-04 10:10:10.999999

      11.0

      ccc

      ddd

      2013-08-02 10:10:10

    • Command:

      $ sampo_util db2asd --connection-uri postgresql://aapfuser:aapfpass@localhost:5432/testdb --output-path . table_a
      
    • Output:

      table_a.asd::
      
          col1: {scale: INTEGER}
          col2: {scale: DATE}
          col3: {scale: DATE}
          col4: {scale: REAL}
          col5: {scale: NOMINAL, domain: [aaa, bbb, ccc]}
          col6: {scale: NOMINAL, domain: [ccc, ddd]}
          col7: {scale: DATE}
      

    Note

    Checking whether the attribute scales and domains are properly generated is strongly recommended.


Output Format

Output file is named in the following rules:

  • output.asd if an sql query is specified.

  • <table_name>.asd if a table or view name is specified.

See ASD File Specification.