sampo_util db2asd¶
Overview¶
Warning
sampo_util db2asd command is deprecated.
sampo_util db2asd generates ASD (Attribute Schema Description) file from a select sql query, database table, or view. Output attribute names, scales, and domains are inferred from each column of the result of an SQL query, a database table, or view.
Supported Database¶
Supported database specification are following:
PostgreSQL¶
Supported schema is public only and the recommended data types are as follows:
database data type category
database data type
SAMPO
Remarks
Numeric Types
SMALLINT
INTEGER
REAL if NULL values exist.
INTEGER
INTEGER
REAL if NULL values exist.
BIGINT
INTEGER
REAL if NULL values exist.
SMALLSERIAL
INTEGER
REAL if NULL values exist.
SERIAL
INTEGER
REAL if NULL values exist.
BIGSERIAL
INTEGER
REAL if NULL values exist.
DECIMAL
REAL
REAL
REAL
DOUBLE PRECISION
REAL
NUMERIC
REAL
Monetary Types
MONEY
NOMINAL
Character Types
CHARACTER
NOMINAL
CHARACTER VARYING
NOMINAL
TEXT
NOMINAL
Date/Time Types
TIMESTAMP
DATE
TIMESTAMP WITH TIME ZONE
DATE
DATE
DATE
INTERVAL
DATE
TIME
NOMINAL
TIME WITH TIME ZONE
NOMINAL
Boolean Type
BOOL
NOMINAL
Enumerated Types
NOMINAL
Network Address Types
CIDR
NOMINAL
INET
NOMINAL
MACADDR
NOMINAL
Note
Some database data types not specified in this table may still create NOMINAL scale attributes in the output ASD. However, SAMPO may not be able to fully support those attributes in learning and prediction, thus it is not recommended to use those data types.
Examples¶
The following examples assume that a PostgreSQL database “testdb” and the user “aapfuser” have been prepared.
Generates from a select sql query.
Input:
table_a¶ col1
col2
col3
col4
col5
col6
col7
1
2013-08-02
2013-08-02 10:10:10
10.1
aaa
ccc
2013-08-02 10:10:10
2
2013-08-03
2013-08-03 10:10:10
10.2
bbb
ddd
2013-08-02 10:10:10
3
2013-08-04
2013-08-04 10:10:10.999999
11.0
ccc
ddd
2013-08-02 10:10:10
Command:
$ sampo_util db2asd --connection-uri postgresql://aapfuser:aapfpass@localhost:5432/testdb --output-path . 'SELECT * FROM table_a'
Note
Table or column names with spaces in sql queries must be enclosed in double quotations.
Output:
output.asd:: col1: {scale: INTEGER} col2: {scale: DATE} col3: {scale: DATE} col4: {scale: REAL} col5: {scale: NOMINAL, domain: [aaa, bbb, ccc]} col6: {scale: NOMINAL, domain: [ccc, ddd]} col7: {scale: DATE}
Note
Checking whether the attribute scales and domains are properly generated is strongly recommended.
Generates from database table.
Input:
table_a¶ col1
col2
col3
col4
col5
col6
col7
1
2013-08-02
2013-08-02 10:10:10
10.1
aaa
ccc
2013-08-02 10:10:10
2
2013-08-03
2013-08-03 10:10:10
10.2
bbb
ddd
2013-08-02 10:10:10
3
2013-08-04
2013-08-04 10:10:10.999999
11.0
ccc
ddd
2013-08-02 10:10:10
Command:
$ sampo_util db2asd --connection-uri postgresql://aapfuser:aapfpass@localhost:5432/testdb --output-path . table_a
Output:
table_a.asd:: col1: {scale: INTEGER} col2: {scale: DATE} col3: {scale: DATE} col4: {scale: REAL} col5: {scale: NOMINAL, domain: [aaa, bbb, ccc]} col6: {scale: NOMINAL, domain: [ccc, ddd]} col7: {scale: DATE}
Note
Checking whether the attribute scales and domains are properly generated is strongly recommended.
Output Format¶
Output file is named in the following rules:
output.asd if an sql query is specified.
<table_name>.asd if a table or view name is specified.