sampo_util db2csv

Overview

Warning

sampo_util db2csv command is deprecated.

sampo_util db2csv generates CSV file/s from a database.


Synopsis

See sampo_util command help:

$ sampo_util db2csv --help

Supported Database

The database data type will be converted to a corresponding SAMPO internal scale after the data get loaded into SAMPO.

PostgreSQL

Supported schema is public only and types are as follows:

database data type

SAMPO

Remarks

INTEGER

INTEGER

NUMERIC

REAL

Deprecated

REAL

REAL

DOUBLE PRECISION

REAL

DATE

DATE

TIMESTAMP WITHOUT TIMEZONE

DATE

CHARACTER

NOMINAL

CHARACTER VARYING

NOMINAL

TEXT

NOMINAL

BOOL

NOMINAL


Examples

The following examples assume that a PostgreSQL database test_db and the user test_user have been prepared.

  • Generating a CSV file from a single table.

  • Input:

table_a

key1

a1

a2

1

1.1

2.1

2

NaN

2.2

3

1.3

2.3

  • Command:

    $ sampo_util db2csv --connection-uri postgresql://test_user:test_pass@localhost:5432/test_db table_a --output-path .
    
  • Output:

    key1,a1,a2
    1,1.1,2.1
    2,?,2.2
    3,1.3,2.3
    

    null or NaN in the tables will be converted to a question mark (‘?’) in the csv file.


  • Generating a CSV file from multiple tables.

  • Input:

table_a

key1

a1

a2

1

1.1

2.1

2

1.2

2.2

3

1.3

2.3

table_b

key1

b1

1

3.1

2

3.2

table_c

key1

c1

2

4.2

3

4.3

  • Command::

    $ sampo_util db2csv –connection-uri postgresql://test_user:test_pass@localhost:5432/test_db table_a,table_b,table_c –output-path .

  • Output:

    key1,a1,a2,b1,c1
    1,1.1,2.1,3.1,?
    2,1.2,2.2,3.2,4.2
    3,1.3,2.3,?,4.3
    

    The tables are merged by the common columns if common columns exist.


  • Generating CSV files grouped by values of the column/s specified by --group-by-key.

  • Input:

table_a

key1

a1

a2

a3

1

1.1

2.1

3.1

2

1.1

2.1

3.2

3

1.2

2.2

3.3

  • Command:

    $ sampo_util db2csv --connection-uri postgresql://test_user:test_pass@localhost:5432/test_db --group-by-key a1,a2 table_a --output-path .
    
  • Output:

    output_1.csv:

    key1,a1,a2,a3
    1,1.1,2.1,3.1
    2,1.1,2.1,3.2
    

    output_2.csv:

    key1,a1,a2,a3
    1,1.2,2.2,3.3
    

    output_grouplist.csv:

    output_1.csv,1.1,2.1
    output_2.csv,1.2,2.2