=========================
sampo_util csv2arffheader
=========================

.. contents:: Contents
    :local:

Overview
========

.. warning::

   sampo_util csv2arffheader command is deprecated.

**sampo_util csv2arffheader**, with an input CSV file, generates @ATTRIBUTE statements,
attribute values, and frequencies of value occurrence which are needed in finalizing
the @ATTRIBUTE statements and their respective data scales in the header section of ARFF files.
The result is output to stdout.

|

Synopsis
========
See sampo_util command help::

    $ sampo_util csv2arffheader --help

|

Examples
========
* No designated nominal threshold.

  * Command::

      $ sampo_util csv2arffheader data2.csv

  * Input::

      _sid,dayofweek,flowername,temperature,date,comment
      0,MON,Iris-setosa,23.5,2012-01-01T01:23:45.678+09:00,
      1,TUE,Iris-setosa,20.1,2012-01-01T01:23:45.678+09:00,"John, Doe"
      2,?,Iris-setosa,,,
      3,THU,Iris-setosa,22.1,2012-04-01T04:23:45.678-10:00,
      4,FRI,Iris-setosa,21,2012-06-01T06:23:45.678Z,
      5,SAT,Iris-versicolor,24.1,2012-06-01T06:23:45.678Z,"Jacob, Smith"
      6,SUN,Iris-versicolor,21.5,2012-08-01T06:23:45.678Z,
      7,MON,Iris-versicolor,19.2,2012-09-01T06:23:45.678Z,"Ethan, Johnson"
      8,TUE,Iris-virginica,22.3,2012-04-01T06:23:45.678Z,

  * Output::

      @ATTRIBUTE _sid {0,1,2,3,4,5,6,7,8}
      0,1,2,3,4,5,6,7,8
      1,1,1,1,1,1,1,1,1

      @ATTRIBUTE dayofweek {MON,TUE,FRI,SAT,SUN,THU}
      MON,TUE,FRI,SAT,SUN,THU
      2,2,1,1,1,1

      @ATTRIBUTE flowername {Iris-setosa,Iris-versicolor,Iris-virginica}
      Iris-setosa,Iris-versicolor,Iris-virginica
      5,3,1

      @ATTRIBUTE temperature {19.2,20.1,21,21.5,22.1,22.3,23.5,24.1}
      19.2,20.1,21,21.5,22.1,22.3,23.5,24.1
      1,1,1,1,1,1,1,1

      @ATTRIBUTE date DATE "yyyy-MM-dd'T'HH:mm:ss"
      2012-01-01T01:23:45.678+09:00,2012-06-01T06:23:45.678Z,2012-04-01T04:23:45.678-10:00,2012-04-01T06:23:45.678Z,2012-08-01T06:23:45.678Z,2012-09-01T06:23:45.678Z
      2,2,1,1,1,1

      @ATTRIBUTE comment {"Ethan, Johnson","Jacob, Smith","John, Doe"}
      "Ethan, Johnson","Jacob, Smith","John, Doe"
      1,1,1

|

* With designated nominal thresholds.

  * Command::

      $ sampo_util csv2arffheader data2.csv  --numeric-nominal-threshold 5 --string-nominal-threshold 5

  * Input:

    Same as the above example.

  * Output::

      @ATTRIBUTE _sid INTEGER
      0,1,2,3,4,5,6,7,8
      1,1,1,1,1,1,1,1,1

      @ATTRIBUTE dayofweek STRING
      MON,TUE,FRI,SAT,SUN,THU
      2,2,1,1,1,1

      @ATTRIBUTE flowername {Iris-setosa,Iris-versicolor,Iris-virginica}
      Iris-setosa,Iris-versicolor,Iris-virginica
      5,3,1

      @ATTRIBUTE temperature REAL
      19.2,20.1,21,21.5,22.1,22.3,23.5,24.1
      1,1,1,1,1,1,1,1

      @ATTRIBUTE date DATE "yyyy-MM-dd'T'HH:mm:ss"
      2012-01-01T01:23:45.678+09:00,2012-06-01T06:23:45.678Z,2012-04-01T04:23:45.678-10:00,2012-04-01T06:23:45.678Z,2012-08-01T06:23:45.678Z,2012-09-01T06:23:45.678Z
      2,2,1,1,1,1

      @ATTRIBUTE comment {"Ethan, Johnson","Jacob, Smith","John, Doe"}
      "Ethan, Johnson","Jacob, Smith","John, Doe"
      1,1,1

|

Input Format
============

See `CSV File Specification <../../input/csv.html>`_.

|

Output Format
=============
The following information is printed with each attribute:

* @ATTRIBUTE statement.
* Attribute values separated by comma.
* Appearance frequency of each attribute value separated by comma.
