SAMPO CSV File Specification

Overview

SAMPO CSV is a tabular data format whose values are comma-separated.

The format is based on RFC4180.

https://tools.ietf.org/html/rfc4180

Example:

_sid,_datetime,SepalLength,SepalWidth,PetalLength,PetalWidth,Name
1,2017-01-23,5.1,3.5,1.4,0.2,Iris-setosa
2,2017-01-23,4.9,3.0,1.4,0.2,Iris-setosa
3,2017-01-23,4.7,3.2,1.3,0.2,Iris-setosa
4,2017-01-23,4.6,3.1,1.5,0.2,Iris-setosa
5,2017-01-23,5.0,3.6,1.4,0.2,Iris-setosa
6,2017-01-23,5.4,3.9,1.7,0.4,Iris-setosa
7,2017-01-23,4.6,3.4,1.4,0.3,Iris-setosa
8,2017-01-23,5.0,3.4,1.5,0.2,Iris-setosa
9,2017-01-23,4.4,2.9,1.4,0.2,Iris-setosa
.
.
.

SAMPO CSV file must fit the following constraints:

Property

Constraint

File name

ASCII characters.csv

Character code

Python 3: UTF-8 (ASCII + Japanese Characters)
Python 2: ASCII

Newline code

CRLF


Attribute Names

Attribute names are described in the first row. Attributes beginning with underscores (‘_’) will be treated as sample_metadata.


Reserved Attributes

Reserved attributes are defined as follows. You must use these attributes in your data as prescribed below.

Attribute name

Data type

Required or Optional

Description

_sid

INTEGER

Required

Sample ID. Must be a unique value.

_datetime

DATE

Optional

Date/Time data.


Date format

The following formats are inferred as DATE format:

  • yyyy-MM-dd

  • yyyy-MM-dd HH:mm:ss

  • yyyy-MM-dd’T’HH:mm:ss

  • yyyy-MM-dd’T’HH:mm:ss.S

  • yyyy/MM/dd

  • yyyy/MM/dd HH:mm:ss

  • yyyy/MM/dd’T’HH:mm:ss

  • yyyy/MM/dd’T’HH:mm:ss.S

  • MM-dd-yyyy

  • MM-dd-yyyy HH:mm:ss

  • MM-dd-yyyy’T’HH:mm:ss

  • MM-dd-yyyy’T’HH:mm:ss.S


Missing Values

A question mark (‘?’) and an empty string (‘’) in the data are treated as missing values.