Getting Started

You can execute any notebooks via the execution API.

Example:

$ curl -N -X POST \
    --data-urlencode "token=<your_jupyter_token>" \
    --data-urlencode "notebook=<notebook.ipynb>" \
    "http://localhost:8888/api/executions"

nbexec does not require any additional libraries for execution, but it requires a cell to define parameters if you set parameters at runtime.

Basic Example

Create a notebook with the following two cells and save it to basic_example.ipynb.

[1]:
# nbexec: params
a = 1
b = 1.5
{
    'a': dict(type='int', desc='Parameter a', examples=['1', '0xf']),
    'b': dict(type='float', desc='Parameter b', examples=['1.5', '2e-2'])
}
[1]:
{'a': {'desc': 'Parameter a', 'examples': ['1', '0xf'], 'type': 'int'},
 'b': {'desc': 'Parameter b', 'examples': ['1.5', '2e-2'], 'type': 'float'}}
[2]:
a * b
[2]:
1.5

The first cell contains the parameter definition. It must starts with the # nbexec: params. The assignments a = 1 and b = 1.5 are the default values. They are also required to execute this notebook interactively.

The next dict {...} is the parameter definition. The key name is the parameter name and the value is a dict which contains options for it. type is always required because parameters will be validated by the type. desc and examples are optional which are displayed in the document page.

The second cell multiplies the variable a and b.

nbexec displays a cog button at the end of the Jupyter toolbar:

image0

This is a link to the execution document which displays the API parameters and examples.

You can execute the notebook from your console as below:

$ curl -N -X POST \
    --data-urlencode "token=<your_jupyter_token>" \
    --data-urlencode "notebook=<notebook.ipynb>" \
    "http://localhost:8888/api/executions"

It returns a JSON result immediately.

{
  "event": "notebook_start",
  "execution": {
    "completed_at": null,
    "exec_id": "50959a2f-d681-4a66-8c4c-0c2ef69d3be8",
    "progress": null,
    "output_path": null,
    "params": {},
    "jupyter_kernel": null,
    "path": "basic_example.ipynb",
    "status": "executing",
    "last_cell_source": null,
    "overwrite": false,
    "started_at": 1517209210.531868,
    "cell_timeout": null
  },
  "timestamp": 1517209210.531868
}

The execution is just started and is not finished yet. It takes at least a few seconds. You can get the status via the API with specifying exec_id which is included in the JSON above.

$ curl -N -G \
    --data-urlencode "token=<your_jupyter_token>" \
    "http://localhost:8888/api/executions/50959a2f-d681-4a66-8c4c-0c2ef69d3be8"

When the execution finishes, status becomes completed.

Alternatively, you can execute the notebook with sending X-Response-Encoding: chunked header to receive real time progress.

$ curl -N -X POST \
    -H 'X-Response-Encoding: chunked' \
    --data-urlencode "token=<your_jupyter_token>" \
    --data-urlencode "notebook=<notebook.ipynb>" \
    "http://localhost:8888/api/executions"

It displays real time events until the execution finishes.

Machine Learning Example

We are going to create a pair of notebooks to do machine learning with Iris Data Set.

  1. iris-train.ipynb

    • Create a model from data and save it as a file

  2. iris-predict.ipynb

    • Load a model from a file and predict the target value with given data

pandas, scikit-learn and matplotlib are required to execute the following example.

Preparing Data Files

Before creating notebooks, we need data files for training and prediction. Execute the following code in a notebook.

[3]:
import os
import pandas as pd
import sklearn.datasets

data = pd.read_csv(os.path.join(sklearn.datasets.__path__[0], 'data', 'iris.csv'),
                   skiprows=1, header=None)
train = data.sample(frac=0.8, random_state=1)
test = data.drop(train.index)
train.to_csv('training.csv', header=None, index=None)  # training dataset
test.to_csv('test.csv', header=None, index=None)  # test dataset

You can check the file contents as follows:

[4]:
!head -5 training.csv
5.8,4.0,1.2,0.2,0
5.1,2.5,3.0,1.1,1
6.6,3.0,4.4,1.4,1
5.4,3.9,1.3,0.4,0
7.9,3.8,6.4,2.0,2
[5]:
!head -5 test.csv
5.1,3.5,1.4,0.2,0
4.9,3.0,1.4,0.2,0
5.0,3.4,1.5,0.2,0
4.4,2.9,1.4,0.2,0
4.3,3.0,1.1,0.1,0

Training

Create the following notebook and save it as iris-train.ipynb.

[6]:
%matplotlib inline
[7]:
import pickle
import pandas as pd
from sklearn import svm
[8]:
# nbexec: params
data_file = 'training.csv'
C = 1.0
kernel = 'rbf'
degree = 3
gamma = 'auto'
{
    'data_file': dict(type=str, desc='Data file', examples=['example.csv']),
    'C': dict(type=float, desc='Penalty parameter C of the error term', examples=['1.0']),
    'kernel': dict(type=str, desc='Specifies the kernel type to be used in the algorithm',
                   examples=['linear', 'poly', 'rbf', 'sigmoid']),
    'degree': dict(type=int, desc="Degree of the polynomial kernel function ('poly')"
                                  "Ignored by all other kernels.", examples=['3']),
    'gamma': dict(type=float, desc="Kernel coefficient for 'rbf', 'poly' and 'sigmoid'", examples=['0.2'])
}
[8]:
{'C': {'desc': 'Penalty parameter C of the error term',
  'examples': ['1.0'],
  'type': float},
 'data_file': {'desc': 'Data file', 'examples': ['example.csv'], 'type': str},
 'degree': {'desc': "Degree of the polynomial kernel function ('poly')Ignored by all other kernels.",
  'examples': ['3'],
  'type': int},
 'gamma': {'desc': "Kernel coefficient for 'rbf', 'poly' and 'sigmoid'",
  'examples': ['0.2'],
  'type': float},
 'kernel': {'desc': 'Specifies the kernel type to be used in the algorithm',
  'examples': ['linear', 'poly', 'rbf', 'sigmoid'],
  'type': str}}
[9]:
df = pd.read_csv(data_file, names=['sepal length', 'sepal width',
                                   'petal length', 'petal width', 'target'])
[10]:
df.head()
[10]:
sepal length sepal width petal length petal width target
0 5.8 4.0 1.2 0.2 0
1 5.1 2.5 3.0 1.1 1
2 6.6 3.0 4.4 1.4 1
3 5.4 3.9 1.3 0.4 0
4 7.9 3.8 6.4 2.0 2
[11]:
pd.plotting.scatter_matrix(df, figsize=(10, 8))
[11]:
array([[<matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab33920b8>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab3311080>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab32d7128>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab32a27f0>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab3262fd0>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab32344a8>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab326dba8>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab31c34e0>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab3110710>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab30d5320>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab30a71d0>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab3060e10>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab3034160>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab2ff1b70>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab2fbeac8>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab2f89550>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab2ed5898>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab2e98c18>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab2e6b160>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab2e2ca58>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab2df7f28>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab2dbc898>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab2d89da0>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab2d1ed30>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f0ab2d0abe0>]], dtype=object)
_images/getting_started_16_1.png
[12]:
X = df.drop(['target'], axis=1)
[13]:
y = df['target']
[14]:
clf = svm.SVC(C=C, kernel=kernel, degree=degree, gamma=gamma)
[15]:
clf.fit(X, y)
[15]:
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)
[16]:
with open('iris.pickle', 'wb') as f:
    pickle.dump(clf, f)

You can execute this notebook with several parameter combinations.

Using ‘linear’ kernel:

$ curl -N -X POST \
    --data-urlencode "token=<your_jupyter_token>" \
    --data-urlencode "notebook=iris-train.ipynb" \
    --data-urlencode "kernel=linear" \
    --data-urlencode "data_file=training.csv" \
    "http://localhost:8888/api/executions"

Using ‘rbf’ kernel with gamma = 0.7:

$ curl -N -X POST \
    --data-urlencode "token=<your_jupyter_token>" \
    --data-urlencode "notebook=iris-train.ipynb" \
    --data-urlencode "kernel=rbf" \
    --data-urlencode "gamma=0.7" \
    --data-urlencode "data_file=training.csv" \
    "http://localhost:8888/api/executions"

Using ‘poly’ kernel with degree = 3:

$ curl -N -X POST \
    --data-urlencode "token=<your_jupyter_token>" \
    --data-urlencode "notebook=iris-train.ipynb" \
    --data-urlencode "kernel=poly" \
    --data-urlencode "degree=3" \
    --data-urlencode "data_file=training.csv" \
    "http://localhost:8888/api/executions"

Prediction

Create the following notebook and save it as iris-predict.ipynb.

[17]:
import pickle
import pandas as pd
[18]:
# nbexec: params
data_file = 'test.csv'
{
    'data_file': dict(type=str, desc='Data file', examples=['example.csv'])
}
[18]:
{'data_file': {'desc': 'Data file', 'examples': ['example.csv'], 'type': str}}
[19]:
df = pd.read_csv(data_file, names=['sepal length', 'sepal width',
                                   'petal length', 'petal width', 'target'])
[20]:
with open('iris.pickle', 'rb') as f:
    clf = pickle.load(f)
[21]:
X = df.drop(['target'], axis=1)
[22]:
result = clf.predict(X)
[23]:
print(pd.DataFrame(result).to_csv(header=False, index=False))
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2

This notebook can be executed by the following execution API:

$ curl -N -X POST \
    --data-urlencode "token=<your_jupyter_token>" \
    --data-urlencode "notebook=iris-predict.ipynb" \
    --data-urlencode "data_file=test.csv" \
    "http://localhost:8888/api/executions"

The result is included in the cell above. You can get it from the output notebook iris-predict-Executed1.ipynb.

You can also get the real time progress by specicying X-Response-Encoding: chunked. The progress payload includes the cell object and you can get the result from it.

Example:

$ curl -N -X POST \
    -H 'X-Response-Encoding: chunked' \
    --data-urlencode "token=<your_jupyter_token>" \
    --data-urlencode "notebook=iris-predict.ipynb" \
    --data-urlencode "data_file=test.csv" \
    "http://localhost:8888/api/executions"