Commit 4a6a32c9 authored by Robin Schnaidt's avatar Robin Schnaidt

Add comments

parent 65628292
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Simplenet\n",
"\n",
"A simple classification CNN based on TECO’s suggested example. In the following, worked with in different configurations in other notebooks.\n",
"\n",
"## Data Import"
]
},
{
"cell_type": "code",
"execution_count": 92,
......@@ -484,7 +495,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Preprocessing "
"# Preprocessing\n",
"\n",
"### Train/Test Split\n",
"We train on all probands except one and use that one for validation."
]
},
{
......@@ -506,7 +520,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Feature selection and scaling"
"### Feature selection and scaling\n",
"\n",
"After trying different options, we found that scaling all body sensor features with a StandardScaler (so standardizing those Features) and leaving rtls sensor features as is works best. \n",
"We use a scikit learn Tranformer here to build a reproducible and easily changable data transformation pipeline."
]
},
{
......@@ -536,6 +553,14 @@
" remainder='drop')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Define the size of the windowing. This is an important hyperparameter.\n",
"We will take half_window frames in the past, including the frame to label, and in the future."
]
},
{
"cell_type": "code",
"execution_count": 98,
......@@ -575,6 +600,14 @@
"test = column_trans.transform(test_df)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Function to create a windowed tensorflow Dataset out of a (plain sequence) Dataset. \n",
"This has been adopted from the Tensorflow documentation."
]
},
{
"cell_type": "code",
"execution_count": 101,
......@@ -587,6 +620,13 @@
"test_recs = split_recs(test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Function to create one ongoing dataset and coresponding targets, to use with the prediction function of the model (so we can manually check the final train and test performance):"
]
},
{
"cell_type": "code",
"execution_count": 102,
......@@ -634,6 +674,14 @@
"test_data, test_targets = make_dataset(test_recs, half_window)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a training dataset containing windowed data and targets interleaved from all recordings (so using one sample from each recroding in a round robin fashion). \n",
"This smooths out the learning process, as the model is not trained sequential on the different probands but instead sees all training data mixed up."
]
},
{
"cell_type": "code",
"execution_count": 105,
......@@ -845,6 +893,13 @@
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Evaluate the final performance on the training data:"
]
},
{
"cell_type": "code",
"execution_count": 111,
......@@ -886,14 +941,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Post processing"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Evaluation"
"# Evaluation\n",
"\n",
"Evaluate the performance on the testdata:"
]
},
{
......@@ -963,6 +1013,13 @@
"print(classification_report(test_targets, baseline))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We visualize a small amount of the targets and predictions to see where our model performs well and which kind of errors it makes."
]
},
{
"cell_type": "code",
"execution_count": 116,
......
......@@ -6,7 +6,9 @@
"source": [
"# Stacknet\n",
"\n",
"A CNN classifier that is built from layers of the Simplenet architecture to enhance classification behaviour from that starting point."
"A CNN classifier that is built from layers of the Simplenet architecture to enhance classification behaviour from that starting point.\n",
"\n",
"## Data Import"
]
},
{
......@@ -512,7 +514,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Preprocessing "
"# Preprocessing\n",
"\n",
"### Train/Test Split\n",
"We train on all probands except one and use that one for validation."
]
},
{
......@@ -534,7 +539,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Feature selection and scaling"
"### Feature selection and scaling\n",
"\n",
"After trying different options, we found that scaling all body sensor features with a StandardScaler (so standardizing those Features) and leaving rtls sensor features as is works best. \n",
"We use a scikit learn Tranformer here to build a reproducible and easily changable data transformation pipeline."
]
},
{
......@@ -603,6 +611,14 @@
"test = column_trans.transform(test_df)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Split up the data for the recordings according to the recording ID to process them seperately. \n",
"We can't treat the whole dataset as one sequence, as we would create windows which contain data from different recordings."
]
},
{
"cell_type": "code",
"execution_count": 38,
......@@ -615,6 +631,14 @@
"test_recs = split_recs(test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Function to create a windowed tensorflow Dataset out of a (plain sequence) Dataset.\n",
"This has been adopted from the Tensorflow documentation."
]
},
{
"cell_type": "code",
"execution_count": 39,
......@@ -631,6 +655,13 @@
" return windows"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Function to create one ongoing dataset and coresponding targets, to use with the prediction function of the model (so we can manually check the final train and test performance):"
]
},
{
"cell_type": "code",
"execution_count": 40,
......@@ -662,6 +693,14 @@
"test_data, test_targets = make_dataset(test_recs, half_window)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a training dataset containing windowed data and targets interleaved from all recordings (so using one sample from each recroding in a round robin fashion). \n",
"This smooths out the learning process, as the model is not trained sequential on the different probands but instead sees all training data mixed up."
]
},
{
"cell_type": "code",
"execution_count": 42,
......@@ -881,6 +920,13 @@
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Evaluate the final performance on the training data:"
]
},
{
"cell_type": "code",
"execution_count": 48,
......@@ -922,14 +968,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Post processing"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Evaluation"
"# Evaluation\n",
"\n",
"Evaluate the performance on the testdata:"
]
},
{
......@@ -999,6 +1040,13 @@
"print(classification_report(test_targets, baseline))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We visualize a small amount of the targets and predictions to see where our model performs well and which kind of errors it makes."
]
},
{
"cell_type": "code",
"execution_count": 53,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment