Commit d778a1cf authored by uqeih's avatar uqeih

Merge branch 'master' of git.scc.kit.edu:ubelj/psda-assignment-3-group-4

parents 2a57926c e388181a
......@@ -18,6 +18,9 @@ Dataframe_preparation.ipynb: Data preprocessing (as provided by TECO/Kinemic)
### Data Exploration:
exploration/Exploration.ipynb: Basic data exploration
exploration/RTLS_Exploration.ipynb: Further exploration or RTLS sensor data.
### CNNs:
#### Simplenet
......@@ -49,4 +52,4 @@ hybridnet.ipynb: An experimental setup combining the classification results of t
### Summary and Further Documents:
slides presentation.pdf: Main presentation slides from July 6th 2020
model performance.xlsx, model performance compact.png: Overview of model performance
\ No newline at end of file
model performance.xlsx, model performance compact.png: Overview of model performance
This diff is collapsed.
This diff is collapsed.
......@@ -5,4 +5,5 @@ plotly==4.8.2
scikit-learn==0.22.2.post1
tensorflow==2.2.0
tensorflow_datasets==tensorflow-datasets
tqdm=4.31.1
\ No newline at end of file
tqdm=4.31.1
seaborn==0.10.1
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Simplenet\n",
"\n",
"A simple classification CNN based on TECO’s suggested example. In the following, worked with in different configurations in other notebooks.\n",
"\n",
"## Data Import"
]
},
{
"cell_type": "code",
"execution_count": 92,
......@@ -484,7 +495,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Preprocessing "
"# Preprocessing\n",
"\n",
"### Train/Test Split\n",
"We train on all probands except one and use that one for validation."
]
},
{
......@@ -506,7 +520,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Feature selection and scaling"
"### Feature selection and scaling\n",
"\n",
"After trying different options, we found that scaling all body sensor features with a StandardScaler (so standardizing those Features) and leaving rtls sensor features as is works best. \n",
"We use a scikit learn Tranformer here to build a reproducible and easily changable data transformation pipeline."
]
},
{
......@@ -536,6 +553,14 @@
" remainder='drop')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Define the size of the windowing. This is an important hyperparameter.\n",
"We will take half_window frames in the past, including the frame to label, and in the future."
]
},
{
"cell_type": "code",
"execution_count": 98,
......@@ -575,6 +600,14 @@
"test = column_trans.transform(test_df)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Function to create a windowed tensorflow Dataset out of a (plain sequence) Dataset. \n",
"This has been adopted from the Tensorflow documentation."
]
},
{
"cell_type": "code",
"execution_count": 101,
......@@ -587,6 +620,13 @@
"test_recs = split_recs(test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Function to create one ongoing dataset and coresponding targets, to use with the prediction function of the model (so we can manually check the final train and test performance):"
]
},
{
"cell_type": "code",
"execution_count": 102,
......@@ -634,6 +674,14 @@
"test_data, test_targets = make_dataset(test_recs, half_window)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a training dataset containing windowed data and targets interleaved from all recordings (so using one sample from each recroding in a round robin fashion). \n",
"This smooths out the learning process, as the model is not trained sequential on the different probands but instead sees all training data mixed up."
]
},
{
"cell_type": "code",
"execution_count": 105,
......@@ -845,6 +893,13 @@
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Evaluate the final performance on the training data:"
]
},
{
"cell_type": "code",
"execution_count": 111,
......@@ -886,14 +941,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Post processing"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Evaluation"
"# Evaluation\n",
"\n",
"Evaluate the performance on the testdata:"
]
},
{
......@@ -963,6 +1013,13 @@
"print(classification_report(test_targets, baseline))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We visualize a small amount of the targets and predictions to see where our model performs well and which kind of errors it makes."
]
},
{
"cell_type": "code",
"execution_count": 116,
......
......@@ -6,7 +6,9 @@
"source": [
"# Stacknet\n",
"\n",
"A CNN classifier that is built from layers of the Simplenet architecture to enhance classification behaviour from that starting point."
"A CNN classifier that is built from layers of the Simplenet architecture to enhance classification behaviour from that starting point.\n",
"\n",
"## Data Import"
]
},
{
......@@ -512,7 +514,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Preprocessing "
"# Preprocessing\n",
"\n",
"### Train/Test Split\n",
"We train on all probands except one and use that one for validation."
]
},
{
......@@ -534,7 +539,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Feature selection and scaling"
"### Feature selection and scaling\n",
"\n",
"After trying different options, we found that scaling all body sensor features with a StandardScaler (so standardizing those Features) and leaving rtls sensor features as is works best. \n",
"We use a scikit learn Tranformer here to build a reproducible and easily changable data transformation pipeline."
]
},
{
......@@ -603,6 +611,14 @@
"test = column_trans.transform(test_df)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Split up the data for the recordings according to the recording ID to process them seperately. \n",
"We can't treat the whole dataset as one sequence, as we would create windows which contain data from different recordings."
]
},
{
"cell_type": "code",
"execution_count": 38,
......@@ -615,6 +631,14 @@
"test_recs = split_recs(test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Function to create a windowed tensorflow Dataset out of a (plain sequence) Dataset.\n",
"This has been adopted from the Tensorflow documentation."
]
},
{
"cell_type": "code",
"execution_count": 39,
......@@ -631,6 +655,13 @@
" return windows"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Function to create one ongoing dataset and coresponding targets, to use with the prediction function of the model (so we can manually check the final train and test performance):"
]
},
{
"cell_type": "code",
"execution_count": 40,
......@@ -662,6 +693,14 @@
"test_data, test_targets = make_dataset(test_recs, half_window)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a training dataset containing windowed data and targets interleaved from all recordings (so using one sample from each recroding in a round robin fashion). \n",
"This smooths out the learning process, as the model is not trained sequential on the different probands but instead sees all training data mixed up."
]
},
{
"cell_type": "code",
"execution_count": 42,
......@@ -881,6 +920,13 @@
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Evaluate the final performance on the training data:"
]
},
{
"cell_type": "code",
"execution_count": 48,
......@@ -922,14 +968,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Post processing"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Evaluation"
"# Evaluation\n",
"\n",
"Evaluate the performance on the testdata:"
]
},
{
......@@ -999,6 +1040,13 @@
"print(classification_report(test_targets, baseline))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We visualize a small amount of the targets and predictions to see where our model performs well and which kind of errors it makes."
]
},
{
"cell_type": "code",
"execution_count": 53,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment