Commit 109105f2 authored by cihan.ates's avatar cihan.ates
Browse files

Update Lecture_8.ipynb

parent e904adb7
%% Cell type:markdown id: tags:
 
#Active Session 6: State Space Models - I
# Active Session 6: State Space Models - I
 
%% Cell type:markdown id: tags:
 
#Important Note
# Important Note
 
Lecture notes and notebooks must not be copied and/or distributed without the express permission of ITS.
 
The notebook is prepared by using the Chapters 1-6 of the Book "[Kalman and Bayesian Filters in Python](https://nbviewer.jupyter.org/github/rlabbe/Kalman-and-Bayesian-Filters-in-Python/blob/master/table_of_contents.ipynb)".
 
 
 
 
%% Cell type:code id: tags:
 
```
%pip install filterpy
```
 
%%%% Output: stream
 
Requirement already satisfied: filterpy in /usr/local/lib/python3.7/dist-packages (1.4.5)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from filterpy) (1.19.5)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (from filterpy) (3.2.2)
Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from filterpy) (1.4.1)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->filterpy) (2.4.7)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->filterpy) (1.3.1)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->filterpy) (2.8.1)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib->filterpy) (0.10.0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib->filterpy) (1.15.0)
 
%% Cell type:code id: tags:
 
```
#Fundamentals
import statistics
import numpy as np
from collections import namedtuple
import filterpy.stats as stats
from numpy.random import randn
from filterpy.stats import plot_gaussian_pdf, gaussian
import random
 
#filtering
 
#https://filterpy.readthedocs.io/en/latest/#filterpy-gh
from filterpy.gh import GHFilter
#https://filterpy.readthedocs.io/en/latest/discrete_bayes/discrete_bayes.html
from filterpy.discrete_bayes import predict, normalize
from filterpy.discrete_bayes import update
#https://filterpy.readthedocs.io/en/latest/#filterpy-kalman
import filterpy.kalman as kf
from filterpy.kalman import KalmanFilter
from filterpy.common import Q_discrete_white_noise
 
#Data generation for Kalman filter:
from __future__ import (absolute_import, division, print_function, unicode_literals)
import copy
import math
#Plotting
import matplotlib as mpl
import plotly.graph_objects as go
import matplotlib.pylab as pylab
import matplotlib.pyplot as plt
```
 
%% Cell type:code id: tags:
 
```
import plotting as self_plots
```
 
%% Cell type:markdown id: tags:
 
# Alpha beta filter
 
You may have access the state of a given system via alternative sources: these could be different sensors attached on the system, a predictive model or simply your intuition (lets accept this a black box model) or you can measure the state of the system (lets say temperature) may times with one sensor (thermocouple). In a real world scenerio, non of these sources/measurements will be perfectly accurate, hence we need some sort of transformation fucntion correcting the our understanding of the system via available information.
 
Here our motiation is to use the available information fully, so we do not want to pick and rely on a relatively more accurate source, but try to use all our sources so as not to eliminate any insights about the state of the system we are interseted in.
 
Lets try to build our perspective starting with a simple exercise; temperature measurement.
 
Imagine that you are taking repeated measurement within a high pressure, high temperature chamber:
 
%% Cell type:code id: tags:
 
```
# Creating noisy measurement data:
measurements = np.random.uniform(380, 400, size=10000)
mean = measurements.mean()
print(f'Average of measurements is {mean:.3f}')
```
 
%%%% Output: stream
 
Average of measurements is 389.943
 
%% Cell type:markdown id: tags:
 
The result seems reasonable at first glance; the expected value falls in the middle of our observations. Yet there is a critical assumption here:
 
Our sensor is assumed to measure the same true temperature (say 390 C) as 380 and 400 with the same likelihood. Well, that is not usually the case, as we expect the sensor to read values closer to the true value most of the time and it should be less likely to print out a value as we get urther away from the true value.
 
%% Cell type:code id: tags:
 
```
# Lets implement a likelihood:
mean_val = np.random.normal(390, 5, size=10000).mean()
print(f'Average of measurements is {mean_val:.3f}')
```
 
%%%% Output: stream
 
Average of measurements is 390.055
 
%% Cell type:code id: tags:
 
```
#Lets see the measurements:
mean_val = np.random.normal(390, 5, size=10)
print(mean_val)
```
 
%%%% Output: stream
 
[393.76071419 389.74490525 386.29880507 395.50574243 391.24268202
386.38717826 382.76199889 387.44442706 384.85680646 391.95997472]
 
%% Cell type:markdown id: tags:
 
Is the temperature increasing, decreasing or are we just seeing a noise? How many times I should measure before deciding a temperature value? It is definitely not practical to make 10k measurements to decide the state of the system! Can we use the trend in the data? Lets continue with temperature example:
 
%% Cell type:code id: tags:
 
```
temperature = [358.1, 364.7, 360.9, 359.9, 362.8, 364.8, 369.5, 367.9, 366.2, 371.3]
error_1 = np.random.uniform(0, 5, size=len(temperature))
error_2 = np.random.uniform(0, 5, len(temperature))
```
 
%% Cell type:code id: tags:
 
```
hypothesis_y = []
[hypothesis_y.append(statistics.mean(temperature)) for i in range(len(temperature))];
```
 
%% Cell type:code id: tags:
 
```
self_plots.plotter(temperature,hypothesis_y,error_1,error_2)
```
 
%%%% Output: display_data
 
 
%% Cell type:markdown id: tags:
 
It is clearly seen that mean approximation fails to capture the trend in the data, indicating a dynamical change in the state of the system. Maybe a better option could be some sort of trend fitting?
 
%% Cell type:code id: tags:
 
```
poly = np.polyfit(list(range(len(temperature))), temperature, 1)
```
 
%% Cell type:code id: tags:
 
```
new_hypothesis = []
[new_hypothesis.append(poly[0]*i + poly[1]) for i in range(len(temperature))];
```
 
%% Cell type:code id: tags:
 
```
self_plots.plotter(temperature,new_hypothesis,error_1,error_2)
```
 
%%%% Output: display_data
 
 
%% Cell type:markdown id: tags:
 
Here we have reached a bifurcation point: does my hypothesis reflect the true dynamics of the system? In other words, do I need to make additional measurements to further predict the state of the system or I am fine with the model (hypothesis) I have?
 
What we are interested here is a complex dynamical system, for which the state may evolve different than my current predictions. What we are after is a dynamical model, that can reflect the changes in the states I measure, different than my prior knowledge on the system. Therefore, we are going to use two sources of information (model and measurements) to update our understanding of the system (can you see the recurrent logic here as well?).
 
Lets imagine a case where we have start making new measurements and we have the model we have derived above. Our objective is to find a way to update our understanding of the system.
 
Lets return back to our example.
 
%% Cell type:code id: tags:
 
```
# Consider that we have recorded the following degrees with given delta_t:
temp_observations = [358.3, 364.8, 360.6, 359.5, 362.8, 364.2, 369.9, 367.7, 366.2, 371.4, 371.4, 372.7]
delta_t = 1.0
#
#Prior knowledge: assume that we trust our measurement more than our model:
scaling_factor = 0.618
```
 
%% Cell type:markdown id: tags:
 
We will use our model output and the observation to make new predictions:
 
$\mathtt{estimation_{t}} = \mathtt{prediction_{t}} + 0.618(\mathtt{measurement_{t}} - \mathtt{prediction_{t}})$
 
Lets write a function accordingly:
 
%% Cell type:code id: tags:
 
```
def state_predict(estimated_temp, gain_rate, isPrint=False):
# storage for the filtered results
estimations, predictions = [estimated_temp], []
 
for z in temp_observations:
# predict new position via our model:
predicted_temp = estimated_temp + gain_rate * delta_t
 
# updating the "filter": here we will combine the information
# coming from the model and the measurement:
estimated_temp = predicted_temp + scaling_factor * (z - predicted_temp)
 
if isPrint:
print(f'prior estimation: {estimations[-1]:.1f}',
f'model prediction: {predicted_temp:.1f}',
f'estimation: {estimated_temp:.1f}')
 
# Saving the loop state:
estimations.append(estimated_temp)
predictions.append(predicted_temp)
 
return estimations, predictions
```
 
%% Cell type:code id: tags:
 
```
#Calling our function:
initial_temp = 360.0
estimations, predictions = state_predict(estimated_temp=initial_temp, gain_rate=poly[0], isPrint=True)
```
 
%%%% Output: stream
 
prior estimation: 360.0 model prediction: 361.2 estimation: 359.4
prior estimation: 359.4 model prediction: 360.6 estimation: 363.2
prior estimation: 363.2 model prediction: 364.4 estimation: 362.0
prior estimation: 362.0 model prediction: 363.2 estimation: 360.9
prior estimation: 360.9 model prediction: 362.1 estimation: 362.5
prior estimation: 362.5 model prediction: 363.7 estimation: 364.0
prior estimation: 364.0 model prediction: 365.2 estimation: 368.1
prior estimation: 368.1 model prediction: 369.3 estimation: 368.3
prior estimation: 368.3 model prediction: 369.5 estimation: 367.5
prior estimation: 367.5 model prediction: 368.6 estimation: 370.3
prior estimation: 370.3 model prediction: 371.5 estimation: 371.4
prior estimation: 371.4 model prediction: 372.6 estimation: 372.7
 
%% Cell type:markdown id: tags:
 
Lets assume that our model roughly capture the true dynamics of the system (~ over-estimates). Lets see the impact of our filtering on the estimations:
 
%% Cell type:code id: tags:
 
```
true_data = []
[true_data.append(poly[0]*i*0.85 + 360.0) for i in range(len(temp_observations)+1)];
```
 
%% Cell type:code id: tags:
 
```
self_plots.plotter_2(temp_observations, predictions, estimations, true_data)
```
 
%%%% Output: display_data
 
 
%% Cell type:markdown id: tags:
 
Our estimations are getting better as we move in time, which lies in between the measurements and the model predictions.
 
Since the current dynamics of the system is not that different than our model insight (in terms of the trends), we do an acceptable job. Question is, what happens when we enter a different regime, or shifting to a different equilibrium state?
 
Lets image that our model states that we are expecting a decrease in the temperature. Can measurements help us to correct this mistake?
 
%% Cell type:code id: tags:
 
```
#Calling our function:
initial_temp = 360.0
# note that we reverse the slope of the model to reflect the insight on decreasing temperatures:
estimations, predictions = state_predict(estimated_temp=initial_temp, gain_rate=-poly[0], isPrint=True)
```
 
%%%% Output: stream
 
prior estimation: 360.0 model prediction: 358.8 estimation: 358.5
prior estimation: 358.5 model prediction: 357.3 estimation: 361.9
prior estimation: 361.9 model prediction: 360.8 estimation: 360.7
prior estimation: 360.7 model prediction: 359.5 estimation: 359.5
prior estimation: 359.5 model prediction: 358.3 estimation: 361.1
prior estimation: 361.1 model prediction: 359.9 estimation: 362.6
prior estimation: 362.6 model prediction: 361.4 estimation: 366.6
prior estimation: 366.6 model prediction: 365.5 estimation: 366.8
prior estimation: 366.8 model prediction: 365.7 estimation: 366.0
prior estimation: 366.0 model prediction: 364.8 estimation: 368.9
prior estimation: 368.9 model prediction: 367.7 estimation: 370.0
prior estimation: 370.0 model prediction: 368.8 estimation: 371.2
 
%% Cell type:code id: tags:
 
```
self_plots.plotter_2(temp_observations, predictions, estimations, true_data)
```
 
%%%% Output: display_data
 
 
%% Cell type:markdown id: tags:
 
This does not look so good. We can still make estimations that yield an increasing trend but the difference between the truth gets larger. In other words, if we just simple combine our prior understanding on the dynamics of the system with the measurements without updating our "view of the world", the filter will fail when there is change in the system behavior.
 
What could be the solution? Remember that we pass our prior knowledge in the form of a "gain", which was fixed based on our previous knowledge.
 
Here the trick is convert the gain parameter into a variable that is learning from our measurements and estimations.
 
%% Cell type:code id: tags:
 
```
# Lets rewrite our function:
#We need to define a learning rate for the gain update:
learning_rate = 1.0-0.618
 
def state_predict_2(estimated_temp, gain_rate, isPrint=False):
# storage for the filtered results
estimations, predictions = [estimated_temp], []
 
for z in temp_observations:
# predict new position via our model:
predicted_temp = estimated_temp + gain_rate * delta_t
 
#Updating our understanding of the "world":
#we will use the difference btw. the prediction and the measurement as a scaling factor
#we will add temporal information by dividing to delta_t => residual/time
#this will be multiplied with learning rate to update previous gain.
residual = z - predicted_temp
gain_rate = gain_rate + learning_rate * (residual/delta_t)
 
# updating the "filter": here we will combine the information
# coming from the model and the measurement:
estimated_temp = predicted_temp + scaling_factor * (z - predicted_temp)
 
if isPrint:
print(f'prior estimation: {estimations[-1]:.1f}',
f'model prediction: {predicted_temp:.1f}',
f'estimation: {estimated_temp:.1f}')
 
# Saving the loop state:
estimations.append(estimated_temp)
predictions.append(predicted_temp)
 
return estimations, predictions
```
 
%% Cell type:code id: tags:
 
```
#Calling our function:
initial_temp = 360.0
# note that we reverse the slope of the model to reflect the insight on decreasing temperatures:
estimations, predictions = state_predict_2(estimated_temp=initial_temp, gain_rate=-poly[0], isPrint=True)
```
 
%%%% Output: stream
 
prior estimation: 360.0 model prediction: 358.8 estimation: 358.5
prior estimation: 358.5 model prediction: 357.1 estimation: 361.9
prior estimation: 361.9 model prediction: 363.4 estimation: 361.7
prior estimation: 361.7 model prediction: 362.2 estimation: 360.5
prior estimation: 360.5 model prediction: 360.0 estimation: 361.7
prior estimation: 361.7 model prediction: 362.3 estimation: 363.5
prior estimation: 363.5 model prediction: 364.7 estimation: 367.9
prior estimation: 367.9 model prediction: 371.2 estimation: 369.0
prior estimation: 369.0 model prediction: 371.0 estimation: 368.0
prior estimation: 368.0 model prediction: 368.1 estimation: 370.1
prior estimation: 370.1 model prediction: 371.5 estimation: 371.4
prior estimation: 371.4 model prediction: 372.8 estimation: 372.7
 
%% Cell type:code id: tags:
 
```
self_plots.plotter_2(temp_observations, predictions, estimations, true_data)
```
 
%%%% Output: display_data
 
 
%% Cell type:markdown id: tags:
 
By comparing the measurements and our model predictions, we created a feedback information, which is then used to update our understanding of the system; that is the gain variable. Here we see that our estimations are much better than pure model predictions and learn the new dynamics in a few steps.
 
Note that we have some parameters here: learning_rate & scaling factor, which are chosen arbitrarily here. We could also use update the gain; take another model prediction; maybe avarage them and so on. The key idea here is to keep update our understanding of the system as new information becomes available.
 
%% Cell type:markdown id: tags:
 
## $\alpha$-$\beta$ filter
 
Above discussions and the algoithm we worked on id formally called $\alpha$-$\beta$ filter or g-h filter in the literature. There are many filter applications in the literature and they are, in one way or the other, interpretations of these two hyperparameters:
 
+ $\alpha$ is the scaling factor between the measurement and the prediction. If $\alpha$ is very small; we ignore the information coming from the measurements. If it is too large; we rely on the measurements too much hence reject very little of the noise (errors) in the measured values.
 
+ $\beta$ is the scaling factor between the measured residual and the model prediction, hence can be considered as a form of learning rate. A small $\beta$ value means we react to the changes in the landscape rather slowly.
 
 
For instance, in Kalman filter, $\alpha$-$\beta$ are varried dynamically at each time step, which we will see later.
 
%% Cell type:markdown id: tags:
 
### Using filterpy
 
%% Cell type:code id: tags:
 
```
# Creating a sensor readings for our simple filter:
def new_position(pos, vel, dt=1.):
return pos + (vel * dt)
def measured_position(pos):
return pos + randn()*300
def gen_data(pos, vel, count):
z = []
for t in range(count):
pos = new_position(pos, vel)
vel += 0.2
z.append(measured_position(pos))
return np.asarray(z)
 
pos, vel = 30000., 15.
z = gen_data(pos, vel, 100)
```
 
%% Cell type:code id: tags:
 
```
z[:10]
```
 
%%%% Output: execute_result
 
array([29839.77543369, 29806.72659737, 30256.68074508, 30261.85269332,
30396.97753924, 29966.68070575, 29909.48209862, 30334.45624217,
30195.61219447, 30468.89539274])
 
%% Cell type:code id: tags:
 
```
x = pos
dx = vel
#Creating filter:
#https://filterpy.readthedocs.io/en/latest/gh/GHFilter.html
f_object = GHFilter(x,dx,dt=1,g=0.01,h=0.01)
print('x:',f_object.x)
print('dx:',f_object.dx)
```
 
%%%% Output: stream
 
x: 30000.0
dx: 15.0
 
%% Cell type:code id: tags:
 
```
#Applying the filter with measurements z:
x_estimates = []
for measurement in z:
x_estimates.append(f_object.update(z = measurement)[0])
x_estimates = np.array(x_estimates)
```
 
%% Cell type:code id: tags:
 
```
#lets see the output of .update on f_object
f_object
```
 
%%%% Output: execute_result
 
GHFilter object
dt = 1
g = 0.01
h = 0.01
x = 32337.904706776262
dx = 26.79157904065614
x_prediction = 32338.54292730415
dx_prediction = 27.429799568544656
y = -63.82205278885158
z = 0.0
 
%% Cell type:code id: tags:
 
```
self_plots.plotter_2(z/1000,[],x_estimates/1000,[])
```
 
%%%% Output: display_data
 
 
%% Cell type:markdown id: tags:
 
### Impact of filter parameters
 
%% Cell type:code id: tags:
 
```
x = pos
dx = vel
#Creating filter:
#https://filterpy.readthedocs.io/en/latest/gh/GHFilter.html
f_object = GHFilter(x,dx,dt=1,g=0.3,h=0.01)
```
 
%% Cell type:code id: tags:
 
```
#Applying the filter with measurements z:
x_estimates = []
for measurement in z:
x_estimates.append(f_object.update(z = measurement)[0])
x_estimates = np.array(x_estimates)
```
 
%% Cell type:code id: tags:
 
```
self_plots.plotter_2(z/1000,[],x_estimates/1000,[])
```
 
%%%% Output: display_data
 
 
%% Cell type:markdown id: tags:
 
When we increase $\alpha$ value, our estimates relies more on the measurement. If we look at the data, however, there are lots of noise so we should limit the influence of measurements.
 
%% Cell type:code id: tags:
 
```
x = pos
dx = vel
#Creating filter:
#https://filterpy.readthedocs.io/en/latest/gh/GHFilter.html
f_object = GHFilter(x,dx,dt=1,g=0.001,h=0.1)
```
 
%% Cell type:code id: tags:
 
```
#Applying the filter with measurements z:
x_estimates = []
for measurement in z:
x_estimates.append(f_object.update(z = measurement)[0])
x_estimates = np.array(x_estimates)
```
 
%% Cell type:code id: tags:
 
```
self_plots.plotter_2(z/1000,[],x_estimates/1000,[])
```
 
%%%% Output: display_data
 
 
%% Cell type:markdown id: tags:
 
Here we have increased the $\beta$ value. It means that our filter become more sensitive to changes in dx; it even reacts to the noise in the data.
 
%% Cell type:code id: tags:
 
```
x = pos
dx = vel
#Creating filter:
#https://filterpy.readthedocs.io/en/latest/gh/GHFilter.html
f_object = GHFilter(x,dx,dt=1,g=0.0001,h=0.003)
```
 
%% Cell type:code id: tags:
 
```
#Applying the filter with measurements z:
x_estimates = []
for measurement in z:
x_estimates.append(f_object.update(z = measurement)[0])
x_estimates = np.array(x_estimates)
```
 
%% Cell type:code id: tags:
 
```
self_plots.plotter_2(z/1000,[],x_estimates/1000,[])
```
 
%%%% Output: display_data
 
 
%% Cell type:markdown id: tags:
 
# [Bayesian](https://en.wikipedia.org/wiki/Bayesian_probability) Filtering
 
We have seen that our understanding of the system and how we update this knowledge has a significant impact on the predictive accuracy of our filtering operation.
 
While building our knowledge towards Kalman filters, we will make a quick stop in Bayesian learning (If you would like to revise the concept, you may check Lecture 4 of DDE-I) and how we can use this strategy to update our understanding of the "world".
 
%% Cell type:markdown id: tags:
 
The beauty of the Bayesian approach is the way it interprets and updates the past knowledge (i.e. prior):
 
$\mathtt{posterior} = \frac{\mathtt{likelihood}\times \mathtt{prior}}{\mathtt{evidence}}$
 
which can be translated into the following algorithm for our problem:
 
%% Cell type:markdown id: tags:
 
1. initialize the prior knowledge,
 
2. predict the state for the next time step,
 
3. measure the state,
 
4. set a belief about the measurement accuracy,
 
5. compute how likely the measurement matches the state,
 
6. update our knowledge with this likelihood (posterior)
 
%% Cell type:markdown id: tags:
 
We will dicuss the idea behind Bayesian learning over a traditional example.
 
%% Cell type:markdown id: tags:
 
### Problem definition: A classic case
 
Imagine that we are trying to predict the position of a robot along a 1D path (along its railway) given an order to move a set position:
 
+ We may not have infromation about the initial state, that is where the robot is.
 
+ We may know how it moves between discrete positions.
 
+ We may also know how it moves: it goes through states following one another: 1=>2; 4=>5 etc.