Commit c4e2c5e5 authored by Cihan Ates's avatar Cihan Ates
Browse files

Colab buttons are added

parent 109105f2
%% Cell type:markdown id: tags:
# Genetic Algortihms - I
%% Cell type:markdown id: tags:
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/cihan-ates/data-driven-engineering/blob/master/DDE_II_Advanced_Topics/Lecture%2010/GA_I.ipynb)
%% Cell type:markdown id: tags:
“It is not the strongest of the species that survives,
not the most intelligent that survives.
It is the one that is the most adaptable to change.”
― Charles Darwin
%% Cell type:markdown id: tags:
# Important Note
Lecture notes and notebooks must not be copied and/or distributed without the express permission of ITS.
%% Cell type:markdown id: tags:
# Example I: Guessing a password
In this example, we will code a GA based function with mutations. Our objective is to guess the password, about which we know the number of characters to generate a chromosome.
%% Cell type:code id: tags:
```
# Libraries:
import datetime
import random
from bisect import bisect_left
from math import exp
```
%% Cell type:code id: tags:
```
#Step 0: decide on a fitness criteria:
#------------------------------------------
'''
Feedback score for natural selection:
Since we are trying to guess a password, we need to compare our guess with the feedback signal;
we need to create a score defining how close our guess is. Since all characters should match the true value,
we can assign 1 for each match.
*
Here we will loop over the characters. Note that our example is simple; our chromosome includes only one 'gene'.
'''
def get_fitness(genes, target):
return sum(1 for expected, actual in zip(target, genes)
if expected == actual)
```
%% Cell type:code id: tags:
```
#Step 1: create a population pool:
'''
* Our guess will be updated so population := 1
* We can consider that we have mitosis and we will immediately kill the parent.
'''
```
%%%% Output: execute_result
'\n* Our guess will be updated so population := 1\n* We can consider that we have mitosis and we will immediately kill the parent.\n'
%% Cell type:code id: tags:
```
#Step 2: create a parent pool:
'''
Creating first parents:
We will generate a sequence of random characters from our database, geneSet
by using a size limitation length.
'''
def _generate_parent(length, geneSet):
genes = []
while len(genes) < length:
sampleSize = min(length - len(genes), len(geneSet))
genes.extend(random.sample(geneSet, sampleSize))
return ''.join(genes)
```
%% Cell type:code id: tags:
```
#Step 3: Breeding & mutations:
'''
Mutation engine: randomly selecting newGene
* We have only mitosis.
* child => parent
For the mutation, lets consider a simple case where only one
chacter in the gene will be replaced.
Note that here we are using a trick:
- we will sample two characters randomly,
- if the randomly selected gene is identical to the previous gene,
we will use the second one.
- by doing so, we eliminated unlucky iterations...
'''
def _mutate(parent, geneSet):
index = random.randrange(0, len(parent))
childGenes = list(parent)
newGene, alternate = random.sample(geneSet, 2)
childGenes[index] = alternate \
if newGene == childGenes[index] \
else newGene
return ''.join(childGenes)
```
%% Cell type:code id: tags:
```
#Step 4: Survival of the fittest:
'''
#Starting the algorithm:
--------------------------------------------------------------------------
i. generates a guess,
ii. requests the fitness for that guess, then
iii. compares the fitness to that of the previous best guess, and
iv. keeps the guessed gene with the better fitness.
--------------------------------------------------------------------------
'''
def get_best(get_fitness, targetLen, optimalFitness, geneSet, display):
'''
get_fitness: fitness function to be called
targetLen: length of the password
optimalFitness: best score expected
geneSet: pool to create chromosomes (genes)
display: terminal output for visualization
'''
random.seed(2021)
bestParent = _generate_parent(targetLen, geneSet)
bestFitness = get_fitness(bestParent)
display(bestParent)
if bestFitness >= optimalFitness:
return bestParent
while True:
child = _mutate(bestParent, geneSet)
childFitness = get_fitness(child)
if bestFitness >= childFitness:
continue
display(child)
if childFitness >= optimalFitness:
return child
bestFitness = childFitness
bestParent = child
```
%% Cell type:code id: tags:
```
# Visualization & helper functions:
'''
Given the genes, target and time;
calls the fitness funtion & prints out the fitness and the elapsed time.
'''
def display(genes, target, startTime):
timeDiff = datetime.datetime.now() - startTime
fitness = get_fitness(genes, target)
print("{0}\t{1}\t{2}".format(genes, fitness, str(timeDiff)))
```
%% Cell type:code id: tags:
```
#Main program:
#------------------------
def guess_password(target):
geneset = " abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!.?_*/#+-"
startTime = datetime.datetime.now()
def fnGetFitness(genes):
return get_fitness(genes, target)
def fnDisplay(genes):
display(genes, target, startTime)
optimalFitness = len(target)
get_best(fnGetFitness, len(target), optimalFitness, geneset, fnDisplay)
def test_password(target):
guess_password(target)
target = "GA is fun!"
test_password(target)
```
%%%% Output: stream
!y.NHqo/bB 0 0:00:00.000172
!A.NHqo/bB 1 0:00:00.001200
!A.NHqf/bB 2 0:00:00.002791
!A.NHqfubB 3 0:00:00.003519
!A.iHqfubB 4 0:00:00.003846
!A.iHqfunB 5 0:00:00.004857
!A.isqfunB 6 0:00:00.006359
!A.isqfun! 7 0:00:00.009981
GA.isqfun! 8 0:00:00.013876
GA.is fun! 9 0:00:00.016180
GA is fun! 10 0:00:00.024178
%% Cell type:markdown id: tags:
# Example 2: N Queen problem
8 queens are to be placed on a standard chessboard such that none are under attack!
Is it a simple task?
There are 64 x 63 x 62 x 61 x 60 x 59 x 58 x 57 possible combinations; only 12 of them work... (considering symmetry)
https://en.wikipedia.org/wiki/Eight_queens_puzzle
We will code it in a way that it can work with imaginary chess boards with a size of N.
%% Cell type:markdown id: tags:
## What are we going to do?
* creating a measure of fitness
* creating a genetic code as Chromosome class.
* Use random numbers to generate the parents' chromosomes.
* Define ways to alter the genetic material:
- Mutations
* Define helper functions:
- Displaying the results
%% Cell type:code id: tags:
```
#Step 0: decide on a fitness criteria:
#------------------------------------------
'''
Now it is more difficult to solve. We need to think about how queens move around; it can move anywhere horizontally, vertically,
and also diagonally. So, there should not be any other queens along these paths: we need to check the lines on + and x along the board.
We can think of alternative ways but since the board is square and has symmetric indices, we can scan the rows, columns and diagonals easily.
We need to first set the origin: here it is on lower left corner. Then, we need to find a smart way to count the number of hits horizontally,
vertically and diagonally.
Fitness idea is as follows:
- save the indices of rows and columns for '+' scanning.
- save the indices of diagonals for 'x' scanning.
- For every found 'Q', we will update the lists.
Example of a board:
- - - Q - - - -
- - - - - - Q -
- - - - Q - - -
- - Q - - - - -
Q - - - - - - -
- - - - - Q - -
- - - - - - - Q
- Q - - - - - -
In order to do that, we should first code
- an object representation of a chess board and
- how we can encode the locations of 'Q' on a chromosome.
- Additional notes -------------------------------------------------------------
Magic methods are special methods describing how a certain objects should behave.
They are always surrounded by double underscores (e.g. __init__, __lt__).
The double underscores indicate that these are magic methods and shouldn't be called
directly by the programmer, they are normally called by the interpreter itself.
https://docs.python.org/3/tutorial/classes.html
https://docs.python.org/3/reference/datamodel.html#special-method-names
--------------------------------------------------------------------------------
'''
#Creating Chromosome class for storing genes and the fitness:
class Chromosome:
# Chromosome object that has Genes and Fitness attributes.
# Genes will include 2N elements, for the row and column indices.
Genes = None
Fitness = None
def __init__(self, genes, fitness):
self.Genes = genes
self.Fitness = fitness
#Creating the chess board as a class:
class Board:
#Queens' locations on a board:
def __init__(self, genes, size):
#empty board:
board = [['-'] * size for _ in range(size)]
# placing the queens according to the genes:
# row: even indices, 0, 2, ...
# column: odd indices 1, 3, ...
for index in range(0,len(genes), 2):
row = genes[index]
column = genes[index + 1]
board[column][row] = "Q"
self._board = board
# We will call it later with "get":
def get(self, row, column):
return self._board[column][row]
# We will print the board later on with "print"
def print(self):
# 0,0 is at the bottom left corner...
for i in reversed(range(0, len(self._board))):
print(' '.join(self._board[i]))
# Creating a fitness class:
class Fitness:
Total = None
#__init__ method which initialises the object. Here total is the fitness of the chromosome.
def __init__(self, total):
self.Total = total
#__gt__(self, other) Defines the behaviour of the greater-than operator >
# we will use it in the adaptive learning
def __gt__(self, other):
return self.Total < other.Total
#__str__(self) Defines behaviour for when str() is called on
def __str__(self):
return "{0}".format(self.Total)
```
%% Cell type:code id: tags:
```
def get_fitness(genes, size):
'''
genes: chromosome
size: size of the genetic material. this is added to make function more generalizable.
***
Now we are ready to create a fitness function.
Similar to Example 1, here we will create a score to drive the algorithm.
We will create a board based on genetic material; "genotype" ==> "phenotype"
Phenotype is an individual's observable traits, such as height, eye color, and blood type.
Here it refers to the queens on the board.
Then, we will check '+' and 'x' lines over the board. Each hit means +1 in each iterable list.
Ideally;
- N queens should be on N different horzontal (-) lines,
- N queens should be on N different vertical (|) lines,
- N queens should be on N different diaginal (/) lines,
- N queens should be on N different diagonal (\) lines,
------------------------------------------------------------
idealScore := 4N
currentScore := len(rowsWithQueens) + len(colsWithQueens) + len(northEastDiagonalsWithQueens) + len(southEastDiagonalsWithQueens)
Fitness = idealScore - currentScore
***
'''
board = Board(genes, size)
#set() method is used to convert any of the iterable to sequence of iterable elements with distinct elements.
rowsWithQueens = set()
colsWithQueens = set()
northEastDiagonalsWithQueens = set()
southEastDiagonalsWithQueens = set()
for row in range(size):
for col in range(size):
if board.get(row, col) == "Q":
rowsWithQueens.add(row)
colsWithQueens.add(col)
northEastDiagonalsWithQueens.add(row + col)
southEastDiagonalsWithQueens.add(size - 1 - row + col)
total = size - len(rowsWithQueens) \
+ size - len(colsWithQueens) \
+ size - len(northEastDiagonalsWithQueens) \
+ size - len(southEastDiagonalsWithQueens)
return Fitness(total)
```
%% Cell type:code id: tags:
```
#Step 2: create a parent pool:
#---------------------------------
def _generate_parent(length, geneSet, get_fitness):
'''
length: indicator of the board dimensions; 2N
geneSet: d
get_fitness: fitness function
'''
genes = []
while len(genes) < length:
sampleSize = min(length - len(genes), len(geneSet))
genes.extend(random.sample(geneSet, sampleSize))
fitness = get_fitness(genes)
return Chromosome(genes, fitness)
```
%% Cell type:code id: tags:
```
#Step 3: Breeding & mutations:
# --------------------------
# note that we use an alternate replacement in case the randomly selected
# if the newGene is the same as the one it is supposed to replace...
def _mutate(parent, geneSet, get_fitness):
mutated = parent.Genes[:]
index = random.randrange(0, len(parent.Genes))
newGene, alternate = random.sample(geneSet, 2)
mutated[index] = alternate \
if newGene == mutated[index] \
else newGene
fitness = get_fitness(mutated)
return Chromosome(mutated, fitness)
```
%% Cell type:code id: tags:
```
#Step 4: Survival of the fittest!
'''
Improving genes from generation to generation...
This will be done in two steps:
(i) generating successively better gene sequences ==> we will use an infinite loop;
(ii) displaying improvements and breaking the loop when we reach the desired fitness.
'''
# Part I:
# creating a function for the evolution. we will replace the parent if child is fitter.
def _get_improvement(mutated, generate_parent):
bestParent = generate_parent()
# yield is a keyword that is used like return,
#except the function will return a generator:
yield bestParent
while True:
child = mutated(bestParent)
if bestParent.Fitness > child.Fitness:
continue
if not child.Fitness > bestParent.Fitness:
bestParent = child
continue
yield child
bestParent = child
# Part II:
def get_best(get_fitness, targetLen, optimalFitness, geneSet, display):
random.seed()
'''
get_fitness: fitness function used
targetLen: desired gene length
optimalFitness: coded as zero
geneSet: possible characters for the indices.
display: helper display function to be called during the evolution.
'''
def fnMutate(parent):
return _mutate(parent, geneSet, get_fitness)
def fnGenerateParent():
return _generate_parent(targetLen, geneSet, get_fitness)
# Creating the loop with break:
# optimalFitness will be used as a criteria:
for improvement in _get_improvement(fnMutate, fnGenerateParent):
display(improvement)
if not optimalFitness > improvement.Fitness:
return improvement
```
%% Cell type:code id: tags:
```
# Helper functions:
def display(candidate, startTime, size):
timeDiff = datetime.datetime.now() - startTime
board = Board(candidate.Genes, size)
board.print()
print("{0}\t- {1}\t{2}".format(
' '.join(map(str, candidate.Genes)),
candidate.Fitness,
str(timeDiff)))
```
%% Cell type:code id: tags:
```
# Main program:
def EightQueensTests(size=8):
#creating indices
geneset = [i for i in range(size)]
#for performance evaluation if needed
startTime = datetime.datetime.now()
#callable functions:
def fnDisplay(candidate):
display(candidate, startTime, size)
def fnGetFitness(genes):
return get_fitness(genes, size)
#Setting the goal:
optimalFitness = Fitness(0)
#"Survuval of the fittest":
get_best(fnGetFitness, 2 * size, optimalFitness, geneset, fnDisplay)
EightQueensTests(size=8)
```
%%%% Output: stream
Q - - - - - - -
- Q - - - - - -
- - - - Q - - -
- - - - - - - -
- - Q - - - - -
- - - Q - - - -
- - - - - - Q -
- - - - - - - Q
7 0 4 5 2 3 6 1 0 7 4 5 3 2 1 6 - 9 0:00:00.000269
Q - - - - - - -
- Q - - - - - -
- - - - Q - - -
- - - - - - - -
- - Q - - - - -
- - - Q Q - - -
- - - - - - Q -
- - - - - - - Q
7 0 4 5 2 3 6 1 0 7 4 2 3 2 1 6 - 7 0:00:00.003093
Q - - - Q - - -
- - - - - - Q -
- - - - Q - - -
- - - - - - - -
- - Q - - - - -
- - - - Q - - -
- Q - - - - - -
- - - - - - - Q
7 0 4 5 2 3 1 1 0 7 4 7 4 2 6 6 - 6 0:00:00.005138
Q - - Q - - - -
- - - - - - Q -
- - - - Q - - -
- - - - - - - -
- - Q - - - - -
- - - - Q - - -
- Q - - - - - -
- - - - - - - Q
7 0 4 5 2 3 1 1 0 7 3 7 4 2 6 6 - 5 0:00:00.006275
- - - Q - - - -
- - - - - - Q -
Q - - - - - - -
- - - - - - - -
Q Q - - - - - -
- - - - Q - - -
- Q - - - - - -
- - - - - - - Q
7 0 0 5 1 3 1 1 0 3 3 7 4 2 6 6 - 4 0:00:00.006762
- - - Q - - - -
- - - - - - Q -
Q - - - - - - -
- - - - - - - -
Q Q - - - - - -
- - - - Q - - -
- - - - - Q - -
- - - - - - - Q
7 0 0 5 1 3 5 1 0 3 3 7 4 2 6 6 - 3 0:00:00.008262
- - - Q - - - Q
- - - - - Q - -
Q - - - - - - -
- - - - - - - -
- Q - - - - - -
- - - - Q - - -
- - Q - - - - -
- - - - - - - Q
7 0 0 5 1 3 2 1 7 7 3 7 4 2 5 6 - 2 0:00:00.010638
- - Q Q - - - -
- - - - - Q - -
- - - - - - - Q
- - - - - - - -
- Q - - - - - -
- - - - - - Q -
- - - - Q - - -
Q - - - - - - -
4 1 0 0 1 3 2 7 7 5 3 7 6 2 5 6 - 1 0:00:00.023153
- - - Q - - - -
- - - - - Q - -
- - - - - - - Q
- - Q - - - - -
Q - - - - - - -
- - - - - - Q -
- - - - Q - - -
- Q - - - - - -
4 1 1 0 0 3 2 4 7 5 3 7 6 2 5 6 - 0 0:00:00.049218
%% Cell type:markdown id: tags:
# Example 3: Solving system of linear equations
Now we will move to a more difficult problem, in which it is difficult to find the global minimum or maximum point, if we do not regularize the evolutionary path.
%% Cell type:code id: tags:
```
# Measure of fitness:
'''
---------------------------------------------------------------------------
x + 0 + z = 6
0 − 3y + z = 7
2x + y + 3z = 15
Here the fitness is easier to code, we can force the summation to be zero
given the right x,y,z couple, which will also be our genetic code.
---------------------------------------------------------------------------
'''
#Coefficients:
#-------------
# note that we can make these allocatable and read from the data for
# many equation problems.
eqs = [[1, 0, 1, -6],[0, -3, 1, -7],[2, 1, 3, -15]]
#-------------
#We will again keep the fitness and chromosome together. Similar to previous example,
# we will create a class for Fitness and the Chromosome:
class Fitness:
TotalDifference = None
#__init__ method which initialises the object. Here total is the fitness value.
def __init__(self, TotalDifference):
self.TotalDifference = TotalDifference
#__gt__(self, other) Defines the behaviour of the greater-than operator >
def __gt__(self, other):
return self.TotalDifference < other.TotalDifference
#__str__(self) Defines behaviour for when str() is called on
#an instance of your class
def __str__(self):
return "diff: {0:0.2f}".format(float(self.TotalDifference))
class Chromosome:
# Chromosome object that has Genes and Fitness attributes
Genes = None
Fitness = None
def __init__(self, genes, fitness):
self.Genes = genes
self.Fitness = fitness