Evolutionary Algorithms I: Differential Evolution

Evolutionary Algorithms are classified under a family of algorithms for global optimization by biological evolution, and are based on meta-heuristic search approaches. The possible solutions usually span a n-dimensional vector space over the problem domain and we simulate several population particles to reach a global optimum.

An optimization problem, in a basic form, consists of solving the task of maximizing or minimizing a real function by choosing values from a pool of possible solution elements (vectors) according to procedural instructions provided for the algorithm. Evolutionary approaches usually follow a specific strategy with differenet variations to select candidate elements from population set and apply crossover and/or mutations to modify the elements while trying to improve the quality of modified elements.

These algorithms can be applied to several interesting applications as well, and have been shown to perform very well in optimizing NP-hard problems as well, including the Travelling Salesman Problem, Job-Shop Scheduling, Graph coloring while also having applicaitons in domains such as Signals and Systems, Mechanical Engineering, and solving mathematical optimization problems.

One such algorithm belonging to the family of Evolutionary Algorithms is Differential Evolution (DE) algorithm. In this post, we shall be discussing about a few properties of the Diferential Evolution algorithm while implementing it in Python (github link) for optimizing a few test functions.

Differential Evolution

DE approaches an optimization problem iteratively trying to improve a set of candidate solutions for a given measure of quality (cost function). These set of algorithms fall under meta-heuristics since they make few or no assumptions about the problem being optimized and can search very large spaces of possible solution elements. The algorithm involves maintaining a population of candidate solutions subjected to iterations of recombination, evaluation and selection. The creation of new candidate solution requires the application of a linear operation on selected elements using a parameter \(F\) called differential weight from population to generate a vector element and then randomly applying crossover based on the parameter Crossover Probability. \(CR\).

The algorithm follows the steps listed down:

Initialize a set of agents/elements \(x\) with random positions in the search space for population size \(P\).
Until a termination criterion is met (number of iterations or required optimality), repeat the following for each agent \(x_i\):
- Pick three agents \(a, b\), and \(c\) from the population at random (distnct).
- Pick a random index \(R \in \{1,...,n\}\) (\(n\) is the dimensionality of the problem)
- Compute a temporary vector \(y\) as following:
  \[y = a + F (b-c)\]
- Now, for each \(j \in \{1,...,n\}\), pick a uniformly distributed number \(r_i \equiv U(0, 1)\).
- If \(r_i \lt CR\) or \(i=R\), then
  - set \(x_{I, j} = y_{j}\)
- Otherwise, \(x_{I, j} = x_{i, j}\)
- if \(f(x_{I}) \lt f(x_i)\), (\(f\) is the cost function for minimization), then
  - replace \(x_i\) with \(x_i\).
- otherwise, \(x_i\) remains unchanged.
Pick the agent from the population that has the highest fitness or lowest cost function value as the solution.

Implementing the Algorithm

The directory structure for the code follows the design as given below:

.
├── differential_evolution.py
└── helpers
    ├── __init__.py
    ├── point.py
    ├── population.py
    └── test_functions.py

Where, differential_evolution.py is the main file we’ll run for execution of the algorithm. The helpers directory consists of helper classes and functions for several operations such as handling the point objects and vector operations related to candidate elements (point.py), methods for handling the collection of all such points and building the population (collection.py), test functions to be used objective/cost functions for testing the efficiency of the algorithm (test_functions.py).

Building The Point Class

# helpers/point.py

import numpy as np
import scipy as sp


class Point:
    def __init__(self, dim=2, upper_limit=10, lower_limit=-10, objective=None):
        self.dim = dim
        self.coords = np.zeros((self.dim,))
        self.z = None
        self.range_upper_limit = upper_limit
        self.range_lower_limit = lower_limit
        self.objective = objective
        self.evaluate_point()

    def generate_random_point(self):
        self.coords = np.random.uniform(self.range_lower_limit, self.range_upper_limit, (self.dim,))
        self.evaluate_point()

    def evaluate_point(self):
        # self.z = evaluate(self.coords)
        self.z = self.objective.evaluate(self.coords)

Here, we’re initializing the Point class with dim which is the dimension size of the vector, lower_limit and upper_limit specify the domain of each co-ordinate of the vector. self.z is the objective function value of the point, associated with each instance to make it wasy for ranking them based on their objective function value. The evaluate_point function runs the objective function for the given point on the test function. The Point class creates instance of vector objects signifying each individual in the population. The collection of individuals is defined in the Population class.

The Population Class

# helpers/population.py

import copy
import numpy as np
from matplotlib import pyplot as plt

from point import Point
from matplotlib import pyplot as plt

class Population:
    def __init__(self, dim=2, num_points=50, upper_limit=10, lower_limit=-10, init_generate=True, objective=None):
        self.points = []
        self.num_points = num_points
        self.init_generate = init_generate
        self.dim = dim
        self.range_upper_limit = upper_limit
        self.range_lower_limit = lower_limit
        self.objective = objective
        # If initial generation parameter is true, then generate collection
        if self.init_generate == True:
            for ix in xrange(num_points):
                new_point = Point(dim=dim, upper_limit=self.range_upper_limit,
                                  lower_limit=self.range_lower_limit, objective=self.objective)
                new_point.generate_random_point()
                self.points.append(new_point)

    def get_average_objective(self):
        avg = 0.0

        for px in self.points:
            avg += px.z
        avg = avg/float(self.num_points)
        return avg

The Population class contain the set of point class instances acting a individuals in the population. The individuals are stored in self.points list. The parameters of the class are num_points, containing information about the population size, dim, upper_limit and lower_limit as discussed above. As an optional parameter, init_generate controls the generation of the initial population and objective referes to an object of the Function class and is the objective function (discussed in the next section). If set to False, the initial population will be empty and the elements will need to added through the main procedure of the algorithm. The get_average_objectve function returns the mean evaluated objective value of the population.

The Objective Functions

# helpers/test_functions.py

import numpy as np


class Function:
    def __init__(self, func=None):

        self.objectives = {
            'sphere': self.sphere,
            'ackley': self.ackley,
            'rosenbrock': self.rosenbrock,
            'rastrigin': self.rastrigin,
        }
        
        if func is None:
            self.func_name = 'sphere'
            self.func = self.objectives[self.func_name]
        else:
            if type(func) == str:
                self.func_name = func
                self.func = self.objectives[self.func_name]
            else:
                self.func = func
                self.func_name = func.func_name

    def evaluate(self, point):
        return self.func(point)

    def sphere(self, x):
        d = x.shape[0]
        f = 0.0

        for dx in xrange(d):
            f += x[dx] ** 2
        
        return f

    def ackley(self, x):
        z1, z2 = 0, 0

        for i in xrange(len(x)):
            z1 += x[i] ** 2
            z2 += np.cos(2.0 * np.pi * x[i])

        return (-20.0 * np.exp(-0.2 * np.sqrt(z1 / len(x)))) - np.exp(z2 / len(x)) + np.e + 20.0

    def rosenbrock(self, x):
        v = 0
        for i in xrange(len(x) - 1):
            v += 100 * (x[i + 1] - x[i] ** 2) ** 2 + (x[i] - 1) ** 2

        return v

    def rastrigin(self, x):
        v = 0

        for i in range(len(x)):
            v += (x[i] ** 2) - (10 * np.cos(2 * np.pi * x[i]))

        return (10 * len(x)) + v

The test_functions.py contains the implementation of the Function class, which creates an objecctive function object. The parameters to the constructor is func which can either be a string or a function. If None, it’ll store the function sphere in self.func, else it shall check for string value. For a string, it will assign the function with the same name implemented in the class (stored under the dictionary self.objectives). For a function, this assumes that the function accepts a numpy ndarray as an input and returns a scalar quantity as the objective function value.

The Objective functions implemented by default currently include sphere, ackley, rosenbrock, and rastrigin functions. A list of optomization test functions can be found here. These are all defined in a multi-dimmensional vector space and exhibit either unimodal or multi-modal properties. For example, the sphere function is a unimodal convex function, while the rastrigin function is a multi-modal non-convex function. The representation of the rastrigin function in a 3-D space is shown (the vertical axis is the value of the objective function):

ras

The Differential Evolution Class

# differential_evolution.py

import copy
import random
import time

from helpers.population import Population
from helpers import get_best_point
from helpers.test_functions import Function


class DifferentialEvolution(object):
    def __init__(self, num_iterations=10, CR=0.4, F=0.48, dim=2, population_size=10, print_status=False, func=None):
        random.seed()
        self.print_status = print_status
        self.num_iterations = num_iterations
        self.iteration = 0
        self.CR = CR
        self.F = F
        self.population_size = population_size
        self.func = Function(func=func)
        self.population = Population(dim=dim, num_points=self.population_size, objective=self.func)

    def iterate(self):
        for ix in xrange(self.population.num_points):
            x = self.population.points[ix]
            [a, b, c] = random.sample(self.population.points, 3)
            while x == a or x == b or x == c:
                [a, b, c] = random.sample(self.population.points, 3)

            R = random.random() * x.dim
            y = copy.deepcopy(x)

            for iy in xrange(x.dim):
                ri = random.random()

                if ri < self.CR or iy == R:
                    y.coords[iy] = a.coords[iy] + self.F * (b.coords[iy] - c.coords[iy])

            y.evaluate_point()
            if y.z < x.z:
                self.population.points[ix] = y
        self.iteration += 1

    def simulate(self):
        pnt = get_best_point(self.population.points)
        print("Initial best value: " + str(pnt.z))
        while self.iteration < self.num_iterations:
            if self.print_status == True and self.iteration%50 == 0:
                pnt = get_best_point(self.population.points)
                print pnt.z, self.population.get_average_objective()
            self.iterate()

        pnt = get_best_point(self.population.points)
        print("Final best value: " + str(pnt.z))
        return pnt.z

Here, in the DifferentialEvolution class, the initializing parameters are:

num_iteration controlling the number of generations/iterations the optimization loop runs. Acts as the stopping criterion.
CR and F are the Crossover Probability and the Differential Weight as defined in the algorithm.
dim is the number of dimensions of the individial vectors (Size of the vector space, \(x \in R^n\); \(x\) is an individual vector).
population_size is passed to the Population class and the population object is stored in self.population.
print_status is a boolean value used for verbosity (prints the best objective function value at each iteration).
func accepts either the function name or the actual function and is used to create the self.func object, which is an instance of the Function class.
self.iteration keeps tracck of the current iteration/generation.

There are essentially two member functions, self.iterate and self.simulate. The self.iterate function runs oone iteration of the Differential Evolution procedure, by applying the transformation operation and crossover on each individual in the population, and the self.simulate function calls the iterate function until the stopping criteria is met, and then prints the best value for the objective function.

Demo

Now that we have an implementation for all the required classes for the Differential Evolution algorithm, we can write a small script to test everything out and see the results.

# demo.py

from differential_evolution import DifferentialEvolution
import datetime

import numpy as np
from matplotlib import pyplot as plt

if __name__ == '__main__':
    number_of_runs = 5
    val = 0
    print_time = True

    for i in xrange(number_of_runs):
        start = datetime.datetime.now()
        de = DifferentialEvolution(num_iterations=200, dim=10, CR=0.4, F=0.48, population_size=75, print_status=False, func='sphere')
        val += de.simulate()
        if print_time:
            print "\nTime taken:", datetime.datetime.now() - start
    print '-'*80
    print "\nFinal average of all runs:", val / number_of_runs

This script initializes the variables number_of_runs, val, and print_time. number_of_runs is used to initiate several runs of the algorithm, and finally the average outcome of the optimized objective function is returned after those runs. val stores the optimized objective function value for each run and is later used to compute the average. print_time is a boolean which controls if the computation time should be printed for each run or not.

The output for the above code, i.e. using the differential evolution algorithm to optimize the sphere test function, on 50 dimensions (50-D vector space), running for 200 iterations for each runs produces the following output:

# Output

Initial best value: 1285.50913073
Final best value: 0.0258755727525

Time taken: 0:00:05.931056
Initial best value: 1218.54112743
Final best value: 0.0323126608382

Time taken: 0:00:05.560921
Initial best value: 1253.1145944
Final best value: 0.0340955810298

Time taken: 0:00:06.081233
Initial best value: 1298.5615981
Final best value: 0.0439433666035

Time taken: 0:00:04.511034
Initial best value: 1228.13894559
Final best value: 0.0405344973595

Time taken: 0:00:05.081286
--------------------------------------------------------------------------------

Final average of all runs: 0.0353523357167

The plot for objective function value against the iterations for the sphere test function in 50D and the Rastrigin test function in 50D are shown below:

res_sphere

res_rastrigin

The code is available in a github repository here.

–