Evolutionary Algorithms I: Differential Evolution
Evolutionary Algorithms are classified under a family of algorithms for global optimization by biological evolution, and are based on meta-heuristic search approaches. The possible solutions usually span a n-dimensional vector space over the problem domain and we simulate several population particles to reach a global optimum.
An optimization problem, in a basic form, consists of solving the task of maximizing or minimizing a real function by choosing values from a pool of possible solution elements (vectors) according to procedural instructions provided for the algorithm. Evolutionary approaches usually follow a specific strategy with differenet variations to select candidate elements from population set and apply crossover and/or mutations to modify the elements while trying to improve the quality of modified elements.
These algorithms can be applied to several interesting applications as well, and have been shown to perform very well in optimizing NP-hard problems as well, including the Travelling Salesman Problem, Job-Shop Scheduling, Graph coloring while also having applicaitons in domains such as Signals and Systems, Mechanical Engineering, and solving mathematical optimization problems.
One such algorithm belonging to the family of Evolutionary Algorithms is Differential Evolution (DE) algorithm. In this post, we shall be discussing about a few properties of the Diferential Evolution algorithm while implementing it in Python (github link) for optimizing a few test functions.
Differential Evolution
DE approaches an optimization problem iteratively trying to improve a set of candidate solutions for a given measure of quality (cost function). These set of algorithms fall under meta-heuristics since they make few or no assumptions about the problem being optimized and can search very large spaces of possible solution elements. The algorithm involves maintaining a population of candidate solutions subjected to iterations of recombination, evaluation and selection. The creation of new candidate solution requires the application of a linear operation on selected elements using a parameter \(F\) called differential weight from population to generate a vector element and then randomly applying crossover based on the parameter Crossover Probability. \(CR\).
The algorithm follows the steps listed down:
- Initialize a set of agents/elements \(x\) with random positions in the search space for population size \(P\).
- Until a termination criterion is met (number of iterations or required optimality), repeat the following for each agent \(x_i\):
- Pick three agents \(a, b\), and \(c\) from the population at random (distnct).
- Pick a random index \(R \in \{1,...,n\}\) (\(n\) is the dimensionality of the problem)
-
Compute a temporary vector \(y\) as following:
\[y = a + F (b-c)\] - Now, for each \(j \in \{1,...,n\}\), pick a uniformly distributed number \(r_i \equiv U(0, 1)\).
- If \(r_i \lt CR\) or \(i=R\), then
- set \(x_{I, j} = y_{j}\)
-
Otherwise, \(x_{I, j} = x_{i, j}\)
- if \(f(x_{I}) \lt f(x_i)\), (\(f\) is the cost function for minimization), then
- replace \(x_i\) with \(x_i\).
- otherwise, \(x_i\) remains unchanged.
- Pick the agent from the population that has the highest fitness or lowest cost function value as the solution.
Implementing the Algorithm
The directory structure for the code follows the design as given below:
Where, differential_evolution.py is the main file we’ll run for execution of the algorithm. The helpers directory consists of helper classes and functions for several operations such as handling the point objects and vector operations related to candidate elements (point.py), methods for handling the collection of all such points and building the population (collection.py), test functions to be used objective/cost functions for testing the efficiency of the algorithm (test_functions.py).
Building The Point Class
Here, we’re initializing the Point class with dim which is the dimension size of the vector, lower_limit and upper_limit specify the domain of each co-ordinate of the vector. self.z is the objective function value of the point, associated with each instance to make it wasy for ranking them based on their objective function value. The evaluate_point function runs the objective function for the given point on the test function. The Point class creates instance of vector objects signifying each individual in the population. The collection of individuals is defined in the Population class.
The Population Class
The Population class contain the set of point class instances acting a individuals in the population. The individuals are stored in self.points list. The parameters of the class are num_points, containing information about the population size, dim, upper_limit and lower_limit as discussed above. As an optional parameter, init_generate controls the generation of the initial population and objective referes to an object of the Function class and is the objective function (discussed in the next section). If set to False, the initial population will be empty and the elements will need to added through the main procedure of the algorithm. The get_average_objectve function returns the mean evaluated objective value of the population.
The Objective Functions
The test_functions.py contains the implementation of the Function class, which creates an objecctive function object. The parameters to the constructor is func which can either be a string or a function. If None, it’ll store the function sphere in self.func, else it shall check for string value. For a string, it will assign the function with the same name implemented in the class (stored under the dictionary self.objectives). For a function, this assumes that the function accepts a numpy ndarray as an input and returns a scalar quantity as the objective function value.
The Objective functions implemented by default currently include sphere, ackley, rosenbrock, and rastrigin functions. A list of optomization test functions can be found here. These are all defined in a multi-dimmensional vector space and exhibit either unimodal or multi-modal properties. For example, the sphere function is a unimodal convex function, while the rastrigin function is a multi-modal non-convex function. The representation of the rastrigin function in a 3-D space is shown (the vertical axis is the value of the objective function):
The Differential Evolution Class
Here, in the DifferentialEvolution class, the initializing parameters are:
- num_iteration controlling the number of generations/iterations the optimization loop runs. Acts as the stopping criterion.
- CR and F are the Crossover Probability and the Differential Weight as defined in the algorithm.
- dim is the number of dimensions of the individial vectors (Size of the vector space, \(x \in R^n\); \(x\) is an individual vector).
- population_size is passed to the Population class and the population object is stored in self.population.
- print_status is a boolean value used for verbosity (prints the best objective function value at each iteration).
- func accepts either the function name or the actual function and is used to create the self.func object, which is an instance of the Function class.
- self.iteration keeps tracck of the current iteration/generation.
There are essentially two member functions, self.iterate and self.simulate. The self.iterate function runs oone iteration of the Differential Evolution procedure, by applying the transformation operation and crossover on each individual in the population, and the self.simulate function calls the iterate function until the stopping criteria is met, and then prints the best value for the objective function.
Demo
Now that we have an implementation for all the required classes for the Differential Evolution algorithm, we can write a small script to test everything out and see the results.
This script initializes the variables number_of_runs, val, and print_time. number_of_runs is used to initiate several runs of the algorithm, and finally the average outcome of the optimized objective function is returned after those runs. val stores the optimized objective function value for each run and is later used to compute the average. print_time is a boolean which controls if the computation time should be printed for each run or not.
The output for the above code, i.e. using the differential evolution algorithm to optimize the sphere test function, on 50 dimensions (50-D vector space), running for 200 iterations for each runs produces the following output:
The plot for objective function value against the iterations for the sphere test function in 50D and the Rastrigin test function in 50D are shown below:
The code is available in a github repository here.
–