Symbolic Regression
This page provides an example of using the OakGP genetic programming Java framework to perform symbolic regression.
For an overview of OakGP please read the getting started with OakGP guide.
Approach
Configuration
Return Type
Variable Set
Constant Set
Function Set
Java Source Code
Output
Problem Description
The aim of this example is to demonstrate how genetic programming can be used to evolve a program that best fits a given dataset. The process of generating a computer program to fit numerical data is called symbolic regression. In this example the dataset contains inputs/outputs for the expression x2 + x + 1
.
This is the same problem as described in "A Field Guide to Genetic Programming" (R. Poli, W. B. Langdon, and N. F. McPhee, with contributions by J. R. Koza, 2008). View Chapter
Approach
There is no need to implement any specialised functions, types or fitness functions for this problem. The function set consists of functions provided by the org.oakgp.function.math.IntegerUtils
class. The org.oakgp.rank.fitness.TestDataFitnessFunction
class provides a suitable fitness function.
The genetic programming run is configured in SymbolicRegressionExample
using a org.oakgp.util.RunBuilder
.
Configuration
Return Type
Type | Description |
---|---|
integer | The output of applying the input value to the arithmetic expression represented by the generated candidate. |
Variable Set
ID | Type | Description |
---|---|---|
v0 | integer | The input to the generated candidate. |
Constant Set
Type | Values |
---|---|
integer | 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 |
Function Set
Class: | org.oakgp.function.math.Add |
Symbol: | + |
Return Type: | integer |
Arguments: | integer, integer |
Class: | org.oakgp.function.math.Multiply |
Symbol: | * |
Return Type: | integer |
Arguments: | integer, integer |
Class: | org.oakgp.function.math.Subtract |
Symbol: | - |
Return Type: | integer |
Arguments: | integer, integer |
Java Source Code
package org.oakgp.examples.simple;
import java.util.HashMap;
import java.util.Map;
import org.oakgp.Assignments;
import org.oakgp.Type;
import org.oakgp.function.Function;
import org.oakgp.function.math.IntegerUtils;
import org.oakgp.node.ConstantNode;
import org.oakgp.node.Node;
import org.oakgp.rank.RankedCandidates;
import org.oakgp.rank.fitness.FitnessFunction;
import org.oakgp.rank.fitness.TestDataFitnessFunction;
import org.oakgp.util.RunBuilder;
import org.oakgp.util.Utils;
/** An example of using symbolic regression to evolve a program that best fits a given data set for the function {@code x2 + x + 1}. */
public class SymbolicRegressionExample {
private static final int TARGET_FITNESS = 0;
private static final int INITIAL_POPULATION_SIZE = 50;
private static final int INITIAL_POPULATION_MAX_DEPTH = 4;
public static void main(String[] args) {
// the function set will be the addition, subtraction and multiplication arithmetic operators
Function[] functions = { IntegerUtils.INTEGER_UTILS.getAdd(), IntegerUtils.INTEGER_UTILS.getSubtract(), IntegerUtils.INTEGER_UTILS.getMultiply() };
// the constant set will contain the integers in the range 0-10 inclusive
ConstantNode[] constants = Utils.createIntegerConstants(0, 10);
// the variable set will contain a single variable - representing the integer value input to the function
Type[] variableTypes = { Type.integerType() };
// the fitness function will compare candidates against a data set which maps inputs to their expected outputs
FitnessFunction fitnessFunction = TestDataFitnessFunction.createIntegerTestDataFitnessFunction(createDataSet());
RankedCandidates ouput = new RunBuilder().setReturnType(Type.integerType()).setConstants(constants).setVariables(variableTypes).setFunctions(functions)
.setFitnessFunction(fitnessFunction).setInitialPopulationSize(INITIAL_POPULATION_SIZE).setTreeDepth(INITIAL_POPULATION_MAX_DEPTH)
.setTargetFitness(TARGET_FITNESS).process();
Node best = ouput.best().getNode();
System.out.println(best);
}
/**
* Returns the data set used to assess the fitness of candidates.
* <p>
* Creates a map of input values in the range [-10,+10] to the corresponding expected output value.
*/
private static Map<Assignments, Integer> createDataSet() {
Map<Assignments, Integer> tests = new HashMap<>();
for (int i = -10; i < 11; i++) {
Assignments assignments = Assignments.createAssignments(i);
tests.put(assignments, getExpectedOutput(i));
}
return tests;
}
private static int getExpectedOutput(int x) {
return (x * x) + x + 1;
}
}
Output
Here is the solution generated by the SymbolicRegressionExample
:
(+ (+ 1 v0) (* v0 v0))
Success!