Heuristics and Machine Learning to Improve Symbolic Computation Algorithms
: Speeding Up Cylindrical Algebraic Decomposition

Student thesis: Doctoral ThesisDoctor of Philosophy

Abstract

This thesis investigates the optimisation of Cylindrical Algebraic Decomposition (CAD) through the right choice of variable ordering, offering methodologies that have the potential to optimise other symbolic computation algorithms similarly. Different chapters of this thesis study how humans can propose informed heuristics to make these choices, how machine learning can be trained to substitute heuristics, and finally, it investigates how insights can be drawn from trained machine learning models to create simple heuristics that are interpretable to humans and often easier to include in implementations.
After motivating the importance of choosing the right variable ordering when running the CAD algorithm we go on to present the methodology that will be used to evaluate the different strategies for choosing the variable ordering. Then, the complexity analysis of the CAD algorithm is used to motivate a simple heuristic that outperforms all existing heuristics created by experts in the field.
Machine learning models have been able to outperform experts in many areas, and researchers had attempted to train models to make this choice. During an exploration of the dataset used by existing research to train their models, we discovered that this dataset had issues previously uncovered that affect the conclusions drawn from it. We propose a methodology to improve the use of the dataset, resulting in a reduction of 38% of the total time needed to build a CAD for each of the sets of polynomials in our testing dataset.
To structure the approaches that can be taken when using machine learning we present a categorisation of machine learning paradigms with respect to their input, feedback mechanism, and output. This categorisation motivated a shift of paradigm from classification to regression to increase the information provided to the models. This paradigm change did not substantially improve performance on average, but it generated the best-performing model in this thesis. Moreover, even though for choosing the variable ordering in CAD this methodology is still limited to a fixed number of variables, it has more potential applications for other symbolic computation choices.
Our last contribution regarding machine learning was proposing a methodology for training models capable of making a series of choices rather than a single choice. This methodology has an even broader potential for making choices in symbolic computation, and in particular, for our case study it can choose a variable ordering for sets of polynomials in an arbitrary number of variables while not losing performance in our three-variable testing dataset.
Finally, we close the cycle coming back to heuristics by being the first researchers in symbolic computation to make use of Explainable AI to extract insights from trained machine learning models, which otherwise behave as black boxes, to deduce heuristics that can be understood by humans. Thanks to this pioneering effort, we were able to uncover a new state-of-the-art heuristic, highlighting the potential of explainable AI in revealing deeper understandings within symbolic computation.
Date of AwardJun 2024
Original languageEnglish
Awarding Institution
  • Coventry University
SupervisorMatthew England (Supervisor) & AmirHosein Sadeghimanesh (Supervisor)

Keywords

  • Machine Learning
  • Symbolic Computation
  • Cylindrical Algebraic Decomposition
  • Heuristics

Cite this

'