Differential-Evolution-based methods for inducing Decision Trees

Rivera López, Rafael

Please use this identifier to cite or link to this item: https://ri.ujat.mx/handle/200.500.12107/6016

Title:	Differential-Evolution-based methods for inducing Decision Trees
metadata.dc.creator:	Rivera López, Rafael
metadata.dc.creator.id:	0000-0002-5254-4195
Abstract:	THIS thesis describes the application of the differential evolution algorithm to induce oblique and axis parallel decision trees. The differential evolution algorithm distinguishes itself by being a simple and straightforward metaheuristic that has been successfully applied to efficiently solve a large number of problems whose parameters are real-valued variables, producing better results than those obtained by other ap proaches. Even though the differential evolution algorithm has already been utilized in data mining tasks, its use to induce decision trees is reduced to one approach to building oblique trees in a global search approach. The differential evolution algorithm is applied in this thesis to induce decision trees through two strate gies: by its use inside a traditional recursive partition scheme, and by utilizing it to conducta global search in the space of possible trees. In the first case, this metaheuristic searches for the most appropriate hyper plane coefficients to better split a set of training instances, optimizing some splitting criterion. With this scheme, the differential evolution algorithm is applied as many times as internal nodes are required to build the oblique decision tree. On the other hand, this evolutionary algorithm carries out a global search of one near-optimal decision tree. Each individual in the population encodes only the internal nodes of a complete binary decision tree stored in a fixed-length real-valued vector. The size of this vector is determined using both the number of attributes and the number of class labels of the training set whose model is induced. To obtain a feasible decision tree, first the vector is analyzed to build a partial tree with only internal nodes, and then the training set is used to insert the corresponding leaf nodes. Two types of decision trees can be obtained using the global search strategy: oblique and axis-parallel decision trees. Using the differential evolution algorithm to find near-optimal hyperplanes of an oblique decision tree is very intuitive since the hyperplane coefficients values are taken from a continuous space, and this metaheuristic was devised to optimize real-valued vectors. However, the construction of axis-parallel decision trees encoded with real-valued vectors is a more complicated task due to each of its internal nodes uses only one attribute to split the training instances. In this thesis, one procedure to select each attribute of each test condition of this type of decision tree is introduced. In this procedure, a mapping scheme using the smallest-position-value rule and the training instances to build a feasible axis-parallel decision tree from one individual in the population is successfully applied. Once the evolutionary process reaches its stop condition, the best individual in the final population is refined to replace non-optimal leaf nodes with sub-trees, as well as it is pruned to reduce the possible overfitting generated by applying this refinement. This procedure allows inducing feasible decision trees with a different number of nodes, although they are represented using a fixed-length parameters vector. Toobtain reliable estimates of the predictive performance of the two strategies implemented in this thesis, and to compare their results with those achieved by other classification methods, a repeated stratified ten fold cross-validation procedure is applied in the experimental study. Since the evolutionary process to find a near-optimal decision tree uses the training accuracy of each tree as its fitness value, the decision trees in the last population could be overtrained, and the tree with the best training accuracy in this population could show a worse predictive ability. In this thesis, with the aim of mitigating the effects of this overtraining, alternative scheme to select one decision tree from the population of trained trees is introduced. This scheme uses a subset of instances of the dataset, which are not utilized in the cross-validation process, to determine an independent accuracy for each decision tree in the final population and to select the best one with this new value.
Issue Date:	1-Feb-2018
metadata.dc.rights.license:	http://creativecommons.org/licenses/by-nc/4.0
URI:	https://ri.ujat.mx/handle/200.500.12107/6016
metadata.dc.language.iso:	eng
Appears in Collections:	Doctorado en Ciencias de la Computación (PNPC)

Files in This Item:

File	Description	Size	Format
Rafael Rivera López.pdf		39,41 MB	Adobe PDF	View/Open

Show full item record

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets