New Hybrid Learning Models for Multi-label Classification and Label Ranking

Reyes Pupo, Oscar Gabriel

dc.contributor.advisor	Ventura Soto, S.
dc.contributor.author	Reyes Pupo, Oscar Gabriel
dc.date.accessioned	2016-11-25T11:19:17Z
dc.date.available	2016-11-25T11:19:17Z
dc.date.issued	2016
dc.identifier.uri	http://hdl.handle.net/10396/14095
dc.description.abstract	En la última década, el aprendizaje multi-etiqueta se ha convertido en una importante tarea de investigación, debido en gran parte al creciente número de problemas reales que contienen datos multi-etiqueta. En esta tesis se estudiaron dos problemas sobre datos multi-etiqueta, la mejora del rendimiento de los algoritmos en datos multi-etiqueta complejos y la mejora del rendimiento de los algoritmos a partir de datos no etiquetados. El primer problema fue tratado mediante métodos de estimación de atributos. Se evaluó la efectividad de los métodos de estimación de atributos propuestos en la mejora del rendimiento de los algoritmos de vecindad, mediante la parametrización de las funciones de distancias empleadas para recuperar los ejemplos más cercanos. Además, se demostró la efectividad de los métodos de estimación en la tarea de selección de atributos. Por otra parte, se desarrolló un algoritmo de vecindad inspirado en el enfoque de clasifcación basada en gravitación de datos. Este algoritmo garantiza un balance adecuado entre eficiencia y efectividad en su solución ante datos multi-etiqueta complejos. El segundo problema fue resuelto mediante técnicas de aprendizaje activo, lo cual permite reducir los costos del etiquetado de datos y del entrenamiento de un mejor modelo. Se propusieron dos estrategias de aprendizaje activo. La primer estrategia resuelve el problema de aprendizaje activo multi-etiqueta de una manera efectiva y eficiente, para ello se combinaron dos medidas que representan la utilidad de un ejemplo no etiquetado. La segunda estrategia propuesta se enfocó en la resolución del problema de aprendizaje activo multi-etiqueta en modo de lotes, para ello se formuló un problema multi-objetivo donde se optimizan tres medidas, y el problema de optimización planteado se resolvió mediante un algoritmo evolutivo. Como resultados complementarios derivados de esta tesis, se desarrolló una herramienta computacional que favorece la implementación de métodos de aprendizaje activo y la experimentación en esta tarea de estudio. Además, se propusieron dos aproximaciones que permiten evaluar el rendimiento de las técnicas de aprendizaje activo de una manera más adecuada y robusta que la empleada comunmente en la literatura. Todos los métodos propuestos en esta tesis han sido evaluados en un marco experimental adecuado, se utilizaron numerosos conjuntos de datos y se compararon los rendimientos de los algoritmos frente a otros métodos del estado del arte. Los resultados obtenidos, los cuales fueron verificados mediante la aplicación de test estadísticos no paramétricos, demuestran la efectividad de los métodos propuestos y de esta manera comprueban las hipótesis planteadas en esta tesis.	es_ES
dc.description.abstract	In the last decade, multi-label learning has become an important area of research due to the large number of real-world problems that contain multi-label data. This doctoral thesis is focused on the multi-label learning paradigm. Two problems were studied, rstly, improving the performance of the algorithms on complex multi-label data, and secondly, improving the performance through unlabeled data. The rst problem was solved by means of feature estimation methods. The e ectiveness of the feature estimation methods proposed was evaluated by improving the performance of multi-label lazy algorithms. The parametrization of the distance functions with a weight vector allowed to recover examples with relevant label sets for classi cation. It was also demonstrated the e ectiveness of the feature estimation methods in the feature selection task. On the other hand, a lazy algorithm based on a data gravitation model was proposed. This lazy algorithm has a good trade-o between e ectiveness and e ciency in the resolution of the multi-label lazy learning. The second problem was solved by means of active learning techniques. The active learning methods allowed to reduce the costs of the data labeling process and training an accurate model. Two active learning strategies were proposed. The rst strategy e ectively solves the multi-label active learning problem. In this strategy, two measures that represent the utility of an unlabeled example were de ned and combined. On the other hand, the second active learning strategy proposed resolves the batch-mode active learning problem, where the aim is to select a batch of unlabeled examples that are informative and the information redundancy is minimal. The batch-mode active learning was formulated as a multi-objective problem, where three measures were optimized. The multi-objective problem was solved through an evolutionary algorithm. This thesis also derived in the creation of a computational framework to develop any active learning method and to favor the experimentation process in the active learning area. On the other hand, a methodology based on non-parametric tests that allows a more adequate evaluation of active learning performance was proposed. All methods proposed were evaluated by means of extensive and adequate experimental studies. Several multi-label datasets from di erent domains were used, and the methods were compared to the most signi cant state-of-the-art algorithms. The results were validated using non-parametric statistical tests. The evidence showed the e ectiveness of the methods proposed, proving the hypotheses formulated at the beginning of this thesis.	es_ES
dc.format.mimetype	application/pdf	es_ES
dc.language.iso	eng	es_ES
dc.publisher	Universidad de Córdoba, UCOPress	es_ES
dc.rights	https://creativecommons.org/licenses/by-nc-nd/4.0/	es_ES
dc.subject	Clasificación Multi-Etiqueta	es_ES
dc.subject	Tratamiento de datos	es_ES
dc.subject	Aprendizaje híbrido	es_ES
dc.subject	Estimación de atributos	es_ES
dc.subject	Minería de datos	es_ES
dc.subject	Inteligencia artificial	es_ES
dc.subject	Multi-label classification	es_ES
dc.subject	Data treatment	es_ES
dc.subject	Hybrid learning	es_ES
dc.subject	Feature estimation	es_ES
dc.subject	Evolutionary algorithms	es_ES
dc.subject	Data Mining (DM)	es_ES
dc.subject	Artificial intelligence	es_ES
dc.title	New Hybrid Learning Models for Multi-label Classification and Label Ranking	en
dc.title.alternative	Nuevos Modelos de Aprendizaje Híbrido para Clasificación y Ordenamiento Multi-Etiqueta	es_ES
dc.type	info:eu-repo/semantics/doctoralThesis	es_ES
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es_ES

Ficheros en el ítem

Nombre:: 2016000001530.pdf
Tamaño:: 2.089Mb
Formato:: PDF

Ver/

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem