Label repopulation through evolutionary computation for addressing challenging multilabel problems
Author
García-Pedrajas, Nicolás
Romero-del-Castillo, Juan A.
Haro García, Aída de
Publisher
ElsevierDate
2025Subject
Evolutionary computationMultilabel classification
Label repopulation
Missing labels
METS:
Mostrar el registro METSPREMIS:
Mostrar el registro PREMISMetadata
Show full item recordAbstract
Multilabel classification has recently attracted considerable attention from the research community in data mining. Multilabel classification is concerned with learning where each instance can be associated with multiple classes (or labels). One of the characteristics of many multilabel problems is the low density of relevant labels. This fact makes the classification problem challenging, as there is only sparse evidence to predict most labels. In this paper, we propose a new method for improving the performance of any multilabel method using a label repopulation strategy. We assume that denser datasets may improve the performance of the algorithms. This assumption is based on the fact that adding new labels may make learning the separation surfaces easier; we do not assume that the added labels correspond to actual relevant labels absent from the dataset due to erroneous labeling. Given the uncertainty of which new relevant labels could enhance the learned models, we approach the task as an optimization problem, utilizing evolutionary algorithms to identify the optimal set of new labels. An extensive comparison using 45 datasets and nine different classification models demonstrates the advantageous performance of our approach. The method is applicable to any multilabel classification model.

