Mostrar el registro sencillo del ítem

dc.contributor.authorGarcía Pedrajas, Nicolás
dc.contributor.authorRomero-del-Castillo, Juan A.
dc.contributor.authorCerruela García, Gonzalo
dc.date.accessioned2024-01-24T17:09:58Z
dc.date.available2024-01-24T17:09:58Z
dc.date.issued2021
dc.identifier.issn0031-3203
dc.identifier.urihttp://hdl.handle.net/10396/26740
dc.description.abstractData reduction is becoming increasingly relevant due to the enormous amounts of data that are constantly being produced in many fields of research. Instance selection is one of the most widely used methods for this task. At the same time, most recent pattern recognition problems involve highly complex datasets with a large number of possible explanatory variables. For many reasons, this abundance of variables significantly hinders classification and recognition tasks. There are efficiency issues, too, because the speed of many classification algorithms is greatly improved when the complexity of the data is reduced. Thus, feature selection is also a widely used method for data reduction and for gaining an understanding of feature information. Although most methods address instance and feature selection separately, the two problems are interwoven, and benefits are expected from performing these two tasks jointly. However, few algorithms have been proposed for simultaneously addressing the tasks of instance and feature selection. Furthermore, most of those methods are based on complex heuristics that are very difficult to scale up even to moderately large datasets. This paper proposes a new algorithm for dealing with many instances and many features simultaneously by performing joint instance and feature selection using a simple heuristic search and several scaling-up mechanisms that can be successfully applied to datasets with millions of features and instances. In the proposed method, a forward selection search is performed in the feature space jointly with the application of standard instance selection in a constructive subspace built stepwise. Several simplifications are adopted in the search to obtain a scalable method. An extensive comparison using 95 large datasets shows the usefulness of our method and its ability to deal with millions of instances and features simultaneously. The method is able to obtain better classification performance results than state-of-the-art approaches while achieving considerable data reduction.es_ES
dc.format.mimetypeapplication/pdfes_ES
dc.language.isoenges_ES
dc.publisherElsevieres_ES
dc.rightshttps://creativecommons.org/licenses/by-nc-nd/4.0/es_ES
dc.sourceGarcía-Pedrajas, N., Romero Del Castillo, J. A., & García, G. C. (2021). SI(FS)2: fast simultaneous instance and feature selection for datasets with many features. Pattern Recognition, 111, 107723. https://doi.org/10.1016/j.patcog.2020.107723es_ES
dc.subjectInstance selectiones_ES
dc.subjectFeature selectiones_ES
dc.subjectEvolutionary algorithmses_ES
dc.subjectK nearest neighbor rulees_ES
dc.titleSI(FS)2: Fast simultaneous instance and feature selection for datasets with many featureses_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.relation.publisherversionhttps://doi.org/10.1016/j.patcog.2020.107723es_ES
dc.relation.projectIDGobierno de España. PID2019-109481GB-I00es_ES
dc.relation.projectIDJunta de Andalucía. UCO-1264182es_ES
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses_ES


Ficheros en el ítem

Thumbnail

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem