Scalable CAIM Discretization on Multiple GPUs Using Concurrent Kernels
Ventura Soto, S.
Cios, Krzysztof J.
Parallel implementation of CAIM algorithm
METS:Mostrar el registro METS
PREMIS:Mostrar el registro PREMIS
MetadataShow full item record
CAIM(Class-Attribute InterdependenceMaximization) is one of the stateof- the-art algorithms for discretizing data for which classes are known. However, it may take a long time when run on high-dimensional large-scale data, with large number of attributes and/or instances. This paper presents a solution to this problem by introducing a GPU-based implementation of the CAIM algorithm that significantly speeds up the discretization process on big complex data sets. The GPU-based implementation is scalable to multiple GPU devices and enables the use of concurrent kernels execution capabilities ofmodernGPUs. The CAIMGPU-basedmodel is evaluated and compared with the original CAIM using single and multi-threaded parallel configurations on 40 data sets with different characteristics. The results show great speedup, up to 139 times faster using 4 GPUs, which makes discretization of big data efficient and manageable. For example, discretization time of one big data set is reduced from 2 hours to less than 2 minutes