Assessing Machine Learning Models for Gap Filling Daily Rainfall Series in a Semiarid Region of Spain

: The presence of missing data in hydrometeorological datasets is a common problem, usually due to sensor malfunction, deficiencies in records storage and transmission, or other recovery procedures issues. These missing values are the primary source of problems when analyzing and modeling their spatial and temporal variability. Thus, accurate gap ‐ filling techniques for rainfall time series are necessary to have complete datasets, which is crucial in studying climate change evolution. In this work, several machine learning models have been assessed to gap ‐ fill rainfall data, using different approaches and locations in the semiarid region of Andalusia (Southern Spain). Based on the obtained results, the use of neighbor data, located within a 50 km radius, highly outperformed the rest of the assessed approaches, with RMSE (root mean squared error) values up to 1.246 mm/day, MBE (mean bias error) values up to − 0.001 mm/day, and R 2 values up to 0.898. Besides, inland area results outperformed coastal area in most locations, arising the efficiency effects based on the distance to the sea (up to an improvement of 63.89% in terms of RMSE). Finally, machine learning (ML) models (especially MLP (multilayer perceptron)) notably outperformed simple linear regression estimations in the coastal sites, whereas in inland locations, the improvements were not such significant.


Introduction
The spatial and temporal analysis of meteorological parameters, such as rainfall is crucial to numerous environmental, hydrological, and agroclimatic studies, as well as optimizing issues, such as water resource management or irrigation scheduling [1][2][3][4]. However, one of the most common problems in time series analyses, such as rainfall datasets, is the presence of gaps of different widths, making this task harder to carry out. This usually results from malfunctioning sensors or data loggers, lack of maintenance, meteorological events, or power outages. Sometimes, the solution is not instantaneous and causes delays because it needs the interaction of qualified personnel. Therefore, before starting with analyses, a common practice is to fill these gaps using different methodologies and applying automatic detection algorithms to detect spurious signals in automated weather stations [5].
Due to its high spatio-temporal variability and the large number of interconnected variables involved, rainfall is one of the most challenging atmospheric variables to characterize, estimate, and forecast [6], especially on a daily basis, with higher volatility and chaotic patterns [7]. A variety of techniques have been developed on both a monthly and daily basis. One of the most frequent algorithms to estimate missing rainfall records is the inverse distance weighting method (IDWM), where the estimated values are calculated with a weighted average (it resorts to the inverse of the distance when assigning the weights) from neighbor stations [8,9]. Another simple method to apply is developed a deep belief network model to forecast rainfall one day ahead in Seoul, performing better than MLP. Chen et al. [29] studied the performance of convLSTM with group normalization (GN) to improve the optimization process and employ a multisigmoid loss, inspired in the critical success index (CSI) and compared to the COTREC model. COTREC obtained better performance, in terms of intensity in some areas, whereas convLSTM got a generally more reliable forecast.
This study aims to create a daily rainfall estimation model using only precipitation data, with different approaches in semiarid regions, such as Andalusia, to fill possible gaps in precipitation datasets. Additionally, a new approach is tested in daily rainfall estimations, which uses future precipitation values for this purpose. Thus, in this work, several machine learning models (MLP, SVM, and RF) and approaches for estimating missing rainfall data were tested and compared to empirical algorithms, such as linear interpolation (LI), in 14 locations from two different regions of Andalusia (coastal and inland areas) in Southern Spain. The first approach (A) uses neighbor stations' rainfall data of the same gap day and its distance to the target station. All these neighbor stations are located within a radius up to 50 km, following the recommendations of Barrios et al. [9] on a monthly basis and Estévez et al. [30] on a daily basis. Secondly, a new approach is considered, using only rainfall data (past and future values) from the target station as the model's inputs. Specifically, two different configurations were tested: (B) one day before and after the gap day and (C) two days before and after the gap day.
The rest of the work is organized as follows. Section 2 shows the information about the locations, the dataset, the theoretical background of the different machine learning (ML) models assessed, the preprocessing algorithms, and the evaluation metrics. Then, in Section 3, the results are reported and discussed. Finally, Section 4 describes the conclusions achieved in this work.

Source of Data
This study was carried out in Andalusia, Southern Spain, located in the southwest of Europe. Andalusia is a semiarid region with the following features: the meridians range from 1 to 7° W, the parallels from 37° to 39° N, an elevation above mean sea level from 26 to 822 m, and a total area of 87 268 m 2 .
The datasets used belong to the Agroclimatic Information Network of Andalusia (RIAA) and can be downloaded at the following link: https://www.juntadeandalucia.es/agriculturaypesca/ifapa/riaweb/web (accessed on 8 September 2021). A total of 14 stations, divided into two areas (coastal and inland locations), were evaluated. The first group of areas included Jaen, La Higuera de Arjona, Linares, Mancha Real, Marmolejo, Sabiote, and Torreblascopedro and the second group included Málaga, Antequera, Archidona, Cártama, Churriana, Pizarra, and Vélez. Figure  1 shows their geographical locations, and Table 1 shows their geo-climatic characteristics.

Methodology
An essential prerequisite to guarantee reliable results using raw meteorological data is the application of quality assurance procedures. The quality control guidelines, reported by Estévez et al. [31], were followed, as well as the procedure to detect spurious precipitation signals from automated weather stations (AWS), also Estévez et al. [5].
Afterward, data preprocessing was required for every approach, obtaining the corresponding input configuration, according to every strategy (see Table 2). Three different methodologies were evaluated: approach (i)-the use of rainfall neighbor data and its distance data to estimate the precipitation values at a different site (all locations are located within a 50 km radius); approach (ii)-the use of one day before and ahead rainfall data values from the target station; and approach (iii)-the use of two days before and ahead rainfall data values from the target station.
Later, in order to tune all the different hyperparameters from the different models, train them, and evaluate their performance, the full dataset was split into training, validation, and test. The train (to fit all the final weights and biases from the final model) and test dataset (never-seen data to assess the performance) were randomly split into 70% and 30%, respectively. Prior to this stage, it is necessary to determine all the hyperparameters of the models (such as the number of hidden layers and neurons in a multilayer perceptron). To this purpose, the training dataset was divided into train_2 and validation (random 80% and 20%, respectively) to train and test the different hyperparameters until the fittest set is found. It is worth noting that the seed used in the random algorithm is the same in all cases, so all assessed models (from different approaches) have the same train, test, and validation dataset. Then, the Bayesian optimization algorithm took place, where different hyperparameters were tested, using the validation dataset, until the fittest set was found. Afterward, the entire train dataset from the initial split was used to adjust all the different weights and biases. Finally, the performance accuracy was assessed, using the testing dataset, which was never seen during previous processes. All this methodology is shown in a flowchart in Figure 2. Table 2. Inputs configurations of the different models and approaches assessed. DOY represents the day of year, P corresponds to precipitation, D corresponds to distance, and i represents an index to the dataset-specific day.

Target Station Inputs Approach
Besides, after splitting the dataset into train and test, a standardization was carried out, which is highly recommended to outperform machine learning models, especially neural network-based models [32]. This can be expressed as Equation (1) where represents the input data and ̅ and correspond to the mean and standard deviation of the training dataset, respectively, and * is the standardized data.

Multilayer Perceptron (MLP)
Multilayer perceptron is one of the most used models in different sectors, especially in hydrology [22,33]. Its functionality is based on neurons in the biological nervous system, where many interconnected neurons work together to generate an interaction, based on different stimuli. It is structured in three types of layers, the input and output correspond to the input and output of the model, respectively, as well as the hidden layer, where neurons are located. The activation function determines the output of a node, given a set of inputs. For example, rectified linear output (ReLU) represents a ramp for positive input values. The process in which the neurons learn (value adjustment of weights and biases) is carried out automatically, which is why this layer is called hidden. ADAM, a very common algorithm for this purpose, uses squared gradients to scale the learning rate and a moving average of the gradients.
A single neuron mathematical logic is represented in Figure 3, where w represents the weight and b is the bias factor.  For further information, the following works can be reviewed [34,35]

Support Vector Machine (SVM)
Support vector machine (SVM) is a supervised machine learning model that analyzes data for classification and regression tasks (also known as support vector regression (SVR)). For classification tasks, its functionality is based on searching the fittest hyperplane to separate different datapoints' classes (classification). On regression, it finds the hyperplane and margins that fit all of them (regression). Thus, an easy way to understand SVM for regression is similar to a linear regression, where a hyperplane (that includes the data) is searched, while having the flexibility to define how much error is considered acceptable. Figure 4 shows an example of SVM for classification (a) and regression (b).  The main feature of SVM models is the use of kernels (linear, sigmoid, or gaussian, among others) to enable operation in a high-dimensional feature map, where the number of features is greater than the number of observations. SVM models are often used in rainfall forecasts, with promising results [36][37][38]. For further details, the following work can be reviewed [36,37].

Random Forest (RF)
Random Forest (RF) was first introduced by [39] as a supervised learning algorithm, where the "term" forest defines that it is built as an ensemble of decision tree models. The general idea is that the conjunction of multiple models increases the overall result. Additionally, RF introduces an extra-randomness when the number of trees starts to grow. Instead of searching for the best feature when splitting nodes, it searches for the best features among a random subset of them. The maximum number of features can be defined in scikit-learn as auto, sqrt, log2, none, or the exact number of maximum features (where auto and sqrt refer to the squared root of the initial number of features, log2 refers to the logarithm base 2 of the number of features, and none is to use all the features). This results in a broader diversity and, as a consequence, a better final performance.
Other researchers have already assessed RF in rainfall with promising results [40][41][42]. For further details, the following work can be revised [42].

Bayesian Optimization
One of the critical aspects of machine learning models' efficiency is hyperparameter selection. Depending on whether the correct values have been set, the performance can dramatically change from outstanding to very poor results. A common practice in the scientific community uses a trial-and-error technique [22], where different values are evaluated, varying from dozens to thousands of possibilities. However, this method is far from efficient because if the hyperparameter space is ample, the algorithm (apart from being very slow) wastes significant time in non-promising configurations. On the other hand, when the hyperparameter space is tiny, an accurate hyperparameter configuration set may be missing, despite being quick. To solve this problem, several algorithms have been assessed in different works. In [43], the authors studied the effectiveness of particleswarm optimization (PSO) and genetic algorithm (GA) to predict the monthly rainfall with MLP in a subtropical monsoon climate in Guilin, China. Wang et al. [44] assessed an artificial bee colony (ABC) with MLP to forecast rainfall values in 17 stations in the Wujiang River Basin. Banadkooki et al. [35] evaluated the flow regime optimization algorithm (FRA) with MLP and SVM to forecast monthly rainfall values in Iran.
In this study, Bayesian optimization was used, due to its high popularity in new automated machine learning (AML) models [45][46][47][48] and its good performance in [34,49]. It was first introduced by Wang et al. [44] as an algorithm, based on the Bayes theorem, to search the minimum/maximum function. Part of its popularity is due to its close relation to human behavior when tuning hyperparameters [50,51]. The prior results are taken into account to choose the following promising values to test, following the next four-step procedure: (1) the hyperparameter space is defined, which limits the values of the hyperparameter space; (2) the algorithm considers previous evaluations, in order to choose the following set of values to be assessed (acquisition function)-two kinds of possibilities can be handled, exploitation (consists of testing hyperparameters values that are assumed to be optimal) and exploration (the opposite of exploitation, to identify new best options); (3) to assess the new hyperparameter configuration using an objective function; and (4) if the optimization process has not finished yet, it goes to the second point. In this work, this algorithm was implemented using Python and the scikit-optimize library, following the instructions of Bellido-Jiménez et al. [34]. All the final hyperparameter sets, used for each model, approach, and location, can be seen in Table 3. Table 3. Hyperparameter set for each model, approach, and location, after carrying out Bayesian optimization, where activation represents the activation function, the optimizer represents the optimizer function, epochs represents the number of epochs, neurons represents the number of hidden layers and the number of neurons of each, kernel is the kernel function, c and epsilon represent internal hyperparameters of SVR, n_estimators is the number of trees in RF, and max_features is the number of features to consider when looking for the best split.

Evaluation Metrics
To assess the efficiency of the developed models, the statistics root mean square error (RMSE), mean bias error (MBE), and coefficient of determination (R 2 ) were used. All of them are mathematically expressed as Equations (2)-(4), respectively: where n represents the number of prediction days, measi corresponds to the measured value for a specific day, predi is the predicted value, i represents every single gap day, and corresponds to the mean.

Results and Discussion
In order to help the reproducibility of this work, the best ML models were uploaded to an open access repository in Github (https://github.com/Smarity/gap-fillingprecipitation-atmosphere-special-issue.git, accessed on 8 September 2021). Table 3 shows the RMSE, MBE, and R 2 performances for all locations in Area 1 (inland locations) using the first approach (A), information from other AWS located within 50 km. In Higuera de Arjona, MLP obtained the best RMSE and R 2 values (RMSE = 1.363 mm/day and R 2 = 0.894), very close to RF (RMSE = 1.384 mm/day and R 2 = 0.889). In terms of MBE, LI outperformed the rest of the ML models (MBE = −0.008 mm/day), followed closely to MLP and RF (MBE = 0.016 mm/day and MBE = 0.026 mm/day, respectively). In Jaen, all ML models outperformed LI in RMSE and R 2 , where MLP obtained the best values (RMSE = 1.767 mm/day and R 2 = 0.827), whereas RF beat the rest, regarding MBE (MBE = 0.023 mm/day). In Linares, RF and LI obtained the best performance, in terms of MBE (MBE = 0.001 mm/day and MBE = −0.001 mm/day). Moreover, MLP outperformed the others, regarding RMSE and R 2 (RMSE = 1.723 mm/day and R 2 = 0.817), followed closely by RF (RMSE = 1.730 mm/day and R 2 = 0.815). In Mancha Real, MLP outperformed the other models in all statistics (RMSE = 1.662 mm/day, MBE = −0.072 mm/day, and R 2 = 0.831), whereas SVM was the worst (RMSE = 1.948 mm/day, MBE = −0.195 mm/day, and R 2 = 0.780). In Marmolejo, with the highest mean annual rainfall (523.36 mm/year), the performance, in terms of RMSE and R 2 , showed that RF obtained the best values (RMSE = 2.129 mm/day and R 2 = 0.801), followed closely by SVM (RMSE = 2.154 mm/day and R 2 = 0.795) and MLP (RMSE = 2.176 mm/day and R 2 = 0.791). In Sabiote, the location with the highest altitude, MLP obtained the best performance in RMSE and R 2 (RMSE = 2.049 mm/day and R 2 = 0.752), but LI beat ML in MBE (MBE = −0.006 mm/day). Finally, in Torreblascopedro, SVM outperformed the rest for all statistics (RMSE = 1.246 mm/day, MBE = −0.005, and R 2 = 0.894), being the most accurate from this first region. It is worth noting that MLP generally obtained the best results, regarding RMSE and R 2 , in most locations, whereas RF and LI obtained the best values for MBE. Additionally, even though ML outperformed LI in all locations, the average improvement was not very significant. Tables 4 and 5 shows the RMSE, MBE, and R 2 values for all locations and models in the coastal locations (Area 2). In Antequera, MLP beat the other models for all statistics (RMSE = 1.596 mm/day, MBE = 0.035 mm/day, and R 2 = 0.875), sharing the same R 2 performance with SVM (R 2 = 0.875). All ML models highly outperformed LI, considering all statistics (especially RMSE and R 2 ), except for MBE using SVM. In Archidona, MLP also obtained the most accurate modeling in RMSE and R 2 (RMSE = 1.811 mm/day and R 2 = 0.844), followed closely to SVM (RMSE = 1.817 mm/day and R 2 = 0.844). Regarding MBE, RF outperformed the rest (MBE = −0.019 mm/day). In Cártama, RF obtained the best MBE value (MBE = 0.002 mm/day), whereas SVM got the best RMSE and R 2 performance (RMSE = 2.502 mm/day and R 2 = 0.778). In Churriana, MLP highly outperformed the rest, in terms of RMSE and R 2 (RMSE = 2.192 mm/day and R 2 = 0.876), whereas RF beat MLP in MBE (MBE = 0.019 mm/day and MBE = −0.052 mm/day, respectively). In Málaga, RF obtained the best values for RMSE and MBE (RMSE = 2.433 mm/day and MBE = 0.012 mm/day), whereas MLP got the most accurate values for R 2 (R 2 = 0.830). In Pizarra, all models obtained very similar performance (even LI). RMSE ranged from 2.032 mm/day (by MLP and SVM) to 2.108 mm/day (by LI), MBE ranged from 0.039 mm/day (by RF) to −0.112 mm/day (by SVM), and R 2 ranged from 0.842 (by LI) to 0.854 (by MLP). Finally, in Vélez, MLP outperformed the rest of the models, in terms of RMSE and R 2 (RMSE = 3.219 mm/day and R 2 = 0.742), while RF obtained the best MBE performance (MBE = −0.020 mm/day), followed closely to MLP (MBE = −0.074 mm/day). Generally, the results obtained by ML highly outperformed LI in most locations and statistics, except for MBE, in which LI obtained very accurate results. Thus, the use of ML models to gap-fill daily rainfall data is highly recommended for coastal sites, performing significantly better than LI, arising the effect of sea distance in rainfall modelling. Eventually, in Figure 5, all these RMSE, MBE, and R 2 values, from both areas and all models, are represented in a scatter plot. Due to the different performances between the ML models, it can be stated that MLP obtained the best results, or very close to them, in most locations. On the other hand, SVM had accurate performances in coastal sites, whereas the behavior was not so good on inland locations. Finally, RF behaved opposite to SVM, having an accurate performance on inland locations and a worse modeling on inland sites.   Tables 6 and 7 show the RMSE and MBE values for the inland and coastal locations, using two different approaches, one day after and before (approach B) and two days after and before (approach C), as inputs, respectively. Generally, all the results are much worse than in Tables 2 and 3, for all cases. Mancha Real obtained an RMSE value above 4.0 for all models, whereas Churriana got the worse values (RMSE > 6.0 mm/day). In terms of MBE, La Higuera de Arjona obtained the best performance (MBE = −0.021 mm/day) using MLP and approach B, whereas Marmolejo was the worst (MBE = −1.459 mm/day), using MLP and this same approach. Finally, in terms of R 2 , the values obtained are low, from R 2 = 0.004 (by SVM in Archidona) to R 2 = 0.079 (by MLP in Málaga), highlighting the nonautocorrelation between precipitation values from the previous and following days. Comparing the results between B and C, in terms of RMSE, on average, approach C (RMSE = 4.359 mm/day) obtained a slightly better performance than approach B (RMSE = 4.323 mm/day). However, in area 2, the use of approach B (RMSE = 4.986 mm/day) was significantly better than approach C (RMSE = 5.588 mm/day).

Using Data from the Target Station
Finally, in Figures 5 and 6, all the RMSE, MBE, and R 2 values, from both areas and all models, are represented in a scatter plot. Table 6. RMSE, MBE, and R 2 performance values from testing dataset for all locations and models in the first area (inland locations), using data from the target station in two different approaches, with the use of the previous and following day and the use of the two previous and two following days. The best values from each station are in bold.

One Day (B)
Two Days (C)  Table 7. RMSE, MBE, and R 2 performance values from testing dataset for all locations and models in the second area (coastal locations), using data from the target station in two different approaches, with the use of the previous and following day and the use of the two previous and two following days. The best values from each station are in bold.

Comparison of the Two Areas
In order to compare the results in the two different areas, Figure 7 shows the RMSE, MBE, and R 2 performance values for these two kinds of locations (inland and coastal), using the best approach (data from neighbor stations). In terms of RMSE mean values, MLP outperformed RF and SVM. Besides, the models applied on the coastal locations underperformed, on average, in all cases and obtained higher variability across sites, rather than inland ones. In terms of MBE mean values, RF and LI obtained values very close to 0, whereas SVM overestimated in most stations. Finally, in terms of R 2 , the results by ML models were quite similar in both inland and coastal locations. However, the results of LI were significantly worse than ML in coastal sites, whereas SVM performed worse on inland sites than coastal. Additionally, Table 8 displays the maximum improvement, in terms of RMSE, R 2 , and MBE, comparing ML to LI (using the first approach). In inland sites, the RMSE improvement ranged from 0.031 mm/day in Torreblascopedro to 0.263 mm/day in Marmolejo, as well as from 0.004 (Torreblascopedro) to 0.048 (Marmolejo), in terms of R 2 . On the other hand, the upgrades in coastal sites ranged from RMSE = 0.076 mm/day and R 2 = 0.012 (in Pizarra) to RMSE = 1.475 mm/day and R 2 = 0.25 (in Archidona). Thus, coastal locations significantly differed between linear interpolation and ML models for gap-filling daily rainfall. In contrast, in inland areas, the improvement was not substantial.
Thus, using empirical approaches (such as LI) to gap-fill daily rainfall data is not recommended, especially in coastal sites; the results are worse than ML, due to the effect of sea distance.

Seasonality Performance
In order to assess seasonal performance, the RMSE, MBE, and R 2 of all the stations and approaches, for the different evaluated models (SVM, MLP, and RF), are represented in Figure 8. Regarding RMSE, summer, autumn, and spring obtained very similar average performances, whereas, in winter, the mean results were the worst. Moreover, summer obtained the narrowest interquartile range, but spring and winter got the more extensive range, with LI being the model with the worst range (the less confident between different stations) among all seasons and models. MBE, MLP, RF, and LI always performed with very similar average results, although LI had the widest interquartile range for all seasons. Besides, SVM always performed the worst, in terms of MBE. In terms of R 2 , the highest mean values were carried out in winter, whereas the worst results were achieved in summer and spring. Regarding mean, all models performed with similar values during the same season. Additionally, in Figures 9 and 10, the values predicted by the different ML models using the first approach are shown and compared to LI. In Figure 9, the predictions from Torreblascopedro are plotted (the site that obtained the best performance, in terms of RMSE and R 2 ). In winter, all predictions are close to the 1:1 line, which denotes the excellent performance of this model during this season. The predictions were also close to the 1:1 line in spring and autumn, although the points were more dispersed than in winter. Finally, summer obtained the worst results, with farthest points to the 1:1 line, especially with high rainfall values. Finally, Figure 10 plots the prediction rainfall values in Archidona. Spring obtained the best general predictions among all models, followed by autumn, summer, and winter, in this order. The highest differences between ML and LI were found in winter and autumn, where most LI predictions were farther from the 1:1 line.
Generally, summer obtained worse results than the rest of the seasons, due to the Mediterranean climate; during summers in Andalusia, precipitation is very occasional. They usually respond to local events, such as local torments. So, gap-filling rainfall data using neighbor stations with very different pluviometry makes models fail in those specific dates. Comparing the results between Torreblascopedro and Archidona, the most significant differences can be seen in winter, where LI performed much worse than ML approaches.

General Discussion
In terms of R 2 , the results obtained in this work outperformed those obtained by Kim and Ryu [52] (Pocatello, ID, USA) using IDWM, OK, and GME, in conjunction with cluster analysis, having the best R 2 performance, with a value below 0.7 (R 2 = 0.689 or R = 0.83). Besides, the models developed in this work highly improved the RMSE and R 2 performance of Wuthiwongyothin [53] in Northern Thailand, using the K-means technique with the inverse distance weighting (IDW) and correlation coefficient weighting (CCW), where the mean R 2 values among all stations were below 0.6. Moreover, in terms of R 2 , the values obtained by Sehad et al. [54] in North Algeria using multispectral MSG SEVIRI imagery were slightly worse, on average, than the obtained in this work, with a mean R 2 = 0.7241. However, in absolute terms, its developed model outperformed this work's best results (R 2 = 0.921 against R 2 = 0.898 in Torreblascopedro using SVM). Thus, ML models with neighbor station data located within a 50 km radius are highly recommended to gap-fill rainfall values in coastal locations, due to their accurate performance (among other approaches) in the different areas assessed along the Andalusia region, being the preferred use of neighbor stations, over the use of cluster analysis with stations located within a further radius distance. However, in inland sites, the performances carried out by ML against LI were not as significant as in coastal sites. Finally, in order to improve the state of the art of these approaches, future works could analyze the possibility of false alarms and missing rainfall cases using the models developed in this work.

Conclusions
Three different approaches were evaluated for gap-filling daily rainfall values: (A) the use of data from neighbor stations within 50 km, (B) the use of one day before and ahead from the target station, and (C) the use of two days before and ahead from the target station. Fourteen different locations were evaluated from two areas, corresponding to inland and coastal sites. Additionally, three different ML models were assessed for this purpose: MLP, SVM, and RF. Daily large datasets of around 21 years were used (from 2000 to 2021), where 70% was used for training and a random 30% for testing purposes. Besides, 20% from the training dataset was used to find the fittest hyperparameters. Finally, a seasonality analysis was carried out. Based on the arisen results, no ML model significantly outperformed the rest, although MLP obtained the best results, or very close to them, in most locations. On the other hand, SVM had accurate performances in coastal sites, whereas the behavior was not so good at inland locations. RF behaved the opposite to SVM, having an accurate performance at inland locations and worse modeling at inland sites. Moreover, the first approach (the use of neighbor data) was notably better than the other approaches, with RMSE values below 2.0 mm/day and R 2 values above 0.85 in most stations. There were no significant seasonal differences in performance, in terms of RMSE and MBE values in winter, spring, and autumn, but the results obtained in summer were generally worse for all locations. Besides, coastal area location models performed slightly worse and with higher performance differences between ML and LI, in most sites and models, highlighting the differences in rainfall prediction efficiency, depending on the sea distance. In conclusion, it could be stated that the use of neighbor data with MLP is highly recommended as a rainfall gap-filling technique, rather than the use of data from the target station from the past and future. Moreover, when these work's results are compared to different paper's approaches using a cluster analysis from wider ranges, the use of closer stations (within a 50 km radius) obtained better results in terms of R 2 .
Finally, due to the significant need to have a complete time series rainfall dataset on a daily basis and the increasing interest in installing low-cost wireless sensors (IoT), the models developed and assessed in this work can help with gap-filling datasets in this work near-real-time, thanks to the decreasing price of the low-cost automated weather stations using these new devices. Data Availability Statement: Publicly available datasets were analyzed in this study. These data can be found here: https://www.juntadeandalucia.es/agriculturaypesca/ifapa/riaweb/web (accessed on 8 September 2021).