STATISTICAL FOUNDATIONS OF NEURAL NETWORKS : BRIDGING THEORY AND PRATICE” – DR C.G UDOMBOSO

unclemtech January 21, 2024 EDUCATION Leave a comment 100 Views

PREAMBLES
With utmost humility, I express profound gratitude to Almighty God, the Creator of all, for His unwavering blessings and guidance, underpinning my existence. His omnipotence, omnipresence, and omniscience merit my deepest reverence. I particularly thank Him for the salvation of my soul. I extend heartfelt appreciation to the former Dean of Science, Prof. A. A. Bakare, for initiating this Faculty Lecture, and to the current Dean, Prof. O. O. Sonibare, for making it a reality. When I accepted this challenge, I did so with mixed emotions, aware of the daunting task of summarizing years of research in sixty minutes before such a knowledgeable audience. This Faculty Lecture marks the third from the Department of Statistics. The first, “Statistical Literacy and Empirical Modeling for National Transformation,” by Prof. O. E. Olubusoye, was presented on October 23, 2014. The second, “Statistical Modeling in Environmetrics,” was delivered by Prof. K. O. Obisesan on May 26, 2021. Today’s lecture, representing the Department’s Computational Statistics Unit, encompasses nearly 16 years of work, constituting approximately 60% of my cumulative research endeavors.
My journey into academia began with a childhood aspiration. When I was a student at Moremi High School, Road 7, University of Ife (now Obafemi Awolowo University), I was passing through the senior staff quarters with my late father, who was a pioneer staff member at the university. We encountered a young man whom my father addressed as “Doctor”. I was intrigued and inquired if he was a medical doctor. He clarified that the title meant he was a “doctor of books”. My little brain did not understand what that meant, but at that moment, I resolved to become a “doctor” too, whether in medicine or academia. Years later, I realized he was referring to a Doctor of Philosophy (PhD) as the “doctor of book”.
Passion for mathematics led me to statistics during a time of limited mathematical prospects. My Ph.D. focus then was to apply probability theory to environmental studies. In 2007 one of my supervisors, Prof. Amahia encouraged collaboration with Geography, which connected me to Prof. Akintola, who ignited my interest in neural networks through his Ph.D. student at the time – Dr. Onafeso Neural networks was a concept I had previously encountered but hadn’t delved into due to my limited understanding. In 2009, I visited the Electrical Electronics Engineering Department at Obafemi Awolowo University to see Engr Alawode, at the instance of Dr. Oluwaseun A. Otekunrin. There, I acquainted myself with MATLAB 2007a, essential for my PhD research. Also in 2009, through Prof. Angela U. Chukwu I connected with Prof. I. K. Dontwi of the Department of Mathematics, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana, who later became one of my supervisors. By 2012, a pre-doctoral fellowship took me to his university, where I formally entered the world of neural networks and dedicated myself to mastering MATLAB 2009a.
1.1 The Earliest Forms of Statistics
Though the concept of Statistics was not known in the early days of human existence, some rudimentary forms of data collection and counting are mentioned in the Bible, especially in the context of censuses, population counts, and record-keeping. For example, the Bible occasionally mentions censuses or population counts. The most well-known census is described in the Book of Numbers (Old Testament), where a census of the Israelites was conducted in the wilderness. Additionally, the Bible contains records of contributions and offerings made by individuals and communities. For example, tithes (one-tenth of income) were collected for the support of religious institutions. It also includes genealogies and lists of names, especially in the Old Testament, to establish lineages, trace family histories, as well as maintain records, and also makes references to agricultural practices and harvests, although these are not presented in a systematic statistical manner.
1.2 The Historical Development of Statistics
The history of statistics is a complex, centuries-long journey with contributions from diverse cultures and fields. Early statistical thinking can be traced to ancient civilizations like Babylon and Egypt, using statistical methods for taxation and census-taking. In the Renaissance (16th-17th centuries), European scholars like Cardano and Graunt advanced demographic data analysis. Graunt notably pioneered mortality statistics. The 18th century witnessed the rise of probability theory, foundational for modern statistics. Mathematicians like de Moivre and Laplace made significant contributions to probability theory, crucial for statistical inference.
The 19th century saw significant advances in statistics. Adolphe Quetelet, often called the “father of statistics,” pioneered the normal distribution and the concept of the “average man.” Francis Galton introduced regression and correlation. Sir Francis Ysidro Edgeworth contributed to statistical theory, including the Edgeworth series. William Sealy Gosset, known as “Student,” developed the t-distribution for hypothesis testing. In the late 19th and early 20th centuries, Karl Pearson made notable contributions, including the Pearson correlation coefficient. Ronald A. Fisher, another influential statistician, advanced experimental design and hypothesis testing, among other areas.
The 20th century marked the formalization and rapid growth of statistics. It became pivotal in scientific research, social sciences, and industry. Modern computing further empowered statistical analysis. Bayesian statistics experienced a resurgence due to computing advances, offering a distinct approach. The late 20th and early 21st centuries ushered in the “big data” era, posing new challenges. Statisticians developed techniques for large datasets, and data science emerged. Statistics evolved further with machine learning, artificial intelligence, and data analytics, broadening applications in predictive modeling, data mining, and deep learning.
Today, statistics is an integral part of various fields, including science, technology, medicine economics, social sciences, humanities, and business. It plays a vital role in decision-making, research, and understanding the world through data-driven insights. The historical development of statistics reflects the human quest to make sense of data and harness its power for various purposes.
1.3 The Science and Arts of Statistics
Statistics is a mathematical and scientific discipline that encompasses data collection, analysis, interpretation, presentation, and organization. Its core objective is to extract meaningful insights from data, facilitating informed decisions and conclusions. It involves various mathematical processes, ranging from basic to intricate. Common challenges in statistics courses often stem from students’ limited foundational mathematics knowledge. In the past, some students avoided essential mathematical statistics courses like probability and stochastic processes, as they were elective. Now, these subjects are mandatory to ensure comprehensive understanding. Requiring students to study these mathematical courses is crucial. They form the foundation for statisticians, are vital at the postgraduate level, and are indispensable for data scientists. These courses provide the theoretical basis for mathematical and statistical algorithms, essential for success in machine and statistical learning. Machine learning and data science are deeply rooted in mathematical and statistical principles, extending beyond coding and algorithms, involving software solutions like Python, MATLAB, R, SAS, and SPSS, which all rely on these foundations.
The term ‘art of statistics’ captures the creative and interpretive aspects of statistical analysis. Beyond its scientific and mathematical identity, statistics possesses an inherent artistic dimension. This encompasses decoding data, effective communication of findings, and using statistics beyond the quantitative realm. Elements include data visualization, storytelling through data, model crafting, ethics, artful data presentation, communication skills, and iterative refinement. It’s about transforming data into meaningful insights that inform decisions and deepen understanding. Statistics applies broadly, from sciences to arts. The University of Ibadan’s Postgraduate College recently approved programs in Computational Linguistics and Natural Language Processing within the Department of Communication and Language Arts. These programs, part of AI and data science, encourages collaboration involving CLA, Computer Science, Statistics, Mathematics, and Data and Information Science Departments.
HISTORICAL DEVELOPMENTS IN STATISTICAL NEURAL NETWORKS RESEARCH
The inception of Neurocomputing and Neural Networks dates back to McCulloch and Pitts’ seminal 1943 article, showcasing neural networks’ capacity to compute arithmetic and logical functions. The late 1950s saw Rosenblatt introducing perceptrons, followed by the development of the perceptron convergence theorem (1957, 1958). Yet, Minsky and Papert’s 1969 findings dampened enthusiasm for perceptrons. However, the early 1980s witnessed renewed interest, driven by Hopfield networks (1982), Kohonen’s Self-Organizing Maps, the resurgence of Werbos’ back-propagation algorithm (1974), and Anderson and Rosenfeld’s historical account (1988). Notably, despite perceptron limitations, silent research persisted during the 1970s, manifesting as adaptive signal processing, pattern recognition, and biological modeling.
2.1 Neural Networks in Statistics
Research into statistical aspects of Artificial Neural Networks (ANNs) began in 1989 with White’s study. Hornik et al. (1990) established ANNs for approximating unknown mappings. Subsequent work connected ANNs to statistical pattern recognition and delved into various aspects, including analysis, modeling, and econometrics. Kumar et al. (1995) compared ANNs with logistic regression. Anders and Korn (1999) advanced SNN model selection. Further developments included time series approaches, comparative studies, and mathematical methods in ANNs.

2.2 Neural Networks and Statistical Terminologies
Despite many similarities between ANN models and statistical models, terminologies used in both fields are quite different. For example, Sarle (1994) claimed that the terminology ‘back-propagation’, should refer only to the simple process of applying chain rule to compute derivatives for the generalized delta rule algorithm, and not the training method itself. He further added that this confusion is symptomatic of the general failure in ANN literature to differentiate between models and estimation methods. Sarle (1996) gave a list of statistics terminology that has its equivalence in ANN literature.

Table 1: Statistical and ANN Terminologies
Statistical Terminology ANN Terminology
variables features
independent variables inputs
predicted values outputs
dependent variables targets or training values.
residuals errors
estimation training, learning, adaptation, or self-organization.
estimation criterion error function, cost function, or Lyapunov function
observations patterns or training pairs
parameter estimates (synaptic) weights
interactions higher-order neurons
transformations functional links
regression and discriminant analysis supervised learning, self-organization or heteroassociation
data reduction unsupervised learning, encoding, or autoassociation
cluster analysis competitive learning or adaptive vector quantization
interpolation and extrapolation generalization
intercept bias
error term noise
forecasting prediction

The statistical terms “sample” and “population” do not seem to have ANN equivalents. However, data are often divided into a training set and test set for cross-validation.

2.3 Similarity of Neural Network Models to Statistical Models
2.3.1 General Linear Models of the Statistical Neural Networks
In general, feedforward networks with no hidden layer are basically generalized linear model.

Input Output Target

      Independent                    Predicted                                  Dependent
         Variable                           Value                          Variable

Figure 1: Simple Linear Regression

Figure 1 is a chart of the simple linear regression predicting the dependent variable (target) as the predicted variable (output), from a set of independent variable(s) (input).

Input Hidden Output

     Independent                   Summing                              Dependent
         Variables                        Junction                                    Variable

Figure 2: A Simple Sketch of the Artificial Neural Network

Figure 2 depicts an artificial neural network (ANN) framework comprising input, hidden, and output units. Input accepts variables, hidden preprocesses with variable separation based on hidden neurons, enhancing precision. Output displays results, also recognized as the Multilayer Perceptron (MLP).
Sarle (1994) stated that the perceptron model with different transfer functions has been shown to have equivalent statistical models. For example:
An ADALINE is a linear two-group discriminant. (Weisberg 1985).
Simple Nonlinear Perceptron is a Logistic Regression (Hosmer and Lemeshow 1989).
ADALINE is a Linear Discriminant Function (McLachlan 1992; Weiss and Kulikowski 1991). An ADALINE is a linear two-group discriminant.

2.3.2 Nonlinear Models of the Statistical Neural Networks (MLPs)
i. Multilayer Perceptron with one output is a Simple Nonlinear Regression (Sarle 1994)
ii. Multilayer Perceptron with several outputs is a Multivariate Multiple Nonlinear
Regression
Functional Link Network is a Polynomial Regression
Linear Multilayer Perceptron is a Maximum Redundancy Analysis
Nonlinear Multilayer Perceptron is a Nonlinear Maximum Redundancy Analysis

There are SNN models that represent Principal component analysis in various forms, including the linear and nonlinear. An MLP having moderate number of hidden neurons is essentially the same as a projection pursuit regression. Difference between the two is that an MLP uses a predetermined transfer function while the projection pursuit regression model uses a flexible nonlinear smoother (Sarle 1994). Also, an MLP results into a nonparametric sieve if the number of hidden neurons increases simultaneosuly with sample size (White 1988). This makes it a useful alternative to methods like kernel regression (Hardle 1990) as well as smoothing splines.

FROM HOMOGENEOUS TO HETEROGENEOUS: THE POWER OF TRANSFER FUNCTIONS
The Multi-Layer Perceptron (MLP), a common Artificial Neural Network (ANN), focuses on neuron weights and transfer functions (TFs). However, the complex TFs in MLPs reduce model interpretability and adaptability (Tayfur, 2002). Many MLP neurons use identical TFs, limiting versatility and causing errors (Adeyiga et al., 2011), leading to user reluctance (Toprak and Cigizoglu, 2008). To address this, integrating mixed transfer functions or using networks with multiple functions (Ashigwuike, 2012) is suggested. Transitioning from Homogeneous Transfer Functions (HOMTF) to Heterogeneous Transfer Functions (HETTF) can enhance comprehensibility while maintaining precision (Adewole et al., 2011). HETTFs aim to create transparent neural networks capable of precise modeling, akin to MLPs (Adepoju et al., 2007), improving adaptability and transparency.
Anders (1996) proposed a mathematical model for SNN, giving it as
(1)
where is the dependent variable, is a vector of independent variables, is the network weight: ‘’ is the weight of the input unit, ‘’ is the weight of the hidden unit, and ‘’ is the weight of the output unit, and is the stochastic term that is normally distributed (that is, ). Basically, is the ANN, expressed as
(2)
where is the homogeneous transfer function.
In Udomboso (2013 and 2021), two HETTFs (satlins_tanh and satlins_tansig, respectively) from three respective HOMTFs, using convolution of functions, were developed for the SNN model:
(3)
for satlins_tanh, and
(4)
for satlins_tansig. where the CDFs, means and variances were obtained in both cases.
Analysis were performed to demonstrate the models, and the results presented below:

Table 2: Model Output with Satlins, Tanh and Satlins_Tanh
MODELS
MSE
NAR2
NIC
ANIC

2 Vars
3 Vars
2 Vars
3 Vars
2 Vars
3 Vars
2 Vars
3 Vars

SATLINS
0.00109
0.017271
0.89792
0.883231
0.004233*
0.059485
1.523973
2.044839

TANH
0.000973*
0.017089
0.906587*
0.883705*
0.004397
0.127796
1.519695*
2.133884

SATLINS_TANH
0.000973*
0.01583*
0.906587*
0.882506
0.004383
0.048318*
1.519665*
2.043398*

Table 3: Model Output with Satlins, Tansig and Satlins_Tansig
MODELS
MSE
NAR2
NIC
ANIC

2 Vars
3 Vars
2 Vars
3 Vars
2 Vars
3 Vars
2 Vars
3 Vars

SATLINS
0.00109
0.017271
0.89792
0.883231
0.004233
0.059485
1.523973
2.044839

TANSIG
0.000975
0.016814
0.899762
0.886528*
0.00416
0.053603
1.522966
2.039348

SATLINS_TANSIG
0.000946*
0.016055*
0.900028*
0.885393
0.003642*
0.03882*
1.522208*
2.032813*

In Udomboso and Ilori (2022), data was obtained from the petroleum oil leakages information in the Niger Delta region of Nigeria, and the above methodology was performed to observe the behavior when life data is involved. An HETTF (hardlims_tansig) from two respective HOMTFs (hardlims and tansig).
The resulting HETSNN are as follows:
(5)
where the CDF, mean and variance are given.

Table 4: Predictive Performance of the HOMSNN and HETSNN Models
Transfer Function
Predicted
Error
NIC

Mean
Variance
Mean
Variance

HARDLIMS
5.0999
0.1564
0.1001
0.5744
1.0854

TANSIG
5.1373
0.1776
0.0627*
0.4357*
1.1899

HARDLIMS_TANSIG
4.8648*
0.0395*
0.3352
0.5989
0.6938*

DOMINANCE AND ADMISSIBILITY OF THE SNN ESTIMATOR OVER THE OLS ESTIMATOR
Estimation theory, crucial in various fields, involves minimizing norms in Hilbert space, with methods like Least Squares, Maximum Likelihood, and Bayesian techniques. Stein’s 1956 discovery of an estimator superior to Least Squares, known as Stein’s phenomenon, was initially met with skepticism (Stein, 1956). James and Stein refined this estimator in 1961, creating the James-Stein estimator, which faced criticism in the 1960s and 1970s (James & Stein, 1961). Empirical Bayes support from Efron and Morris in 1972 gradually bolstered the James-Stein estimator’s credibility (Efron & Morris, 1972).
Lemma (Stein): Let , and let be a differentiable function such that
. Then (6)
Various attempts have been made to improve upon the James-Stein estimator, including Baranchik’s positive-part estimator (Baranchik, 1964) and Thompson estimator (Thompson, 1968). These attempts aimed to address limitations and complexity concerns. Ben-Haim later introduced the spherical blind minimax estimator (SBME) and ellipsoidal blind minimax estimator (EBME), which outperformed traditional least squares (OLS) estimators (Ben-Haim, 2006). These advancements represent significant developments in the pursuit of better estimators beyond the James-Stein framework.
Udomboso and Dontwi (2022) considered the problem of estimation in both the Ordinary Least Squares and the Statistical Neural Network. We investigated the estimator that achieves a risk that is as low as possible by adopting an approach based on Spherical Blind Minimax Estimation (SBME). We proved that the Statistical Neural Network estimator, , strictly dominates the Ordinary Least Squares estimator, by proving a result that dominates the general SBME obtained by Ben-Haim in 2006.

Theorem: In a class of minimax estimators, the SNN estimator has always a mean square error that is less than the OLS estimator.
The proof of this theorem starts by expressing the of , given by
(7)
And by Stein’s lemma introduced above, the rigorous mathematics showed that
(8)
is the largest eigenvalue of . Whenever , the second term will vanish. Also, whenever , and , the expectation will be taken over a strictly negative range. This result makes to be lower than . Hence, , strictly dominates . We note that this result dominates the general SBME obtained by Ben-Haim (2006), which he showed to dominate whenever . This is a further proof of the superiority of the statistical neural network.
The study showed SNN dominance and admissibility over OLS with HOMSNN and HETSNN models. Data was divided into two sets – 2 and 3.

2 Vars 3 Vars
Figure 3: Graph of MSE Performance across the Models

It is important to note here that both the SNN and the classical statistical models are empirical models with their own strength and weaknesses, while possessing some similarities.

STATISTICAL INFERENCE AND NEURAL NETWORKS
Determining the optimal number of hidden units for peak network performance has long perplexed researchers. Too many hidden neurons ensure accurate learning and predictions on trained data, while too few hinder relationship comprehension, leaving error levels unacceptable. Resop (2006) probed the consequences of exceeding the hidden neuron count needed for statistical model-level accuracy, sparking questions about increased accuracy. Panchal et al. (2011) emphasized the critical nature of hidden layer neuron and layer counts in network architecture. Underutilized neurons lead to underfitting, while an excess risks overfitting. Udomboso et al. (2012) addressed these dilemmas with an -based approach and statistical inference, highlighting the significance of variable selection for optimal neural network performance and the perils of redundant variables.

5.1 for Change in Hidden and Input Units
The performance of a network can be determined by the coefficient of determination, . Let denote the coefficient of determination of a network with a given number of hidden units, and denote the coefficient of determination of a network given a change in the number of hidden units. We note here that the change, , can be an increase, or a decrease, . The error produced by this change is given as . The is the output with a given number of hidden units, and is the output given a change in the number of hidden units.
Hence, contribution is,
(9)
And similarly, for a change in input units, the difference in contribution is,
(10)

5.2 Statistical Inference on for Change in Hidden and Input Units
We noted that . We derive the F test for change in HN as
(11)
which in terms of sum of squares, is give as
(12)
The hypothesis for this problem is formulated as versus , where is the parameter of the hidden unit, and . We reject the null hypothesis if .
In the same, without conflict of symbols, we show that for change in input neuron,
(13)
where is the number of input parameters or variables after a change.
The hypothesis is set up by assuming that an input variable , has no effect on the output . That is, against , where is the parameter of the input unit, and . We reject the null hypothesis if .
In the illustration, we considered stepwise increase of the hidden neurons from 1 to 10, keeping the input unit constant. The second set considers input units from 2 to 6, keeping the hidden neurons constant.

Table 5: Values for change in Hidden Neurons
HN

F
HN change
change
F change
change
Decision on

1
0.478346
7.335835
1 to 2
0.332489
26.95519
26.95519
Reject

2
0.810835
34.28647
2 to 3
-0.02534
-4.99637
4.99637
Accept

3
0.785492
29.29465
3 to 4
0.087695
25.79043
25.79043
Accept

4
0.873187
55.08508
4 to 5
0.011567
6.331858
6.331858
Accept

5
0.884754
61.40357
5 to 6
-0.12842
-36.5849
36.5849
Accept

6
0.756336
24.83205
6 to 7
0.092212
19.98969
19.98969
Accept

7
0.848547
44.81431
7 to 8
0.046143
23.14491
23.14491
Accept

8
0.894691
67.96664
8 to 9
0.084534
309.0962
309.0962
Reject

9
0.979224
377.0628
9 to 10
-0.20962
-350.339
350.339
Accept

10
0.769609
26.72359

From Table 5, it is noticed that except for hidden neurons changes from 2 to 3, 5 to 6, and 9 to 10, which shows negative contribution of the change, other results shows a positive contribution of the change. In the same vein, the inference results shows rejection of hidden neurons contribution to the network model at hidden neurons change from 1 to 2 and 9 to 10. This implies that there was no significant effect on the network model when the number of hidden neurons is increased to 2 and 10 respectively.

Table 6: Values for change in Input Units
K

F
K change
change
F change
change
Decision on

2
0.769609
26.72359
2 to 3
0.219852
724.4002
724.4002
Reject

3
0.989462
327.8361
3 to 4
-0.00063
-18.6582
18.6582
Accept

4
0.988834
176.722
4 to 5
0.009836
1324.805
1324.805
Reject

5
0.99867
938.7028
5 to 6
-0.01089
-837.658
837.658
Accept

6
0.98778
64.66858

Table 6 shows that at change from an even input unit to an odd input unit change is positive, while at change from an odd input unit to even input unit, we have negative change. This is attested to in the rejection of the contribution the input units whenever it is increased from even to odd, and acceptance whenever the input unit is increased from odd to input.

INFORMATION CRITERION FOR THE STATISTICAL NEURAL NETWORS
Model selection is vital in data analysis, and Akaike (1973, 1974) introduced AIC to evaluate model fit. AIC inspired other criteria like SIC (1978), BIC (Akaike, 1978), and HQ (Hannan & Quinn, 1979). Sugiura (1978) introduced AICc, a correction for small samples. In neural networks, determining optimal parameters, especially hidden units, is challenging. Murata et al. (1994) introduced NIC, inspired by AIC, to find the best model and parameters for approximating the system’s distribution from training examples, measuring the network’s risk, , against the target distribution (Murata et al., 1994).
The Network Information Criterion, as developed by Murata eta al. (1994), is given as
(14)
where empirical parameter, and estimated parameter
Udomboso et al. (2016) expanded on model selection in Statistical Neural Networks using their developed Network Information Criteria (NIC). NIC, though sample-biased, offers objective, data-driven criteria for choosing the optimal parameter , from candidate models. This study employed a criterion designed to be an asymptotically unbiased estimator of expected Kullback-Leibler information, following Akaike (1973).

6.1 Adjusted Network Information Criterion (ANIC)
In deriving an Adjusted , we assume that the estimates network model includes the true network model, and the approach used the corrected based on Kullback’s systematic divergence as used by Hafidi and Mkhadri (2006). We note that
(15)
where is some value that improves the , which is asymptotically an unbiased estimator of
, where is the dimension of , and is given as ,
and is the .
The process to develop the ANIC stemmed from proving the expression
(16)
which after some mathematical analysis, results into
(17)
which is a correction for the biased .

Table 7: Sample Sizes at which NIC and ANIC exhibit Local Minima

NIC
ANIC

2 Vars
3 Vars
2 Vars
3 Vars

SATLINS
20, 80, 150, 250
40, 80, 125, 200
250
60, 125, 200

TANH
60, 100, 250
40, 125, 175, 250
80, 200
60, 125, 175

TANSIG
40, 100, 150, 250
40, 80, 125, 175, 250
20, 60, 200, 300
40, 80, 150, 300

SATLINS_TANH
60, 100, 150, 300
60, 100, 175, 250
80, 150, 200
150, 400

SATLINS_TANSIG
60, 100, 250
60, 125, 250
150, 250
40, 250

Rates of efficiency for the NIC and ANIC are 33% and 50%, for 2-variable case, and 30% and 43%, for 3-variable case, respectively. The results of the ANIC demonstrate the high precision of SNN models at large samples. When comparing NIC and ANIC for neural network model selection, more local minima with increasing sample size don’t necessarily mean better performance. ANIC, with fewer local minima than NIC, better handles sample size impact due to its complexity-fit trade-off, reduced sensitivity, correction, reduced overfitting, enhanced generalization, efficiency, and practicality.

THE WAVELETS NEURAL NETWORK
The wavelet transform is given by
(18)
where and are the scale and location parameters, respectively.
By adjusting the scale parameter, , a series of different frequency components in the signal can be obtained.
Restricting to discrete values, , then
(19)
where is known as the mother wavelet.
In both cases, it is assumed that .
Wavelet methods have been most studied in the nonparametric regression problem of estimating a function on the basis of observations at time points , and are modeled as
(20)
where is the noise.
Wavelet Neural Networks (WNNs) merge wavelet theory and neural networks. WNN is structured after the feed-forward neural networks, comprising input, hidden, and output layers with linear combiners. Hidden neurons employ orthonormal wavelet family activation functions.
In estimating WNN, we minimize using the usual least-squares cost function;
(21)
where is the number of estimation (training) samples for each class, and is the optimal output, of the input vector.
The partial derivative of the parameters , , and are obtained, and adjustment is made for the parameters by the following equation:
(22)
where is the vector of the parameters , , and , and is the learning rate between 0.1 and 0.9.
APPLICATIONS OF THE SNN MODEL TO SOLVING LIFE PROBLEMS
Applications have been made to diverse real-world challenges, spanning oil and gas, climate, economics, health, and more. Wavelet transforms and time series were utilized in some cases.
8.1 Applications to the Environment
8.1.1 Modeling of Rainfall Precipitation
In a comparative study by Udomboso and Amahia (2011) focusing on rainfall prediction in Ibadan, Nigeria, they assessed the performance of Ordinary Least Squares (OLS) and Statistical Neural Network (SNN) models. Utilizing data from the Nigerian Meteorological (NIMET) station in Ibadan, they analyzed rainfall, temperature, and humidity. Their findings indicated that, as sample size increased, OLS’s Mean Squared Error (MSE) decreased, while SNN’s MSE increased. However, SNN outperformed OLS in terms of MSE, AIC, and SIC across different sample sizes, highlighting its superior performance in modeling rainfall patterns. (See Table 8)
Table 8: Model Selection for both the OLS and SNN
OLS
SNN

MSE

AIC
SIC
HN
MSE

AIC
SIC

132
8.00
0.25
0.237
8.368
8.934
2
5
10
50
100
3.40
2.41
2.60
2.53
2.39
0.03
0.31
0.26
0.28
0.32
0.02
0.30
0.25
0.27
0.31
3.558
2.522
2.721
2.648
2.501
3.799
2.693
2.905
2.827
2.671

264
6.78
0.28
0.275
6.933
7.220
2
5
10
50
100
5.91
5.46
4.44
3.53
0.76
0.06
0.13
0.29
0.44
0.89
0.05
0.12
0.29
0.44
0.89
6.046
5.586
4.542
3.611
0.778
6.297
5.817
4.730
3.761
0.810

396
6.70
0.29
0.289
6.795
7.003
2
5
10
50
100
9.43
8.88
7.98
5.41
2.48
0.00
0.06
0.17
0.43
0.74
0.00
0.06
0.17
0.43
0.74
9.574
9.016
8.102
5.493
2.518
9.867
9.292
8.350
5.661
2.595

Udomboso et al. (2014) employed Artificial Neural Networks and Continuous Wavelet Transform to model monthly rainfall in Ibadan (Jan 1971 to Dec 2003). This approach blended wavelet and neural network techniques for improved rainfall simulation and forecasting, analyzing 48 sub-time series but focusing on the top 10. Table 9 displays sample statistics for the original data and CWT. Original data: Mean 6.08, Std. Dev. 14.33, Std. Err. 1.01. Sub-time series: Mean -0.01 (CWT 1) to 0.23 (CWT 12), Std. Dev. 5.27 (CWT 1) to 13.74 (CWT 12), Std. Err. 0.37 (CWT 1) to 0.97 (CWT 12), respectively.
Table 9: Sample Statistics of the Original Data and the CWT

Mean
Standard Deviation
Standard Error of the Mean

Original Data
6.081095
14.33177
1.010885

Sub-Time Series 1
-0. 0105473

268888
3716387

Sub-Time Series 2
0.0268657
11.987
0.8454978

Sub-Time Series 4
0.1049254
12.90514
0.9102586

Sub-Time Series 3
0.0257711
13.10204
0.9241467

Sub-Time Series 6
0.1443781
11.59102
0.8175676

Sub-Time Series 8
0.1827363
11.85019
0.8358478

Sub-Time Series 5
0.2015423
12.08643
0.852511

Sub-Time Series 10
0.2196518
12.79639
0.9025877

Sub-Time Series 12
0.2327861
13.73769
0.9689816

Sub-Time Series 7
0.1508955
11.57113
0.8161646

This study used three transfer functions (SATLINS, TANSIG, TANH) selected based on low errors, with daily data for training. The continuous wavelet neural network consistently outperformed other methods. Table 10 displays CWNN results for these transfer functions. For TANH, error variances ranged from 0.0007 (CWT 1) to 0.0097 (CWT 6), compared to the original data’s 0.0601. TANSIG results ranged from 0.0007 (CWT 1) to 0.0393 (CWT 2), with only CWT 2 exceeding the original data’s 0.0082. SATLINS results showed smaller error variances.
Table 10: CWNN Result based on the Transfer Functions
Sub-Time Series
Mean Absolute Error
Error Variance

Tanh
Tansig
Satlins
Tanh
Tansig
Satlins

Sub-Time Series 1
0.013951741
0.014383582
0.083100995
0.000729592
0.000739871
0.099012345

Sub-Time Series 2
0.024978109
0.131156219
0.216885075
0.003239021
0.039282011
0.493311007

Sub-Time Series 4
0.030443781
0.029195522
0.217135323
0.002145085
0.002329514
0.42369717

Sub-Time Series 3
0.05209204
0.048633333
0.286941791
0.004102457
0.004778149
0.709184674

Sub-Time Series 6
0.071986567
0.028030348
0.247026866
0.009715933
0.002541222
0.427719785

Sub-Time Series 8
0.023469652
0.040067164
0.352124876
0.002512283
0.006018716
0.951654567

Sub-Time Series 5
0.032890547
0.035047761
0.285989552
0.002339264
0.002771785
0.587782167

Sub-Time Series 10
0.046734328
0.016926866
0.381137811
0.005444832
0.000814949
0.791104316

Sub-Time Series 12
0.022665174
0.025757214
0.407753731
0.00073663
0.001418593
0.763670662

Sub-Time Series 7
0.039681095
0.043672637
0.270354229
0.002958337
0.002743456
0.62697559

Conducting the test of hypotheses on the results obtained, it was found out there are significant differences between each decomposed data and the original data as shown on Tables 11 and 12. Three alternative hypotheses were constructed.: , , , where is the mean of the decomposed data. The variance ratio test was used to test the validity of the model based on the error generated by the network from each decomposed data: , , , where is the standard deviation of the decomposed data. The test shows that at , sub-series 1, 2, 4 and 6 are significant, while at , sub series 1, 2, 4, 6 and 5 are significant. These can be seen on Table 12.
Table 11: Paired Sample Statistics of the Original Data and the CWNN

95% Confidence Interval of the Difference

at and

Lower
Upper

Original Data – Sub-Series 1
4.605974
7.577309
8.0853
0.000

Original Data – Sub-Series 2
4.788468
7.31999
9.4317
0.000

Original Data – Sub-Series 4
4.412868
7.539471
7.5381
0.000

Original Data – Sub-Series 3
4.331777
7.77887
6.9278
0.000

Original Data – Sub-Series 6
4.078572
7.794861
6.3002
0.000

Original Data – Sub-Series 8
3.893714
7.903002
5.8020
0.000

Original Data – Sub-Series 5
3.841665
7.917439
5.6892
0.000

Original Data – Sub-Series 10
3.731568
7.991318
5.4267
0.000

Original Data – Sub-Series 12
3.608266
8.088351
5.1482
0.000

Original Data -Sub-Series 7
3.809
8.051398
5.5128
0.000

Table 12: Variance Ratio Test of the Original Data and the CWNN

at
at

Upper Tail
Lower Tail

Original Data – Sub-Series 1
171.375
0.006
0.0000
0.0000

Original Data – Sub-Series 2
7.016
0.143
0.0319
0.016

Original Data – Sub-Series 4
9.256
0.108
0.0159
0.008

Original Data – Sub-Series 3
3.314
0.302
0.1705
0.0853

Original Data – Sub-Series 6
9.142
0.109
0.0164
0.0082

Original Data – Sub-Series 8
1.830
0.546
0.4806
0.2403

Original Data – Sub-Series 5
4.807
0.208
0.0776
0.0388

Original Data – Sub-Series 10
2.653
0.377
0.2602
0.1301

Original Data – Sub-Series 12
2.834
0.353
0.2305
0.1153

Original Data -Sub-Series 7
4.232
0.236
0.1027
0.0513

The analysis showed that except in extremely rare cases, all the series performed optimally compared to the original data. The result of the study has been able to show that using the continuous wavelet transform in the ANN technique, a better performance of the network is observed.

8.1.2 Global Solar Radiation
Sustainable development emphasizes eco-friendly energy (Boeker & van Grondelle, 2011). Solar power, abundant in places like Nigeria (Lana and Lamont, 2009), requires precise radiation measurements often hampered by costs in developing nations (Trinka et al., 2005). Artificial Neural Networks (ANNs) increasingly offer accurate predictions. Nymphas and Udomboso (2020) employed ANNs for solar radiation estimation in Ibadan, Nigeria, using temperature and humidity inputs, enhancing existing methods. Data from January 1995 to December 2004 displayed climate parameters, including daily Tmax (32.02°C) and Tmin (22.44°C), rainfall (108.95 mm), RHmax (98.25%), temperature range (14.47°C to 32.43°C), 153.3 hrs of monthly sunshine, and daily solar radiation fluctuations (1.8 MJ/m²/day to 29.1 MJ/m²/day). The ANNs achieved competitive Mean Squared Errors (MSEs), corroborating Rehmann et al. (2008).

         Mean temperature            Relative humidity     Daily temperature and relative humidity

Fig 4: Estimated and measured solar radiation

8.1.3 Soil Physico-Chemical Properties on Adsorption
Heavy metal contamination in soils is a significant environmental issue, with detrimental economic and health impacts. Heavy metals like lead (Pb), copper (Cu), zinc (Zn), and cadmium (Cd) pose threats to agriculture, human well-being, and soil ecosystems, prompting concern (Adriano, 2001). While these metals are naturally occurring, human activities and industrial waste have heightened their presence, affecting soil quality and public health. Udomboso et al. (2017) examined the adsorption of these heavy metals in soil, considering factors like pH, goethite content, humic acid, time, and sorbate concentration. Results varied across properties, with Cd showing the highest adsorption, except for humic acid, where Zn exhibited the most significant adsorption. This underscores the importance of soil management and pollution control in Nigeria.

8.1.4 Predicting River Discharge
Surface water is pivotal for various sectors like agriculture, power generation, and fisheries. Managing its flow is crucial for hydropower, water supply, and flood prevention. Amid rising global water demand and uncertainties in water availability, flood prediction, low-flow areas, and hydrological droughts become vital. Predicting discharge aids in mitigating floods, managing droughts, environmental needs, sectoral water demands, reservoir levels, and disaster response. Fashae, Olusola, Ndubuisi and Udomboso (2018) compared ANN and ARIMA models in modeling Opeki river, a significant tributary of River Ogun in Oyo State. This catchment’s unique features offer insights into groundwater recharge and discharge dynamics, supported by an existing gauging station on the river.

Figure 5: ARIMA and ANN predicted discharge from 1982-2010

The study favoured ARIMA modeling for river discharge, especially with limited data. River Opeki’s significance in economic, social, and environmental aspects highlights the need for improved modeling amid Africa’s infrastructure challenges. This offers opportunities for efficient agriculture, flood alerts, hydroelectricity, water supply, and river health promotion.
8.1.5 Estimation of Monthly Evapotranspiration
Evapotranspiration (ET) is vital for water resource management. Chukwu, Udomboso, and Onafeso (2011) studied ET estimation using climatic data from IITA, Nigeria, comparing Statistical Neural Network (SNN) and classical regression. SNN consistently outperformed regression, showing lower Mean Squared Errors (MSE). Increasing hidden neurons in SNN reduced MSE, demonstrating its capacity to capture complex relationships. This highlights SNN’s potential for accurate ET estimation, especially in climate modeling and water resource management. Lorentz et al. (2010) also noted that relative humidity changes may better indicate evaporation variations. Further research is essential for validating these findings and addressing climate change challenges.
8.2 Application to Financial Time Series
8.2.1 Time Series Forecasting with SNN and CWT
In Udomboso and Amahia’s (2016) study, a time series forecasting model for the naira-dollar exchange rate was developed, combining Statistical Neural Networks (SNN) and continuous wavelet transforms. Using annual exchange rate data from late 2012, they focused on three exchange rate variables: Buying Rate (BR), Selling Rate (SR), and Central Rate (CR), represented by corresponding SNN models (PBR, PSR, and PCR). These rates were decomposed into ten continuous wavelet signals (W1 to W10), creating 120 data points per rate. Each signal served as input for the network, and predictions were cross-referenced with the original data. The study showed strong correlations between the decomposed series and the original data, indicating mean prediction stability and generally improved accuracy compared to the original dataset.
In Udomboso and Saliu’s (2016) research, they developed an inference procedure for neural networks using the bootstrap method to assess the market efficiency of the Nigerian exchange rate from 2001 to 2015. They employed a multilayer perceptron network architecture and estimated various SNN models at different lags and hidden neuron configurations. Selection criteria identified the best model for each lag, and model evaluation confirmed that residuals were independently and identically distributed without serial autocorrelation, suggesting potential abnormal earnings in the market.

8.3 Applications in the Oil and Gas Sector
8.3.1 Gas Modeling with TRSM and TSNN
Artificial Neural Networks (ANNs) are crucial in petroleum engineering, especially when conventional models face data limitations. Falode and Udomboso (2016) analyzed Nigeria’s natural gas production, utilization, and flaring using the Time Series Regression Model (TSRM) and Time Series Neural Network (TSNN). Gas flaring, a long-standing issue in Nigeria, impacts the environment and local agriculture. ANNs find applications in reservoir analysis, drill bit grading, pump malfunction detection, and reservoir modeling. In Figure 6, trends in Nigeria’s natural gas production, utilization, and flaring reveal complex interdependencies. Flaring surpassed utilization until 2004, highlighting the intricate relationship between production, utilization, and flaring over time.

Figure 6: Trends in Nigeria’s natural gas production, utilization, and flaring (1958-2006)

The correlogram showed non-stationarity, confirmed by the Augmented Dickey Fuller test with trend and intercept (p-values > 5%). After differencing, variables became stationary with constant means and reduced autocorrelation.
Table 13: Estimated Model Adequacy for Gas Production, Utilization and Flaring

TSRM
TSNN

MSE

AIC
SIC
HL
MSE

AIC
SIC

Production
4.029e19
0.878
0.875
4.370e19
4.720e19
2
2.110e19
0.936
0.935
2.196e19
2.373e19

5
1.900e19
0.942
0.941
1.977e19
2.136e19

10
1.805e19
0.945
0.944
1.878e19
2.029e19

Utilization
4.148e19
0.670
0.663
4.500e19
4.860e19
2
2.652e19
0.789
0.784
2.761e19
2.982e19

5
2.237e19
0.823
0.818
2.328e19
2.515e19

10
1.034e19
0.918
0.916
1.076e19
1.162e19

Flaring
2.998e19
0.672
0.665
3.250e19
3.510e19
2
1.285e19
0.859
0.856
1.338e19
1.445e19

5
1.272e19
0.861
0.858
1.324e19
1.431e19

10
1.032e19
0.887
0.885
1.074e19
1.160e19

Table 13 summarize model adequacy results. MSE, , AIC, and SIC indicate model fit. Results reveal significantly smaller MSEs for TSNN compared to TSRM.

8.3.2 Prediction of Oilfield Scale Formation
Falode, Udomboso, and Ebere (2016) addressed critical challenges in the oil and gas sector by predicting the formation of BaSO4 and CaSO4 oilfield scale. These scales can disrupt operations due to BaSO4’s extreme insolubility and the variable crystallization forms of CaSO4. The research involved two primary cases: predicting BaSO4 precipitation under varying temperature and pressure and determining CaSO4 kinetic rate constants, considering factors like temperature, pressure drawdown, and flow rate. Input variables were standardized within a (0, 1) range for precise modeling. Neural network architectures were optimized based on their performance in minimizing mean squared error (MSE). The results indicated that the neural network models achieved low MSE values, showcasing their potential for effectively managing oilfield scale-related challenges in the industry.
8.3.3 Efficient Crude Oil Pricing
Crude oil, a vital global commodity, comprises a third of worldwide energy consumption, with a profound impact, especially on developing economies and oil-dependent nations. Oil price fluctuations influence inflation rates and overall economic performance. Forecasting oil prices aids stakeholders in making informed decisions, considering demand, supply, and geopolitical factors. Price volatility significantly affects the global economy, particularly oil-importing countries, necessitating accurate price forecasts. In their study, Falode and Udomboso (2021) utilized Autoregressive Neural Networks (ARNN) to forecast crude oil prices, using data spanning January 2006 to October 2020, aiming to mitigate risks and facilitate informed planning.
Typically, the Autoregressive of order , written as , is given as
(23)
where , , and is .
The Autoregressive Neural Network model used in this study is modified from Medeiros et al. (2006) as
(24)
where is a non-linear function in , having a vector of parameter , and is given as . Also, vector is given as , where is a vector of lagged values of . The term , a sequence of independently normally distributed random variables, is the stochastic term, also known as the error. The function is the transfer function (also known as the activation or link function).

Figure 7: Graph of Crude Oil Production, Export and Price (Jan 2006 –Oct 2020)
Figure 7 displays Nigeria’s crude oil production, export, and global price trends. All variables exhibit seasonal variations. Nigeria maintains relatively consistent production and export, while global prices show sharp fluctuations.
Table 14: Exploratory Data Analyses Crude Oil Production, Export and Price (Jan 2006 –Oct 2020)
Variable
Interquartile Range
Mean
Median
Skewness
Excess Kurtosis

Crude Oil Production
0.3000
2.1292
2.1400
-0.2491
0.1928

Crude Oil Export
0.3000
1.6792
1.6900
-0.2491
0.1928

Crude Oil Price
48.1975
77.2874
73.2800
0.2721
-0.9300

Table 14 summarizes 14 years of data on crude oil production, export, and global price. Mean and median values indicate stable production and export at around 2.13-2.14 mbd and 1.68-1.69 mbd, with prices averaging 77.29-73.28 US$/barrel. Production and export have a narrow range of 0.3 mbd. Skewness values near zero suggest roughly symmetric distributions, while excess kurtosis values indicate slightly leptokurtic distributions for production and export (0.1928) and slightly platykurtic for price (-0.9300).
Table 15: SNN Model Determination for Crude Oil Price
Hidden Neuron
MSE

p-value
NIC

1
558.3831
0.2336
0.0366
560.4167

2
520.2403
0.2860
0.0280
519.1867

3
520.7263
0.2853
0.0281
523.4690

4
539.1649
0.2600
0.0318
540.6032

5
511.3642
0.2981
0.0264
513.9875

10
491.6980
0.3251
0.0233
493.2909

20**
306.8214
0.5789
0.0082
307.3058

30
435.7316
0.4019
0.0168
439.0139

40
366.2549
0.4973
0.0114
365.9786

50
408.9640
0.4387
0.0144
409.1884

75
329.1152
0.5483
0.0093
328.6609

100*
198.9716
0.7269
0.0043
199.3477

and **preferred models for forecasting

Table 15 displays model determinations from the SNN part of the study, favoring models with 100 and 20 hidden neurons based on maximum values (0.73 and 0.58) significant at a 1% level.
Overall, Nigeria’s crude oil production and export have declined over the years, despite seasonal variations. The autoregressive neural network with 100 neurons outperformed others. COVID-19 significantly disrupted the oil industry, causing disruptions, reduced demand, price crashes, and operational challenges. This study highlights the need for economic diversification away from oil dependence.

8.4 Applications in the Health Sector
8.4.1 Modelling Cholera in Nigeria and South Africa
Cholera, a historic disease dating back to the early 1800s, gained notoriety through the Calcutta outbreak, linked to poor living conditions and tainted water sources. In 1854, Italian scientist Filippo Pacini identified Vibrio cholera as the causative bacterium during London’s Broad Street Cholera Outbreak. Cholera is characterized by symptoms such as severe diarrhea, vomiting, dehydration, and muscle cramps. It remains a global health concern, predominantly affecting developing tropical and subtropical regions. WHO reports documented cholera cases in African countries like Angola, Congo, Mozambique, Nigeria, Somalia, Tanzania, and South Africa. In Oduaran and Udomboso (2017), cholera modeling for Nigeria and South Africa revealed peaks in Nigeria’s cholera incidence, death, and fatality rates in 1970, 1990, 2000, and 2010, with declining fatality rates post-1990. In South Africa, high incidences were observed in 2000, with relatively elevated death rates in 1980, 2000, and 2010, accompanied by fluctuating fatality rates. The study suggests a need for increased hidden neurons for accurate cholera predictions.

Fatality and Predicted Fatality Fatality Growth Rate and Predicted Fatality Growth Rate
Figure 8: Graph of Fatality, Predicted Fatality, Fatality Growth, and Predicted Fatality Growth Rates due to Cholera in Nigeria (1970 – 2013)

                Fatality and Predicted Fatality               Fatality Growth Rate and Predicted Fatality Growth Rate

Figure 9: Graph of Fatality, Predicted Fatality, Fatality Growth, and Predicted Fatality Growth Rates due to Cholera in South Africa (1970 – 2013)

Figures 8 and 9 display smoother predicted fatality and growth rates, with South Africa exhibiting a notably smoother trend compared to Nigeria.

8.4.2 Contraception Usage
Contraception and involuntary fertility control are key components of family planning recognized by the World Health Organization. Contraception and abortion, affecting fertility before and after gestation, are pivotal for married and unmarried individuals. Condom use stands as the sole proven method to significantly reduce STI and HIV/AIDS transmission. In sub-Saharan Africa, contraceptive prevalence is 25%, with Ghana and Nigeria reporting rates of 23.5% and 14.1%, respectively, impacting maternal mortality rates. It is against this backdrop that Udomboso et al. (2015) and Udomboso and Amoateng (2018) examined factors influencing contraception in Nigeria and Ghana using neural networks and DHS data.

Table 16: Statistics of Model Determination
Hidden Neuron
Never Married
Ever Married

MSE
R2
NIC
MSE
R2
NIC

Nigeria
Ghana
Nigeria
Ghana
Nigeria
Ghana
Nigeria
Ghana
Nigeria
Ghana
Nigeria
Ghana

1
0.253
0.252
0.02
0.00
0.259
0.284
0.196
0.253
0.00
0.00
0.203
0.204

5()(+)
0.240
0.252
0.07
0.00
0.237
0.295
0.175
0.238
0.10
0.06
0.176
0.236

10+(*)(+)
0.251
0.231
0.03
0.09
0.259
0.236
0.167
0.226
0.15
0.11
0.165
0.228

25+()(+)
0.237
0.126
0.08
0.50
0.234
0.127
0.18
0.221
0.08
0.12
0.180
0.224

50()(+)
0.215
0.245
0.16
0.03
0.213
0.252
0.142
0.138
0.27
0.45
0.142
0.139

75+()(+)
0.059
0.206
0.77
0.19
0.060
0.203
0.089
0.109
0.55
0.57
0.089
0.110

100+()(+)
0.008**
0.081**
0.97
0.68
0.009
0.079
0.007**
0.106**
0.96
0.58
0.008
0.106

** Model selected for prediction

Significant at 5% level of significance (P>F) for Nigeria (never married) + Significant at 5% level of significance (P>F) for Ghana (never married)
(*) Significant at 5% level of significance (P>F) for Nigeria (ever married) (+) Significant at 5% level of significance (P>F) for Ghana (ever married)

Table 17: Estimates of usage and rate of usage of contraception (in percent)
Method
Mean Usage (%)
Standard Error of Usage (%)
Rate of Usage (%)

Nigeria
Ghana
Nigeria
Ghana
Nigeria
Ghana

AM
11.9
19.2
0.8
1.2
4.1
4.7

AMM
7.2
10.5
0.9
1.5
4.3
5.9

ATM
4.7
8.7
0.1
-0.1
0.6
-0.3

NCU
88.1
80.8
-0.1
-0.3
-0.4
-1.0

Table 18: Forecasts of usage and rate of usage of contraception till 2030 (in percent)
Method
Mean of Usage Forecast (%)
Standard Error of Usage Forecast (%)
Rate of Usage Forecast (%)

Nigeria
Ghana
Nigeria
Ghana
Nigeria
Ghana

AM
16.0
25.2
1.0
2.1
3.8
3.4

AMM
10.6
18.8
0.7
1.4
4.0
4.2

ATM
5.4
6.4
0.1
0.2
0.6
-1.8

NCU
84.0
74.8
0.7
1.1
-0.4
-0.8

Udomboso et al. (2015) investigated factors affecting contraception use in “never married” women in Nigeria and Ghana, highlighting education, desire for children, opposition to contraception, and location as key influencers. Udomboso and Amoateng (2018) observed a preference for modern contraception over traditional methods, with a 4.4% overall growth in contraception use and declining reluctance (0.7%). Injectables are expected to dominate by 2030.
8.4.3 Vesico-Vaginal Causality
Vesicovaginal fistula (VVF) is a devastating condition characterized by abnormal communication between the vaginal wall and the bladder or rectum, leading to uncontrollable urinary leakage. It’s a significant public health issue, notably in Nigeria, affecting 150,000 to 200,000 patients. While Statistical Neural Networks (SNNs) are effective for nonlinear regression, their application to VVF causative factors is limited. James, Udomboso, and Onwuka (2012) compared SNNs and linear regression models to understand VVF causes. SNNs outperformed, yielding a higher R-squared (0.8 vs. 0.46) and lower mean square error (2011.0 vs. 5439.55). This highlights SNNs’ potential in comprehending complex medical conditions like VVF for prevention and treatment. Factors like prolonged obstructed labor and instrument misuse emerged as significant contributors to VVF, emphasizing the value of artificial neural networks in medical research.
8.5 Application to Aviation Transport
Aviation is a pivotal economic sector, reflecting a nation’s development. In Nigeria, it drives rapid growth, employing 245,500 people and contributing NGN185 billion to the economy. Air passenger traffic has surged, reaching 4.7 million and 10.7 million international and domestic passengers in 2013 (National Bureau of Statistics, 2014). Udomboso and Ojo (2021) developed statistical learning algorithms, including support vector machines and statistical neural networks, to analyze air passenger traffic data from 2007 to 2018 at the Murtala Mohammmed International Airport, Lagos, using records from the Nigerian Airspace Management Agency (NAMA), MMIA.

Figures 10 and 11 depict MMIA’s domestic and foreign airline traffic. They reveal seasonal variations and occasional upward trends, indicating regular passenger influx.
Table 19: Performance Evaluation of SVM Models for Foreign and Domestic Airline Traffic
Kernels
Domestic Airline
Foreign Airline

RMSE
C Value
Degree Value
RMSE
C Value
Degree Value

Linear
Rbf
Polynomial
Sigmoid
34958.41
34959.27
34974.59
34958.97
13000

15000

16000

–

110

13345.96
13345.93
13345.93
13345.96
25000

40000

30000

–

130

   Domestic Air Traffic              Foreign Air Traffic                 Domestic Air Traffic        Foreign Air Traffic
     Figure 12: Plot of SVM Actual and Predicted          Figure 13: Plot of ANN Actual and Predicted

Table 19 reveals that the Linear Kernel SVM with C=13000 excels in predicting domestic airline traffic with the lowest RMSE, while the polynomial SVM kernel with degree 130 performs exceptionally well for foreign airline traffic. Figures 12 and 13 depict the actual and predicted plots for domestic and foreign air traffic training data.

Table. 20 Performance Evaluation of ANN Model using Domestic and Foreign Airline Traffic
No. of Neurons
Domestic Traffic
International Traffic

RMSE
Best Iterations
RMSE
Best Iterations

10
20
30
40
50
60
70
80
90
33071.80
33003.88
33058.59
33020.20
33016.37
33012.22
33034.08
33034.05
33031.52
5900
5300
4000
3700
3700
3700
2800
2800
2700
13359.09
13357.38
13387.26
13348.67
13381.88
13382.27
13397.82
13420.63
13431.18
5200
3200
2800
2900
2300
2200
2200
1900
1800

Table 20 shows Linear Kernel SVM with C=13000 excels in domestic airline traffic prediction, while the polynomial SVM kernel with degree 130 performs well for foreign airline traffic. Figures 12 and 13 display actual and predicted data for domestic and foreign air traffic. The SVM and ANN models, respectively provide accurate predictions.

Table. 21: Optimal metric for Domestic and Foreign Air Traffic of SVM and ANN
Model
Domestic Traffic
International Traffic

Test RMSE
Train RMSE
Test RMSE
Train RMSE

SVM
ANN
34958.412592
33003.878906
46028.98
43868.25
13345.925823
13348.665039
38473.20
37559.26

Table 21 provides optimal metrics for SVM and ANN. SVM outperforms in foreign air traffic prediction, while ANN excels in domestic. Future studies should delve into advanced forecasting techniques, incorporate exogenous factors, and leverage daily data for improved predictive accuracy, contributing to more effective aviation services.
8.5 Other Areas of Applications and Ongoing Researches
Other areas not covered in this lecture include the application of the neural network model to the study of crime rates (James, Suleiman, Udomboso, and Babayemi, 2015), as well as students’ academic performance (Asogwa and Udomboso, 2016). The current focus includes (but is not limited to) statistical image recognition, which covers vehicle recognition, number plate recognition, and OCR (Optical Character Recognition). Additionally, there is ongoing research in dynamic modeling of internet traffic, graphical models, machine learning-based spatiotemporal analysis, greenhouse gas (GHG) estimation, and neural network estimations with time series models.

THE DEPARTMENT OF STATISTICS
The Department of Statistics at the University of Ibadan holds a distinguished reputation as the oldest and most prestigious Statistics Department in Nigeria. It stands as the foremost institution for statistical research and education in the nation. It was established to provide quality education and research in the field of statistics. The Department offers a range of academic programs at both the undergraduate and postgraduate levels, providing opportunities for advanced research and specializations in various units: viz. Biometry, Computational Statistics, Economic and Financial Statistics, Environmental Statistics, and Statistical Design of Investigation. The Department is proud to have a dedicated and experienced faculty of statisticians and researchers who are actively engaged in both teaching and research endeavors. The Department is currently conducting a comprehensive review of its curricula at both the undergraduate and postgraduate levels. As part of this review, a new unit, Mathematical Statistics, would be introduced. Additionally, some existing units will undergo name changes to align with global trends in the field.
The Department of Statistics is renowned for its significant research contributions across various domains of statistics, encompassing mathematical statistics, applied statistics, and data science. These contributions are firmly rooted in the diverse units outlined above, showcasing the Department’s wide-ranging expertise and impact in the field of statistics. Faculty and students often engage in research projects and contribute to academic journals, both nationally and internationally. The Department maintains collaborative partnerships with other academic institutions, government agencies, and industry collaborators, which create valuable research opportunities and internships for students. These collaborations enhance the educational and research experiences offered by the Department, providing students with real-world exposure and networking opportunities. It is also actively involved in community outreach and consultancy services, providing statistical expertise to address real-world problems.
My Dean, the Department of Statistics at the University of Ibadan, like any academic department, has various needs for continual functioning. For instance, the newly constructed annexed building for the Department of Statistics is in ruins. We have received promises for its revitalization, but to date, nothing has been done. We need help. Also, we desire that the university, among other things, do the following for us:
(i) Restrict the teaching of statistics courses to relevant Departments and Faculties to the Department of Statistics.
(ii) Non-Statistics faculty willing to teach Statistics must get chartered by the Chartered Institute of Statisticians of Nigeria (CISON).
(iii) Maintain and fund classrooms, laboratories, software and research facilities for conducive learning.
(iv) Recruit and retain diverse, experienced faculties for teaching and research.
(v) Allocate grants for innovative statistical research projects, inspiring faculty innovation.
(vi) Promote collaborations with Departments and institutions within and outside Nigeria for interdisciplinary research and global exposure.
(vii) Establish a Statistical-consulting base for the university within the Department.
CONCLUSION
My Dean, there has always been a concern about whether Data Science should reside within the domain of Computer Science or Statistics. Statisticians worldwide see Data Science as an integral part of Statistics since it builds upon statistical foundations, encompassing data analysis, modeling, inferential statistics, experimental design, and more. The question has always remained that, should a non-statistician assume the role of a Data Scientist? While it spans diverse skills, the theory of statistics remains its core, enabling insight extraction. Non-statisticians can enter the field of Data Science by acquiring Statistics, Probability, and Machine Learning skills. Data Science offers specialization options based on one’s background, be it in machine learning, data engineering, data analytics, or business intelligence.
There has been a mix-up in understanding that there are differences, as there are similarities between the Artificial Intelligence (AI), Machine Learning (ML) and Data Science. At this juncture, it is important that I compare and contrast the various skills and techniques needed by these related fields for a better understanding.
10.1 Interplay between Artificial Intelligence, Machine Learning and Data Science
Artificial Intelligence, Machine Learning, and Data Science are related fields, but they have distinct focuses, methodologies, and applications.
Scope and Focus: Artificial Intelligence (AI) is a field aiming to create intelligent machines capable of tasks requiring human-like thinking. It includes problem-solving, understanding language, recognizing patterns, and learning from experience. Machine Learning (ML), a subset of AI, focuses on creating algorithms allowing computers to learn from data and improve task performance. Data Science is an interdisciplinary field combining statistics, computer science, and domain expertise to extract insights from data, involving collection, cleaning, and analysis to inform decision-making.
Data Usage: Artificial Intelligence does not necessarily rely on extensive datasets; it focuses on replicating human-like intelligence, incorporating reasoning and rule-based decision-making. Machine Learning, in contrast, is inherently data-driven, demanding substantial datasets to train models for pattern recognition and prediction. Data Science, centered on data, involves collecting, cleaning, and analyzing data to derive insights.
Goals: AI replicates human cognition, aiming for autonomous task execution and adaptation to change. Machine Learning focuses on data-driven algorithm improvement, emphasizing prediction and classification. Data Science extracts insights from data to inform decisions, using techniques like statistics, visualization, and ML as needed.
Interdisciplinary Nature: AI involves computer science, psychology, and philosophy to mimic human intelligence. Machine Learning mainly utilizes computer science, statistics, and optimization to create data-learning algorithms. Data Science, being interdisciplinary, combines statistics, computer science, domain knowledge, and data engineering to tackle real-world issues through data analysis.
10.2 Role Play among the Statistician, Mathematician and Computer Scientist in Artificial Intelligence, Machine Learning, and Data Science
Statisticians, mathematicians, and computer scientists play pivotal roles in advancing Artificial Intelligence, Machine Learning, and Data Science. Their contributions vary with expertise and project phases, enhancing data analysis and model development.
(i) Statistician: Statisticians are often at the heart of data science. They drive AI and ML with experiment design, hypothesis tests, and foundational statistical learning. They guide model selection and result evaluation, anchoring data science in exploration, hypothesis testing, and traditional methods.
(ii) Mathematician: Mathematicians enhance AI and ML with mathematical algorithms, theoretical frameworks, and optimization skills. They strengthen neural networks’ foundations with linear algebra. In data science, they excel in complex mathematical modeling.
(iii) Computer Scientist: Computer Scientists drive AI/ML with algorithm development, system design, and optimization of model processes. They handle data engineering tasks in data science—collection, preprocessing, database, and pipelines.
RECOMMENDATIONS
The University of Ibadan should now adopt Artificial Intelligence, Machine Learning, and Data Science, as a broad-based program into its curriculum. These fields are pivotal in the global technological landscape, ensuring the university’s relevance in a swiftly evolving world. Furthermore, they can stimulate economic growth in Nigeria by fostering innovation, entrepreneurship, and employment in technology-driven sectors, addressing local issues such as healthcare, agriculture, energy, and education. Integrating these disciplines encourages interdisciplinary cooperation, including Statistics, Mathematics, and Computer Science, promoting holistic problem-solving approaches. Establishing AI, ML, and Data Science programs will produce highly sought-after graduates, benefiting both the local and global job markets and enabling cutting-edge research opportunities, research funding, and industry collaborations that enhance student experiences.
Recommendations for AI/ML/Data Science hub at University of Ibadan’s Faculty of Science for sustainable, all-encompassing practice:
Curriculum Development: Establish a task force composed of experts from Statistics, Mathematics, Computer Science, and other relevant Departments to design a comprehensive and interdisciplinary curriculum. Include foundational courses in mathematics, statistics, programming, and data analysis, as well as advanced courses in AI, ML, and Data Science.
Faculty Development: Encourage faculty to pursue advanced degrees and certifications in AI, ML and Data Science. Facilitate faculty exchanges, workshops, and seminars to enhance their expertise.
Infrastructure and Resources: Invest in computing infrastructure, including high-performance computing clusters and specialized hardware. Establish dedicated AI, ML and Data Science labs equipped with software tools, cloud computing resources, and diverse datasets.
Interdisciplinary Collaboration: Encourage interdisciplinary research and projects involving Statistics, Mathematics, Computer Science, and other relevant disciplines. Establish interdisciplinary research centers or institutes focused on AI, ML, and Data Science.
Industry Collaboration: Forge strong partnerships with local and international industries, startups, and technology companies. Create an industry advisory board to guide curriculum development and ensure alignment with industry needs.
Student Engagement and Support: Encourage students to participate in AI/ML/Data Science competitions, research projects, and internships. Offer scholarships, mentorship programs, and career counseling to support students pursuing AI, ML, and Data Science education.
Ethical Considerations: Integrate discussions on ethics, fairness, and responsible AI into the curriculum. Promote awareness of ethical issues and responsible AI practices among students and faculty.
Community Engagement: Host workshops, seminars, and conferences to engage with the academic community, industry professionals, and policymakers. Launch outreach programs to promote STEM education and AI awareness in local schools and communities.
Evaluation and Continuous Improvement: Implement regular program evaluations and gather feedback from stakeholders to refine and update the curriculum. Stay up-to-date with AI/ML/Data Science advancements and adjust course offerings accordingly.
By embracing AI, ML, and Data Science and implementing these recommendations, the University of Ibadan can become a leader in interdisciplinary education and research, contributing to local development and global advancements, and prepare graduates for careers in these dynamic fields.
ACKNOWLEDGEMENTS
Let me start by acknowledging the presence of the Almighty God in my life, and especially for today’s Faculty Lecture. I am reminded of the slogan of my state’s immediate past governor, Emmanuel Gabriel Udom, which resonated throughout his 8 years in office – “ONLY GOD”. Without a doubt, God has been kind to me, as one might say, “If not for the Lord, now may Israel say”.

My Dean, I am a person surrounded by many persons. I might bore you with names in this section. Please pardon me sir. At this point, I would like to express my gratitude to the Committee on Faculty Lecture, under the leadership of Prof. S. T. Ogunbanwo, as well as all the committee members, for granting me the opportunity to deliver the first Faculty Lecture of the 2022/2023 session.

My Immediate Family
The core of my existence, my immediate family, holds a special place in my heart. I am profoundly grateful to my wife, Deaconess (Mrs.) Joy Ejebosele Oluwaseun Udomboso, for her unwavering support throughout my academic journey. Your remarkable understanding and ability to manage our household during my frequent absences have been a pillar of strength. I love you dearly. To my cherished children, Faith, Favour, Florish, and Fountain, I extend my heartfelt gratitude. Your understanding and patience during my prolonged absences have been truly remarkable. I pray that the Lord, who has guided me through every challenge, continues to watch over, guide and abundantly reward each of you, in Jesus’ name.
Spiritual Mentors and Friends
I am profoundly grateful to those who have shaped my spiritual journey since I surrendered my life to the Lord on June 22, 1986. The Scripture Union (Nigeria), notably the Uyo Region, laid the foundation for my spiritual growth. Special thanks to my late patron, Mr. Umoh, my school’s math teacher, who mentored me. Engr. Enoto Udo Oton, my first discipler, and Mrs. Eka Udomboso, my life mentor and sister-in-law, have been pivotal. I also appreciate Pastor Marcus

UncleSam's BLOG news

STATISTICAL FOUNDATIONS OF NEURAL NETWORKS : BRIDGING THEORY AND PRATICE” – DR C.G UDOMBOSO

Related Articles

15000

16000

110

40000

30000

130

About unclemtech

Check Also

School Fees Increment By FG, Lagos Illegal – Falana

Leave a Reply Cancel reply

Wike, Fubara fight about money sharing – Amaechi

Moon sighted in Nigeria, Sunday is Eid-el-Fitr

Among The World’s 21 Black Billionaires, 6 Are Nigerians – Forbes

MTN, Airtel Plan Cost Reduction In Network Sharing Deal

Oyo Sharia Panel Begins Sitting As Governor Makinde Goes To Court

Google’s Submarine Cable To Begin Operations In Nigeria By December

UNN Final Year Student Wins N30m Star Prize At The Big Break Moment Africa 2022(video)

Court Discharges, Acquits Seun Egbegbe After Spending Almost 4 Years In Prison

Musa Kwankwaso Commissions NNPP Office In Ikorodu, Lagos (Photos)

Nigerian sprinter, Tobi Amusan nominated for Women’s World Athlete award