- PREAMBLES

With utmost humility, I express profound gratitude to Almighty God, the Creator of all, for His unwavering blessings and guidance, underpinning my existence. His omnipotence, omnipresence, and omniscience merit my deepest reverence. I particularly thank Him for the salvation of my soul. I extend heartfelt appreciation to the former Dean of Science, Prof. A. A. Bakare, for initiating this Faculty Lecture, and to the current Dean, Prof. O. O. Sonibare, for making it a reality. When I accepted this challenge, I did so with mixed emotions, aware of the daunting task of summarizing years of research in sixty minutes before such a knowledgeable audience. This Faculty Lecture marks the third from the Department of Statistics. The first, “Statistical Literacy and Empirical Modeling for National Transformation,” by Prof. O. E. Olubusoye, was presented on October 23, 2014. The second, “Statistical Modeling in Environmetrics,” was delivered by Prof. K. O. Obisesan on May 26, 2021. Today’s lecture, representing the Department’s Computational Statistics Unit, encompasses nearly 16 years of work, constituting approximately 60% of my cumulative research endeavors.

My journey into academia began with a childhood aspiration. When I was a student at Moremi High School, Road 7, University of Ife (now Obafemi Awolowo University), I was passing through the senior staff quarters with my late father, who was a pioneer staff member at the university. We encountered a young man whom my father addressed as “Doctor”. I was intrigued and inquired if he was a medical doctor. He clarified that the title meant he was a “doctor of books”. My little brain did not understand what that meant, but at that moment, I resolved to become a “doctor” too, whether in medicine or academia. Years later, I realized he was referring to a Doctor of Philosophy (PhD) as the “doctor of book”.

Passion for mathematics led me to statistics during a time of limited mathematical prospects. My Ph.D. focus then was to apply probability theory to environmental studies. In 2007 one of my supervisors, Prof. Amahia encouraged collaboration with Geography, which connected me to Prof. Akintola, who ignited my interest in neural networks through his Ph.D. student at the time – Dr. Onafeso Neural networks was a concept I had previously encountered but hadn’t delved into due to my limited understanding. In 2009, I visited the Electrical Electronics Engineering Department at Obafemi Awolowo University to see Engr Alawode, at the instance of Dr. Oluwaseun A. Otekunrin. There, I acquainted myself with MATLAB 2007a, essential for my PhD research. Also in 2009, through Prof. Angela U. Chukwu I connected with Prof. I. K. Dontwi of the Department of Mathematics, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana, who later became one of my supervisors. By 2012, a pre-doctoral fellowship took me to his university, where I formally entered the world of neural networks and dedicated myself to mastering MATLAB 2009a.

1.1 The Earliest Forms of Statistics

Though the concept of Statistics was not known in the early days of human existence, some rudimentary forms of data collection and counting are mentioned in the Bible, especially in the context of censuses, population counts, and record-keeping. For example, the Bible occasionally mentions censuses or population counts. The most well-known census is described in the Book of Numbers (Old Testament), where a census of the Israelites was conducted in the wilderness. Additionally, the Bible contains records of contributions and offerings made by individuals and communities. For example, tithes (one-tenth of income) were collected for the support of religious institutions. It also includes genealogies and lists of names, especially in the Old Testament, to establish lineages, trace family histories, as well as maintain records, and also makes references to agricultural practices and harvests, although these are not presented in a systematic statistical manner.

1.2 The Historical Development of Statistics

The history of statistics is a complex, centuries-long journey with contributions from diverse cultures and fields. Early statistical thinking can be traced to ancient civilizations like Babylon and Egypt, using statistical methods for taxation and census-taking. In the Renaissance (16th-17th centuries), European scholars like Cardano and Graunt advanced demographic data analysis. Graunt notably pioneered mortality statistics. The 18th century witnessed the rise of probability theory, foundational for modern statistics. Mathematicians like de Moivre and Laplace made significant contributions to probability theory, crucial for statistical inference.

The 19th century saw significant advances in statistics. Adolphe Quetelet, often called the “father of statistics,” pioneered the normal distribution and the concept of the “average man.” Francis Galton introduced regression and correlation. Sir Francis Ysidro Edgeworth contributed to statistical theory, including the Edgeworth series. William Sealy Gosset, known as “Student,” developed the t-distribution for hypothesis testing. In the late 19th and early 20th centuries, Karl Pearson made notable contributions, including the Pearson correlation coefficient. Ronald A. Fisher, another influential statistician, advanced experimental design and hypothesis testing, among other areas.

The 20th century marked the formalization and rapid growth of statistics. It became pivotal in scientific research, social sciences, and industry. Modern computing further empowered statistical analysis. Bayesian statistics experienced a resurgence due to computing advances, offering a distinct approach. The late 20th and early 21st centuries ushered in the “big data” era, posing new challenges. Statisticians developed techniques for large datasets, and data science emerged. Statistics evolved further with machine learning, artificial intelligence, and data analytics, broadening applications in predictive modeling, data mining, and deep learning.

Today, statistics is an integral part of various fields, including science, technology, medicine economics, social sciences, humanities, and business. It plays a vital role in decision-making, research, and understanding the world through data-driven insights. The historical development of statistics reflects the human quest to make sense of data and harness its power for various purposes.

1.3 The Science and Arts of Statistics

Statistics is a mathematical and scientific discipline that encompasses data collection, analysis, interpretation, presentation, and organization. Its core objective is to extract meaningful insights from data, facilitating informed decisions and conclusions. It involves various mathematical processes, ranging from basic to intricate. Common challenges in statistics courses often stem from students’ limited foundational mathematics knowledge. In the past, some students avoided essential mathematical statistics courses like probability and stochastic processes, as they were elective. Now, these subjects are mandatory to ensure comprehensive understanding. Requiring students to study these mathematical courses is crucial. They form the foundation for statisticians, are vital at the postgraduate level, and are indispensable for data scientists. These courses provide the theoretical basis for mathematical and statistical algorithms, essential for success in machine and statistical learning. Machine learning and data science are deeply rooted in mathematical and statistical principles, extending beyond coding and algorithms, involving software solutions like Python, MATLAB, R, SAS, and SPSS, which all rely on these foundations.

The term ‘art of statistics’ captures the creative and interpretive aspects of statistical analysis. Beyond its scientific and mathematical identity, statistics possesses an inherent artistic dimension. This encompasses decoding data, effective communication of findings, and using statistics beyond the quantitative realm. Elements include data visualization, storytelling through data, model crafting, ethics, artful data presentation, communication skills, and iterative refinement. It’s about transforming data into meaningful insights that inform decisions and deepen understanding. Statistics applies broadly, from sciences to arts. The University of Ibadan’s Postgraduate College recently approved programs in Computational Linguistics and Natural Language Processing within the Department of Communication and Language Arts. These programs, part of AI and data science, encourages collaboration involving CLA, Computer Science, Statistics, Mathematics, and Data and Information Science Departments. - HISTORICAL DEVELOPMENTS IN STATISTICAL NEURAL NETWORKS RESEARCH

The inception of Neurocomputing and Neural Networks dates back to McCulloch and Pitts’ seminal 1943 article, showcasing neural networks’ capacity to compute arithmetic and logical functions. The late 1950s saw Rosenblatt introducing perceptrons, followed by the development of the perceptron convergence theorem (1957, 1958). Yet, Minsky and Papert’s 1969 findings dampened enthusiasm for perceptrons. However, the early 1980s witnessed renewed interest, driven by Hopfield networks (1982), Kohonen’s Self-Organizing Maps, the resurgence of Werbos’ back-propagation algorithm (1974), and Anderson and Rosenfeld’s historical account (1988). Notably, despite perceptron limitations, silent research persisted during the 1970s, manifesting as adaptive signal processing, pattern recognition, and biological modeling.

2.1 Neural Networks in Statistics

Research into statistical aspects of Artificial Neural Networks (ANNs) began in 1989 with White’s study. Hornik et al. (1990) established ANNs for approximating unknown mappings. Subsequent work connected ANNs to statistical pattern recognition and delved into various aspects, including analysis, modeling, and econometrics. Kumar et al. (1995) compared ANNs with logistic regression. Anders and Korn (1999) advanced SNN model selection. Further developments included time series approaches, comparative studies, and mathematical methods in ANNs.

2.2 Neural Networks and Statistical Terminologies

Despite many similarities between ANN models and statistical models, terminologies used in both fields are quite different. For example, Sarle (1994) claimed that the terminology ‘back-propagation’, should refer only to the simple process of applying chain rule to compute derivatives for the generalized delta rule algorithm, and not the training method itself. He further added that this confusion is symptomatic of the general failure in ANN literature to differentiate between models and estimation methods. Sarle (1996) gave a list of statistics terminology that has its equivalence in ANN literature.

Table 1: Statistical and ANN Terminologies

Statistical Terminology ANN Terminology

variables features

independent variables inputs

predicted values outputs

dependent variables targets or training values.

residuals errors

estimation training, learning, adaptation, or self-organization.

estimation criterion error function, cost function, or Lyapunov function

observations patterns or training pairs

parameter estimates (synaptic) weights

interactions higher-order neurons

transformations functional links

regression and discriminant analysis supervised learning, self-organization or heteroassociation

data reduction unsupervised learning, encoding, or autoassociation

cluster analysis competitive learning or adaptive vector quantization

interpolation and extrapolation generalization

intercept bias

error term noise

forecasting prediction

The statistical terms “sample” and “population” do not seem to have ANN equivalents. However, data are often divided into a training set and test set for cross-validation.

2.3 Similarity of Neural Network Models to Statistical Models

2.3.1 General Linear Models of the Statistical Neural Networks

In general, feedforward networks with no hidden layer are basically generalized linear model.

Input Output Target

```
Independent Predicted Dependent
Variable Value Variable
```

Figure 1: Simple Linear Regression

Figure 1 is a chart of the simple linear regression predicting the dependent variable (target) as the predicted variable (output), from a set of independent variable(s) (input).

Input Hidden Output

```
Independent Summing Dependent
Variables Junction Variable
```

Figure 2: A Simple Sketch of the Artificial Neural Network

Figure 2 depicts an artificial neural network (ANN) framework comprising input, hidden, and output units. Input accepts variables, hidden preprocesses with variable separation based on hidden neurons, enhancing precision. Output displays results, also recognized as the Multilayer Perceptron (MLP).

Sarle (1994) stated that the perceptron model with different transfer functions has been shown to have equivalent statistical models. For example:

An ADALINE is a linear two-group discriminant. (Weisberg 1985).

Simple Nonlinear Perceptron is a Logistic Regression (Hosmer and Lemeshow 1989).

ADALINE is a Linear Discriminant Function (McLachlan 1992; Weiss and Kulikowski 1991). An ADALINE is a linear two-group discriminant.

2.3.2 Nonlinear Models of the Statistical Neural Networks (MLPs)

i. Multilayer Perceptron with one output is a Simple Nonlinear Regression (Sarle 1994)

ii. Multilayer Perceptron with several outputs is a Multivariate Multiple Nonlinear

Regression

Functional Link Network is a Polynomial Regression

Linear Multilayer Perceptron is a Maximum Redundancy Analysis

Nonlinear Multilayer Perceptron is a Nonlinear Maximum Redundancy Analysis

There are SNN models that represent Principal component analysis in various forms, including the linear and nonlinear. An MLP having moderate number of hidden neurons is essentially the same as a projection pursuit regression. Difference between the two is that an MLP uses a predetermined transfer function while the projection pursuit regression model uses a flexible nonlinear smoother (Sarle 1994). Also, an MLP results into a nonparametric sieve if the number of hidden neurons increases simultaneosuly with sample size (White 1988). This makes it a useful alternative to methods like kernel regression (Hardle 1990) as well as smoothing splines.

- FROM HOMOGENEOUS TO HETEROGENEOUS: THE POWER OF TRANSFER FUNCTIONS

The Multi-Layer Perceptron (MLP), a common Artificial Neural Network (ANN), focuses on neuron weights and transfer functions (TFs). However, the complex TFs in MLPs reduce model interpretability and adaptability (Tayfur, 2002). Many MLP neurons use identical TFs, limiting versatility and causing errors (Adeyiga et al., 2011), leading to user reluctance (Toprak and Cigizoglu, 2008). To address this, integrating mixed transfer functions or using networks with multiple functions (Ashigwuike, 2012) is suggested. Transitioning from Homogeneous Transfer Functions (HOMTF) to Heterogeneous Transfer Functions (HETTF) can enhance comprehensibility while maintaining precision (Adewole et al., 2011). HETTFs aim to create transparent neural networks capable of precise modeling, akin to MLPs (Adepoju et al., 2007), improving adaptability and transparency.

Anders (1996) proposed a mathematical model for SNN, giving it as

(1)

where is the dependent variable, is a vector of independent variables, is the network weight: ‘’ is the weight of the input unit, ‘’ is the weight of the hidden unit, and ‘’ is the weight of the output unit, and is the stochastic term that is normally distributed (that is, ). Basically, is the ANN, expressed as

(2)

where is the homogeneous transfer function.

In Udomboso (2013 and 2021), two HETTFs (satlins_tanh and satlins_tansig, respectively) from three respective HOMTFs, using convolution of functions, were developed for the SNN model:

(3)

for satlins_tanh, and

(4)

for satlins_tansig. where the CDFs, means and variances were obtained in both cases.

Analysis were performed to demonstrate the models, and the results presented below:

Table 2: Model Output with Satlins, Tanh and Satlins_Tanh

MODELS

MSE

NAR2

NIC

ANIC

2 Vars

3 Vars

2 Vars

3 Vars

2 Vars

3 Vars

2 Vars

3 Vars

SATLINS

0.00109

0.017271

0.89792

0.883231

0.004233*

0.059485

1.523973

2.044839

TANH

0.000973*

0.017089

0.906587*

0.883705*

0.004397

0.127796

1.519695*

2.133884

SATLINS_TANH

0.000973*

0.01583*

0.906587*

0.882506

0.004383

0.048318*

1.519665*

2.043398*

Table 3: Model Output with Satlins, Tansig and Satlins_Tansig

MODELS

MSE

NAR2

NIC

ANIC

2 Vars

3 Vars

2 Vars

3 Vars

2 Vars

3 Vars

2 Vars

3 Vars

SATLINS

0.00109

0.017271

0.89792

0.883231

0.004233

0.059485

1.523973

2.044839

TANSIG

0.000975

0.016814

0.899762

0.886528*

0.00416

0.053603

1.522966

2.039348

SATLINS_TANSIG

0.000946*

0.016055*

0.900028*

0.885393

0.003642*

0.03882*

1.522208*

2.032813*

In Udomboso and Ilori (2022), data was obtained from the petroleum oil leakages information in the Niger Delta region of Nigeria, and the above methodology was performed to observe the behavior when life data is involved. An HETTF (hardlims_tansig) from two respective HOMTFs (hardlims and tansig).

The resulting HETSNN are as follows:

(5)

where the CDF, mean and variance are given.

Table 4: Predictive Performance of the HOMSNN and HETSNN Models

Transfer Function

Predicted

Error

NIC

Mean

Variance

Mean

Variance

HARDLIMS

5.0999

0.1564

0.1001

0.5744

1.0854

TANSIG

5.1373

0.1776

0.0627*

0.4357*

1.1899

HARDLIMS_TANSIG

4.8648*

0.0395*

0.3352

0.5989

0.6938*

- DOMINANCE AND ADMISSIBILITY OF THE SNN ESTIMATOR OVER THE OLS ESTIMATOR

Estimation theory, crucial in various fields, involves minimizing norms in Hilbert space, with methods like Least Squares, Maximum Likelihood, and Bayesian techniques. Stein’s 1956 discovery of an estimator superior to Least Squares, known as Stein’s phenomenon, was initially met with skepticism (Stein, 1956). James and Stein refined this estimator in 1961, creating the James-Stein estimator, which faced criticism in the 1960s and 1970s (James & Stein, 1961). Empirical Bayes support from Efron and Morris in 1972 gradually bolstered the James-Stein estimator’s credibility (Efron & Morris, 1972).

Lemma (Stein): Let , and let be a differentiable function such that

. Then (6)

Various attempts have been made to improve upon the James-Stein estimator, including Baranchik’s positive-part estimator (Baranchik, 1964) and Thompson estimator (Thompson, 1968). These attempts aimed to address limitations and complexity concerns. Ben-Haim later introduced the spherical blind minimax estimator (SBME) and ellipsoidal blind minimax estimator (EBME), which outperformed traditional least squares (OLS) estimators (Ben-Haim, 2006). These advancements represent significant developments in the pursuit of better estimators beyond the James-Stein framework.

Udomboso and Dontwi (2022) considered the problem of estimation in both the Ordinary Least Squares and the Statistical Neural Network. We investigated the estimator that achieves a risk that is as low as possible by adopting an approach based on Spherical Blind Minimax Estimation (SBME). We proved that the Statistical Neural Network estimator, , strictly dominates the Ordinary Least Squares estimator, by proving a result that dominates the general SBME obtained by Ben-Haim in 2006.

Theorem: In a class of minimax estimators, the SNN estimator has always a mean square error that is less than the OLS estimator.

The proof of this theorem starts by expressing the of , given by

(7)

And by Stein’s lemma introduced above, the rigorous mathematics showed that

(8)

is the largest eigenvalue of . Whenever , the second term will vanish. Also, whenever , and , the expectation will be taken over a strictly negative range. This result makes to be lower than . Hence, , strictly dominates . We note that this result dominates the general SBME obtained by Ben-Haim (2006), which he showed to dominate whenever . This is a further proof of the superiority of the statistical neural network.

The study showed SNN dominance and admissibility over OLS with HOMSNN and HETSNN models. Data was divided into two sets – 2 and 3.

2 Vars 3 Vars

Figure 3: Graph of MSE Performance across the Models

It is important to note here that both the SNN and the classical statistical models are empirical models with their own strength and weaknesses, while possessing some similarities.

- STATISTICAL INFERENCE AND NEURAL NETWORKS

Determining the optimal number of hidden units for peak network performance has long perplexed researchers. Too many hidden neurons ensure accurate learning and predictions on trained data, while too few hinder relationship comprehension, leaving error levels unacceptable. Resop (2006) probed the consequences of exceeding the hidden neuron count needed for statistical model-level accuracy, sparking questions about increased accuracy. Panchal et al. (2011) emphasized the critical nature of hidden layer neuron and layer counts in network architecture. Underutilized neurons lead to underfitting, while an excess risks overfitting. Udomboso et al. (2012) addressed these dilemmas with an -based approach and statistical inference, highlighting the significance of variable selection for optimal neural network performance and the perils of redundant variables.

5.1 for Change in Hidden and Input Units

The performance of a network can be determined by the coefficient of determination, . Let denote the coefficient of determination of a network with a given number of hidden units, and denote the coefficient of determination of a network given a change in the number of hidden units. We note here that the change, , can be an increase, or a decrease, . The error produced by this change is given as . The is the output with a given number of hidden units, and is the output given a change in the number of hidden units.

Hence, contribution is,

(9)

And similarly, for a change in input units, the difference in contribution is,

(10)

5.2 Statistical Inference on for Change in Hidden and Input Units

We noted that . We derive the F test for change in HN as

(11)

which in terms of sum of squares, is give as

(12)

The hypothesis for this problem is formulated as versus , where is the parameter of the hidden unit, and . We reject the null hypothesis if .

In the same, without conflict of symbols, we show that for change in input neuron,

(13)

where is the number of input parameters or variables after a change.

The hypothesis is set up by assuming that an input variable , has no effect on the output . That is, against , where is the parameter of the input unit, and . We reject the null hypothesis if .

In the illustration, we considered stepwise increase of the hidden neurons from 1 to 10, keeping the input unit constant. The second set considers input units from 2 to 6, keeping the hidden neurons constant.

Table 5: Values for change in Hidden Neurons

HN

F

HN change

change

F change

change

Decision on

1

0.478346

7.335835

1 to 2

0.332489

26.95519

26.95519

Reject

2

0.810835

34.28647

2 to 3

-0.02534

-4.99637

4.99637

Accept

3

0.785492

29.29465

3 to 4

0.087695

25.79043

25.79043

Accept

4

0.873187

55.08508

4 to 5

0.011567

6.331858

6.331858

Accept

5

0.884754

61.40357

5 to 6

-0.12842

-36.5849

36.5849

Accept

6

0.756336

24.83205

6 to 7

0.092212

19.98969

19.98969

Accept

7

0.848547

44.81431

7 to 8

0.046143

23.14491

23.14491

Accept

8

0.894691

67.96664

8 to 9

0.084534

309.0962

309.0962

Reject

9

0.979224

377.0628

9 to 10

-0.20962

-350.339

350.339

Accept

10

0.769609

26.72359

From Table 5, it is noticed that except for hidden neurons changes from 2 to 3, 5 to 6, and 9 to 10, which shows negative contribution of the change, other results shows a positive contribution of the change. In the same vein, the inference results shows rejection of hidden neurons contribution to the network model at hidden neurons change from 1 to 2 and 9 to 10. This implies that there was no significant effect on the network model when the number of hidden neurons is increased to 2 and 10 respectively.

Table 6: Values for change in Input Units

K

F

K change

change

F change

change

Decision on

2

0.769609

26.72359

2 to 3

0.219852

724.4002

724.4002

Reject

3

0.989462

327.8361

3 to 4

-0.00063

-18.6582

18.6582

Accept

4

0.988834

176.722

4 to 5

0.009836

1324.805

1324.805

Reject

5

0.99867

938.7028

5 to 6

-0.01089

-837.658

837.658

Accept

6

0.98778

64.66858

Table 6 shows that at change from an even input unit to an odd input unit change is positive, while at change from an odd input unit to even input unit, we have negative change. This is attested to in the rejection of the contribution the input units whenever it is increased from even to odd, and acceptance whenever the input unit is increased from odd to input.

- INFORMATION CRITERION FOR THE STATISTICAL NEURAL NETWORS

Model selection is vital in data analysis, and Akaike (1973, 1974) introduced AIC to evaluate model fit. AIC inspired other criteria like SIC (1978), BIC (Akaike, 1978), and HQ (Hannan & Quinn, 1979). Sugiura (1978) introduced AICc, a correction for small samples. In neural networks, determining optimal parameters, especially hidden units, is challenging. Murata et al. (1994) introduced NIC, inspired by AIC, to find the best model and parameters for approximating the system’s distribution from training examples, measuring the network’s risk, , against the target distribution (Murata et al., 1994).

The Network Information Criterion, as developed by Murata eta al. (1994), is given as

(14)

where empirical parameter, and estimated parameter

Udomboso et al. (2016) expanded on model selection in Statistical Neural Networks using their developed Network Information Criteria (NIC). NIC, though sample-biased, offers objective, data-driven criteria for choosing the optimal parameter , from candidate models. This study employed a criterion designed to be an asymptotically unbiased estimator of expected Kullback-Leibler information, following Akaike (1973).

6.1 Adjusted Network Information Criterion (ANIC)

In deriving an Adjusted , we assume that the estimates network model includes the true network model, and the approach used the corrected based on Kullback’s systematic divergence as used by Hafidi and Mkhadri (2006). We note that

(15)

where is some value that improves the , which is asymptotically an unbiased estimator of

, where is the dimension of , and is given as ,

and is the .

The process to develop the ANIC stemmed from proving the expression

(16)

which after some mathematical analysis, results into

(17)

which is a correction for the biased .

Table 7: Sample Sizes at which NIC and ANIC exhibit Local Minima

NIC

ANIC

2 Vars

3 Vars

2 Vars

3 Vars

SATLINS

20, 80, 150, 250

40, 80, 125, 200

250

60, 125, 200

TANH

60, 100, 250

40, 125, 175, 250

80, 200

60, 125, 175

TANSIG

40, 100, 150, 250

40, 80, 125, 175, 250

20, 60, 200, 300

40, 80, 150, 300

SATLINS_TANH

60, 100, 150, 300

60, 100, 175, 250

80, 150, 200

150, 400

SATLINS_TANSIG

60, 100, 250

60, 125, 250

150, 250

40, 250

Rates of efficiency for the NIC and ANIC are 33% and 50%, for 2-variable case, and 30% and 43%, for 3-variable case, respectively. The results of the ANIC demonstrate the high precision of SNN models at large samples. When comparing NIC and ANIC for neural network model selection, more local minima with increasing sample size don’t necessarily mean better performance. ANIC, with fewer local minima than NIC, better handles sample size impact due to its complexity-fit trade-off, reduced sensitivity, correction, reduced overfitting, enhanced generalization, efficiency, and practicality.

- THE WAVELETS NEURAL NETWORK

The wavelet transform is given by

(18)

where and are the scale and location parameters, respectively.

By adjusting the scale parameter, , a series of different frequency components in the signal can be obtained.

Restricting to discrete values, , then

(19)

where is known as the mother wavelet.

In both cases, it is assumed that .

Wavelet methods have been most studied in the nonparametric regression problem of estimating a function on the basis of observations at time points , and are modeled as

(20)

where is the noise.

Wavelet Neural Networks (WNNs) merge wavelet theory and neural networks. WNN is structured after the feed-forward neural networks, comprising input, hidden, and output layers with linear combiners. Hidden neurons employ orthonormal wavelet family activation functions.

In estimating WNN, we minimize using the usual least-squares cost function;

(21)

where is the number of estimation (training) samples for each class, and is the optimal output, of the input vector.

The partial derivative of the parameters , , and are obtained, and adjustment is made for the parameters by the following equation:

(22)

where is the vector of the parameters , , and , and is the learning rate between 0.1 and 0.9. - APPLICATIONS OF THE SNN MODEL TO SOLVING LIFE PROBLEMS

Applications have been made to diverse real-world challenges, spanning oil and gas, climate, economics, health, and more. Wavelet transforms and time series were utilized in some cases.

8.1 Applications to the Environment

8.1.1 Modeling of Rainfall Precipitation

In a comparative study by Udomboso and Amahia (2011) focusing on rainfall prediction in Ibadan, Nigeria, they assessed the performance of Ordinary Least Squares (OLS) and Statistical Neural Network (SNN) models. Utilizing data from the Nigerian Meteorological (NIMET) station in Ibadan, they analyzed rainfall, temperature, and humidity. Their findings indicated that, as sample size increased, OLS’s Mean Squared Error (MSE) decreased, while SNN’s MSE increased. However, SNN outperformed OLS in terms of MSE, AIC, and SIC across different sample sizes, highlighting its superior performance in modeling rainfall patterns. (See Table 8)

Table 8: Model Selection for both the OLS and SNN

OLS

SNN

MSE

AIC

SIC

HN

MSE

AIC

SIC

132

8.00

0.25

0.237

8.368

8.934

2

5

10

50

100

3.40

2.41

2.60

2.53

2.39

0.03

0.31

0.26

0.28

0.32

0.02

0.30

0.25

0.27

0.31

3.558

2.522

2.721

2.648

2.501

3.799

2.693

2.905

2.827

2.671

264

6.78

0.28

0.275

6.933

7.220

2

5

10

50

100

5.91

5.46

4.44

3.53

0.76

0.06

0.13

0.29

0.44

0.89

0.05

0.12

0.29

0.44

0.89

6.046

5.586

4.542

3.611

0.778

6.297

5.817

4.730

3.761

0.810

396

6.70

0.29

0.289

6.795

7.003

2

5

10

50

100

9.43

8.88

7.98

5.41

2.48

0.00

0.06

0.17

0.43

0.74

0.00

0.06

0.17

0.43

0.74

9.574

9.016

8.102

5.493

2.518

9.867

9.292

8.350

5.661

2.595

Udomboso et al. (2014) employed Artificial Neural Networks and Continuous Wavelet Transform to model monthly rainfall in Ibadan (Jan 1971 to Dec 2003). This approach blended wavelet and neural network techniques for improved rainfall simulation and forecasting, analyzing 48 sub-time series but focusing on the top 10. Table 9 displays sample statistics for the original data and CWT. Original data: Mean 6.08, Std. Dev. 14.33, Std. Err. 1.01. Sub-time series: Mean -0.01 (CWT 1) to 0.23 (CWT 12), Std. Dev. 5.27 (CWT 1) to 13.74 (CWT 12), Std. Err. 0.37 (CWT 1) to 0.97 (CWT 12), respectively.

Table 9: Sample Statistics of the Original Data and the CWT

Mean

Standard Deviation

Standard Error of the Mean

Original Data

6.081095

14.33177

1.010885

Sub-Time Series 1

-0. 0105473

- 268888
- 3716387

Sub-Time Series 2

0.0268657

11.987

0.8454978

Sub-Time Series 4

0.1049254

12.90514

0.9102586

Sub-Time Series 3

0.0257711

13.10204

0.9241467

Sub-Time Series 6

0.1443781

11.59102

0.8175676

Sub-Time Series 8

0.1827363

11.85019

0.8358478

Sub-Time Series 5

0.2015423

12.08643

0.852511

Sub-Time Series 10

0.2196518

12.79639

0.9025877

Sub-Time Series 12

0.2327861

13.73769

0.9689816

Sub-Time Series 7

0.1508955

11.57113

0.8161646

This study used three transfer functions (SATLINS, TANSIG, TANH) selected based on low errors, with daily data for training. The continuous wavelet neural network consistently outperformed other methods. Table 10 displays CWNN results for these transfer functions. For TANH, error variances ranged from 0.0007 (CWT 1) to 0.0097 (CWT 6), compared to the original data’s 0.0601. TANSIG results ranged from 0.0007 (CWT 1) to 0.0393 (CWT 2), with only CWT 2 exceeding the original data’s 0.0082. SATLINS results showed smaller error variances.

Table 10: CWNN Result based on the Transfer Functions

Sub-Time Series

Mean Absolute Error

Error Variance

Tanh

Tansig

Satlins

Tanh

Tansig

Satlins

Sub-Time Series 1

0.013951741

0.014383582

0.083100995

0.000729592

0.000739871

0.099012345

Sub-Time Series 2

0.024978109

0.131156219

0.216885075

0.003239021

0.039282011

0.493311007

Sub-Time Series 4

0.030443781

0.029195522

0.217135323

0.002145085

0.002329514

0.42369717

Sub-Time Series 3

0.05209204

0.048633333

0.286941791

0.004102457

0.004778149

0.709184674

Sub-Time Series 6

0.071986567

0.028030348

0.247026866

0.009715933

0.002541222

0.427719785

Sub-Time Series 8

0.023469652

0.040067164

0.352124876

0.002512283

0.006018716

0.951654567

Sub-Time Series 5

0.032890547

0.035047761

0.285989552

0.002339264

0.002771785

0.587782167

Sub-Time Series 10

0.046734328

0.016926866

0.381137811

0.005444832

0.000814949

0.791104316

Sub-Time Series 12

0.022665174

0.025757214

0.407753731

0.00073663

0.001418593

0.763670662

Sub-Time Series 7

0.039681095

0.043672637

0.270354229

0.002958337

0.002743456

0.62697559

Conducting the test of hypotheses on the results obtained, it was found out there are significant differences between each decomposed data and the original data as shown on Tables 11 and 12. Three alternative hypotheses were constructed.: , , , where is the mean of the decomposed data. The variance ratio test was used to test the validity of the model based on the error generated by the network from each decomposed data: , , , where is the standard deviation of the decomposed data. The test shows that at , sub-series 1, 2, 4 and 6 are significant, while at , sub series 1, 2, 4, 6 and 5 are significant. These can be seen on Table 12.

Table 11: Paired Sample Statistics of the Original Data and the CWNN

95% Confidence Interval of the Difference

at and

Lower

Upper

Original Data – Sub-Series 1

4.605974

7.577309

8.0853

0.000

Original Data – Sub-Series 2

4.788468

7.31999

9.4317

0.000

Original Data – Sub-Series 4

4.412868

7.539471

7.5381

0.000

Original Data – Sub-Series 3

4.331777

7.77887

6.9278

0.000

Original Data – Sub-Series 6

4.078572

7.794861

6.3002

0.000

Original Data – Sub-Series 8

3.893714

7.903002

5.8020

0.000

Original Data – Sub-Series 5

3.841665

7.917439

5.6892

0.000

Original Data – Sub-Series 10

3.731568

7.991318

5.4267

0.000

Original Data – Sub-Series 12

3.608266

8.088351

5.1482

0.000

Original Data -Sub-Series 7

3.809

8.051398

5.5128

0.000

Table 12: Variance Ratio Test of the Original Data and the CWNN

at

at

Upper Tail

Lower Tail

Original Data – Sub-Series 1

171.375

0.006

0.0000

0.0000

Original Data – Sub-Series 2

7.016

0.143

0.0319

0.016

Original Data – Sub-Series 4

9.256

0.108

0.0159

0.008

Original Data – Sub-Series 3

3.314

0.302

0.1705

0.0853

Original Data – Sub-Series 6

9.142

0.109

0.0164

0.0082

Original Data – Sub-Series 8

1.830

0.546

0.4806

0.2403

Original Data – Sub-Series 5

4.807

0.208

0.0776

0.0388

Original Data – Sub-Series 10

2.653

0.377

0.2602

0.1301

Original Data – Sub-Series 12

2.834

0.353

0.2305

0.1153

Original Data -Sub-Series 7

4.232

0.236

0.1027

0.0513

The analysis showed that except in extremely rare cases, all the series performed optimally compared to the original data. The result of the study has been able to show that using the continuous wavelet transform in the ANN technique, a better performance of the network is observed.

8.1.2 Global Solar Radiation

Sustainable development emphasizes eco-friendly energy (Boeker & van Grondelle, 2011). Solar power, abundant in places like Nigeria (Lana and Lamont, 2009), requires precise radiation measurements often hampered by costs in developing nations (Trinka et al., 2005). Artificial Neural Networks (ANNs) increasingly offer accurate predictions. Nymphas and Udomboso (2020) employed ANNs for solar radiation estimation in Ibadan, Nigeria, using temperature and humidity inputs, enhancing existing methods. Data from January 1995 to December 2004 displayed climate parameters, including daily Tmax (32.02°C) and Tmin (22.44°C), rainfall (108.95 mm), RHmax (98.25%), temperature range (14.47°C to 32.43°C), 153.3 hrs of monthly sunshine, and daily solar radiation fluctuations (1.8 MJ/m²/day to 29.1 MJ/m²/day). The ANNs achieved competitive Mean Squared Errors (MSEs), corroborating Rehmann et al. (2008).

` Mean temperature Relative humidity Daily temperature and relative humidity`

Fig 4: Estimated and measured solar radiation

8.1.3 Soil Physico-Chemical Properties on Adsorption

Heavy metal contamination in soils is a significant environmental issue, with detrimental economic and health impacts. Heavy metals like lead (Pb), copper (Cu), zinc (Zn), and cadmium (Cd) pose threats to agriculture, human well-being, and soil ecosystems, prompting concern (Adriano, 2001). While these metals are naturally occurring, human activities and industrial waste have heightened their presence, affecting soil quality and public health. Udomboso et al. (2017) examined the adsorption of these heavy metals in soil, considering factors like pH, goethite content, humic acid, time, and sorbate concentration. Results varied across properties, with Cd showing the highest adsorption, except for humic acid, where Zn exhibited the most significant adsorption. This underscores the importance of soil management and pollution control in Nigeria.

8.1.4 Predicting River Discharge

Surface water is pivotal for various sectors like agriculture, power generation, and fisheries. Managing its flow is crucial for hydropower, water supply, and flood prevention. Amid rising global water demand and uncertainties in water availability, flood prediction, low-flow areas, and hydrological droughts become vital. Predicting discharge aids in mitigating floods, managing droughts, environmental needs, sectoral water demands, reservoir levels, and disaster response. Fashae, Olusola, Ndubuisi and Udomboso (2018) compared ANN and ARIMA models in modeling Opeki river, a significant tributary of River Ogun in Oyo State. This catchment’s unique features offer insights into groundwater recharge and discharge dynamics, supported by an existing gauging station on the river.

Figure 5: ARIMA and ANN predicted discharge from 1982-2010

The study favoured ARIMA modeling for river discharge, especially with limited data. River Opeki’s significance in economic, social, and environmental aspects highlights the need for improved modeling amid Africa’s infrastructure challenges. This offers opportunities for efficient agriculture, flood alerts, hydroelectricity, water supply, and river health promotion.

8.1.5 Estimation of Monthly Evapotranspiration

Evapotranspiration (ET) is vital for water resource management. Chukwu, Udomboso, and Onafeso (2011) studied ET estimation using climatic data from IITA, Nigeria, comparing Statistical Neural Network (SNN) and classical regression. SNN consistently outperformed regression, showing lower Mean Squared Errors (MSE). Increasing hidden neurons in SNN reduced MSE, demonstrating its capacity to capture complex relationships. This highlights SNN’s potential for accurate ET estimation, especially in climate modeling and water resource management. Lorentz et al. (2010) also noted that relative humidity changes may better indicate evaporation variations. Further research is essential for validating these findings and addressing climate change challenges.

8.2 Application to Financial Time Series

8.2.1 Time Series Forecasting with SNN and CWT

In Udomboso and Amahia’s (2016) study, a time series forecasting model for the naira-dollar exchange rate was developed, combining Statistical Neural Networks (SNN) and continuous wavelet transforms. Using annual exchange rate data from late 2012, they focused on three exchange rate variables: Buying Rate (BR), Selling Rate (SR), and Central Rate (CR), represented by corresponding SNN models (PBR, PSR, and PCR). These rates were decomposed into ten continuous wavelet signals (W1 to W10), creating 120 data points per rate. Each signal served as input for the network, and predictions were cross-referenced with the original data. The study showed strong correlations between the decomposed series and the original data, indicating mean prediction stability and generally improved accuracy compared to the original dataset.

In Udomboso and Saliu’s (2016) research, they developed an inference procedure for neural networks using the bootstrap method to assess the market efficiency of the Nigerian exchange rate from 2001 to 2015. They employed a multilayer perceptron network architecture and estimated various SNN models at different lags and hidden neuron configurations. Selection criteria identified the best model for each lag, and model evaluation confirmed that residuals were independently and identically distributed without serial autocorrelation, suggesting potential abnormal earnings in the market.

8.3 Applications in the Oil and Gas Sector

8.3.1 Gas Modeling with TRSM and TSNN

Artificial Neural Networks (ANNs) are crucial in petroleum engineering, especially when conventional models face data limitations. Falode and Udomboso (2016) analyzed Nigeria’s natural gas production, utilization, and flaring using the Time Series Regression Model (TSRM) and Time Series Neural Network (TSNN). Gas flaring, a long-standing issue in Nigeria, impacts the environment and local agriculture. ANNs find applications in reservoir analysis, drill bit grading, pump malfunction detection, and reservoir modeling. In Figure 6, trends in Nigeria’s natural gas production, utilization, and flaring reveal complex interdependencies. Flaring surpassed utilization until 2004, highlighting the intricate relationship between production, utilization, and flaring over time.

Figure 6: Trends in Nigeria’s natural gas production, utilization, and flaring (1958-2006)

The correlogram showed non-stationarity, confirmed by the Augmented Dickey Fuller test with trend and intercept (p-values > 5%). After differencing, variables became stationary with constant means and reduced autocorrelation.

Table 13: Estimated Model Adequacy for Gas Production, Utilization and Flaring

TSRM

TSNN

MSE

AIC

SIC

HL

MSE

AIC

SIC

Production

4.029e19

0.878

0.875

4.370e19

4.720e19

2

2.110e19

0.936

0.935

2.196e19

2.373e19

5

1.900e19

0.942

0.941

1.977e19

2.136e19

10

1.805e19

0.945

0.944

1.878e19

2.029e19

Utilization

4.148e19

0.670

0.663

4.500e19

4.860e19

2

2.652e19

0.789

0.784

2.761e19

2.982e19

5

2.237e19

0.823

0.818

2.328e19

2.515e19

10

1.034e19

0.918

0.916

1.076e19

1.162e19

Flaring

2.998e19

0.672

0.665

3.250e19

3.510e19

2

1.285e19

0.859

0.856

1.338e19

1.445e19

5

1.272e19

0.861

0.858

1.324e19

1.431e19

10

1.032e19

0.887

0.885

1.074e19

1.160e19

Table 13 summarize model adequacy results. MSE, , AIC, and SIC indicate model fit. Results reveal significantly smaller MSEs for TSNN compared to TSRM.

8.3.2 Prediction of Oilfield Scale Formation

Falode, Udomboso, and Ebere (2016) addressed critical challenges in the oil and gas sector by predicting the formation of BaSO4 and CaSO4 oilfield scale. These scales can disrupt operations due to BaSO4’s extreme insolubility and the variable crystallization forms of CaSO4. The research involved two primary cases: predicting BaSO4 precipitation under varying temperature and pressure and determining CaSO4 kinetic rate constants, considering factors like temperature, pressure drawdown, and flow rate. Input variables were standardized within a (0, 1) range for precise modeling. Neural network architectures were optimized based on their performance in minimizing mean squared error (MSE). The results indicated that the neural network models achieved low MSE values, showcasing their potential for effectively managing oilfield scale-related challenges in the industry.

8.3.3 Efficient Crude Oil Pricing

Crude oil, a vital global commodity, comprises a third of worldwide energy consumption, with a profound impact, especially on developing economies and oil-dependent nations. Oil price fluctuations influence inflation rates and overall economic performance. Forecasting oil prices aids stakeholders in making informed decisions, considering demand, supply, and geopolitical factors. Price volatility significantly affects the global economy, particularly oil-importing countries, necessitating accurate price forecasts. In their study, Falode and Udomboso (2021) utilized Autoregressive Neural Networks (ARNN) to forecast crude oil prices, using data spanning January 2006 to October 2020, aiming to mitigate risks and facilitate informed planning.

Typically, the Autoregressive of order , written as , is given as

(23)

where , , and is .

The Autoregressive Neural Network model used in this study is modified from Medeiros et al. (2006) as

(24)

where is a non-linear function in , having a vector of parameter , and is given as . Also, vector is given as , where is a vector of lagged values of . The term , a sequence of independently normally distributed random variables, is the stochastic term, also known as the error. The function is the transfer function (also known as the activation or link function).

Figure 7: Graph of Crude Oil Production, Export and Price (Jan 2006 –Oct 2020)

Figure 7 displays Nigeria’s crude oil production, export, and global price trends. All variables exhibit seasonal variations. Nigeria maintains relatively consistent production and export, while global prices show sharp fluctuations.

Table 14: Exploratory Data Analyses Crude Oil Production, Export and Price (Jan 2006 –Oct 2020)

Variable

Interquartile Range

Mean

Median

Skewness

Excess Kurtosis

Crude Oil Production

0.3000

2.1292

2.1400

-0.2491

0.1928

Crude Oil Export

0.3000

1.6792

1.6900

-0.2491

0.1928

Crude Oil Price

48.1975

77.2874

73.2800

0.2721

-0.9300

Table 14 summarizes 14 years of data on crude oil production, export, and global price. Mean and median values indicate stable production and export at around 2.13-2.14 mbd and 1.68-1.69 mbd, with prices averaging 77.29-73.28 US$/barrel. Production and export have a narrow range of 0.3 mbd. Skewness values near zero suggest roughly symmetric distributions, while excess kurtosis values indicate slightly leptokurtic distributions for production and export (0.1928) and slightly platykurtic for price (-0.9300).

Table 15: SNN Model Determination for Crude Oil Price

Hidden Neuron

MSE

p-value

NIC

1

558.3831

0.2336

0.0366

560.4167

2

520.2403

0.2860

0.0280

519.1867

3

520.7263

0.2853

0.0281

523.4690

4

539.1649

0.2600

0.0318

540.6032

5

511.3642

0.2981

0.0264

513.9875

10

491.6980

0.3251

0.0233

493.2909

20**

306.8214

0.5789

0.0082

307.3058

30

435.7316

0.4019

0.0168

439.0139

40

366.2549

0.4973

0.0114

365.9786

50

408.9640

0.4387

0.0144

409.1884

75

329.1152

0.5483

0.0093

328.6609

100*

198.9716

0.7269

0.0043

199.3477

- and **preferred models for forecasting

Table 15 displays model determinations from the SNN part of the study, favoring models with 100 and 20 hidden neurons based on maximum values (0.73 and 0.58) significant at a 1% level.

Overall, Nigeria’s crude oil production and export have declined over the years, despite seasonal variations. The autoregressive neural network with 100 neurons outperformed others. COVID-19 significantly disrupted the oil industry, causing disruptions, reduced demand, price crashes, and operational challenges. This study highlights the need for economic diversification away from oil dependence.

8.4 Applications in the Health Sector

8.4.1 Modelling Cholera in Nigeria and South Africa

Cholera, a historic disease dating back to the early 1800s, gained notoriety through the Calcutta outbreak, linked to poor living conditions and tainted water sources. In 1854, Italian scientist Filippo Pacini identified Vibrio cholera as the causative bacterium during London’s Broad Street Cholera Outbreak. Cholera is characterized by symptoms such as severe diarrhea, vomiting, dehydration, and muscle cramps. It remains a global health concern, predominantly affecting developing tropical and subtropical regions. WHO reports documented cholera cases in African countries like Angola, Congo, Mozambique, Nigeria, Somalia, Tanzania, and South Africa. In Oduaran and Udomboso (2017), cholera modeling for Nigeria and South Africa revealed peaks in Nigeria’s cholera incidence, death, and fatality rates in 1970, 1990, 2000, and 2010, with declining fatality rates post-1990. In South Africa, high incidences were observed in 2000, with relatively elevated death rates in 1980, 2000, and 2010, accompanied by fluctuating fatality rates. The study suggests a need for increased hidden neurons for accurate cholera predictions.

Fatality and Predicted Fatality Fatality Growth Rate and Predicted Fatality Growth Rate

Figure 8: Graph of Fatality, Predicted Fatality, Fatality Growth, and Predicted Fatality Growth Rates due to Cholera in Nigeria (1970 – 2013)

` Fatality and Predicted Fatality Fatality Growth Rate and Predicted Fatality Growth Rate`

Figure 9: Graph of Fatality, Predicted Fatality, Fatality Growth, and Predicted Fatality Growth Rates due to Cholera in South Africa (1970 – 2013)

Figures 8 and 9 display smoother predicted fatality and growth rates, with South Africa exhibiting a notably smoother trend compared to Nigeria.

8.4.2 Contraception Usage

Contraception and involuntary fertility control are key components of family planning recognized by the World Health Organization. Contraception and abortion, affecting fertility before and after gestation, are pivotal for married and unmarried individuals. Condom use stands as the sole proven method to significantly reduce STI and HIV/AIDS transmission. In sub-Saharan Africa, contraceptive prevalence is 25%, with Ghana and Nigeria reporting rates of 23.5% and 14.1%, respectively, impacting maternal mortality rates. It is against this backdrop that Udomboso et al. (2015) and Udomboso and Amoateng (2018) examined factors influencing contraception in Nigeria and Ghana using neural networks and DHS data.

Table 16: Statistics of Model Determination

Hidden Neuron

Never Married

Ever Married

MSE

R2

NIC

MSE

R2

NIC

Nigeria

Ghana

Nigeria

Ghana

Nigeria

Ghana

Nigeria

Ghana

Nigeria

Ghana

Nigeria

Ghana

1

0.253

0.252

0.02

0.00

0.259

0.284

0.196

0.253

0.00

0.00

0.203

0.204

5*(*)(+)

0.240

0.252

0.07

0.00

0.237

0.295

0.175

0.238

0.10

0.06

0.176

0.236

10+(*)(+)

0.251

0.231

0.03

0.09

0.259

0.236

0.167

0.226

0.15

0.11

0.165

0.228

25*+(*)(+)

0.237

0.126

0.08

0.50

0.234

0.127

0.18

0.221

0.08

0.12

0.180

0.224

50*(*)(+)

0.215

0.245

0.16

0.03

0.213

0.252

0.142

0.138

0.27

0.45

0.142

0.139

75*+(*)(+)

0.059

0.206

0.77

0.19

0.060

0.203

0.089

0.109

0.55

0.57

0.089

0.110

100*+(*)(+)

0.008**

0.081**

0.97

0.68

0.009

0.079

0.007**

0.106**

0.96

0.58

0.008

0.106

** Model selected for prediction

- Significant at 5% level of significance (P>F) for Nigeria (never married) + Significant at 5% level of significance (P>F) for Ghana (never married)

(*) Significant at 5% level of significance (P>F) for Nigeria (ever married) (+) Significant at 5% level of significance (P>F) for Ghana (ever married)

Table 17: Estimates of usage and rate of usage of contraception (in percent)

Method

Mean Usage (%)

Standard Error of Usage (%)

Rate of Usage (%)

Nigeria

Ghana

Nigeria

Ghana

Nigeria

Ghana

AM

11.9

19.2

0.8

1.2

4.1

4.7

AMM

7.2

10.5

0.9

1.5

4.3

5.9

ATM

4.7

8.7

0.1

-0.1

0.6

-0.3

NCU

88.1

80.8

-0.1

-0.3

-0.4

-1.0

Table 18: Forecasts of usage and rate of usage of contraception till 2030 (in percent)

Method

Mean of Usage Forecast (%)

Standard Error of Usage Forecast (%)

Rate of Usage Forecast (%)

Nigeria

Ghana

Nigeria

Ghana

Nigeria

Ghana

AM

16.0

25.2

1.0

2.1

3.8

3.4

AMM

10.6

18.8

0.7

1.4

4.0

4.2

ATM

5.4

6.4

0.1

0.2

0.6

-1.8

NCU

84.0

74.8

0.7

1.1

-0.4

-0.8

Udomboso et al. (2015) investigated factors affecting contraception use in “never married” women in Nigeria and Ghana, highlighting education, desire for children, opposition to contraception, and location as key influencers. Udomboso and Amoateng (2018) observed a preference for modern contraception over traditional methods, with a 4.4% overall growth in contraception use and declining reluctance (0.7%). Injectables are expected to dominate by 2030.

8.4.3 Vesico-Vaginal Causality

Vesicovaginal fistula (VVF) is a devastating condition characterized by abnormal communication between the vaginal wall and the bladder or rectum, leading to uncontrollable urinary leakage. It’s a significant public health issue, notably in Nigeria, affecting 150,000 to 200,000 patients. While Statistical Neural Networks (SNNs) are effective for nonlinear regression, their application to VVF causative factors is limited. James, Udomboso, and Onwuka (2012) compared SNNs and linear regression models to understand VVF causes. SNNs outperformed, yielding a higher R-squared (0.8 vs. 0.46) and lower mean square error (2011.0 vs. 5439.55). This highlights SNNs’ potential in comprehending complex medical conditions like VVF for prevention and treatment. Factors like prolonged obstructed labor and instrument misuse emerged as significant contributors to VVF, emphasizing the value of artificial neural networks in medical research.

8.5 Application to Aviation Transport

Aviation is a pivotal economic sector, reflecting a nation’s development. In Nigeria, it drives rapid growth, employing 245,500 people and contributing NGN185 billion to the economy. Air passenger traffic has surged, reaching 4.7 million and 10.7 million international and domestic passengers in 2013 (National Bureau of Statistics, 2014). Udomboso and Ojo (2021) developed statistical learning algorithms, including support vector machines and statistical neural networks, to analyze air passenger traffic data from 2007 to 2018 at the Murtala Mohammmed International Airport, Lagos, using records from the Nigerian Airspace Management Agency (NAMA), MMIA.

Figures 10 and 11 depict MMIA’s domestic and foreign airline traffic. They reveal seasonal variations and occasional upward trends, indicating regular passenger influx.

Table 19: Performance Evaluation of SVM Models for Foreign and Domestic Airline Traffic

Kernels

Domestic Airline

Foreign Airline

RMSE

C Value

Degree Value

RMSE

C Value

Degree Value

Linear

Rbf

Polynomial

Sigmoid

34958.41

34959.27

34974.59

34958.97

13000

## 15000

## 16000

–

## 110

13345.96

13345.93

13345.93

13345.96

25000

## 40000

## 30000

–

## 130

```
Domestic Air Traffic Foreign Air Traffic Domestic Air Traffic Foreign Air Traffic
Figure 12: Plot of SVM Actual and Predicted Figure 13: Plot of ANN Actual and Predicted
```

Table 19 reveals that the Linear Kernel SVM with C=13000 excels in predicting domestic airline traffic with the lowest RMSE, while the polynomial SVM kernel with degree 130 performs exceptionally well for foreign airline traffic. Figures 12 and 13 depict the actual and predicted plots for domestic and foreign air traffic training data.

Table. 20 Performance Evaluation of ANN Model using Domestic and Foreign Airline Traffic

No. of Neurons

Domestic Traffic

International Traffic

RMSE

Best Iterations

RMSE

Best Iterations

10

20

30

40

50

60

70

80

90

33071.80

33003.88

33058.59

33020.20

33016.37

33012.22

33034.08

33034.05

33031.52

5900

5300

4000

3700

3700

3700

2800

2800

2700

13359.09

13357.38

13387.26

13348.67

13381.88

13382.27

13397.82

13420.63

13431.18

5200

3200

2800

2900

2300

2200

2200

1900

1800

Table 20 shows Linear Kernel SVM with C=13000 excels in domestic airline traffic prediction, while the polynomial SVM kernel with degree 130 performs well for foreign airline traffic. Figures 12 and 13 display actual and predicted data for domestic and foreign air traffic. The SVM and ANN models, respectively provide accurate predictions.

Table. 21: Optimal metric for Domestic and Foreign Air Traffic of SVM and ANN

Model

Domestic Traffic

International Traffic

Test RMSE

Train RMSE

Test RMSE

Train RMSE

SVM

ANN

34958.412592

33003.878906

46028.98

43868.25

13345.925823

13348.665039

38473.20

37559.26

Table 21 provides optimal metrics for SVM and ANN. SVM outperforms in foreign air traffic prediction, while ANN excels in domestic. Future studies should delve into advanced forecasting techniques, incorporate exogenous factors, and leverage daily data for improved predictive accuracy, contributing to more effective aviation services.

8.5 Other Areas of Applications and Ongoing Researches

Other areas not covered in this lecture include the application of the neural network model to the study of crime rates (James, Suleiman, Udomboso, and Babayemi, 2015), as well as students’ academic performance (Asogwa and Udomboso, 2016). The current focus includes (but is not limited to) statistical image recognition, which covers vehicle recognition, number plate recognition, and OCR (Optical Character Recognition). Additionally, there is ongoing research in dynamic modeling of internet traffic, graphical models, machine learning-based spatiotemporal analysis, greenhouse gas (GHG) estimation, and neural network estimations with time series models.

- THE DEPARTMENT OF STATISTICS

The Department of Statistics at the University of Ibadan holds a distinguished reputation as the oldest and most prestigious Statistics Department in Nigeria. It stands as the foremost institution for statistical research and education in the nation. It was established to provide quality education and research in the field of statistics. The Department offers a range of academic programs at both the undergraduate and postgraduate levels, providing opportunities for advanced research and specializations in various units: viz. Biometry, Computational Statistics, Economic and Financial Statistics, Environmental Statistics, and Statistical Design of Investigation. The Department is proud to have a dedicated and experienced faculty of statisticians and researchers who are actively engaged in both teaching and research endeavors. The Department is currently conducting a comprehensive review of its curricula at both the undergraduate and postgraduate levels. As part of this review, a new unit, Mathematical Statistics, would be introduced. Additionally, some existing units will undergo name changes to align with global trends in the field.

The Department of Statistics is renowned for its significant research contributions across various domains of statistics, encompassing mathematical statistics, applied statistics, and data science. These contributions are firmly rooted in the diverse units outlined above, showcasing the Department’s wide-ranging expertise and impact in the field of statistics. Faculty and students often engage in research projects and contribute to academic journals, both nationally and internationally. The Department maintains collaborative partnerships with other academic institutions, government agencies, and industry collaborators, which create valuable research opportunities and internships for students. These collaborations enhance the educational and research experiences offered by the Department, providing students with real-world exposure and networking opportunities. It is also actively involved in community outreach and consultancy services, providing statistical expertise to address real-world problems.

My Dean, the Department of Statistics at the University of Ibadan, like any academic department, has various needs for continual functioning. For instance, the newly constructed annexed building for the Department of Statistics is in ruins. We have received promises for its revitalization, but to date, nothing has been done. We need help. Also, we desire that the university, among other things, do the following for us:

(i) Restrict the teaching of statistics courses to relevant Departments and Faculties to the Department of Statistics.

(ii) Non-Statistics faculty willing to teach Statistics must get chartered by the Chartered Institute of Statisticians of Nigeria (CISON).

(iii) Maintain and fund classrooms, laboratories, software and research facilities for conducive learning.

(iv) Recruit and retain diverse, experienced faculties for teaching and research.

(v) Allocate grants for innovative statistical research projects, inspiring faculty innovation.

(vi) Promote collaborations with Departments and institutions within and outside Nigeria for interdisciplinary research and global exposure.

(vii) Establish a Statistical-consulting base for the university within the Department. - CONCLUSION

My Dean, there has always been a concern about whether Data Science should reside within the domain of Computer Science or Statistics. Statisticians worldwide see Data Science as an integral part of Statistics since it builds upon statistical foundations, encompassing data analysis, modeling, inferential statistics, experimental design, and more. The question has always remained that, should a non-statistician assume the role of a Data Scientist? While it spans diverse skills, the theory of statistics remains its core, enabling insight extraction. Non-statisticians can enter the field of Data Science by acquiring Statistics, Probability, and Machine Learning skills. Data Science offers specialization options based on one’s background, be it in machine learning, data engineering, data analytics, or business intelligence.

There has been a mix-up in understanding that there are differences, as there are similarities between the Artificial Intelligence (AI), Machine Learning (ML) and Data Science. At this juncture, it is important that I compare and contrast the various skills and techniques needed by these related fields for a better understanding.

10.1 Interplay between Artificial Intelligence, Machine Learning and Data Science

Artificial Intelligence, Machine Learning, and Data Science are related fields, but they have distinct focuses, methodologies, and applications. - Scope and Focus: Artificial Intelligence (AI) is a field aiming to create intelligent machines capable of tasks requiring human-like thinking. It includes problem-solving, understanding language, recognizing patterns, and learning from experience. Machine Learning (ML), a subset of AI, focuses on creating algorithms allowing computers to learn from data and improve task performance. Data Science is an interdisciplinary field combining statistics, computer science, and domain expertise to extract insights from data, involving collection, cleaning, and analysis to inform decision-making.
- Data Usage: Artificial Intelligence does not necessarily rely on extensive datasets; it focuses on replicating human-like intelligence, incorporating reasoning and rule-based decision-making. Machine Learning, in contrast, is inherently data-driven, demanding substantial datasets to train models for pattern recognition and prediction. Data Science, centered on data, involves collecting, cleaning, and analyzing data to derive insights.
- Goals: AI replicates human cognition, aiming for autonomous task execution and adaptation to change. Machine Learning focuses on data-driven algorithm improvement, emphasizing prediction and classification. Data Science extracts insights from data to inform decisions, using techniques like statistics, visualization, and ML as needed.
- Interdisciplinary Nature: AI involves computer science, psychology, and philosophy to mimic human intelligence. Machine Learning mainly utilizes computer science, statistics, and optimization to create data-learning algorithms. Data Science, being interdisciplinary, combines statistics, computer science, domain knowledge, and data engineering to tackle real-world issues through data analysis.

10.2 Role Play among the Statistician, Mathematician and Computer Scientist in Artificial Intelligence, Machine Learning, and Data Science

Statisticians, mathematicians, and computer scientists play pivotal roles in advancing Artificial Intelligence, Machine Learning, and Data Science. Their contributions vary with expertise and project phases, enhancing data analysis and model development.

(i) Statistician: Statisticians are often at the heart of data science. They drive AI and ML with experiment design, hypothesis tests, and foundational statistical learning. They guide model selection and result evaluation, anchoring data science in exploration, hypothesis testing, and traditional methods.

(ii) Mathematician: Mathematicians enhance AI and ML with mathematical algorithms, theoretical frameworks, and optimization skills. They strengthen neural networks’ foundations with linear algebra. In data science, they excel in complex mathematical modeling.

(iii) Computer Scientist: Computer Scientists drive AI/ML with algorithm development, system design, and optimization of model processes. They handle data engineering tasks in data science—collection, preprocessing, database, and pipelines. - RECOMMENDATIONS

The University of Ibadan should now adopt Artificial Intelligence, Machine Learning, and Data Science, as a broad-based program into its curriculum. These fields are pivotal in the global technological landscape, ensuring the university’s relevance in a swiftly evolving world. Furthermore, they can stimulate economic growth in Nigeria by fostering innovation, entrepreneurship, and employment in technology-driven sectors, addressing local issues such as healthcare, agriculture, energy, and education. Integrating these disciplines encourages interdisciplinary cooperation, including Statistics, Mathematics, and Computer Science, promoting holistic problem-solving approaches. Establishing AI, ML, and Data Science programs will produce highly sought-after graduates, benefiting both the local and global job markets and enabling cutting-edge research opportunities, research funding, and industry collaborations that enhance student experiences.

Recommendations for AI/ML/Data Science hub at University of Ibadan’s Faculty of Science for sustainable, all-encompassing practice: - Curriculum Development: Establish a task force composed of experts from Statistics, Mathematics, Computer Science, and other relevant Departments to design a comprehensive and interdisciplinary curriculum. Include foundational courses in mathematics, statistics, programming, and data analysis, as well as advanced courses in AI, ML, and Data Science.
- Faculty Development: Encourage faculty to pursue advanced degrees and certifications in AI, ML and Data Science. Facilitate faculty exchanges, workshops, and seminars to enhance their expertise.
- Infrastructure and Resources: Invest in computing infrastructure, including high-performance computing clusters and specialized hardware. Establish dedicated AI, ML and Data Science labs equipped with software tools, cloud computing resources, and diverse datasets.
- Interdisciplinary Collaboration: Encourage interdisciplinary research and projects involving Statistics, Mathematics, Computer Science, and other relevant disciplines. Establish interdisciplinary research centers or institutes focused on AI, ML, and Data Science.
- Industry Collaboration: Forge strong partnerships with local and international industries, startups, and technology companies. Create an industry advisory board to guide curriculum development and ensure alignment with industry needs.
- Student Engagement and Support: Encourage students to participate in AI/ML/Data Science competitions, research projects, and internships. Offer scholarships, mentorship programs, and career counseling to support students pursuing AI, ML, and Data Science education.
- Ethical Considerations: Integrate discussions on ethics, fairness, and responsible AI into the curriculum. Promote awareness of ethical issues and responsible AI practices among students and faculty.
- Community Engagement: Host workshops, seminars, and conferences to engage with the academic community, industry professionals, and policymakers. Launch outreach programs to promote STEM education and AI awareness in local schools and communities.
- Evaluation and Continuous Improvement: Implement regular program evaluations and gather feedback from stakeholders to refine and update the curriculum. Stay up-to-date with AI/ML/Data Science advancements and adjust course offerings accordingly.

By embracing AI, ML, and Data Science and implementing these recommendations, the University of Ibadan can become a leader in interdisciplinary education and research, contributing to local development and global advancements, and prepare graduates for careers in these dynamic fields. - ACKNOWLEDGEMENTS

Let me start by acknowledging the presence of the Almighty God in my life, and especially for today’s Faculty Lecture. I am reminded of the slogan of my state’s immediate past governor, Emmanuel Gabriel Udom, which resonated throughout his 8 years in office – “ONLY GOD”. Without a doubt, God has been kind to me, as one might say, “If not for the Lord, now may Israel say”.

My Dean, I am a person surrounded by many persons. I might bore you with names in this section. Please pardon me sir. At this point, I would like to express my gratitude to the Committee on Faculty Lecture, under the leadership of Prof. S. T. Ogunbanwo, as well as all the committee members, for granting me the opportunity to deliver the first Faculty Lecture of the 2022/2023 session.

My Immediate Family

The core of my existence, my immediate family, holds a special place in my heart. I am profoundly grateful to my wife, Deaconess (Mrs.) Joy Ejebosele Oluwaseun Udomboso, for her unwavering support throughout my academic journey. Your remarkable understanding and ability to manage our household during my frequent absences have been a pillar of strength. I love you dearly. To my cherished children, Faith, Favour, Florish, and Fountain, I extend my heartfelt gratitude. Your understanding and patience during my prolonged absences have been truly remarkable. I pray that the Lord, who has guided me through every challenge, continues to watch over, guide and abundantly reward each of you, in Jesus’ name.

Spiritual Mentors and Friends

I am profoundly grateful to those who have shaped my spiritual journey since I surrendered my life to the Lord on June 22, 1986. The Scripture Union (Nigeria), notably the Uyo Region, laid the foundation for my spiritual growth. Special thanks to my late patron, Mr. Umoh, my school’s math teacher, who mentored me. Engr. Enoto Udo Oton, my first discipler, and Mrs. Eka Udomboso, my life mentor and sister-in-law, have been pivotal. I also appreciate Pastor Marcus