• Nie Znaleziono Wyników

Artificial neural networks for sensor validation, fault diagnosis, modelling and control of diesel engines

N/A
N/A
Protected

Academic year: 2021

Share "Artificial neural networks for sensor validation, fault diagnosis, modelling and control of diesel engines"

Copied!
194
0
0

Pełen tekst

(1)

ARTIFICIAL NEURAL NETWORKS

FOR

SENSOR VALIDATION, FAULT DIAGNOSIS,

MODELLING AND CONTROL OF

DIESEL ENGINES

A DOCTORAL THESIS

BY

EHSAN MESBAHI

DEPARTMENT OF MARINE TECHNOLOGY

(2)

To MY FATHER

ABBAS MESBAHI

(3)

ABSTRACT

This thesis demonstrates various novel applications of Artificial Neural Networks (ANNs) in the field of marine engineering concerning with sensor validation, fault diagnosis and the modelling and adaptive control of diesel engines.

A biological resemblance between natural and artificial neural networks is identifed and

studied. Inspired by the intensity thresholds and sensitivity characteristics of the

human's sensory mechanism, a novel non-linear normalistion technique is proposed for the recognition of a highly non-linear engineering application. This technique paves the

way to a faster learning and better generalisation for ANNs.

An integrated is proposed for on-line sensor validation and intelligent engine fault

diagnostic system using auto-associative and standard ANNs. The proposed method utilises the data recovery capability of the auto-associative ANNs to arrive at a highly

reliable platform for condition monitoring of diesel engines. The application of the

method is successfully implemented on a Ruston medium speed diesel engine as a case

study.

An alternative technique for system identification utilising ANNs is introduced and applied on a time-variant dynamic system such as a high speed diesel engine. It is

demonstrated that a global and dynamic model of the diesel engine at varying

operational conditions can be established successfully by using recurrent ANNs in contrast to the conventional system identification techniques.

For the applications in the area of intelligent control, the structure of a Model Reference Adaptive Controller (MRNAC) is used for the control of the engine speed. The inverse model of the identified engine, is successfully utilised for the training of the controller ANN Finally, based on the successful application of MRNAC, a novel Neuro-Governor

(4)

DECLARATION

No portion of the work presented in this thesis has been submitted in support of an

application for another degree or qualification of this or any other university or other

institute of learning.

(5)

ACKNOWLEDGEMENTS

The opportunity for this study would have been impossible without the support and

encouragement of my beloved father, Abbas Mesbahi, who did not live long enough to

see the end of it, to him I am most indebted and grateful.

I am very grateful to all academic, secretarial and technical members of staff in the Department of Marine Technology, Profs Roskilly and Sen. Dr Pei Lin Zhou, and

particularly Dr Mehmet Atlar for his endless unforgettable support. Mr John Smith and Mr John Pierce in the Jones Marine Engineering Laboratory have been giving me the

best technical support one may get, I am very thankful to both of them.

This work was partially supported by the Overseas Research Student Awards (ORS) from the Committee of Vice Chancellors and Principals (CVCP) and the Stanley Grey Fellowship from the Institute of Marine Engineers. I am thankful to both organisations

for their financial supports.

My thanks to Profs Thompson and Ruxton for permission to use Ruston Engine database.

I wish to thank all my students, during the last four years, for their informative

discussions, comments and support, I learned a lot when I was teaching you.

(6)

vi

CONTENTS

LIST OF FIGURES LIST OF TABLES! NOMENCLATURE xvii CHAPTER 1: INTRODUCTION

1 .1 ARTIFICIAL NEURAL NETWORKS (ANNs): 1

BACKGROUND AND MOTIVATION

1.2 DIESEL ENGINES: BACKGROUND AND MOTIVATION 1.3 AIMS, OBJECTIVES AND STRUCTURE OF THE THESIS

!CHAPTER 2: ARTIFICIAL NEURAL NETWORKS: FUNDAMENTALS

2.1 THE ARTIFICIAL NEURAL NETWORKS: A DEFINITION 8

2.1.1 COMPARISONS 9

2.1.2 GENERAL FRAMEWORK 10

2.2 NETWORK TOPOLOGIES 14

2.3 PATTERN TYPES 16

2.3.1 DATA ADDRESSING AND DISTRIBUTION 16

2.4 LEARNING IN ANNs 17

2.4.1 MCCULLOCH.P1TTS' NEURONE, THE FIRST BUT NO LEARNING 18

2.4.2 HEBBIAN LEARNING RULE 19

2.4.3 ROSENBLATT'S NEURONE, THE 'PERCEPTRON' 21 2.4.4 WIDRAW AND HOFF'S ADALINE, THE DELTA RULE 26

(7)

2.5 BACKPROPAGATION, A LEARNING ALGORITHM FOR MLPs 31

2.5.1 BP METHOD 3dP

2.5.2 A GRAPHICAL REPRESENTATION

2.5.3 INITIALISATION 35

2.5.4 DATA SELECTION 37

2.6 NETWORK TOPOLOGY (REPRESENTATION)! 37

2.7 NETWORK CAPABILITY (GENERALISATION) 38

2.8 NETWORK LIMITATIONS 39

CHAPTER 3: NONLINEAR NORMALISATION FOR IMPROVED LEARNING AND GENERALISATION

31 INTRODUCTION 41

3.1.! SIGNAL RECEPTION AND PROCESSING WITHIN 43

THE RETINA; NATURAL NONLINEAR SIGNAL PROCESSING

3.2 SENSORY PHYSIOLOGY 43

3.2.1 BASIC DIMENSIONS OF SENSATION 44

3.2.2 INTENSITY 44

12.3 METHODS OF DETERMINING THRESHOLDS 46

3.3 ARTIFICIAL NEURAL NETWORKS 47

3.1.I THE STANDARD NORMALISATION PROCEDURE 47

3.3.2 INPUT DATA (STIMULI) SELECTION 48

3.3.3 NETWORK SELECTION 48

3.3.4 DEFINITION OF ABSOLUTE THRESHOLDSAND 49

JUST NOTICEABLE DIFFERENCE

13.5 PARAMETERS AFFECTING ABSOLUTE THRESHOLDS AND JNDs 50

3.3.6 TEST RESULTS I: ABSOLUTE THRESHOLDS St

3.3.7 TEST RESULTS II: VALIDITY OF WEBER FUNCTION 52

3.3.8 TEST RESULTS FM DEFINITION OF JNDs FOR ANNs 54

3.4 AN ENGINEERING CASE STUDY: THE MOODY DIAGRAM 56,

3.4.1 THE PIPE FLOW 57

3.4.2 ANN TO REPRESENT MOODY DIAGRAM 59

3.4.3 TRAINING RESULTS FOR THE FRICTION FACTOR EXAMPLE 63

vii

(8)

3.4.4 GENERALISATION RESULTS FOR THE FRICTION 65 FACTOR EXAMPLE

3.5 CONCLUSIONS 66

CHAPTER 4: INTELLIGENT SENSOR VALIDATION AND FAULT DIAGNOSTIC SYSTEMS FOR DIESEL ENGINES.

4.1 INTRODUCTION 68

4.1.1 DIESEL ENGINE CONDITION MONITORING SYSTEMS 69

4.1.2 DIESEL ENGINE FAULT DIAGNOSTIC SYSTEMS 71 4.1.3 SENSOR VALIDATION & DATA RECOVERY 72 4.1.4 ARTIFICIAL NEURAL NETWORKS FOR SENSOR 72

VALIDATION AND FAULT DIAGNOSIS

4.1.5 AUTO-ASSOCIATIVE ARTIFICIAL NEURAL NETWORKS- 74 A REVIEW

4.2 RUSTON DIESEL: EXPERIMENTAL DATA 76

4.3 RUSTON ENGINE SENSOR VALIDATION 77

4.3.1 DETECTION OF A FAILED SENSOR 79

4.3.2 DATA RECOVERY 82

4.4 RUSTON ENGINE FAULT DIAGNOSTICS 83

4.4.1 FAULT DIAGNOSTICS RESULTS 84

4.4.2 ON-LINE RE-TRAINING OF THE ANN 86

4.4.3 ENGINE FAULT DIAGNOSIS WITH FAULTY SENSORS 87 4.4.4 ENGINE FAULT DIAGNOSIS WITH RECOVERED

DATA FOR FAILED SENSORS

4.5 CONCLUSIONS 90

CHAPTERS: IDENTIFICATION OF A HIGH SPEED DIESEL ENGINE DYNAMICS

5.1 INTRODUCTION 91

5.1.1 PHYSICAL MODELLING OF DIESEL ENGINES 92

5.1.2 IDENTIFICATION OF DIESEL ENGINES 94

5.1.3 DIESEL ENGINE MODELLING USING ANN TECHNOLOGY 95

5.2 PERKINS DIESEL ENGINE CASE STUDY 95

(9)

5.3 LEAST SQUARES METHOD FOR ENGINE IDENTIFICATION 96

5.3.1 EXPERIMENT DESIGN GUIDELINES 99

5.3.2 DIESEL ENGINE CASE STUDY EXPERIMENTS 100 5.4 ARTIFICIAL NEURAL NETWORKS AND IDENTIFICATION 107 5.5 ENGINE DYNAMICS IDENTIFICATION: OFF-LINE IMPLEMENTATION 108 5.5.1 MODE I: DIESEL ENGINE AS SISO SYSTEM 111

5.5.2 MODE II: ENGINE AS MISO SYSTEM 113

5.5.3 PREDICTION OF UNTRAINED DISCONTINUOUS SIGNALS 115 5.6 ENGINE DYNAMICS IDENTIFICATION: ON-LINE IMPLEMENTATION 117

5.6.1 URAINING INTENSITY ADJUSTMENT 119

5.6.2 PERKINS HIGH SPEED DIESEL IDENTIFICATION 120

5.6.3 ANN ON-LINE IDENTIFICATION: RUNT 121

5.6.4 ANN ON-LINE IDENTIFICATION: RUN II 122

5.6.5 ANN ON-LINE IDENTIFICATION: RUN Elf 123

5.6.6 AN MATHEMATICAL MODEL OF THE PERKINS ENGINE 125

5.6.7 UNTRAINED DISCONTINUOUS PREDICTION 126

5.7 CONCLUSIONS 128

CHAPTER 6: NEURO-GOVERNOR: A NEURAL ADAPTIVE CONTROLLER FOR DIESEL ENGINES

6.1 INTRODUCTION 129

6.1.1 LINEAR DIESEL ENGINE CONTROL 130

6.1.2 NON-LINEAR DIESEL ENGINE CONTROL 130

6.2 ARTIFICIAL NEURAL NETWORKS AND CONTROL 132

6.2.1 NEURAL CONTROL ARCHITECTURES 134

6.2.1.1 SUPERVISED CONTROL 134

6.2.1.2 ADAPTIVE CRITIC AND REINFORCEMENT TRAINING 135 6.2.1.3 BACKPROPAGATION THROUGH TIME (BPTT) 135

6.2.1.4 DIRECT INVERSE CONTROL 136

6.2.2 MODEL REFERENCE NEURAL ADAPTIVE CONTROL 138

6.3 A MODEL REFERENCE NEURAL ADAPTIVE CONTROL 139

(MRNAC) FOR PERKINS DIESEL ENGINE

(10)

x

6.4 OFF-LINE IMPLEMENTATION OF MRNAC AND 142

THE EFFECT OF UNCERTAINTIES

6.4.1 SIMULATION OF MRNAC 143

6.4.2 MODE I: LOAD IS NOT INTRODUCED TO IDENTIFIER & 146 CONTROLER ANNs

6.4.3 MODE II: INPUT LOAD SIGNAL USED IN IDENTIFIER 152 AND CONTROLLER ANNs

6.4.4 TESTING MRNAC WITH UNSEEN INPUT SIGNALS 159 6.5 THE NEURO-GOVERNOR, DESIGN AND IMPLEMENTATION 162

6.5.1 NEURO-GOVERNOR IN ACTION 164

6.5.2 MODE I: LOAD IS NOT INTRODUCED TO IDENTIFIER AND 165 CONTROLER ANNs

6.5.3 MODE II: LOAD SIGNAL USED IN IDENTIFICATION AND 170

CONTROLLER DESIGN

6.6 CONCLUSIONS 175

CHAPTER 7: CONCLUSIONS AND RECOMMENDATION FOR FURTHER WORK

7.1 CONCLUSIONS 177

(11)

Figure 2.1 Figure 2.2 Figure 2.3 Figure 2.4 Figure 2.5 Figure 2.6 Figure 2.7 Figure 2.8 Figure 2.9 Figure 2.10 Figure 2.11 Figure 2.12 Figure 2.13 Figure 2.14 Figure 2.15 Figure 2.16 Figure 2.17 Figure 2.18 Figure 2.19 Figure 3.1 A Figure 3.1 B Figure 3.2 Figure 3.3

LIST OF FIGURES

A processing unit 12 A Feedforward network 14

A simple recurrent or feedback network 15

Three TLUs with their connections 18

LAM structure with linear nodes 20

Organisation of a perceptron 21

The perceptron 21

A single layer perceptron 22

Geometric representation of hyperplane

NOT logical function represented by perceptron 24

AND logical function represented by perceptron and its hyperplane 24 OR logical function represented by perceptron and its hyperplane 24

Exclusive-OR or XOR logical function 25

Solution to XOR problem using perceptron logical gates 25

The ADALINE and Widraw-Hoff Delta rule 26

A fully connected feedforward neural network 30

A possible solution to XOR problem by MLP 30

A feedforward neural network 33

Backpropagation diagram 36

The Weber fraction and Weber's law 45

Dependence of the Weber fraction on the intensity of the 46 initial stimulus, for auditory stimuli

Fully connected 3-layered feed-forward ANN with 49

sigmoidal activation function

The growth of the intensity of the stimuli used for the 50 XOR function

(12)

Figure 3.6 Figure 3.7 Figure 3.8 Figure 3.9 Figure 3.10 Figure 3.11 Figure 3.12 Figure 3.13 Figure 3.14 Figure 3.15 Figure 3.16 Figure 3.17 Figure 3.18 Figure 3.19 Figure 3.20 Figure 3.21 Figure 3.22 Figure 3.23 Figure 3.24 Figure 4.1 Figure 4.2 Figure 4.3

Ay)/ cp for ANN with 5 neurones in the hidden layer 53 Ay/ cp for ANN with 10 neurones in the hidden layer 53 AT/ cp for ANN with 20 neurones in the hidden layer 54

JNDs for ANN with 3 neurones in its hidden layer 54

JNDs for ANN with 5 neurones in its hidden layer 55

JNDs for ANN with 10 neurones in its hidden layer 55

JNDs for ANN with 20 neurones in its hidden layer 56

The Moody diagram 58

Mathematical flow chart of the pipe simulation and 59 application of ANN

Distribution of f when Reynolds number and Relative 61 roughness are linearly normalised

Pre-processing of the input stimuli before training/classification 61

Distribution of/when Reynolds number and Relative 62 roughness are non-linearly normalized

Linear and asinh normalisation of Reynolds number 62

Linear and asinh normalisation of Relative Roughness 63

Training results of linear and non-linear (asinh) normalised 63 input data for ANNs randomly initialised with variance 0.2

Training results of linear and non-linear (asinh) normalised 64 input data for ANNs randomly initialised with variance 0.5

Training results of linear and non-linear (asinh) normalised 64 input data for ANN& randomly initialised with variance 0.9

Mean squared error for test data set after 25000 iterations 65 Linear correlation coefficient for test data set after 25000 iterations 65 A fully connected feed forward ANN with m input and n outputs 74

An auto-associative ANN structure 75

Ruston engine and 25 signals (sensors) selected for validation 79

xii

Figure 3.4 Absolute thresholds in ANNs St

(13)

Sensor Confidence Levels when sensor 102 failed 80

Sensor Confidence Levels when sensor 106 failed 80

Sensor Confidence Levels when sensor 303 failed 80

Sensor Confidence Levels when sensor 308 failed 81

Sensor Confidence Levels when sensor 310 failed 81

Sensor Confidence Levels when sensor 312 failed 81

Sensor Confidence Levels when sensor 408 failed 82

Sensor Confidence Levels when sensor 409 failed 82

ANN outputs (user defined) for faults 1 to 7 and healthy 84 condition respectively

ANN and user interface for engine fault diagnostic

ANN response to untrained set of patterns representing 85 faults 2,4,1,5 and Healthy

On-line re-training of the fault diagnostic ANN 87

The combination of an auto-associative and diagnosis ANN 89

Least Square identification method 99

Perkins Diesel Engine identification at LL-LS, RMS=0.0271 102 Perkins Diesel Engine identification at LL-MS, RMS=0.0172 102 Perkins Diesel Engine identification at LL-HS, RMS=0.0325 102 Perkins Diesel Engine identification at ML-LS, RMS=0.0249 103 Perkins Diesel Engine identification at ML-MS, RMS=0.0225 103

Perkins Diesel Engine identification at ML-HS, RMS=0.0286 103 Perkins Diesel Engine identification at HL-LS, RMS=0.0370 104

Perkins Diesel Engine identification at HL-MS, RMS=0.0414 104 Perkins Diesel Engine identification at HL-HS, RMS=0.0122 104

Variation in estimator parameters at different operating conditions 105

PERKINS diesel engine simulation and the identifier ANN 109

ANN activation function 110

Normalised Fuel rack and load input signals 112

Simulated and identified engine speed 112

Identification and prediction error, RMS=0.0609 112

Number of training iterations at each sampling step 112

xiii Figure 4.4 Figure 4.5 Figure 4.6 Figure 4.7 Figure 4.8 Figure 4.9 Figure 4.10 Figure 4.11 Figure 4.1.2 Figure 4.13 Figure 4.14 Figure 4.15 Figure 4.16 Figure 5.1 Figure 5.2 Figure 5.3 Figure 5.4 Figure 5.5 Figure 5.6 Figure 5.7 Figure 5.8 Figure 5.9 Figure 5.10 Figure 5.11 Figure 5.12 Figure 5.13 Figure 5.14 Figure 5.15 Figure 5.16 Figure 5.17

(14)

Figure 5.18 Figure 5.19 Figure 5.20 Figure 5.21 Figure 5.22 Figure 5.23 Figure 5.24 Figure 5.25 Figure 5.26 Figure 5.27 Figure 5.28 Figure 5.29 Figure 5.30 Figure 5.31 Figure 5.32 Figure 5.33 Figure 5.34 Figure 5.35 Figure 5.36 Figure 5.37 Figure 5.38 Figure 5.39 Figure 5.40 Figure 5.41 Figure 6.1 Figure 6.2 Figure 6.3 Figure 6.4 Figure 6.5

Normalised Fuel rack and load input signals Simulated and identified engine speed

Identification and prediction error, RM S=0.0408

Number of training iterations at each sampling step and its trend

Untrained square shaped movement of the fuel rack Simulated and ANN predicted output signals, no training Prediction error in responding to untrained input signals, RMS=0.0628

25 Untrained square shaped movement of the fuel rack

Simulated and ANN predicted output signals, no training

Prediction error in responding to untrained input signals, RMS=0.05175

On-line ANN identification interface

on-line ANN identification diagram for Perkins engine Normalised input signals to the engine, RUN I

Real engine speed and ANN identified/predicted speed, RUN I Identification/Prediction error, RUN I, RMS=0.04039

Input stimulating signals to the engine, RUN II

Real engine speed and ANN identified/predicted speed, RUN II Identification/Prediction error, RUN II, RMS=0.0109

Input stimulating signals to the engine, RUN III

Real engine speed and ANN identified/predicted speed, RUN III Identification/Prediction error, RUN III, RMS=0.00392

Untrained fuel rack and load input presented for engine speed prediction

Real engine speed and ANN predicted speed Prediction error, RMS=0.0916

Widrow and Smith's first neural controller

Training ANN controller using BPTT Indirect learning structure

General learning structure

Specialised learning architecture xiv 114 114 114 114 115 116 116 116 117 117 118 120 121 121 122 122 123 123 124 124 124 1')6 127 127 134 136 137 138 138

(15)

Figure 6.6 Figure 6.7 Figure 6.8 Figure 6.9 Figure 6.10 Figure 6.11 Figure 6.12 Figure 6.13 Figure 6.14 Figure 6.15 Figure 6.16 Figure 6.17 Figure 6.18 Figure 6.19 Figure 6.20 Figure 6.21 Figure 6.22 Figure 6.23 Figure 6.24 Figure 6.25 Figure 6.26 Figure 6.27 Figure 6.28 Figure 6.29 Figure 6.30 XV

General arrangementofa Model Reference Indirect 139 Adaptive Controller

A Model Reference Neural Adaptive Control 140

(MRNAC) for Perkins Engine

Utilisation of inverse model of the system in providing e1(t) 142 Input signal and load signal used in all simulations 145

MRNAC block diagram when load was not used in 146

identification and control

Basis run in simulation of MRNAC, unidentified load 147

Run in simulation of MRNAC, unidentified load 147

Run 2 in simulation ofMRNAC, unidentified load 148

Run 3 in simulation of MRNAC, unidentified load 148

Run 4 in simulation of MRNAC, unidentified load 149

Run 5 in simulation of MRNAC, unidentified load 149

Run. in simulation of MRNAC, unidentified load 150

Run 7 in simulation of MRNAC, unidentified load 150

R MS of the error between the reference speed 152

and engine speed, MRNAC, Mode I

MRNAC block diagram when load was used in identification 153 and control

Basis Run in simulation of MRNAC, with introduction of 154 load signal

Run 8 in simulation of MRNAC, with introduction of load signal 154 Run 9 in simulation ofMRNAC, with introduction of load signal 155

Run 10 in simulation of MRNAC, with introduction of load signal 155 Run Ii in simulation of MRNAC, with introduction of load signal 156 Run 12 in simulation of MRNAC, with introduction of load signal 156 Run 13 in simulation of MRNAC, with introduction of load signal 157 Run 14 in simulation of MRNAC, with introduction of load signal 157

RMS of the error between the reference speed and 159 engine speed, MRNAC, Mode II

(16)

Response of MRNAC to untrained input signal of high frequency (RMS=0.047)

Test 2 input signals used for MRNAC test

Response of MRNAC to untrained square shaped signal of low frequency (RMS = 0.127)

Neuro-Governor user interface

Block diagram of the Neuro-Governor

Basis Run in implementation of the Neuro-Governor, unidentified load

Run 1 in implementation of the Neuro-Governor,

unidentified load

Run 2 in implementation of the Neuro-Governor, unidentified load

Run 3 in implementation of the Neuro-Governor, unidentified load

Run 4 in implementation of the Neuro-Governor, unidentified load

Run Sin implementation of the Neuro-Governor,

unidentified load

RMS values of the error between the input reference signal and

engine speed for experimental runs of the Neuro-Governor in

Mode I

Basis Run in implementation of Neuro-Governor, identified load Run 6 in implementation of Neuro-Governor, identified load Run 7 in implementation of Neuro-Governor, identified load

Run 8 in implementation of Neuro-Governor, identified load

Run 9 in implementation of Neuro-Governor, identified load Run 10 in implementation of Neuro-Governor, identified load RMS values of the error between the input reference signal and engine speed for experimental runs of the

Neuro-Governor in Mode 11 xvi 160 161 161 162 164 165 166 166 167 167 168 169 170 171 171 172 17? 173 174 Figure 6.3 l Figure 6.32 Figure 6.33 Figure 6.34 Figure 6.35 Figure 6.36 Figure 6.37 Figure 6.38 Figure 6.39 Figure 6.40 Figure 6.41 Figure 6.42 Figure 6.43 Figure 6.44 Figure 6.45 Figure 6.46 Figure 6.47 Figure 6.48 Figure 6.49

(17)

LIST OF TABLES

Table 3.1 The Moody diagram region for ANN training

Table 4.1 Data sets of healthy and faulty engine conditions 76

Table 4.2 Signals selected for validation and engine fault diagnosis 77 Table 4.3 Comparison between recovered (predicted) and correct 83

reading with a faulty sensor

Table 4.4 Numerical presentations of faults 1 to 7 and healthy

condition for training

Table 4.5 Results of engine fault diagnosis using different ANN 85 architectures and 80 unseen data sets

Table 4.6 Engine fault diagnosis using ANNs with sensor failure 88 Table 4.7 Engine fault diagnosis with failed sensors and recovered data 89

Table 5.1 Perkins Diesel particulars 95

Table 5.2 Discrete transfer functions representing PERKINS diesel engine 101 Table 5.3 Total numbers of data used for best fit Least Squares Identification 105

Table 5.4 RMS value of the prediction error when the model of 106 LL-HS, ML-MS and HL-LS were used for other

engine operational conditions

Table 5.5 ANN models used for off-line identification 111

(18)

NOMENCLATURE

Vectors and matrices are represented in bold. Scalars are presented in lower case. Common symbols and abbreviations are given below:

AAM Address Addressable Memory

ADALINE Adaptive Linear Neurone

ANN Artificial Neural Network

BP Backpropagation

BPTT Backpropagation Through Time

CAM Content Addressable Memory

CI Compression Ignition

F(.) Activation function

FAM Fuzzy Associative Memory

FDCL Fault Diagnostic Confidence Level

H(.) Uncertainty function

HS-HL High Speed- High Load

HS-LL High Speed- Low Load

HS-ML High Speed- Medium Load

Hz Hertz

IND Just Noticeable Difference

LAM Linear Associative Memory

LCC Linear Correlation Coefficient

LMS Least Mean Square

LS Least Square

LS-HL Low Speed- High Load

LS-LL Low Speed- Low Load

LS-ML Low Speed- Medium Load

Mega

MADALINE Multiple ADALINE

(19)

xix

MLP Multi Layer Perceptron

MS-HL Medium Speed- High Load

MS-LL Medium Speed- Low Load

MS-ML Medium Speed- Medium Load

MSE Mean Squared Error

OLAM Optimal Linear Associative Memory

RBF Radial Basis Functions

Re Reynolds Number

'RMS Root Mean Square

SI

Spark Ignition

SOIL Sensor Confidence Level

TLU Threshold Logic Unit

XOR Exclusive OR

asinh Arc hyperbolic sinuous

Weber fraction Diameter of the pipe exp Exponential

Friction Factor'

/0, KO

functions!

Average roughness of the pipe

1

sig Sigmiod function

tanh Hyperbolic tangent

Input

Output

Learning rate

(r). Intensity

(20)

CHAPTER

INTRODUCTION:

BACKGROUND AND

MOTIVATIONS

1.1 ARTIFICIAL NEURAL NETWORKS (ANNs): BACKGROUND AND MOTIVATION

ANNs are computational algorithms, which are based on our understanding of

biological nervous systems. They are an alternative computational technique to the

conventional approach of sequential and algorithmic methods.

ANNs, similar to the human's nervous system, process the incoming data in many simple and parallel processing units, unlike the common and traditional processing structure of one powerful processor, dealing with data in turn. This enables ANN to respond the tasks that involve real-time simultaneous processing of several signals

rapidly.

The other advantageous feature of ANNs is their adaptive, teachable and non-linear

structure. This makes them strong. candidate for developing non-linear

multi-dimensional input-output spaces as well as modelling dynamic behaviour of

(21)

Chapter I:Introduction: Backgroundsand, 4,10tivatons Page ?

Research in ANN models goes back to the same time as the first digital computers

were being developed in 1940's (McCulloch & Pitts, 1943) (Pitts & McCulloch, 1947) (Hebb, 1949). After a massive downfall because of deficiencies in their

functional capabilities at the end of 60's (Minsky & Papert, 1969), they reappeared in mid 80's with a strong reply to their critics (McClelland & Rumelhart, 1981). Today, they are the subject of study in many areas as diverse as medicine, engineering and economics, to

solve problems that cannot be easily tackled by conventional

techniques.

1.2 DIESEL ENGINES: BACKGROUND AND MOTIVATION

Diesel engines are the prime movers for the majority of ship propulsion systems as

well as being highly utilised in marine power generation mainly because of their high power to weight ratio and high efficiencies comparing to other heat engines such as

gas or steam turbines.

Diesel engines operating in the off-shore and on-shore industries with different

power ratings and operational conditions, ranging from 10 kW to 80 MW and speeds

from 60 to 5000 rpm, display an extremely wide spectrum of application profile.

Their reliability and high level of performance is dependent on their faultless operation. With the current advancement in electronic instrumentation, all the information required for their health assessment is readily available. Knowledge

Based decision making algorithms such as Expert Systems are currently a favourite choice as fault diagnosis utility for engine manufacturers. Nevertheless, the accuracy of the existing fault diagnostic systems is highly dependent on the accuracy of the

supplied data as well as the precision of built-in physical models representing

individual subsystems. In addition, none of the existing algorithms has the ability of learning new engine conditions in situ, i.e. during engine operation. It is hoped that firstly, by applying an ANN based sensor validation and data recovery procedure, it could be possible to raise the reliability of any data-dependent fault diagnosis tool.

(22)

Chapter I: Introduction: Backgrounds and, Motivatons Page 3

recognise known engine conditions, the ANN& can provide a highly reliable fault diagnosis facility. It

is also hoped that by employing easily teachable pattern

recognition ANNs, new healthy and faulty diesel engine operating conditions can be

trained on-line and recalled during the engine operation.

The attempts to model diesel engine behaviour go back to when they were first introduced to the industry. These models were initially developed to help engine designer to come up with a more efficient heat engine as well as to improve the power to weight ratio, component endurance and maintainability. Today, design

offices face a new challenge for a significant reduction in diesel engine emissions, which has been adopted by various governmental authorities as a part of the growing

concern for the global protection of the environment.

Diesel engine modelling techniques vary according to the application of the model. Unavoidably, engine models, which are aimed to improve our understanding of the

physical

events occurring during the engine cycles, must include chemical,

thermodynamic and dynamic mathematical representations of various subsystems. These models finally may be employed to investigate the effect of various changes in engine design parameters. The main disadvantage dealing with these models is that

our lack of understanding of what is really happening during different processes.

If the objective of the diesel engine modelling is to define its dynamic behaviour during transient conditions, system identification techniques or black box modelling

techniques may be employed. The main disadvantage of such techniques is their

local validity such that once the system operating conditions changes, the accuracy

of the prediction will be largely deteriorated. Attempt for re-modelling the system

although is possible; this will be computationally expensive and may not always lead

to an optimal solution. Moreover, the latter models do not pertain any physical

understanding of the system, they are simply used for representation of the system dynamics and are mainly used in discrete-time system identification and adaptive

(23)

Chapter I: Introduction: Backgrounds and, Motivatons Page 4

The capability of ANNs in capturing and learning the dynamic behaviour of various

systems has been found rather successful. In addition, the feasibility of online

training of ANNs has encouraged many researchers to adopt this technique as an adaptive system identification tool. Therefore, in this research study attempt will be made to apply the ANN technology to model and identify a high speed diesel engine. It is hoped that ANN model will be able to accurately represent such a complex and

time-variant behaviour of the engine at varying operational conditions.

Speed control of diesel engines has traditionally been achieved by simple mechanical or hydraulic governing devices. However, due to rapid and substantial changes in their dynamic behaviour at different operational conditions, simple linear controllers may not respond as effectively as required for certain applications. Application of

adaptive control schemes for a wide range of time varying dynamic systems has

shown promising results. Therefore the effort will be made to demonstrate the ability of the ANNs in combination with the adaptive control technique to operate as an

intelligent governor.

On-line adaptation, rapid convergence and learning capability of ANNs have been

researchers' driving incentives for their employment as Adaptive Controllers. In essence this thesis also aims to implement the ANN technique with the hope of

developing an on-line adaptive controller for a diesel engine as a time-variant non-linear dynamic system.

1.3 AIMS, OBJECTIVES AND STRUCTURE OF THE THESIS

The background and motivation regarding recent promising developments in the area

of ANNs and future demands of the marine engineering, particularly in the field of

diesel engines, has prompted the author to set the aim of the thesis to investigate various novel applications of ANNs in this sphere.

(24)

Chapter it:Introduction:Backgrounds, and. Motivatmis Page 5

In chapter 2, the mathematical foundation of ANNs, starting from the very early

concepts to the current popular architectures are discussed with a specific emphasis

on the learning strategies and their practical capabilities and limitations.

One of the ANN limitations that mentioned in Chapter 2 is their slow convergence during training when mapping certain input/output data spaces. Therefore, chapter 3 is concerned with an investigation into the similarity in discriminating capability of the human sensory mechanism and ANNs. Similar methods used in psychophysics

are applied to a selected ANN structure, which is trained to map the famous

Exclusive-OR (XOR) logic function. The dynamism and adaptability in the pre-processing part of the human sensory mechanism is a crucial factor to the human

survival and these features of the human sensory system are to be imitated by the

ANN structure. The main objective of chapter 3 is to define the absolute and

differential thresholds and the Just Noticeable Difference (JND) for ANNs. Based on this understanding, non-linear preprocessing and normalisation of the input patterns is introduced. The application of this normalisation technique is tested on a "real-world" engineering study and improvements in the speed of convergence and better

generalisation capabilities of ANNs are discussed.

The main objective of chapter 4 is the application of ANNs to sensor validation and fault diagnosis of diesel engines. A 6-cylinder medium speed turbocharged Ruston diesel is considered as the case study. Engine's operational data are collected under the combination of various faults and operational conditions and are used in training

two ANNs for different purposes. The first one, an "auto-associative ANN", is

trained to discriminate between a faulty and healthy reading of a sensor based on the consistency amongst the elements of an overall data available. The same network is then used to replace the faulty sensor reading with a "close-enough" substitute and warn the operator. The second network, "pattern recognition ANN" is designed to compare and discriminate engine faulty conditions. A close match between the stored faulty/healthy patterns and the data measured by data acquisition system results in finding a fault. In the case of an unknown faulty condition, the system will prompt

(25)

Chapter ar:Introduction: Backgrounds and, Motivatons Page 6

In Chapter 5, an adaptive technique for the system identification of dynamic

behaviour of a diesel engines is introduced as another main objectives of the thesis.

A conventional Least Squares (LS) technique is applied to identify a high-speed Perkins diesel

engine under varying operational

conditions. Observation of

substantial changes in dynamic behaviour of the engine shows that each identified

model is only locally valid for the representation of the engine's performance. In

order to obtain a globally accurate model of the diesel engine, recurrent ANNs are firstly applied off-line and in parallel while the behaviour of the engine is simulated

using the models derived by the LS technique. During this process, a switching mechanism is used to roughly imitate the time-variant behaviour of the engine

parameters. This representation is then used to investigate if the required processing time by an ANN based identifier is short enough to cope with rapid changes in the

engine

dynamics. To further

investigate the possibility whether an on-line

identification technique using ANNs can be applied to

identify the dynamic

behaviour of the Perkins engine, the same modelling procedure is applied in

real-time.

In order to demonstrate an effective use of ANNs for intelligent control of time-variant systems in marine engineering,

a Model Reference Neural Adaptive

Controller, MRNAC, is proposed in Chapter 6 for the speed control of a diesel engine. An inverse model based on the process of 'Cause-to-Effect' is utilised to

represent inverse dynamic behaviour of the engine. This 'Effect-to-Cause process is

hardly possible when dealing with physical modelling of the

systems; the computational ability

of ANNs

in performing such process is extremely

advantageous in the design of adaptive controllers. Off-line application of the proposed MRNAC design is implemented using the time-variant engine model developed in Chapter 5 and the results of the controller performance at varying parameters of engine and its roller are presented.

In order to demonstrate a successful application of the MRNAC, an online ANN

speed controller. which is called "Neuro-Governor- is introduced and designed fora

(26)

Chapter 1: Introduction: Backgrounds and, Motivaums Page 7

mathematically based on MRNAC, is implemented by using a fast-acting electrical actuator to control the speed of the engine. The results of the real-time application of

Neuro-Governor at different operational conditions and different settings of its control parameters are presented next.

Finally, the last chapter of the thesis presents overall conclusions of the thesis and

(27)

CHAPTER

Sum in ary

The overall aim of this chapter is to get familiar with the background and

fundamentals of the ANN technology.

This chapter's objectives may be summarised to define and represent different structures and designs of the ANNs, their learning paradigms, backpropagation

algorithm, data selection procedures and their limitations.

2.1 THE ARTIFICIAL NEURAL NETWORKS: A DEFINITION

Many researchers and pioneers have tried to define neural networks in a concise and

brief format. The following is the one preferred by Kohonen (Kohonen, 1989):

"The artificial neural networks are massively parallel interconnected networks of simple (usually adaptive) elements and their hierarchical

organisations which are intended to interact with the objects of the real

world in the same way as the biological nervous system do"

He also categorises the neural computers into two groups: First are those that the interconnections between processing elements are formed adaptively. Secondly, machines that their parameters are time-invariant.

ARTIFICIAL NEURAL

NETWORKS:

FUNDAMENTALS

(28)

Chapter 2: Artificial NeuralNetworks, Historyand Fundamentals Page 9

In this context he also defines learning as improvement of the system performance, in the sense of some criterion, relating to use.

2.1.1 COMPARISONS

Before any attempt to build a brain-like machine or in other words program a

computer in such a way that

it imitates the brain behaviour, differences and similarities between them must be discussed.

This comparison has many different aspects, which the most general ones follow.

Degree of parallelism: Abstract sequential processes called Petri nets govern

massively parallel computers, which are asynchronously operating networks of processing elements. These still execute digital, especially arithmetic operations according to pre-planned machine instructions Neural networks operation is also

asynchronous, but neural signals have no exact format, so it is not possible to

combine any control information, such as control codes or status bits. This makes

all current parallel processing functions, multiplexing, data switching,

time-shared operations and locally centralised operations impossible. The number of neurones inside the brain is estimated to be between 1011 and 1014, each with

between 103 and 104 abutting connections per neurone.

Neural signals: Biological neural signals, are continuous-valued, continuous-time physical signals, therefore, they are not binary and the neurones are not bistable

latches. The neural impulses, although having electrical nature, are not

synchronised to any clock frequency and synapses do not flip between two states

of 0 or

Precision and depth or recursion: A neurone may be defined as a low precision processor. (in terms of number of significant digits of the response. Consequently the depth of recursion in computations must be very small. Digital computers can easily execute sequential operations consisting of billions of successive steps,

(29)

Chapter 2: Artificial Neural Networks, Historv and Fundamentals Page 10

Elements of the Model

Each model consists of eight principal aspects. They are introduced here with their

possible analogy to the human brain and nervous system.

ability is not possible without highly stable, accurate and noise tolerant circuit

components.

d) Processing speed: Today's technology, with digital computers using clock

frequencies in the range up to 800 MHz, it will take less than ins to execute a

single instruction. We have seen in the previous sections, that neurones operate in

the millisecond range (approximately 4 ms) to complete a firing cycle.

Processing order: The processing operations in the brain are not centralised as in the classical digital computers. There are no control or arithmetic-logic-units. All functions seem to be distributed and mixed together, as an anarchic system in

contrary to a completely autocratic system in computers.

0 Data storage: In a computer, data is statically stored in an addressed memory location. New information, at the same address, destroys old information.

Information in the brain is stored in the interconnections between neurones. New information added to the brain by adjusting the interconnection's strengths, the

synaptic efficacy, between the neurones. Briefly, knowledge in the brain is adaptable, whilst in the computer is replaceable.

Fault tolerance: Damage to individual neurones can occur in the brain without

severe degradation of its overall performance. Brain carries a distributed

representation of the information. In contrary most conventional computers are fault-intolerant. Removing or damaging any processing component leads to an

ineffective machine and the corruption is irretrievable.

2.1.2 GENERAL FRAMEWORK

A large number of models, all different in details but share the PDP concept. Here a general framework is to be introduced that possess most of the common features.

(Rumelhart & McClelland, 1986)

(30)

-Chapter 2: Artificial Neural Networks, History and Fundamentals Pa20 I I

1. A set of processing units, or equivalent to the neurones in out nervous system. It will be seen later on that these processing units are called artificial neurones.

1. A state of activation that in binary units may be "on" or "off' and in analogue units depends on input to the functional relationship. It is equivalent to what

already explained as a neurone at rest, resting potential or at the conduction state

transmitting action potential.

An output function for each unit. In the neurone this may be translated as speed

and/or frequency of impulse conduction.

A pattern of connectivity among units. Excitatory or inhibitory behaviour of neurotransmitters that receive impulses from other postsynaptic neurones

A propagation rule for transferring patterns of activities through the network. It

is believed that each neurone in the brain is connected to almost 104 other neurones.

An activation rule for combining the inputs to a single unit and based on that

decide on the new state of the unit. Spatial summation and threshold of

stimulation together make the activation function for the neurone, which decides

the final state of the neurone.

A learning rule whereby patterns of connectivity are modified. Learning and

memory in the brain, habituation and sensitisation.

An environment within which the system must operate. Sensory (Afferent) and motor (Efferent) neurones that deal with the outside world in order to receive and

send impulses.

Fig 2.1 illustrates a processing unit. The vector [x 1(0, x2(t), x3(t) ,...] is called the

input vector, which either carries information from outside or from the previous neurone. The vector [IN, j, w2, w13,...] is the weighting or strength vector. Each input vector parameter will be inhibited or exhibited by corresponding member of the

weighting vector. This vector (or matrix in networks with many neurones) is said to contain long-term memory of the network (LTM). E is the input function, which adds

up all the incoming values. F is called the activation function. Here, the data is processed and there are a number of possibilities for this function such as:

3.

(31)

Chapter 2: Artificial Neural Networks, History and Fundamentals Page 12

The identity function:

The threshold function:

The sigmoid function:

The tanh function:

xi

output = input

{if

input Threshold, then output = II; otherwise :output = 0

Fig 2.1 A processing unit

A unit's job is simply to receive input from its neighbours and as a function of inputs

it receives, to compute an output value, which it sends to its neighbours:

st = F1x w

.1 11 (2.2)

output = tanh(input)

KI), is the output function, which in most of the cases is considered as identity

function.

(1)(x) = x (2.1)

In some cases it might be threshold function and in some others it is assumed to be a stochastic function in which the output of the unit depends in a probabilistic fashion

on its activation values.

Similar to afferent and efferent neurones, in our artificial systemwe may characterise

three types of units: input, hidden and output. Input units receive data from external

output =

(32)

Chapter 2: Artificial Neural Networks, History and Fundamentals Page 13

sources, output units send the processed signal out and hidden units are those ones that their input and output signals are within the system and are not 'visible' to the

outside.

It

is also needed to represent the state of each unit. These may be discrete or

continuous values. Depending on the unit's activation function, the state might be 0 or 1 (on or off) or restricted values of i-1,0,+1). In some models they may take any real value between some maximum and minimum. Such values, when represented in

a vector format sometimes referred to asshort-term memory of the network.

Processing units are connected to each other. We assume that each unit provides a contribution to the next one, which it is connected to. The weighted sum of the input signal is received by each element. The weights may be excitatory or inhibitory. The pattern of connectivity is represented by a weighting vector (or matrix) denoted by

Each individual member of W specifies the strength of the connection. It plays an important rule since it represents the knowledge that is encoded in the network and because of this, it is said that matrix W contains the long-term memory of the system.

How and which inputs are transmitted to the recipient unit is decided by the rule of propagation. This rule, in simple networks may be excitatory and inhibitory weighted of the input signal. For more complex patterns, more complex rules of propagation is

required.

Activation rule is dependent on the activation function, F, and determines the state of activation. Sometimes, the new state of the activation depends on the old ones as

well as the current input signal, the case, which will be discussed in dynamic networks. The activation function is assumed to be deterministic and would be sometimes useful to have another constraint of being a differentiable function.

(33)

Chapter 2: Artificial Neural Networks, History and Fundamentals Page 1 4

Learning involves modifying the patterns of interconnectivity, which in principle can

involve the development of new connections, loss of existing connections or

modification of the strength of connections that already exist. This will be discussed

in later sections in details.

The external environment interacts with the network to provide inputs to the network

and to receive its outputs. Input units (or layer) receive the external signals and output units (or layer) send out the processed signal to the environment.

2.2 NETWORK TOPOLOGIES

Network topology is categorised by the data flow within the system. So two major topologies will be named as Feedforward networks and Recurrent networks. In the first category, data only flows in one direction, from input layer towards the output

layer. Processing units do not receive signals from their own outputs or from the units in front of them. This is illustrated in Fig 2.2.

Input layer

Hidden layer Output layer

Fig 2.2 A Feedforward network

In recurrent or feedback networks in addition to the signals from the preceding units

are allowed to receive inputs either from their own output signals or other units output. This is shown in Fig 2.3.

(34)

Chapter 2.- Artificial Neural Networks', History and Fundamentals Page 15

Input layer

Hidden layer Output layer

Fig 2.3 A simple recurrent or feedback network

Feedback or delayed output signal introduces a dynamic behaviour to the system. The current output is a function, J. of the previous values of the output unit.

y(t) =f (y(t 1), y(t 2),..., x(t), x(t (2.3)

Therefore, in general, recurrent networks are nonlinear dynamical systems and stability of such network is one of the main concerns.

Feedforward neural networks are extensively used to develop non-linear functional mapping between a set of multi-dimensional input/output patterns and in one sense,

are another tool for performing non-linear regression analysis. Once the internal

structure of the relationship is captured, depending on the learning procedures, the network may be used for pattern recognition/classification or for a linear non-iterative interpolation tool to predict the output signal for in-hound un-trained input

signals.

(35)

De printer heeft niet genoeg geheugen voor deze job.

Voer 00n van de volgende stappen uit en druk opnieuw af:

Klik Portability optimaliseren in het PostScript-dialoogvenster.

Zorg dat de hoeveelheid printergeheugen in het dialoogvenster Apparaatopties juist is.

Verlaag het aantal lettertypen in het document. Druk het document in delen.al.

(36)

CHAPTER

Summary

The main goal of this chapter is to investigate possible similarities between biological and artificial neurones in their sensitivity characteristics.

Chapter 3's objectives may be briefly summarised as:

To investigate if the mathematical rules, governing the human's sensory

mechanisms, are applicable to artificial neurones,

Similar to biological neural networks,

define absolute and differential

thresholds for artificial neurones,

To define the Just Noticeable Difference (JND) for ANNs and its relationship to

ANN 's structural and learning characteristics,

To imitate the human's sensory mechanism behaviour in non-linear

pre-processing

of

the input stimuli, for artificial neurones,

To propose a non-linear data pre-processing technique to improve the learning procedure and generalisation capability of ANNs,

3.1 INTRODUCTION

CPUs work millions of times faster than the single neurone within the human brain and

the current attempts to implement Artificial Neural Networks for sensing, storing,

recognizing and recalling patterns have not effectively approached the capabilities of

the brain. Popular neural networks (Hopfield, 1982), (Hopfield & Tank, 1986), (Anderson, 1983), (Grossberg, 1982, 1986), (Rumelhart Sic McClelland, 1986),

NON-LINEAR

NORMALISATION FOR

IMPROVED LEARNING

AND

(37)

Chapter 3: Non-linear Normalisation for Improved Learning and Generalisation Page 42

(Rumelhart & Zisper, 1985), (McClelland & Rumelhart, 1981), (McClelland, 1985), (Kohonen, 1988), (Fukushima, 1988), (Sejnowski & Kienker, 1989), (Sejnowski & Rosenberg, 1986), (Kienker et al, 1986), (Lippmann, 1987) have mainly followed an idea originally proposed by Hebb (Hebb, 1949). In his approach, synaptic inputs to an element are summed and the element fires if a threshold is exceeded. The strength of a synaptic connection is exhibited only if both pre-synaptic and post-synaptic elements

fire the layers of their respective inputs and outputs.

Non-linear training algorithms are extensively used to adjust the weights of the

network for each neural input. The major drawback of using such algorithms is that they are computationally intensive and normally require many training iterations for

each set of patterns before convergence is achieved.

There have been many attempts to reduce the number of iterations by speeding up the

training convergence (Darken & Moody, 1991), (Fahlman, 1990), (Jacobs, 1988), (Leonard and Kramer, 1990), (Silva & Almeida, 1990), (Krose & Van der Smagt, 1993).

Little attention has been paid to input data normalisation methods and

consequently to the nature of the input pattern and the distribution of data between saturation levels of [0, I ] or [-1,1].

Observations of objective sensory mechanisms that provide the pattern recognition and recall capabilities of humans has proved that our sensory organs and stimuli signals

play an important role in pre-processing the flow of observed information.

The results of studies using electrophysiologic, imaging and biochemical methods suggest that the biologic functioning networks transform the signaling patterns to

present imaging patterns of objects sensed in the environment. These functions involve non-Hebbian synaptic transformations, molecular regulation of ion channels and

(38)

Chapter 3: Non-linear Normalisation for Improved Learning and Generalisation Page 43

3.1.1 SIGNAL RECEPTION AND PROCESSING WITHIN THE RETINA; NATURAL

NON-LINEAR SIGNAL PROCESSING

When a cloud passes over the sun on a fine day, we notice a decrease in the

brightness of our surroundings, to which we soon adapt. Even with a

hundredfold change in the light intensity, the perceived relative lightness or darkness and the colors of the surrounding objects change only slightly.

When a circumscribed area of the retina, in a constant state of adaptation

is illuminated, there is an approximate logarithmic relation between the

perceived subjective brightness of the light spot and its luminance.

Microelect rode recordings have demonstrated that the same relation

applies to the discharge rate of the on-center neurones. The discharge rate of off-center neurones is also approximately a 'Unction of the logarithm of the preceding negative intensity step. at light-off'.[Shmidt, Thews, 1989]

The above statements clearly indicates that the input signal to our visionary sensors are all processed by layers of neurones in such a way that light at different luminant intensities can be perceived and identified by the brain. A similar logarithmic treatment of the sound intensity and the sense of hearing can be found in (Shmidt, Thews, 1989)

To suggest how Artificial Neural Network designs might benefit from these biological learning networks (Grossberg, 1982, 1986), (Alkon et al, 1994), the next section will

give a brief overview of major information processing steps in biological sensory mechanisms.

3.2 SENSORY PHYSIOLOGY

An important rule of subjective sensory physiology, "specific sensory energies" stated

by Johannes Muller 150 years ago is:

"The nature of a sensation is not determined by stimulus hut by the sense organ that is stimulated."

(39)

Chapter 3: Non-linear NormalisationforImproved Learning and Generalisation Page 44

It is usually easy to discover the optimal stimulus for a sense organ by observing its

response to a range of stimuli. This turns out to be the stimulus requiring minimal

energy to excite that organ. The sensor is the cell or part of a cell that is responsible for the transduction of stimuli into neural excitation. Much of the information received by proprioceptors and interoceptors sent to Central Nervous System (CNS) rarely or never reaches our consciousness, whilst exteroceptor signals are generally sensed and felt by us. Because of the inaccessibility and small receptive sites on the cell membrane. the nature of the molecular mechanisms underlying transductions of the stimulus into a sensor is, in most cases, not completely understood. (Schmidt Thews, 1989), (Bell et

al, 1980), (Thibodeau, 1987)

The sensor potentials in afferent nerve endings are found to have the following properties:

They are produced in the ending of the nerve and not in the cells surrounding

the nerve endings; these cells are part of the structure of the sense organ.

The sensor potential is a graded response. They are either depolarized or

hyperpolarized to different degrees by stimuli of different intensity. In some cases

the process of transduction also involves an amplification process.

The sensor potential is a local potential, but it spreads over the membrane electronically, it is not actively conducted.

Sensor potentials can undergo spatial and temporal summation.

3.2.1BASIC DIMENSIONS OF SENSATION

Sensations are traditionally considered to have four basic dimensions, intensity,

quality, temporal and spatial extent. In terms of quality, sensory modes are known as

seeing, hearing, smelling and tasting, although the number of senses available to humans will always be a matter of interpretation.

3.2.2 INTENSITY

Various methods of psychophysics are adopted to study the intensity of a sensation.

Fechner was the first

to devise the

first useful technique for the quantitative &

(40)

Chapter 3: Non-linear Normalisation for Improved Learning and Generalisation Page 45

measurement of subjective experience. He attempted to describe the quantitative

relation between the physical intensity (co) and the subjective strength of sensation (y).

The absolute threshold was defined as the smallest stimulus that is just capable of

producing a particular sensation. In the range of supra-threshold stimuli another kind of

threshold can be defined as, the "Just Noticeable Difference" (IND). This is the

amount by which one stimulus must differ from another in order for the difference to be sensed. In 1834 Weber suggested that the change in stimulus intensity that can just be detected ( Ay ) is a constant fraction (c) of the intensity (y) of the initial stimulus.

This is expressed as the Weber function:

= c (3.1)

This relationship applies over a wide range of intensities of stimuli for many sensory modalities; it is a useful measure of the relative sensitivity of the sensory system,

however, when the absolute threshold is approached (c) tends to increase.

The relation between the Weber fraction and stimulus intensity is shown in Figure 3.1

for the loudness of tones. It is evident that in this case Weber's law begins to apply

only when the stimulus reaches 40dB above the absolute threshold, because the Weber function remains almost constant from this intensity on. Similar curves are obtained for

other modalities.

Fig 3.1 A The Weber fraction and Weber's law.

Relation between initial stimulus magnitude ((p) and the increase ( ) required to

exceed the differential threshold for the sense of force.

Ori Mi 3(X.)- 240;- 120- 60-0 2 4 G 8 1.0 .:flevilons Ap

(41)

Chapter 3: Non-linear Normalisation for Improved Learning and Generalisation Page 46

Fig 3.1.B Dependence of the Weber fraction on the intensity of the initial stimulus, for

auditory stimuli.

Fechner made an attempt to define a scale of intensity of the sensation (w) where zero

on such a scale is the absolute threshold, the next stronger sensation is greater by precisely one JND and the next by another JND. Fechner's psychophysical law is expressed by:

= k (1) (3.2)

(Po

where w is intensity of sensation, k is a constant, p is intensity of the stimulus and

is the intensity of the stimulus at the absolute threshold.

The validity of this law is severely restricted to where Weber law applies. Also, almost a hundred years later, it was shown (Stevens, 1975) that the JND is not a constant unit, i.e. a JND in intensity does not produce the same difference in strength of sensation for

all initial intensities. Therefore, (9)

in Fechner's law is more an expression of

discriminability than of intensity of sensation.

3.2.3 METHODS OF DETERMINING THRESHOLDS

There are several methods that can be used to determine thresholds, such as limits,

adjustment and the psychometric Junction. In method of limits, the initial intensity of

the stimulus is set so high that the subject easily perceives it; it is then gradually reduced until the stimulus reaches a sub-threshold value. The test is then repeated but

using a very weak stimulus, which is increased until the threshold is reached. This

process is repeated a number of times and the mean of the resulting threshold values is

taken as an estimate of true threshold value.

0.6

0.4

0.2

0 20 40 6.0 80 100

(42)

Chapter 3: Non-linear Normalisation for Improved Learning and Generalisation Page 47

3.3 ARTIFICIAL NEURAL NETWORKS

Our understanding of human's nervous system inspired the concept and development of Artificial Neural Networks (ANNs). Psychophysics and other methods of assessing

the sensitivity of our senses and the intensities of stimuli, shows that our sensory organisms have adapted themselves so that they are able to receive and perceive

surrounding physical phenomenon. For example, the eyes inform us less about absolute brightness levels but more accurately about differences in brightness in a scene and hence about the boundaries of its individual elements. This difference in brightness is called contrast enhancement by lateral inhibition and is a great help when extracting

some visual feature of a scene. This understanding has also helped us in the

measurement of physical and environmental data. The decibel and phon scales, logarithmic measurement units for sensory physiology, are based on Fechner law, Equation (3.2).

This study is based on the assumption that artificial neurones share the same behaviour

when responding to the characteristics of the intensity of stimuli. If so, it may be possible to imitate such sensory mechanisms whit data processing capabilities of ANNs.

3.3.1

LIE

STANDARD NORMALISATION PROCEDURE

The standard and widely used procedure to normalise the input signal of a feedforward multi-layer ANN, is to uniformly distribute input data between the upper and lower activation threshold limits. Equation (3.3) shows how this method is commonly applied to normalisation input data between 0 and I. consistent with the implementation of a

sigmoid activation function:

X = (3.3)

X iflax X min

where it is the normalised input, x1 the input value, xmin and x is the lowest and

highest values in the input range respectively. Initial weights are primarily adjusted to

a very small random value using a normal or uniform distribution technique.

is

(43)

Chapter 3: Non-linear Norm alisationfor Improved LearninQ and Generalisation Page 48

The justification for these two procedures is to avoid saturation of the artificial

neurones. This would result in operation in regions where the derivative of the neurone

output function is very small and consequently, if backpropagation or any other

algorithm that uses first-derivative information to train the network is used, the training

will be very slow.

The main drawback with a linear normalisation method is that for some real world data sets, the difference between two normalised data may not be large enough to stimulate the neurones. Hence it will take much longer for the ANN to learn and discriminate

between the two individual but relatively close patterns.

3.3.2 INPUT DATA (STIMULI) SELECTION

The threshold levels for ANNs are determined by the saturation levels of their

activation functions,, RI for sigmoidal functions or [-1,1] for tanh or similar

functions. The XOR function is selected as the training input due to its popularity

within the network research community and as a pattern where the network learns to

classify two crossing decision surfaces. The input/output structure for the XOR function can be represented by (3.4).

3.3.3 NETWORK SELECTION

To investigate the possibility of defining JND in ANNs, the most commonly ANN

representation was selected and is shown in Fig 3.2. Standard backpropagation (with momentum) was selected for training and the number of neurones in the hidden layer

was considered as a variable to study its effect on network sensitivity.

in, in, 0 0 out 0 1 1 0 (3.4)

(44)

Chapter 3: Non-linear Normalisation for Improved Learnin_cz and Generalisation Page 49

in, Out

Fig 3.2 Fully connected 3-layered feed-forward ANN with sigmoidal activation function

3.3.4 DEFINITION OF ABSOLUTE THRESHOLDS AND JUST NOTICEABLE DIFFERENCE

Similar to the method of limits described in section 2, the following technique is used

to determine absolute threshold and JND for each ANN.

Considering the fact that our perception of events in the environment should ideally remain the same at different intensity levels of stimuli, the output vector for the XOR function is kept the same as it is presented in Equation (3.4). The low value of zero for all inputs was assumed as an obvious sub-threshold stimulus that ANN is not able to sense. An ANN with two neurones in its hidden layer is capable of learning the XOR

function in approximately 500 iterations when using a standard backpropagation

technique. A maximum of 25000 iterations was selected as the ultimate number ofruns

that should enable the ANN to make a decision on noticeability of the input stimulus.

The following matrices show the input stimuli at each training run:

(3.5)

00

(Po Co 'Co +Ay, + Ago, + ...+ Acp_, + &Pr ±...± A(Pn-1

(Po +Act Ap_,

(Po +A(Pi A(P

(Po (p0 0 (Po_ (P0 + &PI (po + 0 AY! Po + ATI + + Asp, (po + Ap1 +...+ yo + Acp, + ...+ Acp_, (Po + A(Pi ±...+ A(P,

0

AC1 90

(45)

Chapter 3: Non-linear Normalisation for Improved learning, and Generalisation Page 50

where (p is the absolute threshold and each Ay, , i = 1 to n is the JND at a particular

threshold level. Graphically, the decision surfaces may be represented as shown in Fig.2.

14'

0

90 1

Fig 3.3 The growth of the intensity of the stimuli used for the XOR function

Since it is possible to save and reload the original randomised weights at the beginning of each training run, unlike when using the method of limits on living species, it is not

necessary to repeat the same patterns from sub-threshold to supra-threshold and vise versa. However, as the weights are each a member of a Gaussian random family, for each 7 run, the arbitrary randomised weights were tried and the results (i.e absolute

thresholds and JNDs) averaged an presented in Fig. 3.4.

3.3.5 PARAMETERS AFFECTING ABSOLUTE THRESHOLDS AND JNDs

In humans, JNDs vary from one to another. A musician is far better qualified to

discriminate between relatively close frequencies and tones; similarly, a painter can

recognise colours and shades better than an ordinary individual. It is not possible to

(46)

Chapter 3: Non-linear Normalisation for hnproved Learning andGeneralisation Page 5 I

are not known. However, in the case of ANNs, their structure and internal data

transduction is well known. To investigate parameters affecting the network response

to stimuli the following alterations are considered:

A change in the structure of ANNs in terms of the number of neurones in the hidden layer (h), namely 3, 5, 10 and 20.

Alteration of the variance (ci) of the random distribution for the initialisation of weights W1(3,h) and W2(h,l) to 0.2, 0.5 and 0.9. The mean (iii) of these random distributions was assumed to be zero and the learning rates and momentum were fixed.

3.3.6 TEST RESULTS I: ABSOLUTE THRESHOLDS

As discussed earlier the stimuli were gradually increased from zero until the ANN just started to notice or learn the input pattern and these were noted as absolute thresholds

( ). A summary of the averaged results is shown in Fig 3.4.

Var Var =0.5 Var =0.9

Fig 3.4 Absolute thresholds in ANNs

0.25 0.1 -6 ..61) 0.15 LI 0.1 0.05 3-N 5-N 10-N 20-N

Number of Neurons in the hidden layer

t-

(47)

Chapter 3: Non-linear Normalisation for Improved Learning and Generalisation Page 52

It is clearly observed from Fig 3.4, that increasing the number of neurones in the

hidden layer and/or having a higher value for variance, 02, the absolute threshold, yo,

will decline almost linearly. This demonstrates that ANN is more discriminative to input stimuli when the number of neurones in the hidden layer and initialisation variance are increased.

3.3.7 TEST RESULTS II: VALIDITY OF WEBER FUNCTION

A close study of the results of the second test implies that the Weber function (3.1) is not applicable to ANNs since the relationship between JNDs and absolute thresholds is

not linear. These results are presented in Figs 3.5, 3.6, 3.7, and 3.8 for the range of three different variances.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 II

Fig 3.5 Acp / cp for ANN with 3 neurones in the hidden layer

1 0.9 (Var =0.9) -0.8 (Var =0.5) -0.7 (Var =0.2) -0.6 0.5 0.4 0.3 0.2 0.1 0 I (p

(48)

Chapter 3: Non-linear Normalisation for Improved Learning and Generalisation Page 53

Fig 3.6 Acp hp for ANN with 5 neurones in the hidden layer

Fig 3.7 p /p for ANN with 10 neurones in the hidden layer

i I, 1 I 1 I 1 0.9 -do- (Var -0 9) (Var =0.5) 0.8 N(Var =0.2) 0.7 0.6 0.5 OA 03 -02 CH 0 0.41 1

.

0.3 0.4 '0.5 ( 0.6 0.7 0.8 0.9 a il II, =0.9) (Var ---a(V-ir -0.5)

0War =02)

0.9 08 017 0.6 0.5 0.4 03 02 0.1 I 0 , 0 01 102 0.3 OA , 03 (P , 0.6 03 I '0.8 10.9 III I, 00

(49)

Chapter 3: Non-linear NormalisationIbr Improved Learning and Generalisation Page 54

Fig .8, Ap/ pfor ANN with 20 neurones in the hidden layer

3.3.8 TEST RESULTS DEFINITION OF JNDs FUR ANNs,

In the, final part of this investigation, the JNDs were determined for ANNs. Once an absolute threshold was found for each combination of number of neurones in the

hidden layers, and the variance it was gradually increased until the ANN was capable of classifying the, input pattern as an_ XOR function. Obviously,, the low and high. values of input stimuli are not 0 and 1 any more. The results of these tests are shown in

the Figs 3.9, 3.10, 3. t land 3 U.

(Var =0.9)i War 0.5) (Var =0.2) 0.18 0:16 014 002 0.08 0.06 0.04 002 111 2 3 4 Number of.iNDs ,61 7

Fig 3.9 JNDs for ANN with 3 neurones in its hidden, layer

I E li [ ' War =0.9) 0.9 I 08 . =0.5) (Var III (Var =0.2) ' 0.7 0.3 0 0 i 10t1 , 02 , 03 1014 05 cp 06 0.7 . CI 0.9 111 0.1 0 5

(50)

Chapter 3.. Non-linear Normalisation for Improved Learning and Generalisation Page 55 0.18 0.16 0.14 0.12

in

0.1 0.08 0.06 0.04 0.02 (Var =0.9) (Var =0.5) (Var =0.2)

Fig 3.10 JNDs for ANN with 5 neurones in the hidden layer

0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 4 5 6 Number of .INDs 7 8 9 (Var =0.9) (Var -=0.5) (Var =0.2)

Fig 3.11 JNDs for ANN with a Oi neurones in the hidden layer

2 3 4 5 Number of .INDs 7 6 8 1 -r

I

I

I

I

I

I

I

I

Cytaty

Powiązane dokumenty

The idea of the hybrid MPC algorithm with Nonlinear Prediction, Linearisation and Nonlinear Optimisation MPC-NPL-NO is to find an initial point by means of the MPC-NPL algorithm

The proposed neural fault detector and locator were trained using various sets of data available from a selected power network model and simulating different fault scenarios

Artificial neural network techniques were used to build models enabling ship fuel consumption and speed predic- tion, which are necessary to develop DSS.. The input and output

W przeciwieństwie d o pierwszej pozycji serii ..D eutschland und Ö sterreich" nie jest pracą dw óch autorów konfrontujących swe poglądy na problem y

[r]

Stanisław Roszak, Janusz Tandecki "Metryka uczniów Toruńskiego. Gimnazjum

“similar” curves. For different arrangements of GRF profiles, it was possible to obtain the same level of a particular index. This method seems to complement the evalua- tion based

ere marked with triangles in fig. Both the iteration process and the neural model urement points. The the engine characteristics with the artificial neural