• Nie Znaleziono Wyników

Fault Diagnosis using Neural Technology, Hoofdrapport

N/A
N/A
Protected

Academic year: 2021

Share "Fault Diagnosis using Neural Technology, Hoofdrapport"

Copied!
141
0
0

Pełen tekst

(1)

Technische Universiteit Delft

Hoof drapporl 20-12-1995 R.F. van Kuilenburg OEMO 95/27

(2)

TNO report

FEL-95-S307 (OEMO 95/27)

NOT FOR PUBLICATION

TNO Physics and Electronics

Laboratory Oude Waalsdorperweg 63 PO Box 96864 2509 JG The Hague The Netherlands Phone +31 70 326 42 21 Fax +31 70 328 09 61

All rights reserved.

No part of this publication may be reproduced and/or published by print, photoprint, microfilm or any other means without the previous written consent of

TNO.

In case this report was drafted on instructions, the rights and obligations of contracting parties are subject to either the Standard Conditions for Research Instructions given to TNO. or the relevant agreement concluded between the contracting parties.

Submitting the report for inspection to parties who have a direct interest is permitted.

ril) 1996 TNO

Fault Diagnosis

using

Neural Network Technology

Date 20-12-1995 Author(s) R.F. van Kuilenburg Classification Classified by Classification date UNCLASSIFIED Title Managementuittreksel Abstract Report text Appendices A -UNCLASSIFIED

The TNO Physics and Electronics Laboratory is part of TNO Defence Research which further consists of:

: Unclassified Unclassified Unclassified Unclassified Unclassified Copy no No of copies

No of pages : Li 39 (incl appendices,

excl RDP & distribution list)

No of appendices

All information which is classified according to Dutch regulations shall be treated by the recipient in the same way as classified information of corresponding value in his own country. No part of this information will be disclosed to any party.

The classification designation Ongerubriceerd is equivalent to Unclassified, Stg.

Confidentieel is equivalent to Confidential and Stg. Geheim is equivalent to Secret.

This report has been written within the framework of a traineeship. TNO-FEL is not responsible for its content, possible conclusions or recommendations

(3)

nnn

UNCLASSIFIED

Management uittreksel

Tegenwoordig wordt er veel onderzoek uitgevoerd naar storingsdiagnose systemen aan boord van schepen. De engelse en canadese marine verrichten yeel onderzoek op dit gebied en hebben al grote vorderingen geboekt. Het is de verwachting dat storings, diagnose systemen in de toekomst een bescheiden cloth essentiele rot gaan spelen win boord van schepen. Aan de TU Delft is een onderzoeksproject gestart wat tot doel heeft kennis te vergaren op het gebied van storingsdiagnose aan boord. van schepen. Dit ICMOS (Intelligent Condition Monitoring Systems) project omvat verschiflende aanckichtsgebieden. Voor onder andere dieselmotoren en compressor koelsystemen worden prototype condition monitoring systemen ontwikkeld. Als onderdeel van Kilt project is er in het kader van deze stage gekeken naar het gebruik en eigenschappen

van neurale netwerken voor een storingsdiagnose systeem. Dit systeem moet fouten lcunnen identificeren die kunnen optreden bij een compressor koelsysteem. Deze identificatie moet plaatsgevonden hebben voordat enige schade aan het systeern of performance is opgetreden. De conclusie van dit onderzoek is dat neurale netwerken est, aantal eigenschappen hebben die nuttig zijn voor storingsdiagnose systemen. HS

parallel kunnen verwerken van signalen, ongevoeligheid voor ruis en de zelflerende

capaciteiten zijn enkele ihiervan. Aangetoond is dat verschillende .storingen snel

herkend kunnen worden. Echter niet al.le netwerkvormen leverden clezelfde positieve resultaten op. AdditTonele problemen werden gevonden in de noodzaalc om grote datasets beschikbaar te moeten hebben, het niet kunnen bewijzen dat een storingsdi-agnose systeem onder alle omstandigheden goede resultaten geeft en de black-box

natuur van neurale netwerken. Deze problemen lcunnen door een zorgvuldige training

grotendeels opgelost worcien. De ontwildceling en operationele inzet van storings

diagnose systemen is

est

langdurig proces. Dit impliceert dat een goede

samenwerking tussen ontwikkelaar en gebruiker gedurende .de hele levensduur van

(4)

TNO report

UNCLASSIFIED

FEL-95-S307 (OEMO 95/27) 3

Abstract

In this report a fault diagnosis system using neural technology is investigated. The focus will be on the practical use of neural networks. Fault diagnosis is very impor-tant in areas where malfunctioning of equipment can have serious consequences. The fault diagnosis system treated in this report monitors a refrigeration plant (as the system for which a fault diagnosis must be performed). Refrigeration systems are often critical parts of a larger system, they provide for example cooling for the cargo or weapon systems. At the TUD fast and reliable models are being developed

for refrigeration systems to be used in fault diagnosis systems. At the TUD a

re-search program is started to investigate the use of neural technology in fault

diagno-sis. The power of neural technology is that no deep knowledge is necessary for

building a successful fault diagnosis system. To train neural networks, only data is

necessary.

It was found that the principal component analysis can be very helpful in the design-ing stage of a fault diagnosis system. It can rapidly show the basic properties of data in a comprehensive manner. Some problems with the principal component analysis are its sensitivity to noise and its variance to data representation.

Neural networks can be used with success in fault diagnosis systems. The main

reason for this conclusion is their ability to autonomously develop complex relations between input signals and desired output signals. This enables neural networks to recognise faults with very complex symptoms. The 0.1% error rate can be reached but only if there is enough data that covers the operating range.

Neural networks can also be used to form a reference module to simulate a system.

The neural network simulation was not better or worse in performance than the

normal methods (a 5% error could be obtained). The normal feedforward network trained with the Marquart method was found to be the best choice. Recurrent net-works are less suited due to stability problems.

The main disadvantage of neural networks is the long training time and the huge

amount of data that are needed to properly train a network. Also to effectively train a neural network a large amount of knowledge about neural networks is necessary. The performance of a neural network is only guaranteed within the training and test

set range.

It became clear during this investigation that the development of a fault diagnosis system is not a simple task. Fault diagnosis systems must react on small deviations of signals. This implies that each fault diagnosis system has to be periodically tuned and updated for optimal performance. Optimal performance can only be expected

after some operational time. This makes a close cooperation between user and

(5)

Contents

UNCLASSIFIED

1. Introduction 6

2. Refrigeration systems 7

2.1 General working of a refrigeration system 7

2.2 Used refrigeration plant I 9

2.3 Used refrigeration plant II 12

3. Fault diagnosis 15

3.1 Basic requirements 15

3.2 Fault Diagnosis Module 17

3.3 The Reference Module 18

3.4 The role of fault diagnosis 19

3.5 Existing Fault Diagnosis Systems 20

4. Principal component analysis 21

4.1 Theoretical Basis 21

4.2 Results System I

4.3 Results System II 31

4.4 Conclusions 33

5. Neural Networks, General Description and Applications 34

5.1 General Description of Neural Networks 34

5.2 Advantages and Disadvantages of neural networks

5.3 Applications with Neural Networks 39

6. Neural Network Topology and Definitions 41

6.1 Definitions: 41

6.2 Symbols for the different neurons 42

6.3 Topology 43

6.4 Different transfer functions 45

7. Training Neural Networks 48

7.1 General formulation of training 48

7.2 Calculating the gradient of the error surface 50

7.3 Training algorithms general 56

7.4 Method of Steepest decent 58

7.5 Levenberg - Marguart method 61

25

...

...

37

(6)

TNO report

UNCLASSIFIED

FEL-95-S307 (OEMO 95/27) 5

8. Results of Backpropagation training for classification 63

8.1 Training Setup 63 82 Used networks 64 8.3 Variation of parameters 65 8.4 Error calculation 66 8.5 Results of Training 68 8.6 Threshold Values 80

8.7 Final Conclusions for classifying modules 81

9. Simulation of dynamic systems using NN 82

9.1 Simulation with neural networks 89

9.2 The Backpropagation training 86

9.3 Results of the backpropagation training 87

9.4 Conclusion 103

10. Kohonen network 104

10.1 Kohonen network, Theory 104

10.2 Kohonen network, Training 105

10.3 Results of kohonen training with the untransformed data 109

10.4 Results of the Kohonen network 112

10.5 Conclusion 114

11. Final Conclusion 115

12. Bibliography

Appendix A : Pressure Enthalpy Diagram freon 22 1

Appendix B Definitions, Symbols Notation 1

Appendix C: Definition of noise level 1

Appendix D : Radial Basis Networks

(7)

1.

Introduction

In this report a fault diagnosis system using neural technology is investigated. The focus will be on the practical use of neural networks. For a more theoretical view on neural networks the reader is referred to literature. An overview of this literature can be found in the bibliography at the end of this report.

Fault diagnosis is very important in areas where malfunctioning of equipment can have serious consequences. Warships and nuclear powerplants are some examples of these critical systems.

The fault diagnosis system treated in this report monitors a refrigeration plant (as the system for which a fault diagnosis must be performed). Refrigeration systems are often critical parts of a larger system, they provide for example cooling for the

cargo or weapon systems. To build a working fault diagnosis system, deep know

ledge is necessary to build the necessary models and fault matrices. These models are a necessary part of many fault diagnosis systems. The problem with refrigeration systems is that no fast and at the same time reliable models exist for refrigeration systems. At the TUD fast and reliable models are being developed for refrigeration

systems to be used in fault diagnosis systems. At the TUD a research program is

started to investigate the use of neural technology in fault diagnosis. The power of neural technology is that no deep knowledge is necessary for building a successful fault diagnosis system. To train neural networks, only data is necessary. This

elimi-nates the necessity of using deep knowledge. This is also a major disadvantage,

because when the neural system fails no accurate explanation can be given about the cause of the failure.

The goal of this report is to provide an overview of refrigeration systems, fault

diagnosis and neural networks and the combination of these.

The structure of the report follows the development path of that of a real fault

diagnosis system. First the system for which a fault diagnosis system must be build is described (chapter two). A short description of fault diagnosis is given in chapter

three.

After this description an analysis of the data of this system is done (chapter four). This in accordance with a real development path. When the data analysis has been done a suitable neural network must be chosen. To make a good choice of network form, some basic knowledge about neural networks must be present. This knowledge is given in chapter five, six and seven. Finally when the network forms have been chosen, training can begin. The result of this training are given in chapter eight, nine and ten. At the end a conclusion is given, about the suitability of the different net-works for fault diagnosis and reference engines (chapter eleven).

(8)

TNO report

UNCLASSIFIED

FEL-95-S307 (OEMO 95127)

2.

Refrigeration systems

2.1

General working of a refrigeration system

To get a basic understanding ofthe process for which a fault diagnosis system is

built, a general outline of a refrigeration system is given. The system boundaries and variables, and also detectable faults are given. All the components are modelled as black box models, because for neural networks only the data is of importance, not the underlying process which generated the data.

A refrigeration plant has the following basic system scheme and basic components

(figure 2-1):

Q2

figure 2-1, basic schemeof a refrigeration plant

The whole purpose of a refrigeration system is to transport energy(Q2) from a

relative low temperature level to a relative high temperature level (Q1). The trans-portation ofthis energy costs energy due to mechanical losses in the various

compo-nents and thermodynamic losses. The compressor (1) compresses the

freon

(refrigerant) which comes from the evaporator (4). To compress the freon an amount of mechanical energy is needed (W), this energy causes an increase of the enthalpy

ofthe freon. At this high pressure the (still) gaseous freon is condensed in the con-denser (2), releasing energy (Q1) to a cooling medium at a relatively high tempera-ture level. Now the pressure of the freon is adiabatically decreased in the choking

device (3). At this low pressure the freon is evaporated in the evaporator (4) at a

relatively low temperature level.

I. Compressor 2. Condensor

1 3. Choking device

Evaporator 7

(9)

his customary to plot the process in a pressure - enthalpy diagram of the refrigerant (freon), figure 2-2. subcooling Q I enthalpy h 1.1/kg) UNCLASSIFIED Q2

figure 2-2, refrigeration process plotted in a log(p) - H diagramof freon 22

In this diagram the co-existence of fluid + gas of the refrigerant is drawn. After

compression (I) the gas is superheated. In the condensor (2) the gas is first cooled to a saturated gas and then condensed to liquid. Often the temperature (energy) of the

gas is lowered beyond the point where just all the gas has become liquid. This is

done to prevent the liquefied gas to evaporate before it reaches the evaporator due to pressure losses in the pipes or height differences. This is called ''sub-cooling" of the gas. After the choking-valve(3) the gas is evaporated (4). The temperature of the gas

is more increased then absolutely necessary, to ensure that only gas reaches the

compressor. The refrigeration system cannot be controlled if no superheating exists. If liquid reaches the compressor, the compressor can brake down (liquid is hard to compress). The extra temperature increase of the gas is called "superheating". see appendix A for a complete diagram of freon 22 (trademark of Dupont)

A refrigeration system can function with only the components mentioned above, in practice the following additional components can be necessary

IL filter! dryer of the cooling medium. oil separator

control components additional heat exchangers buffer vessels

(10)

TNO report

UNCLASSIFIED

FEL-95-S307 (OEMO 95/27) 9

2.2

Used refrigeration plant

In this chapter a short description of the test plant and the measurements is given, for a more detailed description see [Grimmelius, 1992].

The installation is a water chiller plant situated at the main office of Van Buuren - Van Swaay in Zoetermeer.

The unit consists of three six-cylinder compressors, two of which have a controllable

cylinder bank, a water cooled condenser, a waterchilling evaporator, refrigerant

piping and valves, and an electrical control panel. The process diagram [Grimmelius

1992] is given in figure 2-3.

2.2.2 System Layout

figure 2-3, system layout of the refrigeration plant located at van Butt ren van Swaay 2.2.1 System data

Cooling capacity

:270 kW

Electrical power : 3*27 kW

Coldwater temperature : 6,5-12 °C

(11)

I. Compressor 3. Filter/Dryer 5. Spy glass 7. Evaporator

9. Coolingwater circulationpump Buildingair-conditioning 13. Oilpump

15, Oilinjection bearings

2.2.3 Measured Parameters

table 2-1 measured parameters refrigerationsystem

UNCLASSIFIED

2. Condenser 4. Solenoid 6. Expansion valve 8. Cooling tower

10. Chilled water circulation pump 12. Electromotor

14. Crankcase

nr parameter symb

ol

unit

1 Suction 'pressure loci bar

2 Oil pressure po,,, bar

5 6

Pressure decrease over filter Pressure after evaporator

Ap, bar

7 Carter pressure Pc,kc bar

8 Pressure after compressor bar

9 Temperature before compressor tc, sc

10 Temperature after compressor te, 2C

11 12

Oil temperature compressor I Temperature freon after condenser

to., QC

2C 13

14 15

Temperature freon before expansion valve Temperature freon before evaporator Temperature after evaporator

taxp,

t.,

2C °C 16 Temperature cooling water - Out

L.

2C

17

18

Temperature cooling water -In Temperature chilled water -In

tc, 2C

19 Temperature chilled water -Out C

20 Current compressor I lc, A

21 Current compressor II I0.2 A

22 Number of cylinders reference voltage Vrelcvl V

23 Oil temperature compressor 0C

24 Room temperature t

(12)

TNO report

UNCLASSIFIED

FEL-95-S307 (OEMO 95/27) 11

2.2.4 Remarks on the measurements of the introduced faults

The data of the following faults were measured by Grimmelius (the codes follow the codes in the report used by [Grimmelius, 1992]).

Fault 1: 1.1a Compressor, increased resistance at suction pipe.

12 cylinders in use, 240 measurements done

A valve was closed in the suction pipe of the compressor. Fault 2: 1.5a Compressor, output, increased resistance

12 cylinders in use, 150 measurements done

A valve was closed in the discharge pipe of the compressor. Fault 3: 2.2e Condenser, side of cooling water, to little cooling water

12 cylinders in use, 160 measurements done The flow of the cooling water was reduced.

Fault 4: 3.1a Fluidtrajectory (except filter) element increased resistance,

12 cylinders in use, 60 measurements done

A valve placed in the freon circuit was partially closed. Fault 5: 4.2e Expansion valve, bad contact temperature sensor

10 cylinders in use, 150 measurements done

The temperature sensor was isolated from the freon pipe.

Fault 6: 5.2a Reduced chilled water flow

10 cylinders in use, 105 measurements done

A valve was closed in the chilled water lines.

For a detailed description of the faults and the measurements see [Grimmelius 1992a]. Of these measured faults only fault 1, fault 2 and fault 3 are extensively

(13)

2.3

Used refrigeration plant II

The installation is a water chiller, used for cooling the research rooms at Van

Buuren - Van Swaay. For this investigation the system is coupled to a reheater, in order to simulate the desired workloads for the refrigeration system. The installation is filled with freon 22, with a lowest temperature of -1 °C. The condensor is cooled with fresh water, of which the flow can be controlled by a valve. For more informa-tion on this refrigerainforma-tion system see [vanderHeiden, 19941 and

[vanKuilenburg, 1995].

The process diagram of the refrigeration system is given in figure 2-4

0

Fl

4

UNCLASSIFIED

figure 2-4, system layout of the laboratorium plant located at van Buuren van Swaay 2.3.1 System data

Electrical power : 14 kW

Cooling capacity : 80 kW

Coldwater temp : 3.5 - 9 °C

Cooling water temp : 20 - 35 °C

Amount of freon : 12.4 kg

(14)

TNO report

UNCLASSIFIED

1FEL-95-3307 (OEMO 95/27)1 13

232

Measured Parameters

table 2-7 !measured parameters refrigeration system If

nr

parameter

symbol unit

1 Oilpressure Pa bar

2 Suction pressure compressor p bar

3 Discharge pressure compressor Pcd bar

4 Pressure after condenser Rondo bar

5 Pressure before expansion valve Papi bar

6 Pressure before evaporator Avi[ bar

7 Crankcase pressure Pc, bar

0 Room temperature itinv sic

1 Oiltemperature compressor ito QC

2 Suction temperature compressor tel QC

3 Discharge temperature compressor QC

4 'Liquid temperature after condenser ta,0 QC 5 Temperature before expansion valve it.,,, QC 6 Temperature before evaporator t.,,, sc

7 Temperature crankcase itec 'QC

8 Temperature cooling water in tow, QC 9' Temperature cooling water out

L.

QC

0

1111

Temperature chilled water in Temperature chilled water out

t,

QC

11 Flow freon Or kg/s

2 Flow cooling water Ot. m3/h

3 Flow chilled water O rn3/h

11 Electrical power compressor Prm kW

(15)

For more information about the precise location and specifications see [vanKuilenburg, 1995]. All these faults (except compressor, one phase

discon-nected) were measured in four different working points of the water chiller

(table 2-4).

table 2-4, different settings of the four working points

Code to.vi pond 41mcw 10eIrw tcondi,cw tevi,chw

15.32 bar 15.32 bar 17.30 bar 17.20 bar 2.02 m3/h 2.60 m3/h 2.02 rn3/h 1.62 m3/h 11.1 m3/h 11.2 m3/h 11.2 rn3/h 11.2 m7h 13.1°C 11.8°C 13.2 °C 20.5 'C 13.3°C 19.2°C 13.5°C 11.0°C UNCLASSIFIED

Number Main part involved

Description

1 Compressor Increased resistance suction side

2 Compressor Increased resistance discharge side

3 Compressor Main power one phase disconnected

4 Condenser Too much cooling water

5 Condenser Too little cooling water

6 Liquid line/ Expansion Valve Increased resistance 7 Liquid line/ Expansion Valve No pressure correction

8 Liquid line/Expansion Valve Valve stuck

9 Evaporator Leakage over evaporator

10 Evaporator Increased resistance water side

2.3.3 Remarks on the measurements of the introduced faults

The refrigeration system was fitted with extra piping and valves to make it possible to simulate a large number of faults.

table 2-3 measured faults of refrigeration system

A 300 4000 9 °C 4000 9°C 45 00 3 °C 45°C tcond II

(16)

TNO report

UNCLASSIFIED

FEL-95-S307 (OEMO 95/27) 15

3.

Fault diagnosis

In this chapter the general layout and demands on a fault diagnosis system are given.

It is not the goal to describe a fault diagnosis system (FDS) to the level of

pro-gramming code but only to mention the points that are of value for this report. These points are of value when they are related to neural networks. First the basic require-ments are given. Secondly a more detailed description of FDS is given. The last part deals with existing FDS.

3.1 Basic requirements

The criteria which are important when the decision must be made to develop a FDS are given below. The basic function of fault diagnosis is to register an alarm when an abnormal condition develops in the monitored system and to identify the failed

components. These criteria can be used to assess the FDS probability of success when developed and to rate the existing FDS that are used already. When any of these criteria cannot be reached in a satisfactory way then in most cases the FDS

can be regarded as useless. These criteria are partially from [Patton,I986]

I. Overall cost

The cost of a FDS must be compensated by the savings that can achieved when using a FDS. This means that a FDS can be very expensive, when the savings that can be achieved are also very large.

(This in contrast with [Bonnier, 1994] who stated that a FDS must be cheap by

itself)

Promptness of detection

Faults have to be recognised before any damage (and preferable any effects on the system performance) has occurred.

False alarm rate

A high false alarm rate during normal (healthy) operation is unacceptable be-cause it quickly leads to a lack of confidence in the detection system

Missed fault detection

When a FDS can't detect a fault that has major consequences for the operation of the system at an early stage then it is useless.

The faults that a FDS must detect are the faults that have a major effect on the functioning of the system. Faults with less impact on the system have less

(17)

Sensitivity to slowly evolving faults

The system must be able to detect faults that are only slowly developing in time such as fouling of piping or wear.

Incorrect fault identification

Identification is (for most conventional) systems the second step after diagnosing that a system has failed. The identification is the basis on which it is determined how serious the fault is and what performance is still possible given the diag-nosed fault. From the view of cost effectiveness correct classification is

neces-sary for minimal maintenance costs and a minimum number of engine shut

downs.

Computational load

When a FDS has to be used in real time applications, the computational load has to be small. Very large models and programs take a lot of time and computer

power to reach a decision. This can generate many problems when the models used have to be implemented on normal desktop computers.

High reliability

Bonnier [1994] pointed this out for the sensor side. The same is valid for the computer side. The programs and computer systems on which the FDS

runs have to be more reliable than the system itself. This seems a rather simple demand but in practice this can give many problems. Many customers will specify the platform on which the software has to run. The Royal Netherlands Navy, often demands Windows NT as a minimum platform.

The compromises in detection system design among false alarm rate, sensitivity to

slowly evolving faults and promptness of detection are difficult to make because

they require extensive knowledge of the working environment and explicit under-standing of the vital performance criteria of the monitored system.

Fault detection and identification can be performed in two ways. The first method uses the measured data directly, which is transformed to the correct input parame-ters for the identification module. The other method uses a reference model of the monitored system. The input parameters for the identification module are the differ-ences between the simulated and monitored system. This second method requires much more information about the system than the first (direct) method. The advan-tage of this second method is that this method gives much more information to the

user, even when the FDS cannot reach a decision. Another advantage is that the

FDS by using a simulation model may be independent of different working

condi-tions. UNCLASSIFIED 5 6 7 8

(18)

TNO report

UNCLASSIFIED

FEL-95-S307 (OEMO 95127) 17

The two different methods can be expressed in the following block diagrams :

Input parameters

system

input parameters

figure 3-1, direct use of system parameters in a FDS module

system

model

System Iparametersoi system parameters model parameters

EDS

FDS

figure 3-2, indirect use of systemparameters MaFDS module

3.2

Fault Diagnosis Module

Looking closer at the FDS modules, which contain the same parts for the direct

method and the indirect method, we can identify the following stages.

The data processing stage

In this stage raw input is processed to enhance the information contained in the signals. This transformation can include: linear and non-linear transfor-mations and the calculation of related signals. In this application for refrig-eration systems the subcooling and superheating have to be calculated with

the measured signals. In fact this stage is adding extra information to the signals based on knowledge about the process itself. This module can be

designed in advance. The design requires insight in the process, data

proc-essing and feature enhancement. In this module a first check for sensor

faults can be easily performed.

The classification stage

The classification stage uses the data prepared by the dataprocessing mod-ule. It is mostly a pattern recognition modmod-ule. In standard applications often

very elaborate and sophisticated methods are used to recognise patterns

([Grimmelius, 19921 and [Patton, 1989] ). Neural technology used in this report is simple to use in this stage. This is because neural networks arrange themselves optimally for the demanded task and can generate a very simple

output. This output can very often directly be used without much post-FDS decision vector

Po.

(19)

processing (the level of post-processing depends on the type of network that

is being used)

The neural networks can be trained to link certain patterns with certain faults. It is at the present time very difficult (if not impossible) to make a

neural module that recognises unknown (not trained) faults.

3. Output stage

This is the last stage of a FDS. This is an important stage because the

in-formation gained by the FDS is only effective if used in a correct manner. This stage is responsible for the transformation of the internal information to the needs of the users of the FDS. The users can be operators and other

computer systems such as the ships main computer system. This stage

makes extensively use of databanks to provide more information about the faults, such as severity, system degradation, place and corrective measures. It is this stage that utilises the full potential of fault diagnosis.

Combining these three FDS building blocks the following system is formed

(figure 3-3)

figure 3-3, internal modules of a FDS

3.3

The Reference Module

This is a very straightforward module. In this report neural technology is used to

form a reference system. The main demand on the reference module is that it must enhance the patterns generated by the occurring of faults. It is therefore not

neces-sary that the system dynamics are simulated exactly. The problem is that it is not known which other patterns except the exact simulation are suitable for the

en-hancement. Consequently the reference module has to simulate the system as close as possible in this report.

The goal is to make models using neural technology, that are better than the models used by Grimmelius and van Galen.

For the exact layout and demands of the reference module, the reader is referred to chapter nine on identification using neural technology.

UNCLASSIFIED

input

data 0" pre processing classification -0. output generation

A I I

A

I I

ri

(20)

TNO report

UNCLASSIFIED

FEL-95-S307 (OEMO 95/27) 19

3.4

The role of fault diagnosis

What is the place of fault diagnosis in the total of the systems that form a larger

platform or why does one need fault diagnosis systems in the first place? To define the need and place of a FDS, first the platform (larger system) itself has to be de-fined. In this case a ship is taken as an example platform.

A ship is a tool that can be used to fulfil a certain goal. This goal can he to transport containers from A to B or to protect merchant ships from enemy forces. Zooming in on the ship a number of sub-goals can be defined, in order to fulfil the main goal. Sub-goals : can be, not sinking, travel economically, etc. The ship has a number of resources that can be used to fulfil the goals.

Some of the means that can mobilised to meet the goal are

Energy (electricity, fuel )

Information (speed, fuel level, engine speed, weather, currents, cargo, etc.)

Human Resources

Physical Systems (pumps. cranes, etc).

To find the most optimal combination of 1,4 and 3 , information (2) about all these

resources is essential, to divide the resources in the best way over the systems. It is clear that a FDS module, which generates high quality information, is an

essen-tial part of this system. The role of FDS will only grow larger in the future when

ships become more automated and less experienced personnel is available on board. Especially on board of warships, where the availability of high quality information

about the health and performance of all sub-systems is very important when an

emergency arises. Information must then be generated in a fast and reliable manner, with interfaces to the personnel and the other systems (preferably in an automated way). FDS will play a crucial role in the generation of this information. For a more detailed description about this subject see [Boasson, 1995] and [Schriek, 1995].

(21)

3.5

Existing Fault Diagnosis Systems

A short list will be given of existing fault diagnosis systems (table 3-1), in use today or being developed. All these systems are designed for real time operation systems intended to monitor diesel engines on board of ships. This list is taken from [Paas, 1995] and [Korse, 1994].

Looking at the list it is clear that neural networks are not used very often in fault

diagnosis systems. In later chapters of this report (especially those on fault detec-tion) it will become clear what the reasons could be for neural networks not being

used.

table 3-1 real time fault diagnosis systems in use/development today

UNCLASSIFIED

Name Status Manufacturer Type

Amethyst Operational IR D Mechanalysis Expert System Capa Operational MAN/B&W

Deeds Under development Loyds Register of Shipping Expert System/Deep

knowledge Despro Under development University Newcastle Deep knowledge Dexter Operational American President lines maker Emperical relations Dicare Operational Krupp MaK

Diesel - Prof Operational Promaco

Diva Operational Alsthom Expert System

Deimos Under development Ricardo Ltd Base line model

Dymos Operational IHI

Eds Esprit Programme

Ems Esprit Programme

Engineer Assist Operational Bentley - Nevada System

Faks Operational Wartsila Diesel

lcmos Under Development Nevesbu/TU Delft / Stork Wartsila Deep Knowledge Machinery

Condi-lion Analyser

Design Phase American Navy

Modis - Geadit Testing Phase MAN B&W /AEG see Cape

Mapex New Sulzer Diesel Expert System

Nspectr Operational

Varmint Operational Desian Maintenance Systems Expert System

(22)

-TNO report

UNCLASSIFIED

FEL-95-S307 (OEMO 95127) 21

4.

Principal component analysis

In this chapter the principal component analysis is described. This analysis is used to determine whether a neural network has some probability of success in detecting different faults from the data. In literature it is shown that neural networks perform a kind of principal component analysis (internally) to classify the input-data into different classes [Hecht Nielsen, 1991]. This method can provide the engineer with an easy tool with which one can see in advance whether or not a neural network is suitable for a certain decision task.

4.1 Theoretical Basis

The problem in pattern recognition is the extraction of features, or the selection of features. Feature selection refers to a process whereby a p dimensional data space is transformed into an m dimensional feature space that in theory has exact the same dimensions as the original data space ( p = m). The transformation is done in such a way that the data can be represented with a reduced number of features (m < p), and maintain most of the contents of the data. This is called a dimension reduction. The principal component analysis ( Karhunen-Loeve transformation) maximises the rate of decrease of variance and is a good choice for this transformation.

The following description of the method comes from [Haykin, 19941

Let : x : p-dimensional random vector representing one set of the data of interest with zero mean

E[x] = 0 1-1

u : unit vector, p-dimensional onto which x is projected. This projection is defined by the inner product of the vectors x and u as shown by 1-2

a = XTU = UTX 1-2

U is subject to the constraint

(1-11.1-0 = 1-3

The projection a is a random variable, with mean and variance related to the statis-tics of the vector x. Under the assumption that x has zero mean it can be shown that the mean value of the projection a is zero too:

E(a) = uTE[x] = 0 1-4

(23)

The variance of the random variable can now be described with the following

equa-tion:

E[a2] = uTRu = 4:3-2 1-5

with

R = E[xxT] 1-6

The p-by-p matrix R is the correlation matrix of the data vector x.

R is calculated by the following equitation [Hecht Nielsen, 19911:

N

-R = clk

N

k=I

x = average x vector

N = number of datapoints

The variance Era21 (of the projection a) is according to 1-5 a function of u. This

variance function 1-5 is called a variance probe and is denoted with wt.).

The next problem is to find unit vectors along u in which iii(u) has local maxima or minima. The solution to this problem can be written as an eigenvalue problem :

Ru = X

itt 1-8

with / = parameter number

The solution to this problem gives a set of eigenvalues and a set of orthogonal

eigenvectors. It can be shown that the variance probe and the eigenvalues are equal :

j

= 1,2,..,p

UNCLASSIFIED

1-7

(24)

TNO report

UNCLASSIFIED

FEL-95-S307 (OEMO 95/27) 23

This gives two conclusions:

The eigenvectors of the correlation matrix R define the unit vectors u in which the variance probes have their local minima or maxima.

The associated eigenvalues defme the extremal values of the variance probes ky(u).

With the two rules above we have a tool to transform the basis vectors of the data in such a way that the new basis vectors are all orthogonal and are in the direction of the local maxima and minima of the variance probes.

Transformation of x in a and vice versa is governed by the following simple

equa-tions

a

a = UT x 1-10

(/ x

x = Ua

Where U stands for the p possible u vectors that are formed through the PCA. The transformation is a tool for dimension reduction. Discarding those features with the lowest variance (the eigenvectors with the lowest eigenvalues connected to them) reduces the number of dimensions of which the data is composed, but retain as much as possible the original signal.

figure 4-1, original data with principal figure 4-2, compressed data with

component axis principal component axis

Figure 4-1, 4-2 are an example of a principal component analysis, from [Haykin,

1994].

(25)

A cloud of data points is shown in two dimensions, and the density plots by project-ing this cloud onto each of two orthogonal axes 1 and 2 are indicated. The projection onto axis 1 has maximum variance, and clearly shows the bimodal, or clustered

character of the data.

The second figure shows the effect of discarding the second principal component.

The data shrinks into a line with the statistical properties of the first component

(bimodal). The figures show not the complete shrinking to a line for the purpose of clarity. The points should also lie on the first principal component axis.

This is a simple example. For data of more than two dimensions it is very difficult to

see the structure of the data, and an analysis such as a PCA is necessary. This

method is used in several publications to show the effects of faults [Koivo, 1991].

[Koivo, 1994].

There are four guidelines concerning the number of principal components that

should be retained in order to effectively summarise the data [Rencher, 1995].

Retain sufficient components to account for a specified percentage of the total variance (for example 80%).

Exclude components whose eigenvalues are less than the average of the

eigen-values

Use a plot of X versus i. From this plot a natural break can be seen between the

large eigenvalues and the small eigenvalues.

Test the significance of the larger components

The first four principal components are plotted, and the all the eigenvalues of a

particular system. But even when a component is small its influence can be large, so care must be taken when evaluating the results of the analysis

The data-analysis of the refrigeration system is performed using Matlab function

COV for the calculation of the correlation matrix R with the following

implementa-tion:

[n,p]

= size(X)

X=X-ones(n,1)*mean (X) Y=X'*X/(n-1)

(26)

TNO reparf

UNCLASSIFIED

FEL-95-S307 (OEMO 95/27),h 25

4.2

Results System I

In the next figures the eigenvalues are plotted for the different number of cylinders in operation. For the complete series of plots the reader is referred to the 'other volume with all relevant plots.

Before the PCA algorithm 'is- performed the data is transformed in the following way.

(aoldocylinder operation -loan cylinder operation)

new o cylinder operation -=

Sid((old,nX cylinder operation aald,n cylinderoperation))

with = number of working cylinders (4, 6, 10, 12)

3;Nth n cylinder operations = measurement vector before 'transformation -"new, n cylinder operation = measurement vector after transformation

atd(x) = standard deviation of the vector x

In this way we get data with zero mean and maximum variance of one. This is done

because the parameters could have widely different variances. According to

[Rencher, 1995] it is best to standardise all the parameters. This because the princi-pal component analysis is not scale invariant.

The following parameters are used for the PCA (the choice was random).,

Suction' pressure

Oil pressure compressor I Pressure decrease over filter Pressure after evaporator 'Crankcase pressure Pressure after compressor

Temperature freon after condenser Temperature freon before expansion valve Temperature increase cooling water Temperature freon before evaporator II. Subcooling

2: Superheating

(27)

8

5 10

eigenvalue number

As can be seen for all the different working points (four, six, ten and twelve cylinder operation), the number of significant components is much lower than the number of used parameters (twelve in this case). In all cases only the first five principal com-ponents are of importance (based on the guidelines described in the previous

sec-tion). Twelvecylinder eigenvalues 0 0 5 10 eigenvalue number Sixcylinder eigenvalues

0 0 0 0 0

vi 15 UNCLASSIFIED 8 6 -65 a)

-i4

a) 52 8 0 Tencylinder eigenvalues 0 15 0 5 10 eigenvalue number Fourcylinder eigenvalues 15 5 10 15 eigenvalue number figure 4-3, eigenvalues of refrigeration system I for different cylinder numbers

If we plot the data using only the first and second principal components, we get

figure 4-4.

What can be seen is that the variance in x direction (first principal component) is

much larger than the variance in y direction (second principal component). This is expected because the PCA ensures that we have a decreasing variance. Also can be seen that the different cylinder numbers do not have much effect on the area in which the data points are located. This is caused by the fact that all the data is transformed before the PCA is performed.

8 oo

)

0000

a) > 0 2 0 2 0

(28)

TNO report UNCLASSIFIED 10

10

5

15

figure 4-4, first and second principal components of refrigeration system!

What is really important is how the data of the measured faults is situated in this healthy area. When the data lies completely in the healthy area, problems can be

expected when using neural networks for fault classification and detection. The data used in this analysis is unmodified data, i.e. no reference module was used. When

the data of the faults lies completely within the healthy area, the use of reference modules could be necessary for a neural network to detect the faults. The four

cylinder operation PCA is somewhat different from the other operation ranges. This can be explained with the fact that relatively very few data points were available for the PCA.

When plotting the data of fault one in the healthy area, the situation looks not very promising (figure 4-5). The data of fault one lies completely within the healthy area, and therefore the chance of a neural network detecting the fault is considered small. The situation when we plot the first three components (figure 4-6), is much better. Clearly the fault can be seen as a deviation out of the normal operation range but the question remains how significant this third component will be for a neural network.

10

10

5

0 5 10 le principle component Sixcylinder operation, 03-03-1995

10

5

0 5 10 le principle component Fourcylinder operation, 03-03-1995 4 "C. a) `C' a) 2 , 5 la 0 0 0 a) 0 U) 0. 2 C . FEL-95-S307 (DEMO 95/27) 27 .sa

5

a _4 CI

Twelvecylinder operation Tencylinder operation, 03-03-1995

10

6

10

5

0 5 10

5

0 5 10

le principle component le principle component 5

0

a

a

-10

(29)

Principle components of the healthy and faulty situation, fault '1

-8 -6 -4 -2 0 2

In principle component

figure 12 cylinder working range, fault one

2e principle component

Twelve cylinder operation with faultt .

UNCLASSIFIED

to principle component

6 8 10

figure 4-6,, .12 cylinder working range, fault one, three principal 'components

Looking at the data of fault number two, large problems can, be expected in using

neural networks for identification and classification. The data of fault two lies

completely within the normal data, even when plotted with three components instead

of two components. -15 5 4 4-5, -5

(30)

TNO report

UNCLASSIFIED

FEL-95-S307 (OEMO 95/27) 29

Principle components of the healthy arid faulty situation, fault 2

10 8 6 4 6 8 10 --10 10 -8 -6 -4 -2 0 2 4 6 8 10 le principle component figure 4-7, 12 cylinder working range, fault two

For the data of fault three and four (which are located in the same file) the use of a neural network for identification and classification could well be working. The data of fault three and four is situated well outside the normal operation range.

Principle components of the healthy and faulty situation, fault 3/4

10 8 6 4--10 10 -5 0 5 le principle component

figure 4-8, 12 cylinder working range, fault three and four (solid line)

Figure 4-9 shows the influence of noise on the principal component analysis. The

figure shows the effect of adding noise to the data. The PCA was performed on

"noise free" data, because after inspection the effect of noise on the PCA could be neglected so normal data was used for the PCA.

As can be seen from figure 4-9 the effect of noise on the results of the PCA are very significant. Even with the low level of 10% noise (for the definition of the noise level

10 15 2 0 -2 4 6 8 ...

(31)

see appendix C) the shape of the area completely changes. This shows that the effect

of noise cannot be neglected when using PCA (or related methods) in real time

applications.

05

0. 0. 0 _5 ca. GI C

Twelvecylinder operation with 0% noise

10

-10 -10

-10 -5 0 5 10 -10 0 10 20

le principle component 10 principle component Twelvecylinder operation with 20% noise Twelvecylinder operation with 40% noise

10 10 0 10 re principle component a. 5 C4

05

0. 0 0. -5 0.3 UNCLASSIFIED

Twelvecylinder operation with 10% noise

10 'E a)

05

0. 0 -10 20 -10 0 10 le principle component figure 4-9, 0%, 10%, 20% and 40% noise added to the healthy data

(32)

INC report UNCLASSIFIED 6 5 4 3.3 2, 2 oo FEL-95-S307 (OEMO 95/27) 31 4.3

Results System II

In the next figures the principal components of system II are plotted for some

intro-duced faults. For the complete series of plots the reader is referred to the other

volume with all relevant plots.

The same parameters are used as with refrigeration system I

When plotting the eigenvalues of refrigeration system II, we can see that in this case also only the first five components are of significance (figure 4-10).

Fourcylinder eigenvalues

0 0 0 0 0 0

5 10

eigenvalue number

figure 4-10. eigertvalues of refrigeration system 11

In figure 4-11 the normal operating range is plotted (operation condition A, B, C and D) together with fault D14d23 (too much cooling water)

The normal operation range (black dots) has the shape of a diamond of which the points are the different working conditions (A, B, C and D). The fault (black line) starts in one corner and can be seen wandering out of the healthy operation range, until it finally returns to the normal operation range. This is clearly a fault that can he easily detected with a neural network.

(33)

6 2 o ccI'-2 -4 5 0 -10 -15

Principle components of the healthy and faulty situation, fault 014d23

-5 -4 -3 -2 -1 0 1 le principle component -6 -5 -4 -3 -2 -1 0 1 e principle component UNCLASSIFIED 2 3

figure 4-11, first and second principal components of healthy situation and fault D14d23

In figure 4-12 the normal operating range is plotted (black dots) and fault A2d21 (compressor increased resistance discharge pipe, black line). In this particular case the effects of the fault lie entirely within the normal operating range, thus reducing

the chance of a neural network detecting the fault. This kind of faults will be hard to detect with neural networks and probably also with the normal methods. The use of

a reference engine can be considered in this case, to enhance the effects of the faults.

Principle components of the healthy and faulty situation, fault A2d21

4

figure 4-12, first and second principal components of healthy situation and fault A2d21

2 3 4

8

(34)

TNO report

UNCLASSIFIED

FEL-95-S307 (OEMO 95127) 33

4.4

Conclusions

Principal Component Analysis (PCA) can be very helpful in the designing stage of a fault diagnosis system. It can rapidly show the basic properties of data in a compre-hensive manner. The working range of all the measured parameters can be expressed in one two-dimensional picture. For the system considered in this report two

principal components were sufficient. For other systems it can be necessary to view more than two principal components to capture most of the contents of the original data. The two dimensional picture can be used to determine whether or not the whole working range has been measured. No method is without drawbacks, and the princi-pal component analysis is no exception.

First the PCA is very sensitive to the manner in which data is presented (not scale

invariant). Therefore conclusions made on basis of PCA are only valid when ex-actly the same data is used in the FDS [Rencher, 1995].

Secondly care must be taken in analysing the PCA plots, if someone knows that a certain fault is plotted then it is easy to recognise this fault in the healthy data. When confronted with data in which a fault is only suspected the situation is very different. Thirdly PCA is also sensitive to noise in the signals.

The results presented in literature are often very clean results with very nice

proc-esses, see figure 4-13 and 4-14 [Koivo, 1994] but their significance when using

them in real noisy processes is questionable.

II

The /Li

figure 4-13 two largest principal components of a roller mill [Koiro, 19941

0.5

10

.05 1 -1.5 1. Wu& 146I 1 Irak Imook, 0 4 6 2 2.5 1.5 2 -2 .13 -0.5 0 03 Ti

The fun principal component

(35)

rff====serrrn

5.

Neural Networks, General Description and

Applica-tions

In this chapter a general overview of what neural networks are is given. It is not the goal of this chapter to give a full theoretical description of neural networks. For a more detailed (and mathematical) description of neural networks see chapter six and seven. At the end of this chapter a number of applications are given in which neural

network technology is used.

5.1

General Description of Neural Networks

To remove immediately all the wrong perceptions about neural networks : Neural Networks are not intelligent, cannot generate "smart" decisions, do not think or are

in any form fully comparable to the human brain. In fact a neural network consists of only a compact set of mathematical formulas.

A neural network is a non-linear equation, with a great number of adjustable

pa-rameters. These parameters can be changed to fit the output ofthe equation (neural network) to a desired value, given a certain input value.

The symbols and terms of the neural networks are only a compact way ofdescribing

these highly non-linear equations. This special notation (and all the theories

sur-rounding neural networks) originated from the field of biology. The neural network theory started when biologist sought to describe one of the basic function blocks of

the human brain: the neuron. They hoped that when they could understand the

working of the neuron they could explain some of the properties of the human brain. One of the consequences of the fact that neural networks originate from the human

brain idea is that neural networks behave in some aspects the same as humans. Learning a neural network is a difficult task, learning humans something is also

difficult. Humans and neural network both are very good at pattern recognition. As in the biology the basic building block of a neural network is the artificial neu-ron. This neuron is a very simple item. It can receive a signal, can transform it (with its transfer function) and can generate a single output. Sometimes the neuron pos-sesses a local memory. to store past values or to perform additional functions.

(36)

INCreport

UNCLASSIFIED

FEL-95-S307 (OEMO 95/27) 35

In mathematical form the artificial neuron with one input and one output can

described in the following way:

y(x)=

(x 9)

1-1

with Lp = transfer function

= input of neuron

0 = threshold

y(x) = output of neuron given input x

In graphical form:

9(x-0)

V

10

In a neural network many of these neurons are connected to each other. In a network neurons can have inputs from multiple other neurons, so the model of the one

input-one output needs some refinement.

It is stated that a neuron can receive multiple inputs but only can generate one output. In mathematical form this multiple input - single output neuron can be

described in the following way:

In human form

with iv; = weight factor connected withinput

Xi = input parameter i

In graphical form

z

9

Note : this single input generation, the hyperplane input is not the only possible

method see for a different kind of input generation the radial basis networks. (1)(1)-0)

TO 0111111 NEURONS

ROIN 0111E0 NEURONS

41111

\---NEURON I NEURON 2

(37)

It is convenient to write the equations in matrix notation, because when networks grow large still a simple set of equations will be sufficient to describe the network. When a neuron is written in matrix form thefollowingequation emerges for a single

neuron:

y(K)=

(p (Er

-0)

1-3

As stated above neurons can be connected with each other to form networks. Those connections between neurons contain a weight factor (we). The output of one neuron is multiplied with this factor before it is fed to the next neuron. Those weight factors are the adjustable parameters of the non-linear function, which can be changed to generate a desired output, given a certain input to the neural network. There are a

number of standard networks. These networks are graphical network forms. The

reason for this is that in such a way, a comprehensive understanding can be gained in a fast manner over a particular network

A simple example:

Frequently a 3,2,1 feedforward network is used in literature (the exact meaning of this code will be explained later).

Graphically this network looks like:

In matrix form the following equation is valid for this network

Y=q) 3

And finally the extended mathematical form for this networkisgiven by:

v = (w7y), ((w/

xi, +

w3x2 + w5x3

) 0

+ Ng),

((w2 xi+ + w4x2 + w6 x3

)

1-5

(Ti

,) cp,(w!,-x 02 UNCLASSIFIED 1-4 (P 2 3)

(38)

TNO report

UNCLASSIFIED

FEL-95-S307 (OEMO 95/27) 37

Clearly the graphical and short notation give the most simple, but useful description of this particular network. In practice there can be several different neurons used,

with different transfer functions. In the next paragraph different network forms,

transfer functions and neuron types are given. When a particular network is used in

this report a detailed description will be given (not all the networks given in the

following chapter will be used). The topology of the network is mainly determined by the problem to be solved. The dimension of the input vector determines the num-ber of input neurons and the numnum-ber of output neurons is determined by the solution that is desired. The only free parameters are the number of hidden neurons (neurons that are not directly connected to the outside world), the number of hidden layers and the learning rule. Closely connected to the different neural network topologies is the learning rule, with which the neural network is trained. Training of a neural network is the gradually adjusting of the free parameters in the non-linear equation based on training examples, in order to minimise a fault criterion.

The correct choice of learning rule, and the parameters of the learning rule, can be very significant for the final result. Learning rules can be divided into two major classes.

Unsupervised learning Supervised learning

The difference between supervised and unsupervised learning lies in the fact that when using supervised learning the desired output of a neural network is known in advance and the learning rule tries to fit the calculated output with the desired

out-put.

With unsupervised learning the output of the network is not known in advance and the network trains "itself". After training the values of the different output neurons must be determined by means of a test set. Unsupervised training is therefore not very well suited when using neural networks for simulating tasks. It is mainly used in classification problems.

5.2

Advantages and Disadvantages of neural networks

Advantages of neural networks according to [AGARD,19911 and [Boullard, 1992]

I. Unique solutions based on user data examples, No need to know algorithms

Inherent parallel processing structure yields faster solutions Robust performance in view of noisy and disturbed input signals Inherently fault tolerant (hardware)

Compatibility with existing technology 2.

(39)

Disadvantages of neuralnetworks,

II. Neural networks are not applicable to all processing problems

Neural networks do require training and test data samples Neural networks are black box systems

In table 5-1 a comparison is given between a conventional computer and a neural computer.

table 5-1, Comparison of ANN with conventional digital computers

Processing Order Knowledge Storage

Processing Control

Fault tolerance

Programs with serially performed instructions Static copy of knowl-edge is stored in ad-dressed memory loca-tion

New information de-stroys old information Central processing unit monitors all activities and has access to global information, creating processing bottleneck and critical point of failure Removal of any proc-essing component leads to a defect Corruption of memory is irretrivial, leads to a failure fault intolerant

Parallel programs with comparatively few steps Information is stored in the interconnections of neu-rons

Knowledge adapted by changing interconnection strength

No control nor monitoring of a neurons activity Neurons output only a function of its locally avail-able information from in-terconnected neighbours Distributed knowledge / information representation across many neurons and their interconnection If portion of neurons removed information re-tained through redundant distributed encoding

fault tolerant

Looking at the advantages and disadvantages it can be concluded that neural net-works are typically suited for problems where no information is available over the process or the process is so complex that it cannot be described mathematically. The famous generalisation capabilities of neural networks only regard its interpola-tion capabilities. A neural network cannot extrapolate (a neural network will always give an answer but it will not be correct in most cases). This limits its use in envi-ronments where incomplete training sets are available or there is doubt about the working range of the system.

UNCLASSIFIED

Feature Digital computer Neural irocessino

(40)

TNO report

UNCLASSIFIED

Area of Application Application Status

Telecommunications Sound

Science

Adaptive line equalisers Adaptive echo cancellers Control of sound levels by generating anti-sound

Minimising disturbances in the beams of positrons

Fully operational, very successful application Used in air-conditioning systems and industry Stanford linear accelera-tor

table 5-3, Non linear applications in development

Missile guidance and detonation

Fighter flight and battle pattern guidance

Optical telescope focusing Vehicular trajectory control Electric motor failure

Speech recognition

analogue neural net-works, results are very

promising

YF-22 advanced fighter

Truck backupper

Siemens will use a neural network in their new SAMMS controller results are very promising

FEL-95-S307 (OEMO 95127) 39

5.3

Applications with Neural Networks

The following list is taken from [Widrow, 1994] and contains a list of applications

that have been developed in science, business and industry using neural network

technology. It can be seen that almost all the applications use the neural network in a pattern recognition role.

table 5-2, Linear Neural Network Applications

(41)

Pattern Classification

Petroleum exploration Drug identification Prediction and finan- Loan approval cial Analysis

Control sation

Credit card fraud detection Machine printed character rec-ognition

Hand printed character recogni-tion

Quality control in manufacturing Event detection in particle accel-erators

Real estate analysis Marketing analysis Airline seating allocation

and Optimi- Electric arc furnace electrode

position control

Semiconductor process control Chemical process control Petroleum refinery process control

Continuos casting control during steel production

Fully operational Fully operational, very successful

Fully operational

Fully operational Fully operational, analogue neural net-work Fully operational Fully operational Fully operational Fully operational Fully operational Fully operational, very successful

Fully operational Fully operational

UNCLASSIFIED

table 5-4, Non-Linear Multi Element Neural Network Applications

(42)

TNO report

UNCLASSIFIED

FEL-95-S307 (OEMO 95/27) 41

6.

Neural Network Topology and Definitions

In this chapter some different network topologies are given. First general parameters are defined, then the different network forms are given. Many other transfer func-tions or network topologies are possible: only the network forms that are used in this report are given. It is stressed that a neural network topology is only an aid in visual-ising the network. From a topology no conclusions about accuracy, performance or usefulness of a neural network can be drawn!

6.1

Definitions:

'xi

lnputvector with n input values

,xn

w w21 m 1\

Weightmatrix W = W12

in which

l,W1n W20 - Wmn

column = weightvector per node

row = weightvector per input Wrownumber,neuron number

XI

Neuron layout:

x2,

x3

Output per neuron k is given by:

=

(xT

bias)=y

(x1

) bias,

I)

1,70-0) Y I0

(43)

The desired output is dk, the calculated output is Yk

The function cp can be a linear, non-linear or statistical function.

P1

net,=1,(x1wik)

6.2

Symbols for the different neurons

Two types of neurons are used in this report, the fan out neuron and the function

neuron. The fan-out neuron "fans" the input value out to all the neurons in the next layer. These neurons are typically situated at the first layer. In the fan-out neuron the input-value is not changed. In the function neuron the input value (which can be a vector or a single value) serves as an input for the transfer function. The output of

the neuron is the out-put of the transfer function. In this context the fan-out neuron is a special case of function neuron, it is regarded in this report as a different neuron because of simplicity.

Function neuron

Fan out neuron

(44)

TNO report

UNCLASSIFIED

FEL-95-8307 (OEMO 95127) 43

6.3 Topology

A collection of these basic components is called a neural network topology. The topology is determined by input parameters, output parameters, neuron specifica-tions and connection specificaspecifica-tions.

In this report every row of neurons is called a layer. The type of neurons of which the layer consists is not important. The direction of the data flow is from left to right (input at the left side, output at the right side). The neurons and connections are numbered from top to bottom. The input vectors and output vectors are also num-bered from top to bottom. A layer is called a hidden layer if it has no direct connec-tions with the outside world, except for the bias terms. A layer can only consist of the same type of neurons. If a different type of neuron is present in a layer then this will become a new layer. Layers with the same inputvector are placed above each other. The numbering is from top to bottom then from left to right.

6.3.1 Numbering hidden layer weigih matrix 1

\N,1411

e#4k

0

04 fr

Affrt

A

fr

(unction neurons

.figure 6-13, numbering of a feedforward neural network

output layer

fan out neurons

(n-1)

(45)

6.3.2 Number of layers

WI au( neurora

ZOO

-*I 2

figure 6-15, 2 layer feedforward network figure 6-14, four layer feedforward

hnut layer railie Wait Fanctim

figure 6-16, radial basis network

-4 -.I 6 "*1 7 8 12 UNCLASSIFIED Inpst layer network

figure 6-17, two dimensional kohonen layer figure 6-18, three layer feedforward

network

snout laret

4.1 Inewlfor.r.re meitmeic

lucklea layer Wye., larri

(46)

TNO report

UNCLASSIFIED

FEL-95-S307 (OEMO 95/27) 45

6.4

Different transfer functions

Sigmoid function

This is the most widely used transfer function

y = (net).

(i e_k 'ter )

dcp

ke

knet

(Met + e-k net )

net

figure 6-19 sigmoid function Threshold function

This is a special case of the sigmoid function, if k is very large then the sigmoid

function approaches the threshold function.

= k -

(net)(1 cp (net))

1-2

neti

figure 6-20 threshold function

p(net) = 0 if (p(net) = I if net < b net > b With

net=

(x, .w, )O

,b +

(47)

Hyperbolic tangent function

Close relative of the sigmoid function. It can be shown that this function has some

very desirable properties making it an ideal function for use in neural networks

[Masters, 1993].

y = q) (net) = tan h(k net)

dq)

= k

(net))

dnet

+1

figure 6-21 hyperbolic transferfUnction

Linear function

Frequently used in early networks. y = (p (net)

k- net 0

dcp

=k

dnet

net net

figure 6-22 linear transfelfunction

UNCLASSIFIED

1-3

(48)

TNO report

UNCLASSIFIED

Radial Basis function

(pi = (1),(14- =

(r)

(p(r) = t-cp(r) = r2 -r2

(p(r)=e

(r) = 111 (r2 FEL-95-S307 (OEMO 95/27) 47

figure 6-23, gaussian radial basis function

linear radial function

quadratic radial function

gaussian radial function

inverse multiquadratic radial function 1-5

1

p2)

Cytaty

Powiązane dokumenty

The most frequently pulsing Sensory Neurons represent the strongest association. The most frequently pulsing Object Neuron represents the recognized pattern... The number and rate

Welcome to the Special Issue of Elsevier Neural Networks, which comprises selected articles from the 17th International Conference on Artificial Neural Networks (ICANN), held in

It is quite obvious that adaptation of these parameters will have crucial influence on the legibility of the mappings used for the visualization, and that trajectories that span

Context dependent clustering has been applied for initialization of LVQ prototypes, with two weighting algorithms used for training: one based on the inter–intra class similarity,

The proposed neural fault detector and locator were trained using various sets of data available from a selected power network model and simulating different fault scenarios

In order to check the neural network capability to predict sales value of article, the training set with results of sales value of regression model has been used.. The conformity

The gait properties were analyzed considering EMG (electromyography) signals and using two types of artificial neural networks: the learning vector quantization (LVQ)

• The number of neurons in output layer depends on the type of the problem to solve by the network. • The number of neurons in hidden layer depends on the