Multi-8, a real-time multitasking foreground/timesharing background operating system for a minicomputer

(1)

A REAL-TIME MULTITASKING FOREGROUND/ TIMESHARING BACKGROUND

OPERATING SYSTEM FOR A MINICOMPUTER

Proefschrift

ter verkrijging van de graad van doctor in de technische wetenschappen aan de Technische Hogeschool Delft, op gezag van de rector magnificus

ir. H.B. Boerema

hoogleraar in de afdeling der elektrotechniek, voor een commissle aangewezen door het college van dekanen, te verdedigen op donderdag 1 mei 1975 te 16.00 uur, door '

Johannes Floris Anthoni

elektrotechnisch ingenieur, geboren te Sukabumi, Java.

(2)

Dit proefschrift is goedgekeurd door de promotor

Prof. dr. ir. W.L. van der Poel

This manuscript has been

-edited on a PDP8/I computer using the program EDIT,

-composed on a PDP8/I computer using the program PRINTR (courtesy of West Virginia University),

-typed on a papertape controlled IBM ball-head typewriter (courtesy of IWIS TNO), -offset printed by C.C.Anthoni (courtesy of SUN) .

(3)

FOR A MINICOMPUTER.

Ir. J.F. Anthoni

(toelichting voor de pers)

Een operating system is een samenstel van besturings programma's

dat de handelingen binnen een computer regelt en coordineert. Zo zorgt het

er onder meer voor dat de gebruikers-programma's op de meest efficiente

wijze worden uitgevoerd. Een operating system zouden we kunnen vergelijken

met een goed georganiseerd bedrijf waarin de handelingen voor een groot

deel worden bepaald door opdrachten (orders) van buitenaf, en verder door

de samenhang van de diverse afdelingen. Geheel overeenkomstig kent een

operating system diensten van dirigerende, administrerende, en uitvoerende

aard. Terwijl in een bedrijf de handelingen worden verdeeld over diverse

personen, beschikt het operating system slechts over een uitvoerend orgaan,

de centrale verwerkingseenheid. Het is dus zaak om dit orgaan doelmatig te

verdelen over de diverse handelingen. Nu kan een computer, net als een mens

eigenlijk maar een ding tegelijk. Op een dag echter kan een mens zich met

verscheidene zaken na elkaar bezighouden. Over een week genomen lijkt het

net alsof hij toch tegelijkertijd bezig is geweest met al zijn taken die

allemaal een beetje zijn gevorderd. Op dezelfde manier spelen zich binnen

de computer een aantal processen af waardoor het lijkt alsof hij met een

aantal zaken tegelijk bezig is (parallele taakverwerking, multitasking,

timesharing, multiprogrammering, etc.).

Deze eigenschap (van het operating system) komt in een

labora-torium omgeving goed van pas om gelijktijdig met het normale rekenwerk

een aantal experimenten te besturen. Experiment besturing vereist namelijk

doorgaans een zeer korte bemoeiing van de computer, en vormt derhalve

nauwelijks een belemmering voor het voortgaande werk. Af en toe echter

moet de computer voor het binnenhalen van elektrische signalen

(electro-cardiogram, zenuwsignalen, etc.) zeer snel reageren (enkele duizenden

keren per seconde). Dit feit onder andere sluit de meeste bestaande

operating systemen uit van geavanceerd laboratorium gebruik. Op het Medisch

Biologisch Laboratorium TNO en later ook in samenwerking met het

Fysiologisch Laboratorium van de Rijksuniversiteit Utrecht, werd het MULTI-8

operating system ontwikkeld uit een behoefte aan een research gereedschap

voor de problemen waarmee hedendaags medisch en biologisch speurwerk zich

geconfronteerd ziet. Het ontwerp werd niet alleen afgestemd op snelheid

maar ook op modulariteit in antwoord op de voortdurende verandering van

behoeften in een dynamische research omgeving.

Het proefschrift behandelt welke oplossingen voor dit probleem

werden gevonden en vermeldt ook welke overwegingen hiertoe hebben geleid.

(4)

(5)

TIMESHARING EACKGROL'ND

OPERATING SYSTEM FOF, A MINICOMPUTER

abstract

-A system is described in which all activities are divided over two basic levels of operation: The foreground, providing multitasking and the background, providing timesharing. The structure of the system, its design concepts, and implementation are discussed.

samenvatting

-Een systeein wordt beschreven waarin alle activiteiten worden verdeeld over twee fundamentele werkingsniveaus: de voorgrond die in parallelle taakverwerking voorzlet en de achtergrond bedoeld voor tijdscharen. Van het systeem worden de ontwerpgedachten, de structuur en de uitvoering besproken.

(6)

CONTENTS

CHAPTER ONE - INTRODUCTION

1.1 SCOPE OF THIS MANUSCRIPT

1.2 FOREGROUND BACKGROLTCD: MOTIVATION 1.3 NECESSARY AND DESIRABLE FEATURES l.A HISTORY

CH/J>TER TWO - THE KERNEL

2.1 THE PROCESSOR MULTIPLEXER 2.2 TASK COMMUNICATION

2.3 CORE AND DISK MANAGEMENT

CHAPTER THREE - THE BACKGROUND SUPPORT SUBSYSTEMS

3.1 THE EMUL-^LTOR S U B S Y S T E M

3.2 THE BACKGROUND UTILITIES SUBSYSTEM 3.3 THE BACKGROUND SCHEDULER

3.4 THE 3L0CKDRIVER SUBSYSTEM CHAPTER FOUR - DEVELOPMENT TOOLS CHAPTER FIVE - PERFORMANCE

5.1 FOREGROUND 5.2 BACKGROUND CHAPTER SIX - DISCUSSION CHAPTER SEVEN - APPENDICES

A- SELECTED BIBLIOGRAPY B- EXPLANATION OF TERMS

C- SUMMARY OF MONITOR REQUESTS D- DIAGRAMS

(7)

CHAPTER ONE - INTRODUCTION

1.1 SCOPE OF THIS MANUSCRIPT

This document discusses a number of design motivations, alternative solutions and decisions related to the construction of the MULTI-8 operating system. However, no formal proof for its correctness is given. Since in litterature there exist too few documents describing the actual implementation of an operating system in a down-right way, this manuscript goes into implementation details now and then. The content of this manuscript also serves as a manual for the future system user and fills the gap between two other manuals: "INTRODUCTION TO MULTl-8" (ANTKONI, LOPES CARDOZO 1975), and "MULTI-8 USER'S MANUAL" (ANTHONI, LOPES CARDOZO 1975). The reader is assumed to have some working knowledge of multiprogramming and related topics. However in order to avoid confusion with respect to terminology, a number of important terms are marked with (*) and explained in appendix B.

For an introduction to real-time systems the reader is referred to (ANTHONI 1975;GEVINS 1972;HOARE 1972;MILLS 1969). If he is interested to know how MULTI-8 relates to PDPS programming, he is furthermore referred to

(ANTHONI, LOPES CARDOZO 1975).

The remainder of this chapter discusses our motivation for designing the system and which features have been considered necessary or desirable. Chapter two discusses the most central part of the operating system: the Kernel. Chapter three treats all aspects of the implementation of the background and timesharing. Chapter four presents the development tools that have proved to be necessary, and chapter five evaluates the performance of the system. Chapter six finally reviews the work in relation to the work of others.

1.2 FOREGROUND BACKGROUND: MOTIVATION

The use of a computer system in a research laboratory usually comprises the following activities:

1. on-line data-acquisition and control 2. computation and presentation of results 3. program development

class 1 activities are grossly characterised as: 1. being time-critical, (real-time (*)) 2. requiring relatively short programs. 3. requiring a small part of cpu (*) time.

In contrast, category 2 requires large programs, mainly compute-bound (*) while category 3 requires medium-sized programs, often interactively used. Because in a laboratory environment data arise from experiments and the purpose of the computer is to process them and to present them in an intelligible way, minicomputers are conventionally used in either of two ways:

(8)

INTRODUCTION

-1. data are gathered in real-time. Because of the required speed and compactness, the programs are usually written in an assembly language. The results are stored as an intermediate file(*) for the processing and presentation programs, usually written in a high level language. This method conveniently separates data-acquisition from end processing so that any high level language can be used. Furthermore the two programs do not need to reside in memory together. A disadvantage of the method, however, is that two separate processing steps are required.

2. Both processing steps are located within the same program: end processing in the time not needed for data-acquisition. This usually results in a monolithic (*) program, written in assembly language performing all functions in one. The last method bears the advantage of requiring only one processing step, although at the cost of an entirely machine-dependent solution which is costly and difficult to change and to be used by others. The first method is a more systematical approach, however, requiring intermediate file storage and the necessity to load a different program In core.

Another alternative solution would be to have real-time facilities incorporated into a high level language. Existing solutions in this direction suffer from two basic deficiencies: They are usually slow and they do not offer parallel operations.

Another alternative is to appoint a small processor to collect data and to connect it to a central computer for final processing. This solution relieves the central computer from time-critical processing, but at the same time requires an intricate communication scheme. Because the small front end processors have to rely on mass storage for data originating at a speed dictated by an external process, the central computer should furthermore be able to respond quickly (10-100 ms) and to provide intermediate file storage, while analysis programs proceed on the very same machine. Thus the central computer should also provide for multiprocessing with a fast response and high throughput(*).

The best alternative would be to have real-time work separated from end processing in a foreground background system in which communication between the two 'grounds' exists. In such a system data-acquisition and analysis can be separated while still located within the same computer and even running concurrently (*) . Communication between the two environments can be effectuated smoothly and efficiently while intermediate file storage is readily at hand.

From whichever direction we approach the problem, all paths seem to lead to the necessity of a multiprocessing computer in the immediate vicinity of the laboratory (fig.1.2.0).

(9)

LARGE COMPUTER

(NUMBER

CRUNCHER)

PK06RAM\

DGVeLOP-

ENTER-

COMPU-TATION

Fig.1.2.0 The place of a foreground background system.

1.3 NECESSARY AND DESIRABLE FEATURES

We shall now try to be more specific about the required and desirable features of a foreground background operating system. These features relate to flexibility, performance, background processing capability and compatibility. First of the 1. 2. 3. A. 5. 6. 7. 8. 9. 10 11

of all, as any modern operating system, the real-time foreground part svstem should:

provide multitasking (*).

provide communication between tasks.

be adaptable to widely varying hardware and software configurations.

have a minimum overhead (*) in terms of cpu time. have a minimum overhead in terms of core utilisation. be well structured and open-ended (*).

provide protection among tasks.

grant the foreground tasks (*) control of the svstem as fast as possible after the occurrence of an external event (*). A response time of less than 0.5 ms is required.

be easily manageable:one should be able to have a 'grip' on all parts of the system.

, support device independent input and output.

(10)

INTRODUCTION

-With respect to background use the system should:

12. be able to use all cpu time not spent in foreground processing for background processing.

13. have an identical file structure for both foreground and background in order to communicate by means of files.

14. provide adequate protection between foreground and background because of the required program development. Newly developed background programs are then allowed to crash without harming the system.

15. provide a direct means of communication between background and foreground tasks, across the protection mechanism.

16. support various high level languages.

With respect to compatibility with existing software and future front end processors, the system should:

17. be compatible with an existing system in order that available software can be used on the new system.

18. support development software for front end processors, which are preferably of the same type as the foreground background computer.

(both software and hardware).

1.4 HISTORY

At the time the MULTI-8 project was started, and presumably even now at the time of this writing, there exists no operating system satisfying the above requirements. While awaiting the availability of such a system, we investigated the usefulness of the available PDP8I computer for this purpose.

The PDPS protection mechanism (KT8), discussed in chapter 3.1 would be necessary to protect the foreground from maliciously being affected by the background. A foreground background capability has been described

(ALDERMAN 1969;KENDRicK1971) and DEC(*) has successfully been marketing their TSS8 timesharing system (DEC*). Because these systeins provided for 4K background use only, whereas the newer PS8 operating system (later OSS) effectively used 8K of core or more, difficulties were expected from the presence of instructions necessary to cross the natural PDPS 4K boundaries. It was therefore decided first to study the behaviour of programs, occupying more than 4K of core, under emulation (*)(MEES, ANTHONI 1972). The results were rather discouraging until the idea of a small hardware change to the insufficient hardware protection mechanism (trap (*)) was propounded (KENDRICK 1972). This idea was further worked out, resulting in an "intelligent trap" for the PDP81, causing existing programs to run satisfactorily under emulation. Later a more flexible solution has been developed for the PDP8E at the Physiological Department of the State University of Utrecht.

From here on the PDPS was considered a good candidate foreground background computer and the above mentioned requirements became the design objectives of MULTI-8.

The concept of run-time (*) relocatability (*) was worked out, providing a high degree of modularity to the system, and a set of task communication

(11)

primitives was implemented, resulting in a working version of MULTl-8 (ANTHONI 1973).

A complete redesign was accomplished in co-operation with E. Lopes Cardozo ("we") from the Physiological Department of the University of Utrecht. Ihe implementation of background timesharing was the last step.

In July 1974 DEC announced a similar system, RTSS for real-time multitasking with a background facility. Although consternation was our first reaction, RTSS proved to be essentially a core-only system, lacking some of our most important design objectives.

At the time of writing MULTIS has been installed in 2 laboratories, being heavily m use for over 3 months. It provides a basic framework for further developments, satisfying the stated design objectives to a high degree.

(12)

KERNEL

processor multiplexer -CHAPTER 2 - THE KERNEL

A conventional computer system consists of one central processor, surrounded by peripheral equipment for input and output, storage and retrieval of information.

When thinking in terms of processes (*), peripheral equipment may be considered as primitive information processors, dedicated to a single process. For example: a papertape reader is in fact an information processor, converting information from the papertape format to an electronic representation, readable by the central processor. Similar considerations apply to lineprinters, plotters, magnetic tape equipment and disks. These primitive processors are connected to the central processor by communication channels through which they synchronize(*) and convey their data. As independent processors they are able to operate in parallel with the computer. However since each peripheral process needs to be controlled by software (device handler, driver) and the cpu by its nature can do only one thing at a time, parallel operations of peripheral processes are not exploited to their full extent, unless the central processor is able to simulate a number of internal parallel processes. Such a feature is provided by the "Real-time Multiprocessing(*) Operating System", a software framework intended to unite external and internal processes harmoniously in order to use processor and peripherals efficiently. The minimal framework necessary to achieve this is called the "Kernel(*)" and will be discussed in this chapter.

2.1 THE PROCESSOR MULTIPLEXER

The cpu(*) multiplexer is the mechanism which distributes the cpu among the processes competing for it. It makes use of the fact that a central processor can easily be multiplexed by saving and restoring the processor status (*). The processor is thus almost instantly interruptable and reusable. Such a property is called 'preemptiveness', a very important feature for a resource in a multiprocessing system.

Although the cpu is switched from one process to another, it is convenient to think of multiprocessing as an environment in which internal processes operate in parallel, competing for the system's resources. Resources are those shared items of which there are too few to give each process its own. The most important resource is the central processor, but also memory and most peripheral equipment. The operating system's main concern is thus to distribute its resources among its processes, or as seen from the resources, to schedule the competing processes for each resource.

Various scheduling policies exist: 1. First come, first served.

2. Give each competitor a fair share of the resource. 3. Optimise resource utilisation.

4. Try to meet all competitor's deadlines(*). 5. Avoid deadlock(*).

6. Provide undisturbed sequential access as necessary for magnetic tape and papertape equipment.

Of course some policies can be combined. Because of their different nature it is likely to have separate schedulers for different types of resource,

(13)

instead of saddling up all resources with the sstme scheduling policy.

The cpu scheduler is the most intricate since it is the most important resource of all, and thus attempts to satisfy almost all stated policies. However a scheduling algorithm needs more time according as more decisions are involved. Thus by making one intelligent cpu scheduler, one ends up in a solution consuming most of the cpu time itself. Speed can not only be gained by applying a simple scheduling strategy, but also by imposing certain restrictions on the use of certain facilities: one can limit the number of processor registers (so fewer registers need to be saved and restored) .

This is essentially why in MULTI-8 cpu scheduling has been established in three levels;

1. Fast unintelligent scheduling for time-critical processes. (Interrupt environment)

2. Moderately intelligent scheduling for multitasking. (Foreground environment)

3. Intelligent scheduling for background jobs to give each user a fair share of cpu time while also taking in account terminal and program behaviour, etc. (Background environment).

These levels also correspond to the basic PDPS modes of operation (interrupt mode, kernel mode and user mode),

The most elementary cpu scheduler is the interrupt hardware. An interrupt activates this hardware 'scheduler' which instantly determines what to do. It stops the current process, saves important registers and schedules the desired process. In case of a PDPS, having only one interrupt level, part of the scheduling is taken over by a so-called 'skip chain'.

The strategy used at this level is designed to meet stringent deadlines and to minimize overhead because the scheduling frequency can be very high (a few kHz). The criterlum chosen is based on a fixed priority for each interrupt (=process). Advantages of the fixed priority strategy are:

1. Processes with the shortest deadlines can be serviced first by giving them a high priority.

2. Scheduling is fast.

3. In general, scheduling can readily be implemented by hardware.

When talking about priority scheduling, a distinction must be made between the notions 'priority' and 'preemption' since priority scheduling can be implemented with and without preemption. Priority with preemption corresponds to the established notion of 'priority scheduling': a higher priority process interrupts a lower priority process (Fig,2.1.0.). Priority without preemption allows each process to terminate, and then the next process is scheduled according to its priority. The first method improves response times for higher priority processes while the second method requires less switching (saving and restoring registers). This results in a lower overhead.

(14)

KERNEL -processor multiplexer

4'2. 4 ' ! i^-terrupts

, ^ ^ process 1

— ^ — '

/(1-)

L^)\

With

pre ernpt&Dn

^^^h)

-^ , process 2

c4\

(^)/:

-> ^ frocessl

- V — '

. «

. ' ' ' l^""^''^ yV mthout

\ pre tmptt'on

X _ ^

time.

Fig.2.1.0. Priority scheduling with and without preemption.

Fig.2.1.0. tries to illustrate that scheduling without preemption requires less task-switching as opposed to scheduling with preemption. In the figure an interrupt for the higher priority process 1 occurs while the lower priority process 2 is in progress.

Response time is not only dependent on the strategy in use. It also depends on the worst-case condition of co-incidmg interrupts. Higher priority processes tend to stretch the worst-case response times for lower priority processes (fig.2.1.1). An example is the slower execution of the cpu due to direct memory access transfers which proceed with higher priority. Thus although one employs a fixed priority scheme, one can not exactly predict worst-case response times for each priority level.

\

i 2 p

tasf^ 1

:

i

'

1 interrupts

I tdsR L

t

' J . , - o i- a s rt >o A • e • J ^ o ' ^ i e j^inns y

_{tasH 3}

>-ivme

Fig.2.1.1. High priority processes times for lower priority processes.

tend to stretch worst-case response

The PDPS lacks two hardware devices to enable effective implementation of priority with preemption: A stack for saving registers and a priority arbitration or interrupt identification unit. Without such a unit one has to traverse the skip chain to decide whether the new interrupt is higher in

(15)

priority than the one in progress. Thus for a PDPS computer, preemption at this level is a waste of cpu time.

Before discussing the structure and facilities of the interrupt environment, let us consider the foreercmd environment which givts shape to the concept of multitasking. Ine primary a.'ipn concept of MULTI-8 is to provide a multitasking layer in whic'i a large number of tasks are able to run concurrently. These usually small tasks control peripheral devices and experiments and perform monitor functions,etc. Tasks are indivisible units from which larger units can be composed. They invite for a modular system design. In MULTI-S the multit.isking concept has been carried to such an extent that tasks are automatically loaded from the disk when needed and decide themselves when to disappear trom core. Note that the decision to leave core is entirely left over to the task itself, for only the task knows best when to leave memory. Tasks are furthermore entirely relocatable (*):they are able to run in any set of consecutive pages In core. Within such a powerful environment it is ddvantaqoous to have all other resource schedulers implemented as tasks. Tii.; furc^jround task scheduler has consequently been designed with utmost £le>.ibility in mind. Various reasons have made us adopt two levels of priority without preemption of tasks. (So tasks can not be interrupted by other tasks in either level.):

1. The strategy is very simple and requires little overhead (*).

2. It enables reentrant programming on a PDPS. Note that reentrancy is a prerequisite of the iponitor coUe s.nce some requests take a rather long time d)ie to implicit disk transfers.

3. When sharing common data, no probleirs occur because of unexpected interruptions of other tasks operating on the same set of data. 4. It was expected that time-critical processes can be serviced

adequately in the interrupt environment. Very time-critical processes should use hardware FIFO (*) buffering (see chapter 5.1). 5. The total system load in foreground processing is expected to be

low. For high system loads a priority stratepv is preferable in order to service time-critical tasks as soon as possible,

The two levels f>f priority become manifest in the fact th.it tasks awaked by external interrupts are serviced prior to tasks schpd'i'.t»J as a result of internal software activity. This has been implemented b\ means of two FIFO (*) queues (interrupt queue and sciiedule qui*u>.-,i , emptied in a fixed order by the dispatcher. When neither foreground nor interrupt processing needs to be done, the remaining processor time is available for the background. The background scheduler is a task capable of handing out a complete 8K PDPS processor to each background job in turn, giving a fair share of processor time to each.

The layered processor multiplexer thus obtained is depicted in fig.2.1.2. The range of response times covered by each layer is illustrated in

(16)

KERNEL

processor multiplexer

-;

•C>IS»C

^yyietnori/ hou/^ip/exer

(direct fn&mory accessj

hoemcry

K AS, TAPE

i q this switch oS more complex. brooeS^or

Uhen. dutOm<i^iC' priority hdrduJdre IS r

interrupt

e.n \^ironme.irt

PKOrscriCti of=^f

JOre^round

environment

ZHTBRRUPT cv/

PKorecrt'OA/ OFF

1 backdrounJ

^drxzduler

idle

back ^roanal

environhnertt

ihrrennuRr ON, PRorecrioi^ ON

Fig.2.1.2. Illustration of the layered processor multiplexer.

direct

memory

1

1/xS

interrupt - f^re^ round -6^ks

ertutronment

• 1 I 1

\o ±Oo ±nns \c\

^ bdckground

ttme6h<inng

1 1 1 IJOO ± S do

Fig.2.1.3. Response time-scale for the various multiplexer levels.

Let us consider fig.2.1.2. and imagine that a background job is frequently interrupted by a higher priority process. Then the stacked sequence of schedulers is entered twice for each event: once to put the event-waiting process to work and once to find out who is next. In the latter case the decisions descend to the lowest roots of the multiplexer tree. Reasons enough to separate scheduling into two parts: the scheduler, positioning the multiplexer switches, and the dispatcher(*) executing its logic. Thus as compared to an electric switch the scheduler is the planmaker selecting its position whereas the dispatcher is the current flowing through, discovering that it flows out at one of the ends. Thus the dispatcher

(17)

executes a single decision of the scheduler repeatedly.

The construction of a very primitive scheduler is depicted in fig.2.1.4. It is merely intended to show that this type of scheduler is not suited since It consumes all processor time in looping through the sequence of decisions. It is neither preemptive nor can it preserve idle time for the next higher level (background e.g.). Thus another property of our schedulers is that they are awaked only when a decision is required. This signal may arrive from requests at the very same level or from the underlying level.

/

r - <

I

DO

/I

\

>

^

DO

B

• f

> ^ K

^^Z?^

^ Y ^

DO

z.

^ ^

^

'

Fig.2.1.4. A very simple scheduler/dispatcher.

In order to switch from one process to another, the status(*) of the process needs to be preserved. For this purpose serves the Task Control Block in which next to the processor's registers also other parameters are saved (Fig.2.1.5.).

Diagram 1 (Appendix D) shows the actual flow of processor time in MULTI-S. The scheme deviates from that in fig.2.1.2. because foreground tasks cannot be interrupted by other foreground tasks. The hatched boxes correspond to the switches of fig.2.1.2.

An essential difference between a core-only system and a disk based system is that some monitor requests require disk transfers, such as necessary for reading tasks from disk, keeping the monitor waiting for some 20-80 milliseconds. Since waiting here too, is intolerable, a disk based monitor has to be made reentrant(*). The MULTI-8 monitor has been made reentrant only at those places where the request could not be completed instantly. A

'fake' task is then created, having its own TCB, contained in a 'node' from the pool of free nodes. After completion of the disk transfer, the fake task vanishes and its TCB is returned to the free pool.

Note that foreground tasks, being unlnterruptable by other foreground tasks always run from one monitor request to another. Thus task-switching always coincides with a monitor request. Tasks have to be written in such a way that they allow for task-switching at least every 1-3 milliseconds. Compute-bound processes can use the monitor PRECEDE request for this purpose.

Because the three levels are strictly separated and because there is no preemptive priority within either level, no stacks need to be maintained in MULTI-8. If we would have designed the system such that stacks were required, we would have had a separate stack for each level. The machine's

(18)

KERNEL processor multiplexer

-TOE>

pointor

I ' ' I '

LLer

-thread /c>'

S/j6ew - queues

Codd addres&

in core

block nutnbcr

on the disk

CO

r>n

ec

£ _

L

Lcn^ih

_IF

{CDF/Z)

St

31 DF

ACCUI^ULATOR

P/^O&RAM COUN reR

_L

points <3ft 6^sk ims^e

IF- instruction

field

bF^datd field

L =. Link

Fig.2.1.5. Layout of the Task Control Block.

By allowing only single-stream processes, each TCB will occur at most in only one of the system queues at a time (or in none when inactive, or waiting for an event). This enables a very fast queueing technique

(fig.2.1.6.) while at the same time imposing no limitations to the size of system queues now and in future,

head'

tdiC </

nterro

^ueue

h^ado-~dlL ••

Schedo

lueue

pt

lie

,

e>-" ^

0

X/

Y

A

^

/

o. V

\

> •

0

Fig,2.1.6. Task Control Blocks are queued ('threaded') for scheduling. Now that we have implemented various levels of cpu multiplexers, what does the virtual cpu look like at each level as seen from its users ? The interrupt environment provides a naked machine in which the user has no interrupt facility, while monitor functions are confined to trigger-like scheduling of a task in the foreground environment. The foreground environment provides a multitasking machine and protection from the background. However there is no protection among tasks. Extensive monitor

(19)

facilities provide for the necessary inter-task communication and core and disk management. 'ine background environment provides a number of complete 8K PDPS's protected against each other and unable to corrupt the foreground. However they are able to communicate across the protective boundary. For reasons of efficiency the background programs are deprived of an interrupt mechanism (see chapter 5.2).

2.2 TASK COMMUNICATION

Task communication is necessary for tasks to co-operate. It Is a vehicle for sending and receiving information. Hence task communication can be considered as the task's input and output mechanism.

When talking about messages in their widest sense, four types can be distinguished that find application in multitasking:

1. A message not particularly addressed to a single task. Tasks are not waiting for the message, but look at it now and then in order

to respond accordingly. Any mutually agreed memory location can serve this purpose.

2. A message breaking in on the course of the task to which it has been addressed. Thus a sort of interrupt, forcing the task to

immediate action. After this action the task is able to decide whether or not to resume. This type of interrupt at task level Is not supported in MULTI-8.

3. A message to control the task's mode of operation such as START, STOP and CONTINUE. In a certain sense these also break in on the course of the task, but do not lorce it to take a different course or action. MULTI-8 supports a STOP and a RESTART primitive for this purpose.

4. A message arriving at its destination only if the addressed task is prepared to receive . This type of message consists of two parts: a synchronizer and the message's content,

The message part can take any shape such as global or local variables, the processor's registers etc.

In MULTI-8 messages of 12+3+1 bit are conveyed through the monitor communication services. Once a connection between the tuo tasks has been established, they can exchange messages from each other's code body directly. This is possible since in MULll-b no protection between tasks exists, and both tasks reside in memory during communication. Tnis method costs substantially less overhead as compared to full monitor intervention. The synchronizer part needs monitor intervention. Whereas in a one-process-one-processor situation a process is allowed to consume all processor time in waiting for sor.e internal or e.^ternal condition, this is not tolerable in a real-time multitasking environment. Consequently the operating system should provide facilities to establish a direct Cduse-and-effeet relationship between condition and waiter. This relationship is called synchronization. The simplest synchronizer is a two state element (flag(*)) which basicallv works as a latch between two asynchronous processes as depicted in Fig.2.2.0.

(20)

KERNEL

-Fig.2.2.0, A two state flag for synchronizing processes.

This method is commonly used to synchronize hardware and software. The disadvantage of such a simple synchronizer is that it mutually excludes processes A and B from running. In order to provide overlapped operation of A and B a three state synchronizer is required, which is called

aventflag or eventvariable. (Fig.2.2.1.) It basically works in the

ollowing way:

s

cnterru

interrupt

e,i/okes etror tness^

mir

ScheduLe the wdit&r

Schedule the Matter

Fig.2.2.1. The three states of the eventflag.

When a task has specified to the monitor that it wishes to wait for a certain event (through the WAIT request), the monitor remembers this in the corresponding eventflag by saving the task's Task Control Block pointer in the flag. By doing so, the eventflag is set to its WAITING state. An interrupt toggles it back to its FREE state, at the same time scheduling the waiting task. The INTERRUPTED state is necessary to remember that an interrupt has occurred before the WAIT has been issued. Excessive interrupts keep it INTERRUPTED. When issuing a WAIT request for an eventflag for which someone is waiting already, an error message results. The eventflag can also be considered a primitive PV semaphore (DIJKSTRA

1968) which lacks an index, while the queue of waiters has been reduced to one waiter only. The P operator now corresponds to WAIT and the V operator to 'interrupt'. Excessive V operations are ignored whereas excessive P operations evoke error messages.

we have deliberately decided to use the simpler eventflag because It was sufficient for our purpose and faster in execution. In addition we wanted no automatic queueing for the type of events in our system and we wanted a timeout counter for each waiter.

(21)

Since external events enter the machine as interrupts, the interrupt-handlers convert these to the internal representation of the event (the eventflag), eventually causing the scheduling of the waiting task.

Although any memory location can be used as an eventflag (for example located within the task's body), eventflags have been joined into a monitor resident table of eventflags for the following reasons:

1. Tasks in MULTI-8 are allowed to wait for events, while residing on the disk themselves. Task resident eventflags are therefore prohibitive.

2. Each eventflag has a timeout counter which can be set with the WAIT request, serving as a secundary WAIT condition. By having the eventflags united in one table, the clock service routine Is able to scan along all eventflags efficiently.

Thus in MULTI-8 an eventflag can also be used for the purpose of timing only. This solution at the same time preempts the timer function.

The event-wait mechanism in MULTI-8 can also be considered a rudimentary case of the general 'condition-wait' mechanism (HOARE 1974) in which the task is allowed to specify a number of conditions in the form of a boolean expression. The monitor schedules the task only if the boolean expression becomes true. This sort of scheduling invokes much overhead. In MULTI-S one can specify only two conditions (event OR timeout).

Since MULTI-8 has been designed for a large number of tasks (40-200) , only few of which are active at the same time, the eventflag table could be kept short by allocating eventflags dynamically. When toggled to their FREE state, they become reusable for other tasks (Fig. 2.2.2.).

Tn^

urt

Schedule the

(AJdlting •t<3Sk

tyokes error

Milt

Fig.2.2.2. State transitions of dynamic eventflags.

The dynamic eventflag has one state (RESERVED) more than the fixed eventflag. In fact the FREE state of the fixed eventflag has been split in two. When a dynamic eventflag Is requested from the monitor, a FREE eventflag is searched in the pool of dynamic eventflags, and is RESERVED for the requester. After use the dynamic eventflag returns to its FREE state which automatically returns it to the pool.

With the eventflag mechanism we have created a simple means to synchronize tasks with either external processes or other tasks in the system (Fig. 2,2.3.). For this purpose the monitor provides the SIGNAL function which simulates an interrupt for the specified eventflag.

(22)

KERNEL -processor multiplexer

eK^ern<ii process

» 0: - • • •

•enter

^-C V. 'timer

mUrrupC

r/Otmtr

_Qh/Afr SIGNAL y

W</r

-*—o.

SIGNAL ivArr

Fig.2.2.3. The use of two eventflags to synchronize tasks.

The timeout facility has been a very useful feature in relating an event to real-time. It provides an elegant escape trom situations in which conventional eventflags would 'hang' indefinitely. The decision for adequate measures is left to the task Itself.

It should be mentioned here that this scheduling operation takes place in the Interrupt environment and should be designed as a trigger-like mechanism in order to meet high interrupt rates. This can be Illustrated by a data-acquisition task constituting two parallel processes. One (A) samples data from an A/D converter and puts them into one of two available buffers. Since data rates are supposed to be very high (e.g. 4 kHz), it is located in the interrupt environment. The other process (B) transfers one buffer to the disk while the other is being filled. For the transfers it needs the facilities offered by the multitasking monitor, and is thus located in the foreground environment. The example shows that the scheduling of B should be as short as possible In order not to lose data

(Fig. 2.2.4.).

process A

,itd'lrtacr'o-'

timoscit^

i,nterruffts ^

J^rnptind

Schedi

<3tdn e<pand»d

Q ; SIGNAL

trJ

• • • • i ! i \ , — U k -JL9 & • • - . ,

.. Xi

. . . .

WAIT

t t =

o ?•

**f f * 1**

process B

runs )r>uch Idtet

>• time

Fig. 2.2.4. Processes A and B seen at a macro time-scale and at a micro time-scale.

(23)

In some cases the synchronization is implicit in the type of service provided, for example when putting a task to work as a sort ot subroutine. Completion of the subroutine task (RETURN) then automatically carries the SIGNAL to continue the CALLer (Fig. 2.2.5.).

cAUer

CALLee

r\ V ?

t6tsk A

• Hrne

^

->

*

rt

^RBrtlRM 1

Fig.2.2.5. The use of CALL and RETURN.

The CALLed task implicitly WAITs for a message at its entrypoint when not in use, while the CALLer inplicitly WAITs upon CALLing. The RETURN implies a SIGNAL message for the CALLer, while rendering the CALLed tasks to its initial WAITing state. One can also say that the interlocked mode of operation carries the additional information to synchronize the two tasks

(the backlink in the TCB). In MULTI-S for this interlocked procedure call no eventflags are used.

A similar situation occurs when requesting a task to RUN, while not being interested when it terminates. For this purpose no eventflag is required. (Fig.2.2 6.)

1 . < - ^ ^ 1

tdsk A

^RUN

—^ time

"con^ptetcon

Fig.2.2.6. A RUN request requires no synchronization.

An important question is what to do if the CALLed or RUNed task happens to be busy. The problem is that the monitor does not know which strategy to apply, because the task in question could have been another scheduler with its own scheduling strategy. So the monitor can only take general measures such as:

1. Report the CALLer that the task is busy.

2. Put the CALLer in a queue of CALLers for that task, 3. Retry the CALL request now and then.

The advantage of putting the CALLer into a queue is that the task in question can take the next CALLer from the queue immediately upon completion of the previous one, so that no time is wasted. This is certainly recotimiended for a heavily used resource, such as the svstem disk and cpu. Furthermore when knowing the queue's structure, the resource can re-arrange the order of the CALLers in order to satisfy its own scheduling policy (such as for instance the minimalisatlon of rotational latency of the disk). A disadvantage is that the CALLer loses control, once entered into the queue, not being able to recover from a 'hung' resource, or to respond to a request to cancel the CALL (manual abortion e.g.).

(24)

KERNEL

-Therefore in MULTI-8 the way of handling these conditions is left to the CALLer and the CALLed task themselves. The CALLer returns at the busy-return, being able to retry the CALL at appropriate intervals. However queueing is possible by implementing the CALLed task as two processes with a buffer in between, the CALLed part being instantly reusable.

MULTI-8 remembers the CALLer Control Blocks (Fig.2.2.7.).

CALLee relationship by linking their Task

bdcklthk

points to

CAUer

rcB

Of CALLer

^

• _Tea

of CfirLLee r -K;, C»• -1 I

'1 etceterd

Fig.2.2.7. The backlink serves to stack CALLed tasks.

The backlink furthermore serves to indicate whether the task is still processing a CALL and to indicate whether the task has been claimed for successive CALLs as necessary for sequential-access devices like papertape equipment.

In MULTI-S two sorts of task exist, not distinguishable from the outside: RUN-tasks and CALL-tasks. The difference is made at run-time by either issuing the RUN or the CALL request with the task's name as parameter. A RUN-task is always an independent parallel process, the top of a hierarchy of tasks. In terms of conventional programming a RUN-task is comparable with the main program while CALL-tasks can be compared with program procedures. In MULTI-8 however, all tasks are written as procedures for the highest flexibility in systems-structuring. They can be directed to run independently or as a procedure or both,,depending on their nature. The RETURN request used to terminate the execution of a task is treated accordingly: in case it has been CALLed, it RETURNS control to the CALLer; In case the task has been RUNed it simply stops,

In order to do parallel operations using the CALL request, MULTI-8 provides a sort of intermediate RETURN. The purpose of this return is to acknowledge the CALLer that his CALL has been granted, at the same time passing him the number of an eventflag obtained from the pool of dynamic eventflags (Fig.2.2.8.). The CALLer now continues processing and synchronizes with the CALLee through the normal SIGNAL-WAIT sequence.

CPiLLer-

-tCme.

:CALL

_mtr

CALLee

entry ReSER.

-P-event

%

- Q

/ ^

ReruRN-r COrJriNUE. SlGNAL-h HPtur

Fig.2.2.8. Parallel processing using the CALL and RETURN+CONTINUE requests. The STOP and RESTART monitor requests, serving primarily to suspend tasks

by operator intervention, can be used for task synchronization too. The STOP-RESTART mechanism has three states, given by the tasrc's STOP and STOPPED bits (Fig.2.2.9,).

(25)

STOP

d('s,p<^tci)er sets siopped hit

cind Stops t<isk b\j not

.St'3rtin^ <t

Fig.2.2.9. The STOP and RESTART requests.

The STOP bit is a request to stop the task. If the task is inactive, then nothing really happens, but if the task is active, it eventually passes the dispatcher which stops it. In order to continue the task correctly (RESTART) one must know whether the task has actually been stopped or not. In case the task has been stopped it needs to be rescheduled. In case it has not (yet) been stopped, it is either inactive or waiting, thus may not be rescheduled. Only its STOP bit needs to be cleared. The STOPPED bit thus serves to notify the RESTART command that the task has actually been stopped and should be scheduled again.

The SUSPEND command is a STOP for the task itself. Together with the RESTART command it can be used as a simple synchronizer (Fig.2.2.10.).

-^^^ O > O , ? ^tc.

.RBsTT^r

SUSPEND

-.

•iasK B i >, 6 o

/RESTART ^USPENC) > time

Fig.2.2.10. SUSPEND and RESTART as synchronizers.

The STALL request enables one to suspend the execution of the task for a specified period of time. It is identical to first requesting a dynamic eventflag and then using it for timing out only.

No provisions have been made for tine-of-day scheduling since such a feature can be readily implemented in a disk or core resident task when required. The core resident clock routine, however, updates five memory locations: MILLISECONDS, SECONDS, MINUTES, HOURS and DATE, which enable each task to inspect time.

Monitor requests have been structured such that they can be combined in various ways, giving rise to a large number of differing requests (42). These are summarised in Appendix C. Diagram 2 reflects the structure of monitor request handling.

For the sake of modularity, tasks in MULTI-8 have names and are referenced by name in the monitor requests. The programmer thus needs not to worry about the monitor's internal tables. He merely invents a new name and attaches It to his task.

(26)

KERNEL

-Each task can be assigned a 'nickname' too. In identifying the task the monitor lets the assigned name prevail b\ first searching the list of assigned names. This mechanism allows one to substitute tasks of the same class. For example a lineprinter blockdriver and a punch blockdriver behave equally from the viewpoint of the CALLer. Thus by assigning the punch blockdriver name to the lineprinter blockdriver, all punch output from the program will be sent to the lineprinter blockdriver,

After loading the task into the system, it obtains a fixed place In the monitor lists and tables. (Fig.2,2,11.)

TASkUB/?ARy on di&k 1. i__ CORBMAP ^ direct ^^~^"^-^-^

trcBP

tTCBP"

trtap

po( rvcer - ^ ^ , ^

- - - = relation c^n i>e cofnpated'''--..^

•^^ ASsr&A/eo-

oaiAl-! : : 1

MMB i

^"AB"' JST 1 <5 ^

'^AMB usr

"G^" ^ - t - ' : ^

Ttb rABL€ 1

i(irt- ' .'9!&,a:,.S.^-'"

etc,

rcei

Tcazl

Fig.2.2.11. Organization of system tables. Arrows show how from a given Item in one table other items belonging to the same task can be found.

These tables have been designed such as to contain a minimum of redundancy. Pointers have been used where quick lookup is necessary. In order to minimize the time spent in searching the namelists, the monitor applies a simple 'learning' strategy: after the very first name lookup it 'remembers' the TCB pointer by overwriting the name location in the task's body. At the next request the monitor is able to distinguish TCB pointer (negative) from name (positive). This method reduces overhead especially when tasks are called various times in succession.

(27)

2.3 CORE AND DISK MANAGEMENT

Conventional use of a disk in a real-time operating system is through overlays (*) and tasks that need to be loaded in fixed partitions in core. The programmer has to take precautions as to where and when to load these overlays in order to make optimal use of core. These methods impose severe restrictions when the nature and timing of the real-time tasks can not be predicted accurately.

Another method of using memory effectively is so-called 'paging'. The principle of paging is based on the fact that a program uses only a small part of its code most of the time ('working set'). By dividing programs in pages of a convenient size, the monitor tries to keep only those pages in memory that are actually needed by the program. Unused or infrequently used pages have to be swapped(*) back to the disk. The time spent in swapping can be used to run other timesharing users. The problem is how to decide which pages have to be returned to the disk, since the monitor is not able

to predict the next step of the program. Also a hardware device must be available for relocation and page escape detection.

In MULTI-8 an entirely different approach has been taken in an attempt to make tasks as independent of their position in memory as e.g. the hardware module boards in a modern register-bus design. As opposed to hardware, software 'modules' can be plugged in and out of memory at a rate only limited by the speed of the disk. Thus large amounts of tasks (modules) can be stored 'on the s h e l f , being almost instantly available. Only those

tasks that are actually in need of memory remain in core. For example: out of some 40 tasks, occupying more than 60 pages (8K) of memory, in practice only 5-8 (2K) need to be simultaneously in core. Note that in MULTI-8 tasks are not swapped back to disk.

In order that tasks are able to run in any place in memory, they need to be run-time(*) relocatable (*). Furthermore for fast core management, task sizes must be a small multiple of some convenient unit. Relocatability in MULTI-8 has been achieved by making good use of the detested page structure of the PDPS. The idea is basically as follows: if we make a small program that runs within one page, then that program will run in any page. One-page programs are said to be 'naturally' relocatable. Problems occur when attempting to address beyond the current memory page. So in order to relocate a program, the monitor has only to relocate the indirect addresses used to cross the page boundaries. This can be done in a fraction of the time a relocatable loader would need to relocate the same program. In general one has to locate position dependent code at a place where it can easily be found and relocated. In MULTI-8 the indirect references are located at the beginning of each page in the so-called Page Header, which

Is of variable length (Fig.2.3.1,).

ltdsk hedder

I I I pdgerl I

pages

p<3^e3

!e hedd&r-^

(28)

KERNEL

core/disk management

-The Task Header, located at the very beginning of each task, serves to specify the task's name, its length and the number of interrupt routines it contains. Each task can CONNECT one or more interrrupt service routines to the Interrupt scheduler. The monitor for this purpose patches the skip chain in order that the task will be instantly entered at a given label when the specified interrupt occurs. Since these CONNECTed interrupt processes reside within the body of the task, communication takes place through local variables. CONNECTion is automatically established when the task enters core and is broken at the RELEASE and SWPOUT requests, discussed later. A papertape reader-to-punch transfer task could for example make use of this facility as depicted in fig.2.3.2. The detailed format of Task and Page Header is presented in fig. 2.3.3.

punch

-»

redofer

S inter

s rapt

? ^ched

S uCer

H H : ^ interrupt

processing

Fig.2.3.2. Example of CONNECTed processing

NAMB

LBNG TH I ItOf^ CONNECTS

DE VICE r/U M B BR

LA&ec

Derice

N(JHB,ER

CABSi-\ CONNECT0i

\ (iONNBCr4^2.

fifOOR.es^ ±.

fynoRes^s z.

0<Z^ 0 0 '

pointe,rs dnd

<icroSS-^ <icroSS-^ e references th<st

need to be relocated

termindtor

Fig.2.3.3. Format of Task and Page Header.

The relocation mechanism enables a task to run in any number of consecutive pages in core. Fig.2.3.4. gives an idea of how tasks can be located at a given instance.

(29)

Fig.2.3.4. Tasks are scattered through memory.

The core layout is managed in the coremap with one memory location per page. A page can be : 1) free (0000), 2) occupied by the monitor (7775), 3) accommodating a task (TCBPOINTER), 4) accommodating a RELEASEd task (-TCBP0INTER) or 5) a oorrowed page from memory (7776 for each page, followed by 7777 for the last page).

In order to understand core management, the RELEASE facility will ba discussed first. The RELEASE request dismisses the task from memory while still retaining a claim for the place just occupied. The core manager tries to hand out free core pages first before giving away RELEASEd pages. So a RELEASEd task will most likely revive within its used code. This unique allocation mechanism drastically reduces the number of disk transfers, at the same time improving response time. This facility has especially proved its value for tasks that need to be accessed frequently and fast, but never really know when to swap (*) out of core. Tasks using this mechanism have to be written such that they neither rely on local variables nor on assembly initialised code.

The SWPOUT request releases memory completely: all used pages become free. Tne task is however, not swapped back to the disk. This also reduces the number of disk transfers and prevents deterioration of tasks because of the unability to check for correct write operations to the disk. Thus a CALL or RUN after a SOTOUT always executes in a new task image.

Note that background jobs are indeed swapped back to the disk. Deterioration of background jobs does not harm the foreground system.

The RELEASE and SVrPOLT requests are specified as options to a number of monitor requests (Appendix C) . A very useful combination is WAIT+SWPOUT which enables a task to wait on the disk (in a fresh image) until an event occurs. An example is the foreground command decoder, a task waiting its whole life for the foreground break character, while residing on disk. Such a feature is only possible if the task's TCB and the eventflag reside in core.

The core allocation strategy, flow-charted in diagram 3, uses the following priorities in order to keep holes large:

1. the first FREE hole that fits exactly. 2. the first FREE hole larger than requested. 3. the first FREE+RELEASED hole that fits exactly. 4. the first FREE+RELEASED hole larger than requested.

(30)

KERNEL

core/disk management

-core queue. The -core allocator keeps a record of the maximum hole size available. Each time a RELEASE or SWPOUT has been processed, the core-waiters are threaded to the schedule queue (Fig.2.3.5.) and their requests are retried.

Fig.2.3.5. The technique of appending a queue to another queue.

The core allocator is also used for allocating buffer storage in multiple page units. The REQBLK and RELBLK requests serve this purpose. It belongs to the task's responsibility to return the borrowed space using the RELBLK request.

Core memory can also be allocated in blocks of 8 words each (nodes). These nodes are taken from a pool of free nodes, chained to each other.

(Fig,2.3.6.) A node is large enough to be used as a Task Control Block.

(31)

hvad

o ?

0 r-^\

one node

borrovred.

•:•:•:*:•:*:*

m

1 * ^ . •v 7 ^ 7 ^ - - ^

W

node ,

returned.

to pooL

Fig. 2.3.6. Free node allocation.

Note that the method of free node allocation deviates from the normal FIFO

queueing technique employed as depicted in Fig.2.3.7.

head

1 «_

1 --*

'P

-t^iC

head

c

take frorn

<pueue

/

(0

tali

1 a 1

1 " 1

head

_tail

add to

<luea^

head

-tatt

0 empty efueue

Fig.2,3,7. The technique of FIFO queueing of TCBs,

The Task Library Is contained in a file on the disk, located in a software-protected region (Fig.2.3.8.). This file is composed by the TASK BUILDER program out of pre-asse:r.bled binary files.

(32)

KERNEL -core/disk management

0S8 monitor UprSri ^(^^^''^

I I ^ I I :

other pro^rdfni S- /fl'&S

{

f

'Jirector<f

T

liii^t^il^ .

Fig.2.3.8, Layout of the Task Library,

Because there exists no protection among tasks and between tasks and monitor, it is most hazardous to insert a newly developLd task Into ttie svitem. The present Task Builder consequently only allows tasks to be inserted when MULTI-S is not running. For the development of tasks it would be convenient to have some means of running the task in a protected environment (a pseudo MULTI-S in the background) or under stringent supervision (tracer in the foreground). Such a provision has not yet been realized. In future however, a facility will be provided to insert (friendly) tasks into the system while it runs.

^ < ; ^ < i 2

(33)

CHAPTER THREE - THE BACKGROUND SUPPORT SUBSYSTEMS

The background is the environment in which large (BK) programs run in the time not used by either foreground or interrupt environment. A protection mechanism must be available to protect the foreground from corruption by background programs, especially when program development is allowed. However a means of communication between background and foreground should be established too, in order to exchange commands and data.

On a PDPS, having an address space of only 12 bits= 4096 words (1 FIELD), protection can effectively be reached by 'trapping' the special instructions that provide for 'extended' memory operation. Since I/O operations should be controlled by the operating system, all I/O instructions (lOTs) need to be trapped too. This is performed by a very simple circuit which for the mentioned class of instructions causes an interrupt while inhibiting the execution of the trapped instruction. Such a trap is a standard DEC option available on almost all PDPS computers. The monitor upon recognition of the instruction 'emulates'(*) it in the Toreground environment. The amount of code necessary depends on the complexity of the function to be performed, the peculiarities of the I/O device and its being shared or not. The ideal situation would be to emulate the trapped instructions so precisely that the background program runs in a virtual copy of a POPS computer. However since each lOT takes a hundred or more instructions to be emulated, background programs will be slowed down in case lOTs occur quite often. Thus for the sake of speed we have deviated from the criterium of preciseness here and there. This approach of emulating every PDPS I/O instruction (lOT) enables one to run almost all existing PDPS software on the new system. It has been demonstrated that emulation could be speeded up by a hardware device, allowing certain harmless instructions to be executed by the processor as yet. (MEES, ANTHONI 1972). This particularly holds true for the extended memory instructions occurring very frequent in 8K programs. For details with respect to the implementation of these hardware modifications the reader is referred to (HEMELAAR, ANTHONI 1975). Fig.5.2.1. illustrates the gain in performance for some typical background programs.

We would like to use this chapter to describe the structure and some implementation details of the background support subsystem, which has been implemented by foreground tasks. The background support subsystem consists of four parts:

1, The emulator,

2, The background utilities, 3, The background scheduler. 4, The blockdriver tasks,

(34)

BACKGROUND SUPPORT

emulator -3 . 1 THE EMULATOR SUBSYSTEM

The h e a r t of the emulator subsytem i s the Central Emulator ( F i g , 3 . I . 1 . ) ,

W .

trap

intern*,

Skip diam

clear fh^

n:

cnbor-'^P^' /'save a

_aJr\

rounKO err y branch

1-^

Branch to corrcci pare

^k

emu-late

/ OlSk-r

/call a.

\ emulator

!^--> other

Core- resident en^ulairor routines HOP

T

tmulate £A5 OSR, HLT instructions error handtirtf^ restore

Cdll

i disk-residerii

Y\ error, ^ pnrter , ^ ^

-(r^Ume Lack^ (stcp back^nd)

Fig,3.1.1, Simplified structure of the Central Emulator.

When a trap interrupt has been detected, the processor registers are saved in the Background Control Table (Fig.3.1.3.). Now a rather exceptional transition from interrupt environment to foreground environment is made for the sake of speed. This is allowed since the trap interrupt by definition occurs when the processor runs in background mode. Thus no foreground task runs either. The emulator can simply declare itself the running foreground task. From this time on, the Central Emulator runs as a normal foreground task, able to use all monitor facilities and to CALL other tasks.

Upon recognition of the trapped instruction, various actions nay follow: 1, In-line core resident emulation (for the sake of speed),

(35)

3, CALL a disk resident emulator task, 4, Ignore the instruction, (NOP)

5, Evoke an error typeout at the correct console and stop the corresponding background job.

Most important actions are table driven so that the user can easily adapt them to his specific needs. He is for example able to assign a task (name) to a certain I/O instruction which can very well be used as a means of communication between background and foreground. By assigning an unused lOT (of which there are many) to a data-acquisition task, the background program can specify sampling parameters as arguments (*) following the lOT. The acquisition task then stops the background program and starts sampling input channels as specified. The data can either be stored into an array of the background program or in a specified file on the disk. Upon completion of the sampling process the background program is resumed,

Another example is to use the available background input/output channels (punch,reader,printer,card reader etc.) for a similar application. The main advantage of this method is that large parts of the application program can be written in any high level language, and that the high level language need not be modified. A number of similar solutions is not only conceivable but also practicable,

The set of emulator tasks has been organised according to the structure depicted in fig.3,1.2, In this drawing a line represents a relationship between two objects,

Fig.3.1,2. Inter-relationship between emulator tasks.

The Central Emulator and a number of emulator routines and tasks have been made reentrant. This has been achieved for various resources in various ways. Each background job has its own incarnation of the Central Emulator with its own TCB. Thus the TCB pointers of the various Central Emulator

(36)

BACKGROUND SUPPORT emulator

-tasks are different. Consequently an emulator task claimed (or busy) for one Central Emulator will be blocked for all others. This guarantees undisturbed sequential access to a claimed resource.

Since the disk is a heavily shared resource, it should be made reentrant in some way. This has been achieved by accumulating all disk instructions from the background into a single disk request, stored in the BCT (Fig.3.1.3.). During the actual disk transfer the background is still in emulation mode which means that it can not be swapped out. The information from the disk is thus deposited in the body of the background program.

DECtape lOTs have proved to be very tricky and elaborate to emulate. Fortunately in OSS almost all programs use the OSS DECtape handler. This enabled us to rewrite the handler so that it behaves as normal when not running under MULTI-8 (however occupying 2 pages instead of 1 ) , but issues a 'giant' lOT (see below) when running under MULTI-8. This giant lOT is handled by the DECtape emulator task which executes the complete transfer. During the search for the correct DECtape block, the background job is swapped out to allow the other background job(s) to use the cpu. The background job is swapped in again when the tape has been positioned just before the correct block.

Note that background programs are able to detect whether they are running under MULTI-8 or not. An unused lOT, now acronymed SMS (SKIP IF MULTI-8) skips the next instruction when running under MULTI-8, and has no operation (NOP) under normal circumstances.

Now that the emulator is in full control of input and output operations, it is also able to deviate from exact PDP-like execution for the sake of speed or for providing more facilities to the background user such as:

1. Recognising special I/O instructions e.g.for reading the time, etc. A very special one is the 'release all claims' lOT which releases all used resources. The point is that neither the 038 operating system nor conventional PDPS programs have been written with a view to timesharing. Consequently they do not request and close their resources. Because in MULTI-8 resources are claimed as soon as the first lOT for that resource occurs, a provision had to be made to release (close) the resource after use. This has been achieved by including a 'release-all-claims' lOT in strategical places of the OSS monitor and in some background programs. Thus resources are allocated and released automatically.

2. Providing buffering of input and output devices, which particularly pays off in timesharing. This buffering can be extended to SPOOLlng (*) for e.g. lineprinter output.

3. Providing extensive communication facilities by a single lOT. These lOTs have been named 'giant' lOTs. For example, DECtape transfers occur through a single 'giant' lOT. The 'giant' lOT enables the background user to 'call' a foreground task. The value held in the AC is an index in a list with task names.

4. Removing excessive lOTs by replacing them in the background's program code in order to decrease the number of traps.

Each background job has its own Background Control Table, containing its status, registers, disk parameters, emulation parameters, storage locations for the reentrant programs and a teletype input and output buffer. Successive BCT's are circularly linked for the purpose of round-robin (*)