Turing machines and computer viruses

(1)

viruses

Luite van Zelst

Institute for Logic, Language and Computation University of Amsterdam

3^rd International Workshop on the Theory of Computer Viruses, 2008

(2)

1

Motivation

2

Turing machines

Modern Turing machines Turing’s original machine

3

Computer viruses

Interpreted Sequences Cohen’s viruses

(3)

“A virus may be loosely defined as a sequence of symbols which, upon interpretation, causes other sequences of symbols to contain (possibly evolved) virus(es).” (Fred Cohen)

(4)

The definition

Let M be a Turing machine and V ⊆ Σ^∗ then hM, V i ∈ VS if

∀v ∈ V , h ∈ HM (1)

if ∃ n₁< ω (2)

∧ h(0) = hs₀, _, _i (3)

∧ h(n₁) = hs₀, t₁, p₁i (4)

∧ t₁[p₁, |v |] = v (5)

then ∃ v⁰∈ V , n₂< ω, pos < ω (6)

∧ h(n₂) = h_, t₂, _i (7)

∧ t[pos, |v⁰|] = v⁰ (8)

∧ ∨ pos ≥ p₁+ |v | (9)

∨ p₁≥ pos + |v⁰| (10)

∧ ∃ n₃< ω (11)

∧ n₁< n₃< n₂ (12)

∧ h(n₃) = hs₃, t₃, p₃i (13)

∧ pos ≤ p₂≤ pos + |v⁰| (14)

(5)

M Y V I R U S . . .

TM

Contiguous sequences (strings) Any substring on the tape

Uses a special flavour of Turing machines

(6)

M Y V I R U S . . .

TM

Contiguous sequences (strings) Any substring on the tape

Uses a special flavour of Turing machines

(7)

Depicted

s₀ v time n₁

v₀⁰ time n₂

v⁰ time n₃

(8)

Literature

Thimbleby et al. in 1998: A Framework for Modelling Trojans and Computer Virus Infection

Mäkinen in 2001: Comment on ‘A Framework for Modelling . . . ’

(9)

How are Turing machines defined precisely?

How are ‘interpreted sequences’ defined?

(10)

Davis (1958), Minsky (1967), Hopcroft et al. (1979):

Turing machine computes a function:

input M output

On computable numbers, with an application to the Entscheidungsproblem, Turing, 1936

Machine that computes an infinite sequence M

1 1 0 0 1 1 1 1 0 0 0 0 1 1 1 1 . . .

(11)

Davis (1958), Minsky (1967), Hopcroft et al. (1979):

Turing machine computes a function:

input M output

On computable numbers, with an application to the Entscheidungsproblem, Turing, 1936

Machine that computes an infinite sequence M

1 1 0 0 1 1 1 1 0 0 0 0 1 1 1 1 . . .

(12)

Infinite tape

Infinite tape: t : ω → Σ

Finite content:  /∈ Σ represents an empty square Infinite tape: t : ω → Σ ·∪ {}

Pure content: P^Σ Σ

0 1 0 1 0 . . . PΣ in one-one correspondence with Σ^∗

(13)

Infinite tape

(14)

Infinite tape

(15)

Infinite tape

(16)

Definition

Structure hQ, Σ, tr , q₀i where Q a finite set of states

Σ a finite set of tape symbols q₀ starting state

tr is a transition function such that

tr : Q × (Σ ·∪ {}) → Q × Σ × {−1, 0, 1}

(17)

Moves

Configurations: hs, t, pi where state: s ∈ Q

tape: t : ω → Σ ·∪ {}

position: p < ω

Moves: hs, t, pi ,→hs⁰, t⁰, p⁰i

(18)

Computation

Computations: →M binary relation on infinite tapes t : ω → Σ ·∪ {}

t →_M t⁰ ⇐⇒ hs, t, pi ,→ⁿhs⁰, t⁰, p⁰i /,→

Condition 1: the machine start withhs0, t ∈ P^B, 0i Condition 2: The machine does not write :

→_M⊆ (PB)²

→_M⊆ (Σ^∗)²

(19)

Computation

t →_M t⁰ ⇐⇒ hs, t, pi ,→ⁿhs⁰, t⁰, p⁰i /,→

→_M⊆ (PB)²

→_M⊆ (Σ^∗)²

(20)

Computation

t →_M t⁰ ⇐⇒ hs, t, pi ,→ⁿhs⁰, t⁰, p⁰i /,→

→_M⊆ (PB)²

→_M⊆ (Σ^∗)²

(21)

Semantics

Basic: |M| : Σ^∗ → Σ^∗

Function on naturals: encode input and output Representing all functions: one extra ‘erasure’ symbol

(22)

Computable Numbers

In 1936, Turing wrote: On computable numbers, with an application to the Entscheidungsproblem.

Real numbers

Binary expansion: π = 11, 001001000011111101 . . . Non integer part: infinite sequence

Computable numbers

Binary expansion written by a machine ? Computable sequences

(23)

Computable Numbers

Real numbers

Computable numbers

(24)

Computable Numbers

Real numbers

Computable numbers

(25)

Conditions

Mark the left-hand side Figures: ‘output’

Auxiliaries: ‘notes’

F -squares

A contiguous sequence of figures Not erasable (‘write-once’) E -squares

a kind of scratchpad

(26)

Conditions

B B

F -squares

(27)

Conditions

B B 0 1 0 1 1

F -squares

(28)

Conditions

B B 0 ∗ 1 $ 0 ∗ 1 ∗ 1 $

F -squares

(29)

Conditions

B B 0 ∗ 1 $ 0 ∗ 1 ∗ 1 $

F -squares

(30)

Conditions

B B 0 ∗ 1 $ 0 ∗ 1 ∗ 1 $

F -squares

(31)

Computed sequence

Computable sequence

0 1 0 1 1 0 1 1 1 0 . . .

Function output

Σ

0 1 0 1 1 0 1 1 1 0 . . .

(32)

Computed sequence

Computable sequence

0 1 0 1 1 0 1 1 1 0 . . .

Function output

Σ

0 1 0 1 1 0 1 1 1 0 . . .

(33)

Computed sequence

Computable sequence

0 1 0 1 1 0 1 1 1 0 . . .

Function output

Σ

0 1 0 1 1 0 1 1 1 0 . . .

(34)

On Turing machines

M Y V I R U S . . .

TM

Can we define a virus as a (contiguous) ‘sequence of symbols’

that is ‘interpreted’ by a Turing machine?

(35)

On Turing machines

M Y V I R U S . . .

TM

Can we define a virus as a (contiguous) ‘sequence of symbols’

that is ‘interpreted’ by a Turing machine?

(36)

For modern Turing machines

Turing machine: computes a function fully determined by transition function Universal machine:

Computes the universal function

Computes the function of some other machine Defining a Universal machine:

Encode the transition function: ‘program’

Encode the input to this machine: ‘input’

Encodings: injective and therefore decodable

(37)

For modern Turing machines

Without specifying ‘valid’ encodings Any machine ‘interprets’ any input Empty string ‘encodes’ the machine itself Entire input ‘encodes’ a constant function Just one program

Interleaving of ‘program’, ‘input’ and simulated tape and temporary symbols

Not every substring of the (total) input is interpreted

(38)

For modern Turing’s original machines

Turing’s original machine:

no input

no interpreted sequences Turing’s universal machine:

itself unlike Turing’s original machines

the entire input is the encoding of exactly one machine F - and E -squares: program is not a contiguous sequence on the tape

(39)

For modern Turing’s original machines

Turing’s original machine:

no input

no interpreted sequences Turing’s universal machine:

itself unlike Turing’s original machines

the entire input is the encoding of exactly one machine F - and E -squares: program is not a contiguous sequence on the tape

(40)

Model at least two programs Non-program cannot be a virus Standard models inadequate New (universal) Turing machine?

No benefit: non standard

(41)

Model at least two programs Non-program cannot be a virus Standard models inadequate New (universal) Turing machine?

No benefit: non standard

(42)

tape: ω → Σ

infinite tape with infinite content compare with t : ω → Σ ·∪ {}

starting state & position undefined transition function unrestricted

tr : K × Σ → K × Σ × {−1, 0, 1}

even with  ∈ Σ finite content undecidable

(43)

tape: ω → Σ

tr : K × Σ → K × Σ × {−1, 0, 1}

(44)

tape: ω → Σ

tr : K × Σ → K × Σ × {−1, 0, 1}

(45)

Not a modern Turing machine

Not Turing’s original (universal) machine All sequences trivially computable

No distinction between figures and auxiliaries No distinction between F -squares and E -squares

(46)

Can the difference between Cohen’s machine and modern Turing machines be overcome?

Viral equivalence: M ≡vir N if:

hM, V i ∈ VS ⇐⇒ hN, V i ∈ VS

Viral equivalence is incomparable to functional equivalence

No, it cannot.

(47)

Precise definitions essential for Turing machines Turing machines are inappropriate to model viruses Cohen’s modelling non-standard

Outlook

Open door for other modellings of computer viruses Dissect Turing’s machine and unify the two Turing machine models: ‘Talkative Machine’ (TM)

(48)

Thank you for your attention!

(49)

in

t

out M

M

N

in

t t

out M

M

N

(50)

≡

_vir

* ≡

_func

M = hQ, Σ, tr , q₀i N = hQ, Σ, tr⁰, q₀i Σ = {a, b} Q = {q₀, q₁} tr (q0, _) = hq1, a, 0i tr⁰(q0, _) = hq1, b, 0i

Virally equivalent (trivially) No functionally equivalent:

|M|_func = t 7→ t[0 7→ a]

|M|_func = t 7→ t[0 7→ b]

(51)

≡

_vir

+ ≡

_func

Two machines that compute x 7→ x · 1.

q₀

q₃

any,any,R

,1,N

q0 q1

q2

q3

any,any,R

,0,N

0,0,N

0,1,N

(52)

beamericonbookM. Davis.

Computability and unsolvability.

McGraw-Hill, 1958.

beamericonbookM. Minksy.

Computation : finite and infinite machines.

Prentice-Hall, 1967.

beamericonbookJ. Hopcroft and J. Ullman.

Introduction to Automata Theory, Languages, and Computation.

Addison Wesley, 1979.

(53)

beamericonbookB.J. Copeland.

The essential Turing: seminal writings in computing, logic, philosophy, artificial intelligence, and artificial life plus the secrets of enigma.

Oxford University Press, 2004.

beamericonarticleH. Thimbleby, S Anderson and P. Cairns

A Framework for Modelling Trojans and Computer Virus Infection.

Computer Journal, 41: 444-458, 1998.

(54)

beamericonarticleE. Mäkinen

Comment on ‘A framework for modelling trojans and computer virus infection’.

Computer Journal, 44(4): 321-323, 2001.