Protection and Retrieval of Encrypted Multimedia Content: When Cryptography Meets Signal Processing

(1)

Volume 2007, Article ID 78943,20pages doi:10.1155/2007/78943

Review Article

Protection and Retrieval of Encrypted Multimedia Content:

When Cryptography Meets Signal Processing

Zekeriya Erkin,1_{Alessandro Piva,}2_{Stefan Katzenbeisser,}3_{R. L. Lagendijk,}1_{Jamshid Shokrollahi,}4

Gregory Neven,5_{and Mauro Barni}6

1_{Electrical Engineering, Mathematics, and Computer Science Faculty, Delft University of Technology,}

2628 CD, Delft, The Netherlands

2_{Department of Electronics and Telecommunication, University of Florence, 50139 Florence, Italy} 3_{Information and System Security Group, Philips Research Europe, 5656 AE, Eindhoven, The Netherlands}

4_{Department of Electrical Engineering and Information Sciences, Ruhr-University Bochum, 44780 Bochum, Germany} 5_{Department of Electrical Engineering, Katholieke Universiteit Leuven, 3001 Leuven, Belgium}

6_{Department of Information Engineering, University of Siena, 53100 Siena, Italy} Correspondence should be addressed to Zekeriya Erkin,z.erkin@tudelft.nl

Received 3 October 2007; Revised 19 December 2007; Accepted 30 December 2007 Recommended by Fernando P´erez-Gonz´alez

The processing and encryption of multimedia content are generally considered sequential and independent operations. In certain multimedia content processing scenarios, it is, however, desirable to carry out processing directly on encrypted signals. The field of secure signal processing poses significant challenges for both signal processing and cryptography research; only few ready-to-go fully integrated solutions are available. This study first concisely summarizes cryptographic primitives used in existing solutions to processing of encrypted signals, and discusses implications of the security requirements on these solutions. The study then continues to describe two domains in which secure signal processing has been taken up as a challenge, namely, analysis and retrieval of multimedia content, as well as multimedia content protection. In each domain, state-of-the-art algorithms are described. Finally, the study discusses the challenges and open issues in the field of secure signal processing.

Copyright © 2007 Zekeriya Erkin et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. INTRODUCTION

In the past few years,the processing of encrypted signals has emerged as a new and challenging research field. The combi-nation of cryptographic techniques and signal processing is not new. So far, encryption was always considered as an add-on after signal manipulatiadd-ons had taken place (seeFigure 1). For instance, when encrypting compressed multimedia sig-nals such as audio, images, and video, first the multime-dia signals were compressed using state-of-the-art compres-sion techniques, and next encryption of the compressed bit stream using a symmetric cryptosystem took place. Conse-quently, the bit stream must be decrypted before the multi-media signal can be decompressed. An example of this ap-proach is JPSEC, the extension of the JPEG2000 image com-pression standard. This standard adds selective encryption to JPEG2000 bit streams in order to provide secure scalable streaming and secure transcoding [1].

In several application scenarios, however, it is desirable to carry out signal processing operations directly on encrypted signals. Such an approach is called secure signal processing,

en-crypted signal processing, or signal processing in the enen-crypted domain. For instance, given an encrypted image, can we

(2)

x(n) Process

(compress) Encrypt

Channel _Decrypt Process (decompress) x(n)

Figure 1: Separate processing and encryption of signals.

The security requirements of signal processing in en-crypted domains depends strongly on the considered appli-cation. In this survey paper, we take an application-oriented view on secure signal processing and give an overview of pub-lished applications in which the secure processing of signal amplitudes plays an important role. In each application, we show how signal processing algorithms and cryptosystems are brought together. It is not the purpose of the paper to describe either the signal processing algorithms or the cryp-tosystems in great detail, but rather focus on possibilities, im-possibilities, and open issues in combining the two. The pa-per includes many references to literature that contains more elaborate signal processing algorithms and cryptosystem so-lutions for the given application scenario. It is also crucial to state that the scenarios in this survey can be implemented more eﬃciently by using trusted third entities. However, it is not always easy to find trusted entities with high computa-tional power, and even if one is found, it is not certain that it can be applicable in these scenarios. Therefore, the trusted entities either do not exist or have little role in discussed sce-narios in this paper.

In this paper, we will survey applications that directly ma-nipulate encrypted signals. When scanning the literature on secure signal processing, it becomes immediately clear that there are currently two categories under which the secure sig-nal processing applications and research can be roughly clas-sified, namely, content retrieval and content protection. Al-though the security objectives of these application categories diﬀer quite strongly, similar signal processing considerations and cryptographic approaches show up. The common cryp-tographic primitives are addressed inSection 2. This section also discusses the need for clearly identifying the security re-quirements of the signal processing operations in a given sce-nario. As we will see, many of the approaches for secure sig-nal processing are based on homomorphic encryption, zero-knowledge proof protocols, commitment schemes, and mul-tiparty computation. We will also show that there is ample room for alternative approaches to secure signal processing towards the end ofSection 2.Section 3surveys secure sig-nal processing approaches that can be classified as “content retrieval,” among them secure clustering and recommenda-tion problems.Section 4discusses problems of content pro-tection, such as secure watermark embedding and detection. Finally,Section 5concludes this survey paper on secure pro-tection and retrieval of encrypted multimedia content.

2. ENCRYPTION MEETS SIGNAL PROCESSING

2.1. Introduction

The capability to manipulate signals in their encrypted form is largely thanks to two assumptions on the encryption strategies used in all applications discussed. In the first place,

encryption is carried out independently on individual signal samples. As a consequence, individual signal samples can be identified in the encrypted version of the signal, allowing for processing of encrypted signals on a sample-by-sample basis. If we represent a one-dimensional (e.g., audio) signal X that consists ofM samples as

X=x1,x2,x3,. . . , xM−1,xM T

, (1) wherexiis the amplitude of theith signal sample, then the

encrypted version of X using keyk is given as Ek(X)= Ek x1 ,Ek x2 ,Ek x3 ,. . . , Ek xM−1 ,Ek xM T . (2) Here the superscript “T” refers to vector transposition. Note

that no explicit measures are taken to hide the temporal or spatial structure of the signal, however, the use of sophisti-cated encryption schemes that are semantically secure (as the one in [2]) achieves this property automatically.

Secondly, only public key cryptosystems are used that have particular homomorphic properties. The homomorphic property that these public key cryptographic system provide will be concisely discussed inSection 2.2.1. In simple terms, the homomorphic property allows for carrying out additions or multiplications on signal amplitudes in the encrypted do-main. Public key systems are based on the intractability of some computationally complex problems, such as

(i) the discrete logarithm in finite field with a large (prime) number of elements (e.g., ElGamal cryptosys-tem [3]);

(ii) factoring large composite numbers (e.g., RSA cryp-tosystem [4]);

(iii) deciding if a number is annth power inZN for large enough compositeN (e.g., Paillier cryptosystem [2]). It is important to realize that public key cryptographic sys-tems operate on very large algebraic structures. This means that signal amplitudesxithat were originally represented in

8-to-16 bits will require at least 512 or 1024 bits per signal sample in their encrypted formEk(xi). This data expansion

is usually not emphasized in literature but this may be an important hurdle for practical applicability of secure signal processing solutions. In some cases, however, several signal samples can be packed into one encrypted value in order to reduce the size of the whole encrypted signal by a linear fac-tor [5].

A characteristic of signal amplitudesxi is that they are

(3)

Table 1: Some (probabilistic) encryption systems and their homomorphisms.

Encryption system f1(·,·) f2(·,·)

Multiplicatively Homomorphic El-Gamal [3] Multiplication Multiplication

Additively Homomorphic El-Gamal [13] Addition Multiplication

Goldwasser-Micali [14] XOR Multiplication

Benaloh [15] Addition Multiplication

Naccache-Stern [16] Addition Multiplication

Okamoto-Uchiyama [17] Addition Multiplication

Paillier [2] Addition Multiplication

Damg˚ard-Jurik [18] Addition Multiplication

about the signal. Consequently, probabilistic encryption has to be used, where each encryption uses a randomization or blinding factor such that even if two signal samplesxiandxj

have the same amplitude, their encrypted valuesEpk[xi] and

Epk[xj] will be diﬀerent. Here, pk refers to the public key used

upon encrypting the signal amplitudes. Public key cryptosys-tems are constructed such that the decryption uses only the private key sk, and that decryption does not need the value of the randomization factor used in the encryption phase. All encryption schemes that achieve the desired strong notion of

semantic security are necessarily probabilistic.

Cryptosystems operate on (positive) integer values on finite algebraic structures. Although sampled signal ampli-tudes are normally represented in 8-to-16 bits (integer) val-ues when they are stored, played, or displayed, intermediate signal processing operations often involve noninteger signal amplitudes. Work-arounds for noninteger signal amplitudes may involve scaling signal amplitudes with constant factors (say factors of 10 to 1000), but the unavoidable successive operations of rounding (quantization) and normalization by division pose significant challenges for being carried out on encrypted signal amplitudes.

In Section 2.2, we first discuss four important cryp-tographic primitives that are used in many secure signal processing applications, namely, homomorphic encryption, zero-knowledge proof protocols, commitment schemes, and secure multiparty computation. InSection 2.3, we then con-sider the importance of scrutinizing the security require-ments of the signal processing application. It is meaningless to speak about secure signal processing in a particular ap-plication if the security requirements are not specified. The security requirements as such will also determine the possi-bility or impossipossi-bility of applying the cryptographic prim-itives. As we will illustrate by examples—and also in more detail in the following sections—some application scenarios simply cannot be made secure because of the inherent infor-mation leakage by the signal processing operation because of the limitations of the cryptographic primitives to be used, or because of constraints on the number of interactions be-tween parties involved. Finally, inSection 2.4, we briefly dis-cuss the combination of signal encryption and compression using an approach quite diﬀerent from the ones discussed in Sections3and4, namely, by exploiting the concept of coding with side information. We discuss this approach here to em-phasize that although many of the currently existing

applica-tion scenarios are built on the four cryptographic primitives discussed inSection 2.2, there is ample room for entirely dif-ferent approaches to secure signal processing.

2.2. Cryptographic primitives

2.2.1. Homomorphic cryptosystems

Many signal processing operations are linear in nature. Lin-earity implies that multiplying and adding signal amplitudes are important operations. At the heart of many signal pro-cessing operations, such as linear filters and correlation eval-uations, is the calculation of the inner product between two signals X and Y. If both signals (or segments of the signals) containM samples, then the inner product is defined as

X, Y =XT_Y₌_x 1,x2,. . . , xM · ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ y1 y2 .. . yM ⎤ ⎥ ⎥ ⎥ ⎥ ⎦= M i=1 xiyi. (3)

This operation can be carried out directly on an encrypted signal X and plain text signal Y if the encryption system used has the additive homomorphic property, as we will discuss next.

Formally, a “public key” encryption systemEpk(·) and its

decryptionDsk(·) are homomorphic if those two functions

are maps between the message group with an operationf1(·)

and the encrypted group with an operation f2(·), such that

ifx and y are taken from the message space of the encryption

scheme, we have f1(x, y)=Dsk f2 Epk(x), Epk(y) . (4)

(4)

Another important consideration is the representation of the individual signal samples. As encryption schemes usually operate in finite modular domains (and all messages to be encrypted must be represented in this domain), a mapping is required which quantizes real-valued signal amplitudes and translates the signal samples of X into a vector of modular numbers. In addition to the requirement that the computa-tions must not overflow, special care must be taken to repre-sent negative samples in a way which is compatible with the homomorphic operation oﬀered by the cryptosystem. For the latter problem, depending on the algebraic structure of the cipher, one may either encode the negative value−x by

the modular inversex−1_{in the underlying algebra of the}

mes-sage space or by avoiding negative numbers entirely by using a constant additive shift.

In the context of the above inner product example, we require an additively homomorphic scheme (see Table 1). Hence, f1is the addition, and f2is a multiplication:

x + y=Dsk Epk(x)·Epk(y) , (5) or, equivalently, Epk(x + y)=Epk(x)·Epk(y). (6)

Note that the latter equation also implies that

Epk(c·x)=

Epk(x) c

(7) for every integer constant c. Thus, every additively

homo-morphic cryptosystem also allows to multiply an encrypted value with a constant available or known as clear text.

The Paillier cryptosystem [2] provides the required ho-momorphism if both addition and multiplication are con-sidered as modular. The encryption of a messagem under a

Paillier cryptosystem is defined as

Epk(m)=gmrN modN2, (8)

whereN = pq, p and q are large prime number, g∈ Z∗

N2is

a generator whose order is a multiple ofN, and r ∈ Z∗

N is a

random number (blinding factor). We then easily see that

Epk(x)Epk(y)= gx_rN x gy_rN y modN2 =gx+yrxry N modN2 =Epk(x + y). (9)

Applying the additive homomorphic property of the Paillier encryption system, we can evaluate (3) under the assumption that X is an encrypted signal and Y is a plain text signal:

EpkX, Y =Epk _M i=1 xiyi = M i=1 Epk xiyi = M i=1 Epk xi yi . (10) Here, we implicitly assume thatxi,yiare represented as

inte-gers in the message space of the Paillier cryptosystem, that is,

xi,yi ∈ ZN. However, (10) essentially shows that it is

possi-ble to compute an inner product directly in case one of the

two vectors is encrypted. One takes the encrypted samples

Epk(xi), raises them to the power ofyi, and multiplies all

ob-tained values. Obviously, the resulting number itself is also in encrypted form. To carry out further useful signal processing operations on the encrypted result, for instance, to compare it to a threshold, another cryptographic primitive is needed, namely, zero knowledge proof protocols, which is discussed in the next section.

In this paper, we focus mainly on public-key encryption schemes, as almost all homomorphic encryption schemes be-long to this family. The notable exception is the one-time pad (and derived stream ciphers), where messages taken from a finite group are blinded by a sequence of uniformly random group elements. Despite its computationally eﬃcient encryp-tion and decrypencryp-tion processes, the applicaencryp-tion of a one-time pad usually raises serious problems with regard to key dis-tribution and management. Nevertheless, it may be used to temporarily blind intermediate values in larger communica-tion protocols. Finally, it should be noted that some recent work in cryptography (like searchable encryption [6] and order-preserving encryption [7]) may also yield alternative ways for the encryption of signal samples. However, these ap-proaches have not yet been studied in the context of media encryption.

To conclude this section, we observe that directly com-puting the inner product of two encrypted signals is not pos-sible since this would require a cryptographic system that has both multiplicative and additive (i.e., algebraic) homomor-phism. Recent proposals in that direction like [8,9] were later proven to be insecure [10,11]. Therefore, no provably secure cryptographic system with these properties is known to date. The construction of an algebraic privacy homomorphism re-mains an open problem. Readers can refer to [12] for more details on homomorphic cryptosystems.

2.2.2. Zero-knowledge proof protocols

Zero-knowledge protocols are used to prove a certain state-ment or condition to a verifier, without revealing any “knowledge” to the verifier except the fact that the assertion is valid [19]. As a simple example, consider the case where the prover Peggy claims to have a way of factorizing large numbers. The verifier Victor will send her a large number and Peggy will send back the factors. Successful factorization of several large integers will decrease Victor’s doubt in the truth of Peggy’s claim. At the same time Victor will learn “no knowledge of the actual factorization method.”

Although simple, the example shows an important prop-erty of zero-knowledge protocol proofs, namely, that they are interactive in nature. The interaction should be such that with increasing number of “rounds,” the probability of an adversary to successfully prove an invalid claim decreases significantly. On the other hand, noninteractive protocols (based on the random oracle model) also do exist. A formal definition of interactive and noninteractive proof systems, such as zero-knowledge protocols, falls outside the scope of this paper, but can be found, for instance, in [19].

(5)

x of an element y to the base g in a finite field [20]. Hav-ing knowledge of discrete logarithmx is of interest in some

applications since if

y=gxmodp, (11)

then given p (a large prime number), g, and y (the

calcu-lation of the logarithmx) are computationally infeasible. If

Peggy (the prover) claims she knows the answer (i.e., the value of x), she can convince Victor (the verifier) of this

knowledge without revealing the value of x by the

follow-ing zero-knowledge protocol. Peggy picks a random number

r∈ Zpand computest=grmodp. She then sends t to

Vic-tor. He picks a random challengec ∈ Zpand sends this to

Peggy. She computess=r−cx mod p and sends this to

Vic-tor. He accepts Peggy’s knowledge of x if gs_yc ₌ _{t, since if}

Peggy indeed used the correct logarithmx in calculating the

value ofs, we have

gsycmodp=gr−cxgxcmodp=gr=t mod p. (12) In literature, many diﬀerent zero-knowledge proofs exist. We mention a number of them that are frequently used in secure signal processing:

(i) proof that an encrypted number is nonnegative [21]; (ii) proof that shows that an encrypted number lies in a

certain interval [22];

(iii) proof that the prover knows the plaintext x

corre-sponds to the encryptionE(x) [23];

(iv) proofs that committed values (seeSection 2.2.3) satisfy certain algebraic relations [24].

In zero-knowledge protocols, it is sometimes necessary for the prover to commit to a particular integer or bit value. Commitment schemes are discussed in the next section.

2.2.3. Commitment schemes

An integer or bit commitment scheme is a method that al-lows Alice to commit to a value while keeping it hidden from Bob, and while also preserving Alice’s ability to reveal the committed value later to Bob. A useful way to visualize a commitment scheme is to think of Alice as putting the value in a locked box, and giving the box to Bob. The value in the box is hidden from Bob, who cannot open the lock (without the help of Alice), but since Bob has the box, the value in-side cannot be changed by Alice; hence, Alice is “committed” to this value. At a later stage, Alice can “open” the box and reveal its content to Bob.

Commitment schemes can be built in a variety of ways. As an example, we review a well-known commitment scheme due to Pedersen [25]. We fix two large primesp and q such

thatq|(p−1) and a generatorg of the subgroup of order q

ofZ∗

p. Furthermore, we seth=gamodp for some random

secreta. The values p, q, g, and h are the public parameters

of the commitment scheme. To commit to a valuem, Alice

chooses a random valuer ∈ Zqand computes the

commit-mentc=gm_hr_mod_{p. To open the commitment, Alice sends}

m and r to Bob, who verifies that the commitment c received

previously indeed satisfiesc = gm_hr_mod_{p. The scheme is}

hiding due to the random blinding factorr; furthermore, it

is binding unless Alice is able to compute discrete logarithms. For use in signal processing applications, commitment schemes that are additively homomorphic are of specific importance. As with homomorphic public key encryption schemes, knowledge of two commitments allows one to compute—without opening—a commitment of the sum of the two committed values. For example, the above-mentioned Pedersen commitment satisfies this property: given two commitmentsc1=gm1hr1modp and c2=gm2hr2

modp of the numbers m1 and m2, a commitment c =

gm1+m2_hr1+r2_mod_{p of m}

1+m2can be computed by

multiply-ing the commitments:c = c1c2modp. Note that the

com-mitmentc can be opened by providing the values m1+m2

andr1+r2. Again, the homomorphic property only supports

additions. However, there are situations where it is not possi-ble to prove the relation by mere additive homomorphism as in proving that a committed value is the square of the value of another commitment. In such circumstances, zero-knowledge proofs can be used. In this case, the party which possesses the opening information of the commitments com-putes a commitment of the desired result, hands it to the other party, and proves in zero-knowledge that the commit-ment was actually computed in the correct manner. Among others, such zero-knowledge proofs exist for all polynomial relations between committed values [24].

2.2.4. Secure multiparty computation

The goal of secure multiparty computation is to evaluate a public function f (x(1)_,_x(2)_,_{. . . , x}(m)_{) based on the secret}

in-putsx(i)_,_i₌_{1, 2,}_{. . . , m of m users, such that the users learn}

nothing except their own input and the final result. A sim-ple examsim-ple, called Yao’s Millionaire’s Problem, is the com-parison of two (secret) numbers in order to determine if

x(1) _{> x}(2)_{. In this case, the parties involved will only learn}

if their number is the largest, but nothing more than that. There is a large body of literature on secure multiparty computation; for example, it is known [26] that any (com-putable) function can be evaluated securely in the multi-party setting by using a general circuit-based construction. However, the general constructions usually require a large number of interactive rounds and a huge communication complexity. For practical applications in the field of dis-tributed voting, private bidding and auctions, and private in-formation retrieval, dedicated lightweight multiparty proto-cols have been developed. An example relevant to signal pro-cessing application is the multiparty computation known as Bitrep which finds the encryption of each bit in the binary representation of a number whose encryption under an ad-ditive homomorphic cryptosystem is given [27]. We refer the reader to [28] for an extensive summary of secure multiparty computations and to [29] for a brief introduction.

2.3. Importance of security requirements

(6)

processing solutions, it is important to realize that in each application the security requirements have to be made ex-plicit right from the start. Without wishing to turn to formal definition, we choose to motivate the importance of what to expect from secure signal processing with three simple yet il-lustrative two-party computation examples.

The first simple example is the encryption of a (say au-dio) signal X that containsM samples. Due to the

sample-by-sample encryption strategy as shown in (2), the encrypted signalEpk(X) will also containM encrypted values. Hence,

the sizeM of the plain text signal cannot be hidden by the

approaches followed in secure signal processing surveyed in this paper.

In the second example, we consider the linear filtering of the signal X. In an (FIR) linear filter, the relation between the input signal amplitudes X and output signal amplitudes Y is entirely determined by the impulse response (h0,h1,. . . , hr)

through the following convolution equation:

yi=h0xi+h1xi−1+· · ·+hrxi−r= r k=0

hkxi−k. (13)

Let us assume that we wish to compute this convolution in a secure way. The first party, Alice, has the signal X and the second party, Bob, has the impulse response (h0,h1,. . . , hr).

Alice wishes to carry out the convolution (13) using Bob’s linear filter. However, both Bob and Alice wish to keep secret their data, that is, the impulse response and the input signal, respectively. Three diﬀerent setups can now be envisioned.

(1) Alice encrypts the signal X under an additive homo-morphic cryptosystem and sends the encrypted signal to Bob. Bob then evaluates the convolution (13) on the encrypted signal as follows:

EpkA yi =EpkA _r k=0 hkxi−k = r k=0 EpkA hkxi−k = r k=0 EpkA xi−k hk . (14)

Notice that the additive homomorphic property is used in the above equation and that, indeed, individ-ually encrypted signal samples should be available to Bob. Also notice that the above evaluation is only pos-sible if both X and (h0,h1,. . . , hr) are integer-valued,

which is actually quite unlikely in practice. After com-puting (14), Bob sends the result back to Alice who decrypts the signal using her private key to obtain the result Y. In this setup, Bob does not learn the output signal Y.

(2) Bob encrypts his impulse response (h0,h1,. . . , hr)

un-der a homomorphic cryptosystem and sends the result to Alice. Alice then evaluates the convolution (13) us-ing the encrypted impulse response as follows:

EpkB yi =EpkB _r k=0 hkxi−k = r k=0 EpkB hkxi−k = r k=0 EpkB hk xi−k . (15)

Alice then sends the result to Bob, who decrypts to ob-tain the output signal Y. In this solution, Bob learns the output signal Y.

(3) Alice and Bob engage in a formal multiparty proto-col, where the function f (x1,x2,. . . , xM,h0,h1,. . . , hr)

is the convolution equation, Alice holds the signal val-uesxiand Bob the impulse responsehias secret inputs.

Both parties will learn the resulting output signal Y. Unfortunately, none of the above three solutions really pro-vides a solution to the secure computation of a convolution due to inherent algorithm properties. For instance, in the first setup, Alice could send Bob a signal that consists of all-zero values and a single “one” value (a so-called “impulse sig-nal”). After decrypting the result EpkA(yi) that she obtains

from Bob, it is easy to see that Y is equal to (h0,h1,. . . , hr),

hence Bob’s impulse response is subsequently known to Al-ice. Similar attacks can be formulated for the other two cases. In fact, even for an arbitrary input, both parties can learn the other’s input by a well-known signal processing procedure known as “deconvolution.” In conclusion, although in some cases there may be a need for the secure evaluation of convo-lutions, the inherent properties of the algorithm make secure computing in a two-party scenario meaningless. (Neverthe-less, the protocols have value if used as building blocks in a large application where the output signal Y is not revealed to the attacker.)

The third and final example is to threshold a signal’s (weighted) mean value in a secure way. The (secure) mean value computation is equivalent to the (secure) computation of the inner product of (3), with X the input signal and Y the weights that define how the mean value is calculated. In the most simple case, we have yi = 1 for alli, but other

defini-tions are quite common. Let us assume that Alice wishes Bob to determine if the signal’s mean value is “critical,” for in-stance, above a certain threshold valueTc, without revealing

X to Bob. Bob, on the other hand, does not want to reveal his

expert knowledge, namely, the weights Y and the threshold

Tc. Two possible solutions to this secure decision problem

are the following.

(i) Use secure multiparty computation, where the func-tion f (x1,x2,. . . , xM,y1,y2,. . . , yM,Tc) is a

combina-tion of the inner product and threshold comparison. Both parties will only learn if the mean value is critical or not.

(ii) Alice sends Bob the signal X under additively homo-morphic encryption. Bob securely evaluates the in-ner product using (10). After encryptingTcusing

(7)

Message source Encryption Key Compression Eavesdropper Public channel Secure channel Joint decompression and decryption Reconstructed source

Figure 2: Compression of an encrypted signal from [30].

threshold Tc. Bob sends the result to Alice, who

de-crypts the result using her secret key and checks if the value is larger or smaller than zero.

Although the operations performed are similar to the sec-ond example, in this example the processing is secure since Bob learns little about Alice’s signal and Alice will learn lit-tle about the Bob’s expert knowledge. In fact, in the first implementation, the entire signal processing operation is ultimately condensed into a single bit of information; the second implementation leaks more information, namely, the distance between the correlation value from the threshold. In both cases, the result represents a high information ab-straction level, which is insuﬃcient for launching successful signal processing-based attacks. In contrast, in the example based on (13), the signal processing operation led to an enor-mous amount of information—the entire output signal—to be available to either parties, making signal processing-based attacks quite easy.

As we will see in Sections3and4, many of the two-party secure signal processing problems eventually include an in-formation condensation step, such as (in the most extreme case) a binary decision. We postulate that for two-party lin-ear signal processing operations in which the amount of plain text information after processing is in the same order of mag-nitude as before processing, no secure solutions exist purely based on the cryptographic primitives discussed in the previ-ous section, due to inherent properties of the signal process-ing problems and the related application scenario. For that reason, entirely other approaches to secure signal processing are also of interest. Although few results can be found in lit-erature on approaches not using homomorphic encryption, zero-knowledge proofs, and multiparty computation proto-cols, the approach discussed in the next section may well show a possible direction for future developments.

2.4. Compression of encrypted signals

When transmitting signals that contain redundancy over an insecure and bandwidth-constrained channel, it is custom-ary to first compress and then encrypt the signal. Using the principles of coding with side information, it is, however, also possible to interchange the order of (lossless) compression and encryption, that is, to compress encrypted signals [30].

The concept of swapping the order of compression and en-cryption is illustrated inFigure 2. A signal from the message source is first encrypted and then compressed. The compres-sor does not have access to the secret key used in the encryp-tion. At the decoder, decompression and decryption are per-formed jointly. From classical information theory, it would seem that only minimal gain could be obtained as the en-crypted signal has maximal entropy, that is, no redundancy is left after encryption. However, the decoder can use the cryptographic key to decode and decrypt the compressed and encrypted bit stream. This brings opportunities for eﬃcient compression of encrypted signals based on principle of cod-ing with side information. In [30], it was shown that neither compression performance nor security need to be negatively impacted under some reasonable conditions.

In source coding with side information, the signal X is coded under the assumption that the decoder—but not the encoder—has statistically dependent information Y, called the side information, available. In conventional coding sce-narios, the encoder would code the diﬀerence signal X−Y in

some eﬃcient way, but in source coding with side informa-tion, this is impossible since we assume that Y is only known at the decoder. In the Slepian-Wolf coding theory [31], the crucial observation is that the side information Y is regarded as a degraded version of X. The degradations are modeled as “noise” on the “virtual channel” between X and Y. The signal

X can then be recovered from Y by the decoder if suﬃcient

error-correcting information is transmitted over the chan-nel. The required bit rate and amount of entropy are related asR≥H(X|Y). This shows that, at least theoretically, there

is no loss in compression eﬃciency since the lower bound

H(X|Y) is identical to the scenario in which Y is available

at the encoder. Extension of the Slepian-Wolf theory exists for lossy source coding [32]. In all practical cases of interests, the information bits that are transmitted over the channel are parity bits or syndromes of channel coding methods such as Hamming, Turbo or LDPC codes.

In the scheme depicted inFigure 2, we have a similar sce-nario as in the above source coding with side information case. If we consider the encrypted signalEk(X) at the input of

the encoder, then we see that the decoder has the keyk

(8)

the same as if the keyk would be available during the source

encoding process, that is,R ≥H(Ek(X) |k) =H(X). This

clearly says that the (lossless) coding of the encrypted sig-nalEk(X) should be possible with the same eﬃciency as the

(lossless) coding of X. Hence, using the side information key

k, the decoder can recover first Ek(X) from the compressed

channel bit stream and subsequently decodeEk(X) into X.

A simple implementation of the above concept for a bi-nary signal X uses a pseudorandomly generated key. The key

k is in this case a binary signal K of the same dimension M as

the signal X. The encrypted signal is computed as follows:

Ek(X)=X⊕K, Ek xi =xi⊕ki, i=1, 2,. . . , M. (16)

The encrypted signalEk(X) is now input to a channel

cod-ing strategy, for instance, a Hammcod-ing codcod-ing. The strength of the Hamming code is dependent on the dependency be-tween Ek(X) and the side information K at the decoder.

This strength obviously depends solely on the properties of the original signal X. This does, however, require the mes-sage source to inform the source encoder about the entropy

H(X), which represents a small leak of information. The

en-coder calculates parity check bits over binary vectors of some length L created by concatenating L bits of the encrypted

signal Ek(X), and sends only these parity check bits to the

receiver.

The decoder recovers the encrypted signal by first ap-pending to K the parity check bits, and then error correcting the resulting bit pattern. The success of this error correction step depends on the strength of the Hamming code, but as mentioned, this strength has been chosen suﬃciently with regards to the “errors” in K on the decoding side. Notice that in this particular setup the “errors” represent the bits of the original signal X. If the error correction step is successful, the decoder obtainsEk(X), from which the decryption can

straightforwardly take place:

X=Ek(X)⊕K, xi=Ek xi ⊕ki, i=1, 2,. . . , M. (17)

The above example is too simple for any practical sce-nario for a number of reasons. In the first place, it uses only binary data, for instance, bit planes. More eﬃcient coding can be obtained if the dependencies between bit planes are considered. This eﬀectively requires an extension of the bit plane coding and encryption approach to coding and en-cryption of symbol values. Secondly, the decoder lacks a model of the dependencies in X. Soft decoders for Turbo or LDPC codes can exploit such message source models, yield-ing improved performance. Finally, the codyield-ing strategy is lossless. For most continuous or multilevel message sources, such as audio, images, and video, lossy compression is desir-able.

3. ANALYSIS AND RETRIEVAL OF CONTENT

In the today’s society, huge quantities of personal data are gathered from people and stored in databases for various

purposes ranging from medical researches to online person-alized applications. Sometimes, providers of these services may want to combine their data for research purposes. A classical example is the one where two medical institutions wish to perform joint research on the union of their pa-tients data. Privacy issues are important in this scenario be-cause the institutions need to preserve their private data dur-ing their cooperation. Lindell and Pinkas [33] and Agrawal and Srikant [34] proposed the notion of privacy preserving data mining, meaning the possibility to perform data analysis from distributed database, under some privacy constraints. Privacy preserving data mining [35–38] deals with mutual untrusted parties that on the one hand wish to cooperate to achieve a common goal but, on the other hand, are not will-ing to disclose their knowledge to each other.

There are several solutions that cope with exact matching of data in a secure way. However, it is more common in signal processing to perform inexact matching, that is, learning the distance between two signal values, rather than exact match-ing. Consider two signal valuesx1 andx2. Computing the

distance between them or checking if the distance is within a threshold is important:

_x₁₋_x₂_<_. ₍₁₈₎

This comparison or fuzzy matching can be used in a vari-ety of ways in signal processing. One example is quantizing data which is of crucial importance for multimedia compres-sion schemes. However, considering that these signal values are encrypted—thus the ordering between them is totally de-stroyed, there is not any eﬃcient way known to fuzzy com-pare two values.

In the following sections, we give a summary of tech-niques that focus on extracting some information from pro-tected datasets. Selected studies mostly use homomorphic encryption, zero-knowledge proofs, and, sometimes, multi-party computations. As we will see, most solutions still re-quire substantial improvements in communication and com-putation eﬃciency in order to make them applicable in prac-tice. Therefore, the last section addresses a diﬀerent approach that uses other means of preserving privacy to show that fur-ther research on combining signal processing and cryptogra-phy may result in new approaches rather than using encryp-tion schemes and protocols.

3.1. Clustering

Clustering is a well-studied combinatorial problem in data mining [39]. It deals with finding a structure in a collection of unlabeled data. One of the basic algorithms of cluster-ing is theK-means algorithm that partitions a dataset into K clusters with a minimum error. We review the K-means

(9)

Y

X

Cluster centers Objects

Figure 3: Clustered dataset. Each object is a point in the 2-dimensional space.K-means clustering algorithm assigns each

ob-ject to the cluster with the smallest distance.

(1) selectK random objects representing the K

initial centroid of the clusters.

(2) assign each object to the cluster with the nearest centroid.

(3) recalculate the centroids for each cluster. (4) repeat step 2 and 3 until centroids do not

change or a certain threshold achieved. Algorithm 1: TheK-means clustering algorithm

3.1.1. K-means clustering algorithm

TheK-means clustering algorithm partitions a dataset D of

“objects” such as signal values or features thereof intoK

dis-joint subsets, called clusters. Each cluster is represented by its center which is the centroid of all objects in that subset.

As shown inAlgorithm 1, theK-means algorithm is an

iterative procedure that refines the cluster centroids until a predefined condition is reached. The algorithm first chooses

K random points as the cluster centroids in the dataset D

and assigns the objects to the closest cluster centroid. Then, the cluster centroid is recomputed with recently assigned ob-jects. When the iterative procedure reaches the termination condition, each data object is assigned to the closest cluster (Figure 3). Thus to carry out theK-means algorithm, the

fol-lowing quantities needs to be computed:

(i) the cluster centroid, or the mean of the data objects in that cluster,

(ii) the distance between an object and the cluster cen-troid,

(iii) the termination condition which is a distance mea-surement compared to a threshold.

Attribute names Data owned by Alice Data owned by Bob

Figure 4: Shared dataset on whichK-means algorithm is run.

In the following section, we describe a secure protocol that carries out secureK-means algorithm on protected data

ob-jects.

3.1.2. SecureK-means clustering algorithm

Consider the scenario in which Alice and Bob want to apply

theK-means algorithm on their joint datasets as shown in

Figure 4, but at the same time they want to keep their own dataset private. Jagannathan and Wright proposed a solution for this scenario in [40].

In the proposed method, both Alice and Bob get the fi-nal output but the values computed in the intermediate steps are unknown to the both parties. Therefore, the intermediate values such as cluster centroids are uniformly shared between Alice and Bob in such a way that for a valuex, Alice gets a

random sharea and Bob gets another random share b, where

(a + b) mod N=x and N is the size of the field in which all

operations take place. Alice and Bob keep their private shares of the dataset secret.

The secure K-means clustering algorithm is separated

into subprotocols where Alice and Bob computes the follow-ings (Algorithm 2).

(i) Distance measurement and finding the closest cluster: the distance between each object and cluster centroid is computed by running a secure scalar product proto-col by Goethals et al. [41]. The closest cluster centroid is determined by running Yao’s circuit evaluation pro-tocol [42] with the shared data of Alice and Bob. (ii) New cluster centroid: the new cluster centroid requires

to determine an average computation over shared val-ues of Alice and Bob. This function of the form (a + b)/(m+n) can be computed by applying Yao’s protocol

(10)

Randomly selectK objects from the dataset D as initial

cluster centroids

Randomly share the cluster centroid between Alice and Bob

repeat

for all objectdkin datasetD do

Run the secure closest cluster protocol Assign todkto the closest cluster

end for

Alice and Bob computes the random shares for the new centroids of the clusters.

until cluster centroids are close to each other with an error

of.

Algorithm 2: Privacy preservingK-means clustering algorithm.

(iii) Termination condition: the termination condition of the algorithm is computed by running the Yao’s circuit evaluation protocol [42].

The squared distance between an object Xi =(xi,1,. . . , xi,M)

and a cluster centroidμjis given by the following equation: distXi,μj 2 =xi,1−μj,1 2 +xi,2−μj,2 2 +· · ·+xi,M−μj,M 2 . (19) Considering that the clusters centroids are shared between Alice and Bob, (19) can be written as

distXi,μj 2 =xi,1− μA j,1+μBj,1 2 +· · ·+xi,M− μA j,M+μBj,M 2 , (20) whereμA

j is Alice’s share andμBj is Bob’s share such that the

jth-cluster centroid is μj=μAj+μBj. Then, (20) can be written

as distXi,μj 2 = M k=1 x2 i,k+ M k=1 μA j,k 2 + M k=1 μB j,k 2 +2 M k=1 μA j,kμBj,k −2 M k=1 μA j,kxi,k−2 M k=1 xi,kμBj,k. (21) Equation (21) can be computed by Alice and Bob jointly. As the first term of the equation is shared between them, Al-ice computes the sum of components of her share while Bob computes the rest of the components. The second term and third term can be computed by Alice and Bob individually, and the rest of the terms are computed by running a secure scalar product protocol between Alice and Bob, much similar to the evaluation of (3) via the secure form of (10). Alice first encrypts her dataEpkA(μ

A j)=(EpkA(μ A j,1),. . . , EpkA(μ A j,M)) and

sends it to Bob who computes the scalar product of this data with his own by using the additive homomorphic property

of the encryption scheme as follows:

EpkA μAj μB j ₌ EpkA μAj,1 μB j,1 ,. . . , EpkA μAj,M μB j,M . (22)

Then, multiplying the encrypted components gives the en-crypted scalar product of Alice’s and Bob’s data

EpkA _M k=1 μAj,kμBj,k = M k=1 EpkA μAj,k μB j,k . (23)

The computed distances between the objects and the cluster centroids can later be the input to the Yao’s circuit evaluation protocol [42] in which the closest cluster centroid is deter-mined. We refer readers to [41,42] for further details on this part.

Once the distances and the closest clusters to the objects are determined, each object is labeled with the nearest cluster index. At the end of each iteration, it is necessary to compute the new cluster centroids. Alice computes the sum of the cor-responding coordinates of all object sj and the number of

objectsnj within each of theK clusters for j, 1 ≤ j ≤ M.

As shown inFigure 4, Alice has only some of the attributes of the objects, thus she treats these missing values as zero. Bob also applies the same procedure and determines the sum of coordinatestjand the number of objectsmjin the clusters.

Givensj,tj,nj, andmj, the jth component of the ith cluster

is

μi, j =

sj+tj

nj+mj.

(24)

Since there are only four values, this equation can be com-puted eﬃciently by using Yao’s circuit evaluation protocol [42] with Alice’s sharessjandnjand Bob’s sharestjandmj.

In the last step of theK-means algorithm, the iteration

is terminated if there is no further improvement between the previous and current cluster centroids. In order to do that, a distance is computed between the previous and current clus-ter centroids. This is done in the same way as computing dis-tances between an object and a cluster centroid but in addi-tion, this distance is compared to a threshold value. Con-sidering that the cluster centroids are shared between Alice and Bob, the result of the computation of the squared dis-tance of cluster centroids for thekth and (k + 1)th iterations

is again random shares for Alice and Bob:

distμA,k+1j +μB,k+1j ,μA,kj +μB,kj 2

=αj+βj, (25)

where α and β are the shares of Alice and Bob. Alice and

Bob then apply Yao’s protocol on their K-length vectors

(α1,. . . , αK) and (β1,. . . , βK) to check if αj +βj < for

1≤j≤K.

3.2. Recommender systems

(11)

a user may be interested in by implementing a signal process-ing algorithm known as collaborative filterprocess-ing on user prefer-ences to find similar users that share the same taste (likes or dislikes). Once similar users are found, this information can be used in variety ways such as recommending restaurants, hotels, books, audio, and video.

Recommender systems store user data, also known as preferences, in servers, and the collaborative filtering algo-rithms work on these stored preferences to generate recom-mendations. The amount of data collected from each user directly aﬀects the accuracy of the predictions. There are two concerns in collecting information from the users in such systems. First, in an ordinary system they are in the order of thousands items, so that it is not realistic for the users to rate all of them. Second, users would not like to reveal too much privacy sensitive information that can be used to track them. The first problem, also known as the sparseness problem in datasets, is addressed for collaborative filtering algorithms in [43–45]. The second problem on user privacy is of interest to this survey paper since users tend to not give more infor-mation about themselves for privacy concerns and yet they expect more accurate recommendations that fit their taste. This tradeoﬀ between privacy and accuracy leads us to an entirely new perspective on recommender systems. Namely, how can privacy of the users be protected in recommender systems without loosing too much accuracy?

We describe two solutions that address the problem of preserving privacy of users in recommender systems. In the first approach, user privacy is protected by means of encryp-tion and the recommendaencryp-tions are still generated by pro-cessing these encrypted preference values. In the second ap-proach, protecting the privacy of the users is possible without encryption but by means of perturbation of user preference data.

3.2.1. Recommendations by partial SVD on

encrypted preferences

Canny [46] addresses the user privacy problem in recom-mender systems and proposes to encrypt user preferences. Assume that the recommender system applies a collaborative filtering algorithm on a matrix P of users versus item ratings. Each row of this matrix represents the corresponding user’s taste for the corresponding items. Canny proposes to use a collaborative filtering algorithm based on dimension reduc-tion of P. In this way, an approximareduc-tion matrix of the orig-inal preference matrix is obtained in a lower dimension that best represents the user taste for the overall system. When a new user enters the system, the recommendations are gener-ated by simply reprojecting the user preference vector, which has many unrated items, over the approximation matrix. As a result, a new vector will be obtained which contains approx-imated values for the unrated items [43,46].

The ratings in recommender systems are usually integer numbers within a small range and items that are not rated are usually assigned to zero. To protect the privacy of the users, the user preferences vector X=[x1,x2,. . . , xM] is encrypted

individually asEpk(X). To reduce the dimension of the

pref-erence matrix P singular value decomposition (SVD) is an option. The SVD allows P to be written as

P=UDVT, (26)

where the columns of U are the left singular vectors, D is a diagonal matrix containing the singular values, and VT _has

rows that are the right singular vectors.

Once the SVD of the preference matrix P is computed, an approximation matrix in a lower-dimension subspace can be computed easily. Computing the SVD on P that contains encrypted user preferences is, however, more complicated.

Computing the decomposition of the users’ preference matrix requires sums of products of vectors. If the preference vector of each user is encrypted, there is no eﬃcient way of computing sums of products of vectors since this would re-quire an algebraic homomorphic cryptosystem. Using secure multiparty computation protocols on this complex function is costly considering the size of the circuit necessary for the complex operation.

Instead of straightforward computation of SVD, Canny [46] proposed to use an iterative approximation algorithm to obtain a partial decomposition of the user preference ma-trix. The conjugate gradient algorithm is an iterative pro-cedure consisting merely of additions of vectors which can be done under homomorphically encrypted user preference vectors. Each iteration in the protocol has two steps, that is, users compute (1) their contribution to the current gradient and (2) scalar quantities for the optimization of the gradi-ent. Both steps require only additions of vectors thus we only explain the first step.

For the first step of the iterations, each user computes his contribution Gkto the current gradient G by the following

equation:

Gk=AXkTXk

I−ATA, (27) where matrix A is the approximation of the preference ma-trix P and it is initialized as a random mama-trix before the pro-tocol starts. Each user encrypts his own gradient vector Gk

with the public key of the user group by following the Peder-sen’s threshold scheme [47] that uses El Gamal cryptosystem which is modified to be additively homomorphic. All con-tributions from the users are then added up to form the en-crypted gradientEpk(G) by using the additive homomorphic

property of the cryptosystem:

Epk(G)=Epk k∈users Gk = k∈users Epk Gk . (28)

This resulting vector Epk(G) is then jointly decrypted and

used to update the approximated matrix A which is publicly known and used to compute the new gradient for the next iteration.

(12)

User1 User2 UserN Collaborative filtering Data disguising Central database Disguised data Original data

Figure 5: Privacy preserving collaborative filtering with user pref-erence perturbation.

from the users are also proved valid by running a zero-knowledge proof protocol. Both group of zero-zero-knowledge proofs are checked by a subgroup of users of whose major-ity is necessary for the validation.

Canny [48] also applies this approach to a diﬀerent

collaborative filtering method, namely, expectation maxi-mization- (EM-) based factor analysis. Again this algorithm involves simple iterative operations that can be implemented by vector additions. In both recommender system solutions, multiple iterations are necessary for the algorithm to con-verge and in each iteration, users need to participate in the cryptographic computations as in joint decryption and zero-knowledge proofs for input validation. These computations are interactive and thus, it is imperative for the users to be online and synchronized.

3.2.2. Randomized perturbation to protect preferences

Previous section showed that homomorphic cryptosystems, zero-knowledge proof protocols, and secure multiparty com-putations play an important role in providing solutions for processing encrypted data. However, there are other ways to preserve privacy. In the following, we discuss preserving pri-vacy in recommender systems by perturbation of user data.

Randomized perturbation technique was first intro-duced in privacy-preserved data-mining by Agrawal and Srikant [34]. Polat and Du [49, 50] proposed to use this randomization-based technique in collaborative filtering. The user privacy is protected by simply randomizing user data while certain computations on aggregate data can still be done. Then, the server generates recommendations based on the blinded data but can not derive the user’s private in-formation (Figure 5).

Consider the scalar product of two vectors X and Y. These vectors are blinded by R = [r1,. . . , rM] and S =[s1,

. . . , sM] such that X =X + R and Y =Y + S. Hereri’s and

si’s are uniformly distributed random values with zero mean.

The scalar product of X and Y can be estimated from X and

Y : X ·Y = M k=1 xkyk+xksk+rkyk+rksk ≈ M k=1 xkyk. (29)

Since R and S are independent and independent of X and

Y, we haveMk=1xksk≈0, M

k=1rkyk≈0, and M

k=1rksk≈0.

Similarly, the sum of the elements of any vector A can be esti-mated from its randomized form A . Polat and Du used these two approximations to develop a privacy-preserving collab-orative filtering method [49,50].

This method works if the number of users in the system is significantly large. Only then the computations based on ag-gregated data can still be computed with suﬃcient accuracy. Moreover, it is also pointed out in [51,52] that the idea of preserving privacy by adding random noise might not pre-serve privacy as much as it had been believed originally. The user data can be reconstructed from the randomly perturbed user data matrix. The main limitation in the original work of Polat and Du is shown to be the item-invariant perturbation [53]. Therefore, Zhang et al. [53] propose a two-way com-munication perturbation scheme for collaborative filtering in which the server and the user communicates to determine perturbation guidance that is used to blind user data before sending to the server. Notwithstanding these approaches, the security of such schemes based on perturbation of data is not well understood.

4. CONTENT PROTECTION

4.1. Watermarking of content

In the past decade, content protection measures have been proposed based on digital watermarking technology. Digi-tal watermarking [54,55] allows hiding into a digital con-tent information that can be detected or extracted at a later moment in time by means of signal processing operations such as correlation. In this way, digital watermarking pro-vides a communication channel multiplexed into original content through which it is possible to transmit informa-tion. The type of information transmitted from sender to re-ceiver depends on the application at hand. As an example, in a forensic tracing application, a watermark is used to embed a unique code into each copy of the content to be distributed, where the code links a copy either to a particular user or to a specific device. When unauthorized published content is found, the watermark allows to trace the user who has redis-tributed the content.

Secure signal processing needs to be performed in case watermark detection or embedding is done in untrusted de-vices; watermarking schemes usually rely on a symmetric key for both embedding and detection, which is critical to both the robustness and security of the watermark and thus needs to be protected.

(13)

X b Embedder sk Xw _Channel Attacks manipulations X b X sk Detector/decoder b_Yes/no

Figure 6: A digital watermarking model.

required, bandwidth requirements become prohibitive. A proposed solution is to use client-side watermark embed-ding. Since the client is untrusted, the watermark needs to be embedded without the client having access to the original content and watermark.

The customer’s rights problem relates to the intrinsic problem of ambiguity when watermarks are embedded at the distribution server: a customer whose watermark has been found on unauthorized copies can claim that he has been framed by a malicious seller who inserted his identity as wa-termark in an arbitrary object. The mere existence of this problem may discredit the accuracy of the forensic tracing architecture. Buyer-seller protocols have been designed to embed a watermark based on the encrypted identity of the buyer, making sure that the watermarked copy is available only to the buyer and not to the seller.

In the watermark detection process, a system has to prove to a verifier that a watermark is present in certain content. Proving the presence of such a watermark is usually done by revealing the required detection information to the ver-ifying party. All current applications assume that the verifier is a trusted party. However, this is not always true, for in-stance, if the prover is a consumer device. A cheating veri-fier could exploit the knowledge acquired during watermark detection to break the security of the watermarking system. Cryptographic protocols, utilizing zero-knowledge proofs, have been constructed in order to mitigate this problem.

We will first introduce a general digital watermarking model to define the notation that will be useful in the remainder of the section. An example of a watermarking scheme is proposed, namely, the one proposed by Cox et al. [56] since this scheme is adopted in many of the content pro-tection applications.

4.1.1. Watermarking model

Figure 6shows a common model for a digital watermark-ing system [57]. The inputs of the system are the original host signal X and some application dependent to-be-hidden information, here represented as a binary string B = [b1,

b2,. . . , bL], withbitaking values in{0, 1}. The embedder

in-serts the watermark code B into the host signal to produce a watermarked signal Xw, usually making use of a secret key

sk to control some parameters of the embedding process and allow the recovery of the watermark only to authorized users. The watermark channel takes into account all processing operations and (intentional or non-intentional) manipula-tions the watermarked content may undergo during

distri-bution and use. As a result, the watermarked content Xw is

modified into the “received” version X . Based on X , either a detector verifies the presence of a specific message given to it as input, thus only answering yes or no, or a decoder reads the (binary) information conveyed by the watermark. Detec-tors and decoders may need to know the original content X in order to retrieve the hidden information (non-blind de-tector/decoder), or they do not require the original content (blind or oblivious detector/decoder).

4.1.2. Watermarking algorithm

Watermark information is embedded into host signals by making imperceptual modifications to the host signal. The modifications are such that they convey the to-be-hidden in-formation B. The hidden inin-formation can be retrieved after-wards from the modified content by detecting the presence of these modifications. Embedding is achieved by modifying the set of features X = [x1,x2,. . . , xM]. In the most simple

case, the features are simple signal amplitudes. In more com-plicated scenarios, the features can be DCT or wavelet coe ﬃ-cients. Several watermarking schemes make use of a spread-spectrum approach to code the to-be-hidden information B into W=[w1,w2,. . . , wM]. Typically, W is a realization of a

normally distributed random signal with zero mean and unit variance.

The most well-known spread-spectrum techniques was proposed by Cox et al. [56]. The host signal is first trans-formed into a discrete cosine transform (DCT) representa-tion. Next the largest magnitude DCT coeﬃcients are se-lected, obtaining the set of features X. The multiplicative wa-termark embedding rule is defined as follows:

xw,i=xi+γwixi=xi

1 +γwi

, (30) wherexw,iis theith component of the watermarked feature

vector and γ is a scaling factor controlling the watermark

strength. Finally, an inverse DCT transform yields the wa-termarked signal Xw.

To determine if a given signal Y contains the watermark

W, the decoder computes the DCT of Y, extracts the set X of largest DCT coeﬃcients, and then computes the correlation

ρX _W between the features X and the watermark W. If the

correlation is larger than a thresholdT, that is,

ρX _W= X

_{, W}