• Nie Znaleziono Wyników

Representing Trust Metrics with Annotations

4. Data Structures for RDF

4.1. Representing Trust Metrics with Annotations

One of the main problems of the Semantic Web environment is to define the trust level of different RDF triples. Taking into account that the Semantic Web environment is enormous and everybody may contribute to it and access it, a question should be asked how much information can users trust from the information providers. Trust metrics and annotating RDF statements are solutions to this problem.

In this section we propose annotated RDF, trust metrics and inference principles. We also present ways to extend the query language to trust metrics and introduce mapping algorithms to the pure RDF model.

Trust metrics are required by both the RDF statements and the annotations of the whole collections of RDF statements. In the first instance a metric is a measure of the agent’s trust degree to the RDF triple. On account of such annotations, the agents may make inferences calculating new trust metrics. In the second instance, when calculating new metrics, they annotate whole groups of RDF statements. This instance has been described in Section 4.2. The values of trust metrics can be the result of social interaction by agents.

4.1.1. Preliminaries

In this subsection we introduce RDF interpretation and semantic conditions, which should be satisfied, as the basis for extended interpretation. RDF consists of resources, properties, literals and classes. All the relationships between these parts are presented in Figure 4.1.

Figure 4.1: Hierarchy of RDF semantics

Let us denote by ρ a subset of the vocabularies of RDF [87], RDFS [26] and OWL [96]

as follows: ρ = {rn, dm, tp, spo, sco, sa, df, io, ep, ec, pdw, dw}, where rn is range, dm is domain, tp is type, spo is subPropertyOf, sco is subClassOf, sa is sameAs, df is differentFrom, io is inverseOf, ep is equivalentClass, ec is equivalentClass, pdw is propertyDisjointWith and dw is disjointWith.

Following [26, 96, 87] we introduce the concept of class. The set of classes CN is defined as CN = {x ∈ RN ∶ ⟨x, SN(C)⟩ ∈ EXTN(SN(tp))}, where RN is a nonempty set of named resources (see Definition 2.2). Additionally, the mapping CEXTN ∶CN → 2RN is defined as CEXTN(c) = {x ∈ RN ∶ ⟨x, c⟩ ∈ EXTN(SN(tp))}, where EXTN is an extension function used to associate properties with their property extension (see Definition 2.2).

Note that SN(x) ∈ CN and x ∈ {p, r, l, c}, where p is Property IRI, r is Resource IRI, l is Literal IRI and c is Class IRI. Let RN = CEXTN(r), PN = CEXTN(p), CN = CEXTN(c) and LVN = CEXTN(l), then a simple interpretation satisfies the following semantic conditions:

1. for properties:

1.1 x ∈ PN ⇔ ⟨x, SN(p)⟩ ∈ EXTN(SN(tp)),

3.4 (⟨x, y⟩ ∈ EXTN(SN(sa)) ∧ ⟨x, z⟩ ∈ EXTN(p)) ⇒ ⟨y, z⟩ ∈ EXTN(p),

3.5 (⟨x, y⟩ ∈ EXTN(SN(sa)) ∧ ⟨y, z⟩ ∈ EXTN(SN(sa))) ⇒ ⟨x, z⟩ ∈ EXTN(SN(sa)), 3.6 (⟨x, y⟩ ∈ EXTN(SN(sa)) ∧ ⟨x, z⟩ ∈ EXTN(p)) ⇒ ⟨y, z⟩ ∈ EXTN(p),

3.7 ⟨x, y⟩ ∈ EXTN(SN(df )) ⇒ ( ∃

p∈PN, z∈ RN(⟨x, z⟩ ∈ EXTN(p) ⇐⇒ ⟨y, z⟩ ∉ EXTN(p))∨ (⟨z, x⟩ ∈ EXTN(p) ⇐⇒ ⟨z, y⟩ ∉ EXTN(p))).

The intention of these conditions is to determine the syntax of the language.

4.1.2. Annotated RDF with Trust Metrics

In this subsection we introduce annotated RDF and trust metrics. We also propose inference rules and annotation algebra. Its biggest benefit is that it makes possible the sharing of private and public resources between sites, according to the trust information.

Definition 4.1 (RDF quad). Assume that I is the set of all IRI references, B an infinite set of blank nodes, and L the set of all RDF literals (see Definition 2.1). Let O = I ∪ B ∪ L, S = I ∪ B and M is a set of real numbers, M = {x ∈ R ∶ 0 ≤ x ≤ 1}, then Q ⊆ S × I × O × M is the set of all RDF quads.

If q = ⟨s, p, o, m⟩ is an RDF quad, s is subject, p predicate, o object and m trust metric.

Example 4.1. In this example we present the possible syntax of an RDF quad.

:me rdf:type :Teacher 0.9 . :me foaf:name "John Smith" 1 .

Below is defined a generalization of an RDF graph, which adds a metric function.

Definition 4.2 (Extension of an RDF graph). An extension of an RDF graph is a tuple G = ⟨SO, ARC, f, t, I, larc, M, larcM⟩, where:

1. SO is a set of vertices,

2. ARC is a non empty set of arcs,

3. f ∶ ARC → SO is a function which yields the source of each arc, 4. t ∶ ARC → SO is a function which yields the target of each arc, 5. I is a set of IRIs,

6. larc∶ARC → I is a function mapping arcs to IRIs, 7. M is a set of trust metrics,

8. larcM ∶ARC → M is a function mapping arcs to trust metrics.

We denote by as,p,o,m an arc x ∈ ARC so that t(x) = o, f (x) = s, larc(x) = p and larcM(x) = m.

To define an RDF graph with a metric over Q we need an operation typified by the j-th project projection map, written P rojkj, that takes an element x = ⟨x1, . . . , xj, . . . , xk⟩ of the cartesian product X1×. . . × Xj×. . . × Xk to the value P rojkj(x) = xj. This map is always surjective.

Definition 4.3 (RDF graph with a metric). An RDF graph with a metric is an extension of an RDF graph over Q and it is a tuple GQ= ⟨SOQ, ARCQ, fQ, tQ, IQ, lQarc, MQ, lQarcM⟩, where: SOQ = S ∪ O; ARCQ = Q; fQ = P roj4∣Q1 ; tQ = P roj4∣Q3 ; IQ = I; lQarc = P roj4∣Q2 ; MQ= M; lQarcM=P roj4∣Q4 .

Example 4.2. An example in Figure 4.2 presents an example of an RDF graph with trust metrics constructed manually from a sample FOAF profile. A real in the subset {x ∈ R ∶ 0 ≤ x ≤ 1} is attached to every triple. We can see that information about

#js’s workplace homepage is not too trustworthy, but information about his name is very trustworthy.

<#js> rdf:type foaf:Person 1 .

<#js> foaf:name "John Smith" 0.9 .

<#js> foaf:workplaceHomepage <http://univ.com/> 0.3 .

Figure 4.2: Simple example of model explanation

For operations on quads algebra is needed.

Definition 4.4 (Boolean algebra). Boolean algebra is M = ⟨M, ⊕, ⊗, ⊖, –, ⊺⟩, where:

1. M is a partially ordered set of trust metrics, 2. ⊕ is M binary operation of addition,

3. ⊗ is M binary operation of multiplication, 4. ⊖ is M unary operation of complement, 5. – is the M smallest element,

6. ⊺ is the M largest element.

This structure complies with the following conditions:

— associativity: (a ⊕ b) ⊕ c = a ⊕ (b ⊕ c) ∧ (a ⊗ b) ⊗ c = a ⊗ (b ⊗ c),

— commutativity: a ⊕ b = b ⊕ a ∧ a ⊗ b = b ⊗ a,

— absorption: a ⊕ (a ⊗ b) = a ∧ a ⊗ (a ⊕ b) = a,

— distributivity: a ⊕ (b ⊗ c) = (a ⊕ b) ⊗ (a ⊕ c) ∧ a ⊗ (b ⊕ c) = (a ⊗ b) ⊕ (a ⊗ c),

— complements: a ⊕ ⊖a = ⊺ ∧ a ⊗ ⊖a = –,

for arbitrary elements. For every a, b ∈ M we define a ⪯ b iff for certain c ∈ M we have a ⊕ c = b.

Example 4.3. In this example we present operations on the algebraic structure.

⊖(0.3 ⊕ 0.6) ⊗ 0.2 = (1 − max(0.3, 0.6) ∗ 0.2) = 0.4 ∗ 0.2 = 0.08

Definition 4.5 (M-annotated interpretation). An M-annotated interpretation is defined as a tuple E = ⟨RN, PN, EXTN, SN, LN, LVN, EXTM⟩, where RN, PN, EXTN, SN, LN and LVN are exactly the same as for a simple interpretation (see Definition 2.2).

EXTM is a partial function, which are built from property and each pair of resources (EXTM ∶RN×PN ×RN → M), for which the triple ⟨x, p, y⟩ is in the domain of EXTM iff ⟨x, y⟩ ∈ EXTN(p), with p ∈ PN, x, y ∈ RN.

The set of classes CN is based on EXTN and also defined in exactly the same way as for a simple interpretation (see Subsection 4.1.1). In the same way, CEXTM based on EXTM. CEXTM is the partial function CEXTM ∶ CN ×RN → M. {⟨c, r⟩ ∶ c ∈ CN ∧r ∈ CEXTN(c)} is a domain of a partial function CEXTM and CEXTM(c, r) = EXTM(r, SN(tp), c), where c ∈ CN, r ∈ CEXTN(c). The interpretation is presented in Figure 4.3.

Example 4.4. In this example we present an M-annotated interpretation. This interpretation E for vocabulary V is given by:

RN = {a, b, c, α, β, γ, Ω, Υ, Φ, Ψ}

PN = {α, β, γ}

EXTN =α → {⟨a, b⟩}, β → {⟨a, Ω⟩}, γ → {⟨b, c⟩}

SN = :JS → a, foaf:Person → b, foaf:Agent → c LN = ∅

LVN = {Ω}

EXTM = ⟨a, α, b⟩ → {Υ}, ⟨a, β, Ω⟩ → {Φ}, ⟨b, γ, c⟩ → {Ψ}

The interpretation E evaluates all three quads with true:

SN("0.9") = Υ ∈ EXTM(a, α, b) = EXTM(SN(:JS), SN(rdf:type), SN(foaf:Person)) SN("0.5") = Φ ∈ EXTM(a, β, Ω) = EXTM(SN(:JS), SN(foaf:name), SN("John Smith"))

SN("0.6") = Ψ ∈ EXTM(b, γ, c) = EXTM(SN(foaf:Person), SN(rdfs:subClassOf), SN(foaf:Agent))

literals

simple typed

metric

IRIs

LVN RN PN

LN SN

vocabulary V interpretation E

Figure 4.3: Example of M-annotated interpretation

We define the extended simple interpretation N = ⟨RN, PN, EXTN, SN, LN, LVN⟩, which extends E that satisfies RN ⊆RN, PN ⊆PN, p ∈ PN Ô⇒EXTN(p) ⊆ EXTN(p), SND(SN)= SN, LND(LN)= LN and LVN ⊆ LVN, where D(f ) denotes the domain of function f . We also consider a partial function E ∶ PN ×RN → M , which satisfies D(EXTM) =D(E ) and EXTM(x, p, y) = E (x, p, y), where ⟨x, p, y⟩ ∈ D(EXTM).

We construct the trust function EXTM, which is a natural extension of E , so that the N structure enriched with EXTM is an interpretation. Additionally, we intend that it should comply with condition E (x, p, y) ⪯ EXTM(x, p, y), where ⟨x, p, y⟩ ∈ D(E ).

Let us consider to this end an auxiliary partial function E∶RN ×PN×RN → 2M, so that D(E) = {⟨x, p, y⟩ ∶ p ∈ PN∧ ⟨x, y⟩ ∈ EXTM(p)} and

E(x, p, y) =

⎧⎪

⎪⎪

⎪⎪

⎪⎩

{E (x, p, y)} if ⟨x, p, y⟩ ∈ D(EXTM)

∅ otherwise

We will denote by (x, p, y, m) the dependence m ∈ (EXTM)(x, p, y). Next, we enrich each of the sets (EXTM)(x, p, y) by the inference rules from Table 4.1. This table presents inference rules, which are based on the conditions (see Subsection 4.1.1). In the table x, y and z represent properties, X, Y and Z represent classes, A, B and C represent individuals and m, n are trust metrics.

Lastly, the partial function EXTM is determined by the dependence D(EXTM) = D(E) and EXTM(x, p, y) = inf E(x, p, y), where ⟨x, p, y⟩ ∈ D(E) and inf E(x, p, y) is evaluated by relation ⪯. Finally, E = ⟨RN, PN, EXTN, SN, LN, LVN, EXTM⟩ is the extended interpretation of M-annotation.

R1(A,x,B,m) (A,x,B,n)

Example 4.5. Figure 4.4 shows the example of a graph with incomplete trust metrics from a sample FOAF profile. As in the previous example, a real number is attached to every RDF triple. The calculations are in the general form (a ⊗ b) ⊕ (c ⊗ d), and the calculations for the inferred RDF triple are performed by suitably instantiating the operators ⊗ and

⊕. The rules used are presented in Table 4.1. The annotation for the inferred triple is calculated as max(0.9∗0.6; 0.5∗0.7) = 0.54 (consider rules R1, R8 and R11from Table 4.1).

<#DT> <http://dbpedia.org/>

Figure 4.4: Example of using inference rules

4.1.3. Querying Trust with Extended SPARQL

In this subsection we propose a SPARQL extension in such a way that this trust-aware idea extends the query language by new concepts enabling the usage of trust metrics in queries and describing access to the trust values.

We propose the TRUST VAR clause. It serves as a new variable, which becomes part of the query result. This variable, like a SPARQL variable, can be used for sorting the results, filtering them etc.

We also introduce the TRUST VAL clause. The clause comprises of a pair of numbers in parentheses, where the first value denotes a lower bound and the second is an upper bound. This range limits the query result. The TRUST VAR and TRUST VAL clauses can be mixed together. Both clauses should be used inside the WHERE clause. Listing 4.1 presents a grammar fragment of extended SPARQL in the EBNF variant as defined in [116]. The new terminals are placed in lines 6-10.

1 G r a p h P a t t e r n N o t T r i p l e s ::= G r o u p O r U n i o n G r a p h P a t t e r n |

Example 4.6. In this example we present a variable as a query in RDF data presented in Example 4.2 in trust-aware SPARQL. This solution gives us one way in which the selected variables can be bound to RDF terms so that the query pattern matches the data. The result sets give all the possible solutions.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

Example 4.7. In this example we present the trust range in RDF data presented in Example 4.2 in trust-aware SPARQL. This solution gives us one way in which the selected variables can be bound to RDF terms so that the query pattern matches the data. The result sets give all the possible solutions.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

4.1.4. Mapping into the RDF model

In this subsection the mapping from our approach to the RDF model is presented. It is important because of compatibility with existing systems.

We propose simple ontology for the trust metric. We define the trustValue property, where a trust metric can be stored. A range of the property uses the subset {x ∈ R ∶ 0 ≤ x ≤ 1} as a typed literal. Listing 4.2 presents Simple Trust Ontology (STO) in Terse RDF Triple Language [89].

We present two methods for mapping our proposal to an RDF model: reification of the statement and named graphs [31].

input : graph with trust GQ output: graph GT

8 insert triple (US, PS, s);

9 insert triple (US, PP, p);

10 insert triple (US, PO, o);

11 get metric from q = (s, p, o, m);

12 insert triple (US, PV, m);

Algorithm 4.1: Mapping to reification statements

The first method is based on the built-in vocabulary intended for describing RDF statements. Our reification proposal consists of the type Statement, and the properties:

subject, predicate, object, and trustValue. The Algorithm 4.1 presents the process of transformation in our approach, which uses RDF reification.

The second method is based on named graphs [31]. The Algorithm 4.2 presents the process of transformation, which uses named graphs.

Example 4.8. In this example we present mapping RDF quads (shown in Example 4.1) into reification statements. This example is expressed in Terse RDF Triple Language [89].

_:st rdf:type rdf:Statement ;

rdf:subject :me ;

rdf:predicate rdf:type ;

rdf:object :Teacher ;

sto:trustValue "0.9"^^xsd:float .

input : graph with trust GQ

output: named graph N G, default graph GT

1 foreach q ∈ Q do

2 create unique named graph UNG;

3 get triple from q = (s, p, o, m);

4 insert (s, p, o) into UNG;

5 create subject UNG;

6 create predicate ”trustValue”;

7 get metric from q = (s, p, o, m);

8 create object m;

Algorithm 4.2: Mapping to Named Graphs

Example 4.9. In this example we present mapping RDF quads (shown in Example 4.1) into named graphs. This example is provided in TriG [20].

metric:tg {

:me rdf:type :Teacher }

metric:tg sto:trustValue "0.9"^^xsd:float .

However, the second method has two problems: many pure RDF providers would not be forwards-compatible with such quad-centric data representation and none of the named graph syntaxes are standards at the moment. But the second approach allows for more compact annotation serialization than the first method.

Powiązane dokumenty