• Nie Znaleziono Wyników

Matching Subgraphs and Generating Views

5. Operations on RDF Graph Store

5.1. Matching Subgraphs and Generating Views

One of the major problems of the RDF graph store is the method of subgraph selection and generation of collection views. Taking into account that graph stores exist in various environments, we should ask how best to facilitate connecting data from heterogeneous sources and querying them. In this section we propose a new way of querying based on IRI references that is a solution to this problem.

This section presents graph store management operations, query operations on collections and the way to generate collection views. We also propose mapping algorithms to the SPARQL language.

5.1.1. RESTful Management

In this subsection we discuss the idea of using RESTful access to the graph store’s data, in particular the set of management operations supported by the graph store and HTTP response status codes.

The most essential of the RESTful mechanisms, as mentioned in [46], is the fact that the REST-style architecture components are deployment independent and scalable.

Following [47], the interfaces are general, and intermediary components can reduce latency and enforce their security. Furthermore, the client-server separation of concerns reduces the complexity of connector semantics, simplifies component implementation, increases

the scalability of pure server components, and improves the effectiveness of performance tuning.

Definition 5.1 (Representation). Let representation Rep be the finite set of all representations, each consisting of data Dt, headers H, and media type identifier M T , then Rep ⊆ Dt × 2H ×M T .

Definition 5.2 (Resource message, request, response). A resource message is a request or response. Let request Req be the finite set of valid resource requests, each consisting of HTTP methods M th, resource identifiers RI (defined in [43]), and representation Rep, then Req ⊆ M th × RI × Rep and response Res ⊆ Rep.

The supported media types of the proposed graph store can be represented by RDFJD (see Section 4.2) or any other valid type. Requests to the graph store depend on the Accept HTTP headers and the responses depend on the Content-Type header. Responses can use vocabularies, such as HTTP in RDF defined in [69] and/or Representing Content in RDF defined in [70], to express details of success or failure (in particular, error messages).

It can be useful to decide which request should or should not be sent.

We define two meta operations:

1. management operation, 2. query operation.

Definition 5.3 (Graph management operation). A graph management operation OpGraphM anagmentGSD is a function that accepts the arguments A and transforms a document-oriented graph store GSD into another document-oriented graph store GSD : OpGraphM anagmentGSD(A) = GSD .

Arguments should be in the IRIs form or an empty set. The result is either GSD in the case of correct execution or GSD in the case of an error. These operations allow one to manipulate graphs:

1. OpList(). It selects all triples from a graph store.

2. OpCreate(u) with u ∈ I. It creates a graph in a graph store.

3. OpLoad(u1, u2) with u1 ∈ I and u2 ∈ I. It reads an RDF document, and inserts its triples into the graph in a graph store.

4. OpDrop(u) with u ∈ I. It removes the specified graph from a graph store.

5. OpInf o(). It returns information about a graph store. It may display SPARQL Service Description [115].

Example 5.1. In this example we present the removal of a collection in an abstract form.

OpDrop("http://example.org/users")

Example 5.2. In this example we present a request to remove the collection, which is equivalent to abstract Example 5.1. http://example.org/users is a possible IRI query.

The request uses:

— DELETE method,

— collection name, which is a concatenation of request-uri and Host header,

— content-types that are acceptable (RDFJD).

DELETE /users HTTP/1.1 Host: example.org If-Match: "*"

Accept: application/x-rdfjd+json; charset=utf-8

5.1.2. RESTful Querying

In this subsection we propose the set of query operations supported by the graph store and syntax for the PATCH HTTP method [44].

Definition 5.4 (Quad pattern). Let V be the set of all variables, which is infinite and disjoint with I, B, L (see Definition 4.1) then QP ⊆ (S ∪ V) × (I ∪ V) × (O ∪ V) × (M ∪ V) is the set of all RDF quad patterns.

Definition 5.5 (Graph query operation). A graph query operation OpGraphQueriesGSD is a function that accepts the argument B and transforms a document-oriented graph store GSD into another document-oriented graph store GSD : OpGraphQueriesGS(B) = GSD .

Arguments should be in the RDF triples form, triple patterns form or an empty set.

The operation performs the described transformation of the graph store either completely or leaves the graph store unchanged. These operations allow one to manipulate RDF quads:

1. OpSelect(qp) with {qp ∶ qp ∈ QP }. It combines the operations of projecting from the graph store.

2. OpInsert(q) with {q ∶ q ∈ Q}. It adds triples into the graph store.

3. OpDelete(q) with {q ∶ q ∈ Q}. It removes triples from the graph store.

4. OpAsk(qp) with {qp ∶ qp ∈ QP }. It tests whether a query has a solution or not.

5. OpDescribe(). It returns a result containing data about graph store resources.

Let QP1 and QP2 be quad patterns, then (QP1 AND QP2) and (QP1 UNION QP2) are quad patterns. Following Definition 4.1 and Definition 5.4, we define a mapping µ from V to Q is a partial function µ ∶ V → Q. The mapping should comply with rules D(µ1∪µ2) = D(µ1) ∪D(µ2), V ∈ D(µ1)/D(µ2) ⇒ (µ1∪µ2)(V ) = µ1(V ) and V ∈ D(µ2)/D(µ1) ⇒ (µ1∪µ2)(V ) = µ2(V ) (where D(µ) denotes the domain of the function µ). Two mappings µ1 and µ2 are compatible when for all V ∈ D(µ1) ∩ D(µ2) holds µ1(V ) = (s1, p1, o1, m1) and µ1(V ) = (s2, p2, o2, m2), where s1 = s2, p1 = sp, o1 = o2 and (µ1∪µ2)(V ) = (s1, p1, o1, m1⊗m2) (operation ⊗ is defined in Definition 4.4) and denoted by µ1 ∼ µ2. The mapping with an empty domain is denoted by µ. Note that this mapping is compatible with any other mapping. Let W be a set of variables, W ⊆ V, then the restriction of µ to W (denoted by µ∣W) is a mapping so that D(µ∣W) =D(µ) and µ∣W(V ) =µ(V) for every V ∈ D(µ) ∩ W . Note that if W = ∅, then µ∣W.

Definition 5.6 (Evaluation of a graph pattern over a graph with a trust metric). The evaluation of a graph pattern QP over a graph with a trust metric GQ, denoted byJQP KGQ

and is defined recursively as follows:

1. JQP KGQ = {µ ∶ V → Q ∣ ∀q∈GQ D(µ) ⊆ V(q)} where D(µ) denotes the domain the function µ and V(q) is the set of variables mentioned in q,

2. JQP1 AND QP2KGQ = {µ1∪µ2∣µ1∈JQP1KGQ∧µ2 ∈JQP2KGQ∧µ1 ∼µ2}, 3. JQP1 UNION QP2KGQ =JQP1KGQ∪JQP2KGQ.

Example 5.3. In this example we present a conjunctive query, which selects all names.

Each variable of the query header appears in at least one quad pattern. The quad patterns of the query make up the set QP .

?n

<-(?p, rdf:type, foaf:Person, ?t1) (?p, foaf:name, ?n, ?t2)

Example 5.4. In this example we present a request selecting the RDF quads from the collection, which is equivalent to abstract Example 5.3. This example is provided in RDFJD.

http://example.org/users/-p,rdf.type,foaf.Person,-t1/-p,foaf.name,--n,⤦

Ç-t2 is a possible IRI query. The request uses:

— GET method,

— collection name, which is a concatenation of the first part of request-uri and Host header (http://example.org/users),

— query, which is the second part of request-uri,

— content-types that are acceptable (i.e. RDFJD).

GET /users/-p,rdf.type/foaf.Person,-t1/-p,foaf.name,--n,-t2 HTTP/1.1 Host: example.org

Accept: application/x-rdfjd+json; charset=utf-8

Example 5.5. In this example we present the response of the query from Example 5.4.

The response includes:

— success status code (200),

— Content-Type header of RDFJD document,

— empty RDFJD document.

HTTP/1.1 200 OK

Server: GraphStore/1.0

Content-Type: application/x-rdfjd+json; charset=utf-8 {

"_subject": "http://example.com/voc#me",

"foaf.name": {

"_value": "John Smith",

"_trust": 0.5 }

}

We also propose two regular expressions (called extended quad predicate) over the vocabulary I: ∣, which is disjunction and ∗, which is concatenation. An extended quad predicate is a tuple QPE of the form ⟨s, px, o, m⟩, where px is a predicate with proposed extensions. If px = x1∣x2, then JQKGQ is the result of the evaluation of the pattern (⟨s, x1, o, m1⟩UNION ⟨s, x2, o, m2⟩) over GQ: JQKGQ = {µ ∣ µ ∈J⟨s, x1, o, m1⟩KGQ∨ J⟨s, x2, o, m2⟩KGQ}, where Q is a set of quads ⟨s, px, o, ⋅⟩. If px = x1∗x2, then assuming that ?V is a variable and JQKGQ is the result of the first evaluation of the pattern (⟨s, x1, ?V, m⟩ AND ⟨?V, x2, o, n⟩) over GQ: JQKGQ = {(µ1∪µ2) ∣µ1 ∼ µ2∧ ∀?V,m1,m2 µ1 ∈ J⟨s, x1, ?V, m1⟩K ∧ µ2∈J⟨?V , x2, o, m2⟩K}, where Q is a set of quads ⟨s, px, o, ⋅⟩. To translate an extended predicate quad we propose two functions:

1. disj(x1, x2, m1, m2) – a sequence path of x1 followed by x2 with metrics m1 and m2, 2. conc(x1, x2, m1, m2)– a alternative path of x1 or x2 with metrics m1 and m2.

Example 5.6. In this example we present an extended quad predicate with disj function.

The query:

-a,foaf.name,John,-t1/

-a,foaf.know|foaf.name,--name,-t2|-t3 is equivalent to the query:

-a,foaf.name,John,-t1/

-a,foaf.know,-b,-t2/

-b,foaf.name,--name,-t3

Taking into consideration that delete and insert operations are often combined we introduce a new Update Ontology (UO) for OpU pdate(qp1, qp2), which consists of OpInsert and OpDelete. It uses the HTTP PATCH method, which applies partial modifications to an RDF graph store. In this approach we do not use IRI to define, which triples should be added and/or removed. We define it in the body of the request message (RDF payload) and we use a special RDF document which uses UO. The main properties of UO are delete, which contains a remove pattern and insert, which contains an add pattern. These patterns are based on the built-in vocabulary intended for describing RDF statements. There is also an Update class, which is the domain for delete and insert. The PATCH method is used to apply partial modifications to an RDF graph store. Listing 5.1 presents UO in Terse RDF Triple Language [89].

HTTP method Management operation Query operation

Table 5.1: Mapping graph store operations to HTTP methods

1 <# delete > a owl : O b j e c t P r o p e r t y ;

Example 5.7. In this example we present the update of the object of RDF quads (where the subject is http://example.com/voc#me, predicate is http://xmlns.com/foaf/0.1/name and trust metric is free to choose) to James Johnson. This listing presents an example in Terse RDF Triple Language [89].

<> a u:Update ;

u:delete ([ u:subject <http://example.com/voc#me> ; u:predicate foaf:name ;

u:object _:old ; u:trust _:t ; ]) ;

u:insert ([ u:subject <http://example.com/voc#me> ; u:predicate foaf:name ;

u:object "James Johnson" ; u:trust _:t ; ]) .

5.1.3. Collection Views and Query Syntax

In this subsection the collection view and query syntax are presented. We extend the concept of a data collection from Section 4.2 with the possibility of defining a collection by using views on other collections.

These views can be used to transform RDF from other collections and to include these parts in a base collection. The collection views are generated by combining RDF data from disparate graph stores or other Semantic Web Services [77]. It extends the collection to support queries that merge data distributed across the RDF graph stores. The aim is to combine them in a value-adding manner in order to create useful information and knowledge. The content elements of these collections come in the form of various RDF formats and ontologies. Our proposed graph store can be a proxy for other graph stores and can be used as a resource in RDF documents. The proposal works in the federated and decentralized environment as delegated queries in a graph store proxy.

Definition 5.7 (Collection view). A collection view is a tuple CV = ⟨Cb, [C1e, . . . Cne], η⟩, where:

1. Cb is a base collection (see Section 4.2),

2. Ce is an external collection, [C1e, . . . Cne] is a list of Ces,

3. η is a mapping function from a list of Ces to a base collection η ∶ [C1e, . . . Cne] →Cb.

A collection view CV = ⟨Cb, [], ∅⟩ is equivalent to collection Cb = ⟨r, v, GQT⟩ where r ∈ I is the provenance of a graph and v ∈ L is a value, which can be interpreted as trust metrics (see Section 4.2). Generating the view is described by OpGen(qp1, qp2) where qp1 ∈QP and qp2 ∈QP .

Example 5.8. In this example we present a collection view proxy, which generates a changed RDF document, where foaf.name is changed to vcard.FN predicate. In this case there are six query string parameters allowed, which are presented in Table 5.4 and in Table 5.3. Serialization depends on the MIME type in the Accept header.

http://example.org/users/.gen/-x,foaf.name,-n,-t/⤦

Ç-x,vcard.FN,-n,-t?sn=1&dn=1

Character Description

simple literal without language tag

@ simple literal with language tag + typed literal

Table 5.2: Special characters in RESTful queries

We also propose the syntax of a RESTful query based on the IRI. A variable is prefixed by minus, and it is not part of the variable name. Quad patterns should begin after the graph name. A variable, which is in the query’s header should be prefixed by double minus. Multiple quad patterns are separated by a slash. The elements of a quad pattern are separated by a comma. Note that the fourth element of a quad pattern can either be a single variable (which serves as a new value) or a pair of numbers (where the first value denotes a lower bound and the second an upper bound). All prefixes of abbreviated IRIs use dot and should be associated with well-known IRI as defined in [85]. If prefixes are not in well-known IRI or do not include dot, then it is a literal. A typed literal is suffixed by a plus sign and the IRI to the data type. A blank node is prefixed by underscore. The extended predicate is associated with a vertical bar (disjunction) and asterisk (sequence).

All possible special characters in queries are presented in Table 5.2. The RESTful query grammar is defined in Appendix A.

The greatest benefit of this syntax is the fact that it can use variable expansion defined in [57]. In Table 5.4 and Table 5.3 we present three query string parameters, which can be used in a GET method. The most complicated is a filter (f), which may include relational expressions (<, >, <=, >=), equality expressions (=, ! =) and/or functions defined in [67].

The results of a query can be divided into discrete pages of data using pagination (p=a,b, where a is a page number and b is the number of statements on one page). All possible response codes are presented in Table 5.5.

Character Description

c[] table of external collections sn number of source IRI fragment dn number of destination IRI fragment Table 5.3: Query string parameters in collection views

Character Name Description

f filter restrict results to those for which the expression evaluates to true

l limit an upper bound on the number of results returned s skip suppress number of results returned before starting p pagination dividing content into discrete pages

c count aggregate function, which returns the number of times a given expression has a bound

Table 5.4: Query string parameters in projecting

5.1.4. Mapping from SPARQL

In this subsection the mapping from SPARQL to our approach is presented. It is important because of migrating from existing endpoints and other SPARQL services [115].

Algorithm 5.1 maps SPARQL clauses to operations, which are presented in Subsection 5.1.1 and Subsection 5.1.2. Moreover, the algorithm can be modified to equivalent values, which are shown in Table 5.6.

Status code HTTP method for graph store request Response

200 GET, POST, PUT, DELETE, data

HEAD, OPTIONS, PATCH

204 POST, PUT, DELETE, PATCH empty

304 GET empty

400 GET, POST, PUT, DELETE, error message

HEAD, PATCH

404 GET, PUT, DELETE, HEAD, PATCH empty

409 POST, PUT, DELETE, PATCH error message

Table 5.5: Relationship between HTTP status codes and methods

Update operation Equivalent SPARQL clauses

OpU pdate(∅, qp2) DELETE

OpU pdate(qp1, ∅) INSERT

OpU pdate(∅, qp2) CLEAR

OpU pdate(qp1, ∅) ADD

OpDrop(u); OpU pdate(qp1, ∅)) COPY

OpDrop(u); OpU pdate(qp1, ∅); OpDrop(u) MOVE

Table 5.6: Relationship between update operation and SPARQL clauses

Powiązane dokumenty