Index of /rozprawy2/10410

Pełen tekst

(1)AGH University of Science and Technology Faculty of Electrical Engineering, Automatics, Computer Science and Electronics. Ph.D. Thesis Michał Grega. Performance Analysis of a Query by Example Image Search Method in Peer to Peer Overlays. Supervisor: Prof. dr hab. inż. Zdzisław Papir.

(2) AGH University of Science and Technology Faculty of Electrical Engineering, Automatics, Computer Science and Electronics Department of Telecommunications Al. Mickiewicza 30, 30-059 Kraków, Poland tel. +48 12 6173937 fax +48 12 6342372 www.agh.edu.pl www.kt.agh.edu.pl. c Michał Grega, 2011 Copyright All rights reserved Printed in Poland.

(3) Acknowledgements It is a pleasure to thank the many people who made this thesis possible. First of all I would like to show my gratitude to my supervisor, Prof. Zdzisław Papir. I am grateful for his guidance, his valuable remarks and continuous support. Without the kind and warm atmosphere of knowledge exchange and support he created in our research team it would be much harder, if not impossible for me to finish this work. I am grateful to Dr. -Ing. Nicolas Liebau and his team from the Technische Universit¨at Darmstadt, who advised and supported me during and after the CONTENT project. I would also like to wholeheartedly thank my colleagues, Mikołaj Leszczuk and Lucjan Janowski, who devoted their knowledge and time to discuss my work. Special credit goes also to my friends from our room 309, Kasia Kosek-Szott, Szymon Szott and Piotrek Romaniak. We went together through the PhD course supporting and helping each other not only in our professional career but also on the private side of our lives. My warm thoughts go to my parents and their enormous effort, which brought me to this point. Last, but not least, I would like to thank my wife, Kate, for her patience and loving words of support, which allowed me to advance in times of doubt. Thank you!.

(4)

(5) Abstract Nowadays the traditional division between content producers and consumers is becoming blurry. A new type of user emerges, the prosumer, who is at the same time the producer and the consumer of multimedia content. This is the direct result of easy access to content creation tools, such as digital cameras and camcoders, the popularity of content manipulation tools, such as open source image editing tools and open access to multimedia hosting services, such as Google Picasa, YouTube, Flickr and many others. This mass production and availability of content creates a new challenge. It is not enough to create content and make it available in the Internet. It is necessery to allow the content to be easily searched for and accessed by other users. We are used to searching for items in the network with sets of keywords. Users, however, have no incentive to tag each of their movies and photographs with keywords. This is where advanced multimedia search methods, such as Query by Example can be successfully utilized. Another emerging problem is the economical effort required to set up a new multimedia service. The volume of the multimedia data, requirements on storage space, computation power and bandwidth allow only the largest market vendors to easily introduce new multimedia services. An answer to this problem is the departure from the traditional client–server architecture to a Peer-to-Peer network, in which network upkeep costs are shared among the network users. This dissertation presents the work in the cross-domain area of Peer-to-Peer networking and advanced multimedia search methods. The author identifies and solves the problems encountered during the application of the Query by Example search technique in both structured and unstructured Peer-to-Peer overlays. The author proves that implementation of Query by Example service in Peerto-Peer overlays is possible while maintaining the quality offered by centralized solution and the benefits of Peer-to-Peer overlays at the same time..

(6)

(7) Contents Acknowledgements. iii. Abstract 1 Introduction 1.1 Motivation and Goal of the Research 1.2 The Concept of the P2P QbE System 1.3 Major Research Problems . . . . . . 1.4 Research Approach . . . . . . . . . . 1.5 Thesis . . . . . . . . . . . . . . . . . 1.6 Cooperation and Publications . . . . 1.7 Structure of the Dissertation . . . . .. v. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. 2 State of the Art 2.1 P2P Architectures . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Client-Server Architecture . . . . . . . . . . . . . . . . 2.1.2 Centralised P2P . . . . . . . . . . . . . . . . . . . . . . 2.1.3 Unstructured P2P . . . . . . . . . . . . . . . . . . . . 2.1.4 Hybrid P2P . . . . . . . . . . . . . . . . . . . . . . . . 2.1.5 Structured P2P . . . . . . . . . . . . . . . . . . . . . . 2.1.6 Summary of P2P Architectures . . . . . . . . . . . . . 2.2 The Metadata Management . . . . . . . . . . . . . . . . . . . 2.2.1 The Metadata Systems – Dublin Core . . . . . . . . . 2.2.2 The Metadata Systems – MPEG-7 . . . . . . . . . . . 2.3 Search Methods and Architectures of Metadata . . . . . . . . 2.3.1 Classification Based on Input Method . . . . . . . . . . 2.3.2 Classification of Search Methods Based on Complexity 2.3.3 Advanced Search Methods and Users . . . . . . . . . . 2.3.4 Summary on Classification of Search Methods . . . . . 2.3.5 The Architectures of Metadata . . . . . . . . . . . . . 2.4 Search Benchmarking . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . .. . . . . . . . . . . . . . . . . .. . . . . . . .. . . . . . . . . . . . . . . . . .. . . . . . . .. 3 3 6 7 10 12 12 14. . . . . . . . . . . . . . . . . .. 15 15 16 16 17 17 18 19 20 20 21 24 24 26 26 27 27 28.

(8) viii. CONTENTS. . . . . .. 30 31 36 37 38. 3 Problem Approach 3.1 Image Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Measurement of QbE Accuracy in Local Database . . . . . . . . . . 3.2.1 Measurement Methodology . . . . . . . . . . . . . . . . . . . 3.2.2 QbE Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Description of the Experiment . . . . . . . . . . . . . . . . . 3.2.4 Experiment Execution and Results . . . . . . . . . . . . . . 3.2.5 Experiment Conclusions . . . . . . . . . . . . . . . . . . . . 3.3 Application of QbE in an Unstructured P2P Overlay . . . . . . . . 3.3.1 Implementation of the QbE Service . . . . . . . . . . . . . . 3.3.2 File Distribution and Popularity . . . . . . . . . . . . . . . . 3.3.3 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . 3.3.4 Analysys of the Results . . . . . . . . . . . . . . . . . . . . . 3.3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Application of QbE in a Structured P2P Overlay . . . . . . . . . . 3.4.1 Routing in CAN Overlay . . . . . . . . . . . . . . . . . . . . 3.4.2 Implementation of the QbE Service in the CAN Overlay . . 3.4.3 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . 3.4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Comparision of QbE in Structured and Unstructured P2P Overlays 3.6 Research Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . .. 39 39 40 40 42 44 45 46 47 47 47 49 50 53 54 54 56 57 58 59 59 60. 4 Summary. 63. 2.5 2.6 2.7. 2.4.1 The Benchmarked Parameters . 2.4.2 Search Accuracy Benchmarking Simulation Tools . . . . . . . . . . . . Related Work . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . ..

(9) List of Abbreviations ANSI. American National Standards Institute. CAN. Content Addressable Network. DCEMS. Dublin Core Metadata Element Set. DCMI. Dublin Core Metadata Initiative. DHT. Distributed Hash Table. DS. Description Schemes. ITU. International Telecommunications Union. MP. Mega Pixels. MPEG. Moving Picture Experts Group. NISO. National Information Standards Organisation. OCR. Optical Character Recognition. P2P. Peer-to-Peer. PSTN. Public Switched Telephone Networks. QbE. Query by Example. QoE. Quality of Experience. SATIN. European Doctoral School on Advanced Topics In Networking. TTL. Time To Live. VoIP. Voice over IP. XM. MPEG-7 Experimentation Model.

(10)

(11) Chapter 1 Introduction This introductory Chapter presents the goal of the research, the thesis of the dissertation and the methodological approach used to solve the problem stated in the thesis. A discussion of the relevant author’s publications and a presentation of the overall dissertation structure is provided as well.. 1.1. Motivation and Goal of the Research. The goal of the presented research is to analyse the possibility of developing a content-based Query by Example (QbE) search system for a Peer-to-Peer (P2P) overlay. System performance will be assessed in terms of search accuracy and badwitdth utilization. A Peer-to-Peer system is a self organising system consisting of end-systems (called “peers”) which form an overlay network over the existing network architecture. The structure of the P2P network can significantly differ from the structure of the underlying physical network (Figure 1.1). Peers offer and consume services and resources. They are also characterised by a significant amount of autonomy. Services are exchanged between any participating peers. Such networks are gaining more and more popularity and attention both from users and researchers. On one hand this growing interest can be explained by numerous P2P based applications, ranging from simple file sharing to more sophisticated services such as Voice over IP (VoIP) and online gaming. On the other hand, P2P networking is a challenging topic for researchers because of its distributed architecture, networked cooperation of the peers and lack of central authority (in most of P2P network architectures). This research has been focused on one of the most popular applications of P2P which is file sharing. File sharing in P2P networks became significantly popular since 2000. A study performed in 2004 by the CacheLogic company gave a.

(12) 4. Introduction. Figure 1.1: The concept of the overlay P2P network over an existing physical network conclusion that “Traffic analysis conducted as a part of an European Tier 1 Service Provider field trail has shown that P2P traffic volumes are at least double that of http during the peak evening periods and as much as tenfold at other times” [59]. Figure 1.2 shows that in 2006 P2P services in Europe were responsible for 70% of the overall global traffic. A more recent study [61] performed by the Canadian company Sandvine shows that P2P traffic is responsible for 35% of the downstream broadband traffic and over 75% of the upstream traffic in North America. A study performed by Cisco company in 2010 [2] shows, that the share of the network traffic caused by P2P applications is being surpassed by traffic caused by video streaming (YouTube like services), but still is and will be the second largest source of traffic in the networks. While the numbers vary between years and studies due to different test regions and methodologies the bottom line remains the same – P2P is responsible for most of the current global network traffic. Rapid global growth in multimedia data has created new, previously unknown business opportunities. Data search itself has grown into a business sector which has allowed companies such as Google and Yahoo to become brands recognised worldwide. Multimedia is now rapidly moving into every-day life. Users are not only able to access media via radio, television and the Internet but now also create their own media content. Digital cameras, handheld camcoders, media enabled mobile.

(13) 1.1 Motivation and Goal of the Research. 5. Figure 1.2: The percentage of Internet traffic generated by the P2P applications [59] phones and the wide spread availability of content creation and editing software which were previously available only to professionals, all contribute to this data volume growth. The traditional division of users has to be extended by a new category – the “prosumers” who are, at the at the same time the “producers” and “consumers” of multimedia. Data is also being shared within user communities via Web 2.0 services such as YouTube, Google Video and Flickr and also used in media industries. Usergenerated content is of growing importance to news agencies, as it is often the fastest way to come up with breaking news. Decentralised architectures are becoming more significant in sharing user–created content. This growth of the amount of the multimedia data causes a new problem of effective search to emerge. Usually, published media are rarely accompanied by a comprehensive textual annotation. Most users have no incentive for adding even a few simple few-keyword metadata for their multimedia object. The usual keyword–based search becomes ineffective and not possible in many unannotated repositories. This is the place where Query by Example (QbE) search may become useful as it requires no textual description of the multimedia object to work..

(14) 6. Introduction. Another research area of growing importance is the Quality of Experience (QoE) which is a measure of a subjective quality experienced by the user. In the early days of the Internet the satisfaction of users was of negligible importance, although the complex problem of delivering high-quality service to the end-user was well known from the experience with Public Switched Telephone Networks (PSTN). Nowadays, when the user can choose between different service providers the service with the highest QoE level is the most successful one. In the presented research the proposed QbE solution will be assessed by QoE measurements. This research has been devoted to one type of multimedia content which are still images. Nevertheless, the research approach and the conclusions drawn from the presented results form a firm foundation for research on advanced search methods for other types of multimedia content such as audio, video and 3D objects.. 1.2. The Concept of the P2P QbE System. The basic concept of the P2P QBE system is to support content-based QbE retrieval of images stored in a distributed manner in a P2P file sharing network. A use case of such a system is presented in Figure 1.3. The mechanics of the generic QbE system are as follows: the user provides an example of a searched item to the system in order to retrieve similar objects from the repository. The system calculates, on the client side, the features which describe the example. The features extracted from the example are called “low-level metadata”. The concept and definition of metadata is provided in Chapter 2.2. Low-level metadata is a numeric vector which represents signal properties of the image. It can be, for example, a dominant colour of the image or an edge histogram. These low-level features can be easily sent over the network, as usually their size in bytes is much smaller than the size of the original media. The features of different media objects can be numerically compared in order to calculate the similarity between the media items. Figure 1.4 depicts the example of a QbE query. In this case a ”dominant colour” descriptor, from the MPEG-7 descriptor set (Chapter 2.2.2) was used to retrieve similar images from a local database. As can be observed, all returned images have a similar dominant colour tone as the provided example, but the content is different. In the distributed scenario (as depicted in Figure 1.3) the low-level metadata is routed trough the P2P overlay to queried nodes. This is the significant difference between the solution proposed in the dissertation and searching within a local database of images. The queried side is a set of the P2P nodes selected by the routing algorithm of the P2P overlay. The.

(15) 1.3 Major Research Problems. 7. The user provides an example Calculation of MPEG-7 descriptor values of the example. Calculation of the MPEG-7 descriptor values of images possesed. Search for similar images in possesion. P2P Overlay Calculate the distance of the example from the images in possession. Querying Node. Queried Node. in the overlay the example e features of Propagate th. Propagate the distance thumbnails s and to the quer ying node Gather the distances/ thumbnails and create a ranking of the results Based on the ranking and thumbnails the user chooses a file to download. Figure 1.3: The use case of the QbE search system queried side stores the pre-calculated low-level metadata of the possessed media. Upon receiving the low-level metadata of the example the queried side calculates the distances of the metadata of the example from the stored metadata (under a predefined metric). The queried side returns the list of distances and thumbnails for the most similar objects to the querying side.. 1.3. Major Research Problems. The presented research integrates two recognised techniques – the QbE method of search and the P2P content delivery method. This is the primary source of new research challenges. The QbE search process, when implemented in a P2P overlay, will require more bandwidth than the traditional keyword query. The extracted features of media items are stored and transmitted in the form of a vector of numbers. While the.

(16) 8. Introduction. Figure 1.4: An example of the QbE search principle volume of this vector is small while compared to the size of the media file it is still significantly larger than a few-byte keyword. This may cause the network to become congested with queries, especially in case of unstructured overlays. Another source of network congestion are the thumbnails of media files. In case of images the thumbnails are the images in a very low resolution. They allow the user to decide whether the resulting image fulfills their search criteria. It is critical to choose the proper overlay architecture for the search service. Different architectures have different properties and not each architecture may be able to provide the QbE service. The main difference between the QbE retrieval system for a local media database and the QbE search system in a P2P network is that in case of the local database search all media in the database are available during the search process. In case of the P2P search the available set of media is only a subset of all media existing in the network. Another effect is caused by the dynamic nature of the network – the participants join and leave the P2P network during its operation. This property of a P2P network is called churn [67]. As churn adds another level of complexity to the presented research problems and it is not the most significant factor influencing the performance of the search service it was decided to analyse the performance of the QbE application in P2P overlay in the absence of churn. Content-based search is resource-consuming as it requires the analysis of the image itself. The extraction of the descriptor values can be done only once for each.

(17) 1.3 Major Research Problems. 9. image and stored for later use, but this can be a problem for large databases. The comparison of the descriptors requires large computing power as the descriptors are typically one dimensional vectors of as much as 200 values. A single comparison can be done almost instantly, but the problem is not scalable in case of large local repositories. Here, the great advantage of the distributed computing power of the P2P overlays can be utilised. A single node is unlikely to be in possession of more than several thousand images. Search in such a small repository can be performed very quickly and all the nodes can do it in parallel. Concluding, the following theoretical and practical problems arise and will be solved in order to introduce the QbE search technique into the P2P architecture. 1. It should be determined which metadata frameworks and QbE techniques are available and which are suitable for use in P2P overlays. 2. It should be recognised which P2P overlays are the best candidates for the introduction of such a service. 3. It should be decided how to compare the performance of the QbE search method in both centralised and P2P environments. Also, a sufficient number of practical experiments has to be conducted in order to analyse the performance of the QbE search in centralised and P2P environments. A comparison of the performance in different environments has to be conducted. The Search time also has to be taken into account. It is very difficult to assess the search time in the simulation environment. Even if such an assessment is made it will lack in precision. This is due to the nature of the simulation environment which simplifies the protocol stack and the behaviour of peers. The protocol stack implemented within the peers in the simulator is simplified when compared to a real implementation in order to allow for resource efficient simulations. The peers are implemented in a uniform way in the simulator, which means that each peer has similar resources in its disposal. This is very different from the real networking environment in which peers differ substantially between each other in terms of available resources and the way these resources are utilised. Also the network links, unlike in the simulator, differ between each other in terms of available bandwidth and latency. In the authors’ belief the only proper method of accurate search time assessment in this case is implementation and measurement in a real network environment. Such a task is out of the scope of the presented research and is taken into account as one of the possible future research directions..

(18) 10. Introduction. 1.4. Research Approach. The proposed methodological approach is depicted in Figure 1.5. In order to reach the research goal of designing the QbE search system for the P2P overlay seven research tasks need to be accomplished. 2. Development of measurement methodology for QbE performance. 1. State of the Art analysis - P2P overlays - QbE techniques. 3. QbE performance analysis in local database. Selection of QbE method for further research. 4. Application of QbE in structured P2P overlay 6. Performance analysis 5. Application of QbE in unstructured P2P. 7. Comparison and conclusions. Figure 1.5: The research approach. 1. The first research task is a detailed analysis of the existing state of the art of both P2P network architectures and QbE search. In the topic of the P2P networks the existing architectures (called overlays) will be analysed in order to specify the requirements for the QbE search system. The analysis of multimedia search methods will allow to get acquainted with the recent achievements in this research area. Special attention will be put on the analysis of the methods of Query by Example of multimedia in local databases. It is foreseen to adapt the methods used in such databases for the need of the designed system. In order to create a system based on extraction of metadata it is required to analyse the standards of metadata extraction, comparison and storage. Also special attention will be put on the analysis of existing, similar systems. The state of the art in both P2P systems and QbE systems are presented in Chapters 2.1, 2.2 and 2.3. 2. The second research task is to develop a method of measurement of the performance of a distributed QbE search service. Both objective and subjective (QoE) metrics will be taken into account. The main objective measures of search performance are the accuracy and search time. These aspects have been chosen during the preliminary research as the most relevant. The existing methods of analysis of the quality of search are dedicated to the multimedia databases. According to the current state of the art there is no popular,.

(19) 1.4 Research Approach. 11. versatile and widely used measurement environment for search services in the P2P networks. The methods of the subjective quality assessment for video and audio multimedia services are well-defined in the International Telecommunications Union (ITU) recommendations. For measurement of the QoE of other multimedia services, including search, there are no standards nor recommendations. A set of proposed measurement methods is described in Chapter 2.4. 3. There are many techniques used for image QbE search. Some of these techniques are defined within the MPEG-7 standard [37] and some, such as the PictureFinder software, are proprietary solutions developed by companies and research institutions. Those methods yield different results and offer different quality of search results. For the purpose of the research of the QbE technique in the P2P environment it is required to identify the method which offers the highest QoE to the service users and is applicable in scope of the presented research. To identify such a method a set of QbE methods will be implemented in a local database of images. The performance of these methods will be assessed and the best QbE method will be selected for implementation in the P2P overlay. The measurements of performance in the centralised database will also serve as a reference for comparison of search performance between centralised and distributed QbE systems. This problem is addressed in Chapter 3.2. 4. It is planned to implement the QbE method selected in Task 3 in two architectures of P2P overlays – structured and unstructured overlays. This will allow for comparison of the performance of the service in these two, significantly different, P2P architectures. The unstructured overlays are of simple construction. No complex query routing algorithms are implemented and the search process is often based on flooding the network with search queries. Such networks are characterised by low search overhead at the cost of the lower performance. The experiment and its results are covered in Chapter 3.3. 5. Structured overlays are characterised by a more complex construction and implemented routing routines. Thanks to that higher performance is delivered. Most of the structured P2P overlays are based on the Distributed Hash Tables (DHT). An example of a structured P2P overlay is CAN (Content Addressable Network). For the research purpose the implementation of the QbE service will be done in the simulation environment. Simulation of the P2P overlays is the only feasible way to research these network architectures. As scalability is of utmost importance it is impossible to create a P2P overlay in a laboratory conditions. Even such structures as PlanetLab offers at most.

(20) 12. Introduction. approximately thousand of nodes, whereas real P2P overlays are composed often of millions of nodes. The simulation environment will be chosen after the analysis of the existing P2P simulation tools. The experiment and results are described in Chapter 3.4. 6. During the sixth stage of the research the performance of the QbE service implementations in the structured P2P overlay and unstructured P2P overlay will be analysed. It is foreseen to analyse the performance of the QbE service in terms of accuracy when compared to the centralized solution. Bandwidth consumption will also be covered. 7. The last stage of the research is to compare the performance of the QbE search service in the local image database (Task 3) and P2P overlays (Tasks 4 and 5). The comparison of the performance of a local and distributed QbE services will allow to prove the thesis of the dissertation.. 1.5. Thesis. The thesis of the dissertation is as follows: It is possible to use a Query by Example mechanism for a search of images based on their low-level description in the unstructured and structured P2P overlays at accuracy comparable to the centralized solutions.. 1.6. Cooperation and Publications. The research presented in this dissertation was partially funded by the CONTENT (Content Networks and Services for Home Users) Network of Excellence (No. 038423). Fruitful cooperation within the CONTENT NoE has resulted in the authors’ participation in the European Doctoral School on Advanced Topics In Networking (SATIN). The results of the research where presented and discussed during the SATIN meetings both with fellow PhD students as well as with senior researchers from European universities. The PhD thesis and research was supported by Nicolas Liebau, PhD Eng. from the Technical University Darmstadt, Germany. Furthermore, the PhD research and preparation of the thesis was funded by a national PhD grant (no. N N517 175537). The subjective tests of the MPEG-7 descriptors were performed for the need of the eContentPlus project “GAMA”..

(21) 1.6 Cooperation and Publications. 13. During the research of the topics described in this dissertation the author has prepared several publications, which are listed and commented below, in their chronological order: 1. Implementation and Application of MPEG-7 Descriptors in Peer-to-Peer Networks for Search Quality Improvement - Introduction to Research [17] was presented during the CONTENT PhD workshop in Madrid, Spain. The paper presents the general idea of the system as well as the basic methodological approach to the problem. 2. Content-based Search for Peer-to-Peer Overlays [22] was presented during the PhD workshop of the MEDHOCNET conference in Corfu, Greece. It presented a detailed description of the methodology as well as a detailed approach to the problem of benchmarking and measurement of a distributed search service. 3. Quality of experience evaluation for multimedia services [21] was presented at a plenary session of the KKRRiT 2008 conference in Wrocław, Poland. The paper described the differences between objective and subjective evaluation methods of multimedia services. 4. Benchmarking of Media Search based on Peeer-to-Peer Overlay Networks [25] was presented during the INFOSCALE 2008 conference workshop in Vico Equense, Italy. The publication presents the outcomes of the P2P benchmarking activity which was carried out in the CONTENT NoE. 5. Advanced Multimedia Search in P2P Overlays [18] was presented during the Students Workshop during the IEEE INFOCOM 2009 conference in Rio de Janeiro, Brazil. The paper presents the basic concept of the P2P QbE system along with a brief summary of the results of the user test, which was performed in order to select the optimal QbE method for further research. 6. Ground-Truth-Less Comparision of Selected Content-Based Image Retrieval Measures [20] was presented during the UCMEDIA 2009 conference in Venice, Italy. The paper presents the comparison of different QbE methods assessed in a series of subjective experiments. The results obtained in these experiments are presented in Chapter 3.2. 7. Wyszukiwanie przez podanie przykładu w sieci nakładkowej protokołu Gnutella [19] was presented during the KKRRiT 2011 conference in Poznań, Poland. The paper was awarded the second best young author paper award. The paper presents the results of implementation of the QbE service in the Gnutella P2P overlay. The results are presented in Chapter 3.3..

(22) 14. 1.7. Introduction. Structure of the Dissertation. Chapter 2 presents the theoretical background of the dissertation. In this Chapter the existing P2P overlays are described. An overview of existing metadata management standards is provided. Afterwards existing P2P benchmarking methods are briefly described. The Chapter is concluded with an overview of P2P simulation tools and a description of similar research projects. Chapter 3 reports the results of the practical experiments. First, the performance of different QbE methods is assessed in subjective experiments. Afterwards the implementation and simulation results of the QbE service in an unstructured and a structured P2P are provided. A comparison of the two implementations is described. Chapter 4 wraps up the results and most significant conclusions. Suggestions for further research directions are provided as well..

(23) Chapter 2 State of the Art The investigated search service is created by applying content–based media retrieval techniques to the P2P file sharing overlay. Content–based media retrieval is a popular research topic. Such services are typically created for centralized media databases. Much effort is also undertaken by the researchers in the area of the P2P overlays. The combination of both techniques will, on one hand, open new possibilities for content providers and consumers but, on the other hand, is an up-to-date research challenge. This Chapter describes basic concepts and the current state of the art in both research areas – the P2P overlays and the content-based image retrieval. In-depth understanding of these research areas is required in order to identify new research challenges which result from combination of underlying technologies. The description of the architectures of the P2P networks is provided in Section 2.1. The existing systems of metadata storage are discussed in Section 2.2. In Section 2.3 the existing content-based retrieval systems are presented. Section 2.4 describes the benchmarking system used for the measurements of the system. It is followed up by Section 2.5 which presents the existing software for simulation of P2P overlays. Section 2.6 is devoted to the similar research and related work in the area.. 2.1. P2P Architectures. The problem how to store the data and how to find data exists since the first databases were created. This Chapter describes the problems and solutions associated with data storage and retrieval. The division of P2P systems presented in this Chapter is based on [65]. For the sake of simplicity the Chapter will focus on file sharing service, but all the assumptions and techniques described.

(24) 16. State of the Art. can by applied to any kind of services such as instant messaging, VoIP or social networking.. 2.1.1. Client-Server Architecture. The first approach to the mentioned problem is a client-server architecture. A file sharing service following the client-server paradigm consists of two kinds of nodes. A powerful server, which job is to store the files, keep an up-to-date record of them, reply to the queries and finally serve the clients with the file download functionality. The clients in this design are computers of very low resources available, when compared to the server. The sole role of the clients is to be an interface between the server and the user. The client-server architecture has numerous drawbacks. It is expensive to setup as it requires a powerful computer. It is also expensive to support as it requires a lot of bandwidth to deal with both queries and file downloads. The server creates a single point of failure for the file sharing service. The system does not scale without substantial hardware and bandwidth expenses. On the other hand, there are advantages of such set-up. The service, being operated from a single server, is easy to control. Due to full control over the service it is also easy to create an economic model for it.. 2.1.2. Centralised P2P. The costs and the scalability problems of the client-server architecture caused the first generation of P2P architecture to emerge. Historically the first solution are centralised P2P networks such as Napster [55]. The main goal of the creators of the centralised P2P were to remove the load caused by the file downloads from the central server. In the centralised P2P architecture service consists of a central server and nodes. The role of the central server is to maintain the index of the files available throughout the network. The files are kept in the nodes and the nodes are responsible for supporting the resources required for the download. If a node wants to share a file in the network it has to advertise the file and its metadata to the central server. If a node wishes to find and download a file, it sends a query to the central server. The server responds with the address (or addresses) of the node (nodes) which have the requested asset. The querying node downloads the file from the node that has it bypassing the central server. This architecture has benefits over the client-server architecture. It allows to distribute the most resource consuming part of the service, being the file download. The concept of the service is relatively simple as no sophisticated routing mechanisms are required. This results in a simple implementation. Thanks to the.

(25) 2.1 P2P Architectures. 17. central indexing server the network can be controlled and an economical model for the service can be created. On the other hand, the indexing server in the centralised P2P networks still is the single point of failure. Also the load balancing, while much better than in the client-server solution, still is far from being perfect. Nodes holding files being popular among users have to sustain larger load than nodes that with unpopular content.. 2.1.3. Unstructured P2P. The next step in the evolution of P2P overlays is the unstructured P2P architecture. In this solution there is no need for any central authority. Both the storage and search are distributed in the network. The network consists only of nodes, which are sometimes referred as to “servents” [10] (a word created from “server” and “client”), as in one of the implementations of the unstructured P2P architecture - the Gnutella protocol in version 0.4. The node does not perform file upload in any form. The sole responsibility of the node in the unstructured P2P architecture is to maintain connection with other nodes and to respond the queries for searched files. Search process is based on flooding the network with the query and hoping that the query propagated within the network will eventually reach the nodes, which possess the searched file. This architecture was the first one to remove the single point of failure from the design. If a node disconnects from the network only its content becomes inaccessible and the network as a whole does not cease to function. The drawbacks of the unstructured P2P systems is a huge signalling overhead which is required to maintain the connectivity of the nodes. Also the search process is very ineffective as the query routing is based in the flooding principle. Due to the totally distributed nature of the unstructured P2P system the administrator of the network looses the control over it, which apart from creating legal problems with digital rights management (referred also as “piracy”), renders creation of economic model for the service very challenging.. 2.1.4. Hybrid P2P. Hybrid P2P networks emerged as a natural combination of the design paradigms of the centralised P2P systems and the unstructured P2P architectures. In the hybrid P2P systems the nodes are formed into a two tier topology. The nodes that have an abundance of resources (computing power and bandwidth) are elevated to a status of super peers (which can be also understood as a kind of local servers)..

(26) 18. State of the Art. The super peers are responsible for maintaining the indexes of small chunks of the network (typically up to 100 nodes, as Gnutella version 0.6 [47]). When a node connects to a network it has to establish a connection with a super peer responsible for the part of the network the node is in. After the connection has been established the node advertises the metadata of the files it wants to share to the super peer. In this way the super peer holds a complete index of files which are stored in “its” part of the network. If a node wants to find a file it sends a query to its super peer. The super peer checks whether the file is available within the part of the network it has an index for. If so, it returns the address of the node which has the searched file in possession. If no result can be found in the super peer’s index the query is propagated to other super peers by flooding just as in unstructured P2P systems. The benefit of the hybrid P2P architecture over a unstructured P2P is a reduced amount of signalling required to maintain the network. The benefit of removal of a single point of failure is partly covered. A disconnection of a peer does not influence the network, although the disconnection of a super peer requires either selection of a new super peer or the other super peers to take over the functions of the disconnected one. The main drawback of the hybrid P2P architecture is an uneven load offered to the nodes forming the network. The super peers need to withstand a much more traffic and have to commit more computing power than the regular nodes without any substantial benefit.. 2.1.5. Structured P2P. The structured P2P architecture is the most current approach to the design of the P2P system. The novelty of the concept is the introduction of a logical link between the data (a file) and the address in the addressing space in the P2P overlay. A single peer in the structured P2P overlay is responsible for a chunk of the addressing space. If a new file is to be placed in the network it has to be mapped to the addressing space of the network. The mapping is done with a hashing algorithm such as for example SHA1 [11]. After the file was mapped to the address, the node responsible for the part of the addressing space the file address belongs to, has to be informed. There are two general strategies of storage [65]. One is the direct storage. In this method the file is uploaded to the node, which is responsible for the address linked to the file. Second is the indirect storage. In this method the responsible node is informed only of the file metadata and location. Search in the structured P2P overlay is easy. If a node knows the hash of the file searched it automatically knows the part of the addressing space it needs to query..

(27) 2.1 P2P Architectures. 19. The benefit of such architecture is a low search overhead. Also the problem of load balancing is solved by this design. If a given content is popular among the users, it is enough to increase the number of nodes responsible for that part of the addressing space to split the load between them. The main drawback of the structured P2P overlay is a complex construction which makes the design and deployment of hybrid P2P networks a challenging task. Nevertheless several structured P2P overlay designs have been proposed such as Chord [66] or CAN [60].. 2.1.6. Summary of P2P Architectures. Communication overhead per node. The P2P systems have been actively developed since at least 1999 as an answer to the challenge of load distribution from the server to the clients. Different architectural solutions offer a different level of tradeoff between the communication overhead and the storage cost per node (Figure 2.1) with client-server and unstructured P2P solutions on the opposite poles.. Pure P2P. Hybrid P2P. DHT Centralised P2P Client-Server Storage cost per node. Figure 2.1: Storage cost versus communication overhead (based on [65]) In order to analyse the behaviour and performance of the QbE service in P2P overlays two different P2P architectures were chosen for implementation. These are Gnutella v. 0.4, which represents the pure P2P paradigm and Content Addressable Network (CAN) which represents the DHT–based P2P architecture..

(28) 20. State of the Art. 2.2. The Metadata Management. As the term of metadata was introduced without any formal definitions there are several ways to define, what metadata really is. The most common and informal understanding of metadata is “data about data” [68]. This definition provides the best description of the nature of the metadata. To give a better understanding of the concept of the metadata, two additional, complimentary definitions can be introduced. A summary report presented by the “Committee on Cataloging: Description & Access” defines metadata as “structured, encoded data that describe characteristics of information-bearing entities to aid in the identification, discovery, assessment, and management of the described entities” [68]. Additionally, Bulterman gives a following definition: “[Metadata is a set of] optional structured descriptions that are publicly available to explicitly assist in locating objects” [8]. Several metadata systems were introduced as standards. Two most popular will be presented here in more details – the Dublin Core and the MPEG-7 standard.. 2.2.1. The Metadata Systems – Dublin Core. The Dublin Core1 metadata standard is a simple (referred as the “metadata pidgin for digital tourists” [30]) and straightforward standard for information resource description. The preliminary works on the proposal of the standard started in 1995 and are continued by the Dublin Core Metadata Initiative (DCMI). The standard proposal was accepted by the National Information Standards Organisation (NISO), which is an association accredited by the American National Standards Institute (ANSI) [56]. The Dublin Core has been also proposed as a Request for Comments (RFC) [70]. The goals of the Dublin Core Metadata Standard are as follows [30]: 1. The set of metadata for a media file is supposed to be kept as small and simple as possible. 2. The semantic used by the standard is supposed to be well understandable and defined. This makes the metadata human-readable. 3. The standard is supposed to be international, having numerous language versions. 4. The standard is supposed to be easily extensible. The standardising organisation is working on creating interfaces which will allow to incorporate other metadata sets to the standard. 1 The name Dublin Core refers to Dublin, Ohio, USA where the first initiative of a metadata standard emerged, not to Dublin, Ireland, which is a common misunderstanding..

(29) 2.2 The Metadata Management. 21. There are two levels of the Dublin Core Metadata Element Set (DCEMS): the Simple DCEMS and the Qualified DCEMS. The Simple DCMES contains of a set of high-level 15 elements, such as ”Title”, “Subject” or “Creator”. The Qualified DCEMS is an extension of the Simple DCEMS and introduces three new elements – “Audience”, “Provenance” and “RightsHolder”. Additionally the Qualified DCEMS introduces a set of refinements or qualifiers which narrow the semantics of an element. There are four rules which define the relationship between metadata and its data which are applied in the Dublin Core standard [30]: 1. The standard defines a set of elements and element refinements, along with a formal registry. These should be used by the content managers as a ”best practise” 2. The “one-to-one” principle. A Dublin Core compliant metadata describes a single instance of a media, and not the whole class of media. In other words each media manifestation has its own Dublin Core metadata. 3. Dumb-down principle. A media manager (either a piece of software or a human) should be able to ignore a qualifier and use its value only. It may lead to loss of accuracy, but such a case should be handled. 4. The metadata should be constructed in such a way that it can be parsed and used both by software and human content managers. The Dublin Core standard reveals many advantages – such as overall user-friendliness, which allows the access and understanding of the metadata by the human content managers. It has, however, a disadvantage, which eliminates it as a candidate for the described search system. It does not define the methods of extraction and comparison of the lowlevel metadata. The standard focuses only on the high-level metadata. It is worth mentioning, that some of the existing P2P networks allow narrowing of the simple text-based query by input of a metadata. Such, existing implementations (e.g. the KAD implementation in the eMule client) allow, for example, to define the ”author” or the ”title” of the searched media. Moreover – it is also possible to narrow down the search by inputting some content–based metadata to the search– query – such as ”bitrate”. These content-based metadata fields are, unfortunately far too primitive to allow significant improvement in the search quality.. 2.2.2. The Metadata Systems – MPEG-7. The MPEG family of standards consists of five well established groups of standards [53]. MPEG-1 [34] and MPEG-4 [36] are standards for multimedia compression,.

(30) 22. State of the Art. storage, production and distribution. MPEG-2 [35] is a standard for audio and video transport for a broadcast-quality TV. MPEG-21 [38] is defined as an open framework for the multimedia content and is still under development. The MPEG-7 standard was proposed by the Moving Pictures Experts Group (MPEG) as an ISO/IEC standard in 2001 [37]. The formal name of the standard is “Multimedia Content Description Interface”. The main goal of the MPEG-7 standard can be described as “to standardise a core set of quantitative measures of audio-visual features called Descriptors (D), and structures of descriptors and their relationships, called Description Schemes (DS) in MPEG-7 parlance” [53]. The most important requirements, set for the MPEG-7, are as follows [50]: – Applications – the standard can be applied in many environments and to numerous tasks ranging from education and tourist information to biomedical applications and storage. One of the most important areas of application of the MPEG-7 standard are the search and retrieval services [53]. – Media types – the MPEG-7 standard addresses many media types, including images, video, audio and three dimensional (3D) items. This diversity makes the standard all-purpose and allow development of complex system, which integrate many types of media content. – Media independence – the MPEG-7 metadata can be separated from the media and stored in a different place, also in multiple copies. This feature of the standard makes it useful e.g. in the P2P environments. – Object-based approach – the MPEG-7 standard structure is object-oriented, which is a desired feature in case of development of advanced metadata systems. In such system object-based approach allows development of a hierarchy of descriptions in which one descriptions inherit parts of the structure from another. – Abstraction levels – the MPEG-7 standard covers all levels of abstraction of the description of the media. The lowest, machine levels of the description of the media include such signal-processing features of the media such as spectral features in case of audio media or color structure descriptors in case of images. These are utilised for QbE search. On the other hand, MPEG-7 supports the high, semantic level of description of the media. In the scope of this research the low level descriptions of the media will be utilised. – Extensibility – the structure of the MPEG-7 is open for extension of the basic set of descriptors. This feature of the standard is very valuable, as new types of media are emerging and can not be covered at the moment of the standardisation..

(31) 2.2 The Metadata Management. 23. Several terms, introduced by the MPEG-7 standard that will be used in the scope of this dissertation are presented below and in Figure 2.2 [54]: – Data – all kind of the multimedia content which can be described with the use of the standardised metadata. – Feature – a significant extractable characteristic of the data. – Descriptor – defines the syntax and semantics of the representation of a feature. – Descriptor value – an instance of a descriptor for a particular feature and a particular piece of data. – Description scheme – provides information on structure and relation between its parts. The parts of a description scheme can be both descriptors and other description schemes. – Description Definition Language, DDL – is an interface which allows to extend the existing descriptors and create new ones. Defined in the standard. Defined outside standard. Description Definition Language. Description Scheme. Descriptor. Description Scheme. Description Scheme. Descriptor. Descriptor. Descriptor. Figure 2.2: The relations between the parts of the MPEG-7 standard [54] To allow better understanding of the above definitions, a following example can be given (based on [54]): The description scheme Shot Technical Aspects consists of two descriptors, Lens which gives the information on the focal length of the lens used and Sensor size, which gives information on the size of the camera sensor in MP (Mega Pixels). A descriptor value for the Lens descriptor can be, for example, 300 mm and the descriptor value for the Sensor size descriptor can be e.g. 8 MP. In case of the low-level descriptors the standard defines, in most cases, the way of the extraction of the descriptor value. The extracted descriptor has a form of.

(32) 24. State of the Art. a vector. The standard does not, except for few cases, define a metric in which the descriptor values can be compared. However, such metrics are known and methods for calculation of the distance between two low-level descriptor values are described in the literature [50]. The MPEG group, apart from providing the standard itself, provides a reference software, called the eXperimentation Model (XM). This software tool allows extraction and comparison of descriptor values. It can serve as a reference for testing of other MPEG-7 based tools. An interesting inconsistency in the standard can be observed. The MPEG-7 standard does not provide a method of the descriptor value comparison, whereas the XM, being a part of the standard, does so. As the MPEG-7 standard is one of the most advanced achievements towards the metadata management it is used in the presented research.. 2.3. Search Methods and Architectures of Metadata. This Chapter presents a classification of the search methods used for retrieval of the media. The overview of the classification is depicted in Figure 2.3.. 2.3.1. Classification Based on Input Method. Most of the existing search system are based on the text search. The user inputs a text string which is then searched in the repository. The search can be exact (where only identical hits are returned) or for similar hits (for example – single words from the query). For some application it is useful to utilise a fuzzy textual search method. In this kind of search a distance is calculated from the query to the textual strings in the repository. This distance is calculated in a dedicated metric, e.g. Levensthein Edit Distance [44]. This approach allows for successful search in case when the user misspells the word. It can also be helpful if the words in the database are spelt incorrectly. The textual content in the repository may have different origins. The repository can simply consist of textual documents as in case of Internet search. In case of media repositories the textual search can be performed with use of the file names (most common case in the P2P networks) or the manual annotations assigned by the repository owners. Using this search method we can search for movies with a defined actor by typing the name and hoping, that the movies in the repository have been tagged with the actor’s name. Therefore, the effectiveness of this way of media search depends on the accuracy of the textual description of the media.

(33) 2.3 Search Methods and Architectures of Metadata User input method. Matching method Search methods classification. 25 Textual. Query by Example Exact matching Fuzzy matching Low-level features. Source data. High-level features Semantic. Source data origin. Automatic annotation Manual annotation Single-stage. Complexity Multi-stage. Figure 2.3: The classification of search methods and is independent from the media itself. To summarise, the textual search for media is the easiest but also the most primitive way of search. The content-based search allows to search for media basing on their contents. A textual description can be automatically extracted from the media. This can be done, for example, by means of the Optical Character Recognition (OCR), face recognition or speech recognition. Such, content based, automatically generated metadata can be accessed via previously described textual search. Now, a textual search by providing a name of an actor can yield better results, if the repository has been processed by a face recognition algorithm. A more advanced content-based search techniques utilise the low-level features of the video, such as dominant colour, shape histogram or similar. This gives us an opportunity to perform search by providing an example. If the repository provides tools for a low-level content based search and with a Query by Example (QbE) search method it is possible to search for videos or scenes which depict an actor by providing a visual example of this actor. The most advanced search tools utilise the semantic information extracted from the media. If a repository is providing tools for extraction and search within the semantic data derived from the content it would be possible to perform a textual query in form of a sentence: “I am searching for love scenes with an actor talking to a woman”. From the above example it is easy to notice that the most versatile and useful search system should incorporate all of the above mentioned search methods. If such system would exist, it would be possible to perform a query “I am looking.

(34) 26. State of the Art. for scenes from movies with the actor from the provided example, who is wearing a blue suit and talking to a woman.”. 2.3.2. Classification of Search Methods Based on Complexity. The search methods can be classified as single-stage and multi stage methods. In single stage search methods the user provides a query and the searching process is concluded, when the user is provided with the list of relevant results. This method is simple and does not require much input from the user himself. In some systems the user, after receiving the results has a possibility to apply filters or perform further search within the results. This filtering and/or search is performed locally. The other category of search mechanisms are multi stage search systems. In such systems the first part of the querying process is identical with the singlestage search. However, upon receiving the results the user may specify his query by utilising the received results. Such specified query is executed again and is expected to yield better results than the initial query. Multistage search systems are supposed to allow for more effective search. They, however require more effort and expertise from the user.. 2.3.3. Advanced Search Methods and Users. The natural question which arises here is for the usefulness of search tools. We got used to the textual search both in case of documents and media and may just not feel the need for more advanced tools. Such tools however, may be useful both for professionals and prosumers. Two usage scenarios are provided below. These scenarios have been developed by the author with cooperation with content providers and specialists [24]. Alice, a 28 year–old history teacher is preparing an ancient–history lesson for her class. She wishes to show pictures from Pompeii, showing the Vesuvius volcano in the background. She has a problem, however – the only photos she finds in the Internet are thousands of family photos shot at Vesuvius. She knows that there are many good pictures in the network, but she can’t find any appropriate straight away, but after browsing hundreds of pictures Juan is a 46 years–old violin player and composer. Today, on his way home in the train he heard a fascinating Celtic tune from someones mobile ring tone. He remembered the tune perfectly – thanks to his musical talent. He would really like to know more of this tune he hums all the time. . ..

(35) 2.3 Search Methods and Architectures of Metadata. 27. Alice runs a Peer-to-Peer client on her notebook and uploads her photo as a query example. In the returned results she quickly finds an appropriate photo for her lesson. Juan runs the same client on his computer and hums the tune to the microphone to use it as a query over the content–aware worldwide– spread distributed multimedia P2P store. In the results he finds the full song and concert description, the music file together with DRM information to retrieve it. But he also gets a peer–cast online radio station that is just playing the whole song or did it within the last week. He realises that the peer–cast station plays the kind of Celtic music he likes all the time and finds out that it is run by a Celtic band that wishes to live–spread its music to a wide audience over the network. After a few days of listening, he joins the Celtic band community that is running the peer-cast station and shares his own works. Expert publications stress that users are becoming media producers and consumers at the same time (“prosumers”). The society is moving from a common, unified mainstream market to individual and fragmented tastes [23]. This trend can be depicted as an effect of a long tail in Figure 2.4. On the left side a vertical bar represents the standard model of multimedia production, where few broadcast stations produce content for masses of consumers. On the right side of the image the horizontal bar depicts the huge society of bloggers, photo-bloggers, videobloggers and podcasters who offer their multimedia productions to a narrow group of focused consumers. These producers are at the same time consumers of multimedia productions provided to them. This is a Web 2.0 community, which can be referred to as prosumers.. 2.3.4. Summary on Classification of Search Methods. The search and retrieval mechanisms can be classified in several ways (Figure 2.3). But the most important conclusion is that the diversity of search mechanisms is similar to the diversity of content types. And, like the content, different retrieval methods are targeted at different end users. Moreover – thanks to easy access to the Internet and growing popularity of media the tastes and interests of consumers are becoming more diverse. The search and retrieval technology should keep up with the trend.. 2.3.5. The Architectures of Metadata. The traditional approach for storing metadata is to keep it in a single repository along with the data itself. This approach guarantees quick access both to metadata.

(36) State of the Art. Professional photographs, world-wide magazines, photo exhibitions. Number of content consumers. 28. Regional, local, community newspapers photographs, information portals Bloging, photo sharing. One-to-many. Everybody-to-a-few. Number of content creators Figure 2.4: Popularity of content versus the number of the content consumers [23] and to the content as both are kept in a single repository. This feature makes the centralised approach useful for a local use. But when moving into the area of networking this approach becomes ineffective. User, searching for a piece of content has to his disposal only one, local copy of metadata accompanying the content itself. It is depicted in Figure 2.5. {A, B, C . . .} represent pieces of content and {MA , MB , MC . . .} – the corresponding metadata. The approach adopted in the presented research is to separate the metadata from the content while preserving the link between both of them. Moreover - metadata can be copied and distributed - which is not always possible with the content due to its high volume or the copyright restraints. This leads to easier search and retrieval of metadata and its linked content. The user has to his disposal a distributed and fuzzy repository of metadata stored in many P2P overlay nodes. The concept is depicted in Figure 2.6.. 2.4. Search Benchmarking. According to the English dictionary [69], a benchmark is: “a standardized problem or test that serves as a basis for evaluation or comparison (as of computer system performance)”.

(37) 2.4 Search Benchmarking. 29. Figure 2.5: The traditional approach to content and metadata storing P2P Overlay Node. P2P Overlay Node. A. MA. D. MB. MD. MA B MB. P2P Overlay Node. MB. C. MC P2P client. User. Figure 2.6: The distributed storage of media and metadata. The search tools, as any other systems, need to be measured in order to assess their widely understood quality. Apart from assessing quality, the results of the benchmarking process allow to compare the proposed or developed system against similar ones. The requirement for such comparison is that all the compared sys-.

(38) 30. State of the Art. tems are measured with use of the same measurement tools called benchmarking frameworks. In computer science benchmarking is performed according to a standard. Such standard may be official, approved by one of the standardisation bodies. On the other hand there is a variety of unofficial benchmarking frameworks, approved by the society of users and developers. The problem of measurement of search quality emerged with the first search engines. The methods, mainly for assessment of search accuracy became more sophisticated with introduction of computerised media. The methodology for accuracy benchmarking in media databases is well established. Unfortunately, there are no standards nor widely recognised methods for benchmarking of media search in P2P overlays. The presented research proposes a benchmarking framework which can be used to measure multimedia search accuracy and time in P2P overlays.. 2.4.1. The Benchmarked Parameters. There are several aspects of a search system that can be measured. The most important aspect is search accuracy which can be defined as the ability of the system to find the desired results upon the query. The second measurable attribute of the search system is search time. For the database-based systems search time was not an issue as the search was instant. This was thanks to high performance and locality of the database systems. According to works describing the search benchmarking systems “speed is not of central concern” [43]. Distributed P2P systems are characterised by a considerable and varying delay in communications. Therefore the search time will also be a discussed. Although accuracy and time are not the only measurable characteristics of the search system, author finds them most important. The security aspects of such a system, including the trust management are out of scope of the presented research. A trust system, which can be easily adapted to ensure the security of a P2P system was developed by the author as a masters thesis [16] and presented in [26]. The resource consumption of the search system is another measurable feature. Resources are here understood as memory, processing power, disk space and bandwidth. Memory is consumed mainly by the routing operations of the overlay and can be treated as service-independent. Processing power is consumed mainly by the operations of query preparation and processing (described in detail in Chapter 1.2) and contributes to the search delay. Therefore it will be not analysed separately. The disk space is required, apart from the content storage, to contain the descriptors. Because the space consumed by the descriptors is minimal, when compared with the disk area required for the media it also will not be analysed in.

(39) 2.4 Search Benchmarking. 31. this dissertation. Bandwidth consumed by the search service will be analysed during the experiments. Bandwidth resource in QbE search application is consumed during the querying process due to large (when compared to textual queries) size of descriptors and due to transmission of thumbnails of retrieved media items.. 2.4.2. Search Accuracy Benchmarking. Assessment methods for search accuracy are well explored in the case of centralized repositories of media [64]. In case of distributed repositories these methods require adaptation. Annotated Image Databases In order to perform benchmarking of the search system accuracy it is required to have a ground truth. It may be defined as a full knowledge of all data stored in the system. It serves as a reference level for the benchmarking of accuracy. In other words – to assess the search system accuracy, defined as the ability of the system to find the desired results upon the query, it is required to be able to confront the results against what is actually available in the repository. In the case of benchmarking of multimedia a ground truth is a collection of annotated media files. Such collection is usually manually annotated to make the annotations accurate. For the case of images there are several requirements for the reference collection. First of all, although the search system may have access to an almost unlimited number of images, having a very huge benchmark database is not critical [43]. According to the same author the actual number of pictures in the reference database should vary from 1,000 to 10,000 images. The second requirement for the reference collection is the content of images. To make measurements coherent with the real environment of the application, the content of the images (meaning, in case of image search – photographs) should be natural and complex. Complex in this meaning may be defined as “semantically rich” and multiobject. The format of images also needs to reflect the typical format used by the users being in case of image sharing a low-compressed JPEG. Available sources of such annotated images can be, basically, divided into two groups. The first group contains the annotated collections which are intended to be used in benchmarking. The number of files stored in such databases can vary from hundreds [42] to tens of thousands [28]. Also the content of databases varies - from the natural images [4] to strongly artificial multi angle shots of single objects on a uniform background [14]. Such databases are typically available to the research community free of charge. Another source of annotated images are Internet hosting services for photos. In such services users have a possibility to manually annotate images. Flickr and.

(40) 32. State of the Art. Corbis are examples of such services. The greatest advantage of such source of annotated images is the number of photographs hosted. Flickr hosted, by the end of 2010 over 5 billion (5 ∗ 109 ) photographs growing by 1 billion (1 ∗ 109 ) per year. Disadvantages are the uncontrollable quality of annotation and the difficulty of access to the repositories. The strong advantage is the natural character of such source of images, which means that those photographs were taken by the users in the natural, every–day circumstances. Evaluation Metrics Existing evaluation metrics for search accuracy can be divided into quantitative and qualitative metrics. The former ones are measured objectively and refer to the broadly understood performance of the system. The latter ones refer to the subjective quality of the system and are assessed by the users. An example of a qualitative metric is a Mean Opinion Score (MOS) scale, which was initially standardised by the International Telecommunication Union [31]. In this metric the quality of the system is subjectively assessed by the users in a five-grade scale, where 5 means the best quality and 1 means the worst quality. Another example of qualitative measures is R-factor as a measurement method. It may be utilised in a similar way it is used for subjective evaluation of speech quality in voice transmission systems [32]. One may also take into consideration that MOS may be derived from R-factor (and vice-versa) [33] if a proper mapping function is known. There are numerous quantitative metrics for assessment of search accuracy in the single-stage retrieval process. An overview is given in [64]. These metrics are applicable in the case of a binary classification problems, where all of the items can be classified either as relevant or irrelevant to the query. If a query is sent to a system containing N items, a cut-of value k is set and Vn is defined as relevance (Boolean, 0 for irrelevant, 1 for relevant) of a returned nth item. It has to be noted that introduction of relevance (Vn ) implies that a ground truth is known. Ground truth is an a priori knowledge, which allows classification of items in the database as relevant or irrelevant to the given query. Such ground truth may not be accessible for all search scenarios (as it will be shown later). In such case the below defined metrics become useless. However, for sake of completeness of discussion on the benchmarking systems the following metrics can be defined: – Detections, which is the number of relevant item detected, defined as (2.1). Ak =. k−1 X n=0. Vn. (2.1).

(41) 2.4 Search Benchmarking. 33. – False Alarms, which is the number of irrelevant items detected, defined as (2.2). k−1 X Bk = (1 − Vn ) (2.2) n=0. – Misses, which is the number of undetected relevant items, defined as (2.3). Ck =. N −1 X. Vn − Ak. (2.3). n=0. – Correct Dismissals, which is the number of irrelevant items not detected, defined as (2.4). N −1 X Dk = (1 − Vn ) − Bk (2.4) n=0. – Primary Recall, which is the number of detections divided by the total number of relevant items, defined as (2.5). Rk =. Ak Ak + Ck. (2.5). – Primary Precision, which is the number of detections divided by the total number of returned items, defined as (2.6). Pk =. Ak Ak + Bk. (2.6). – Fallout, which is the number of false alarms divided by the sum of false alarms and correct dismissals, defined as (2.7). Fk =. Bk Bk + Dk. (2.7). Analogous metrics can be defined for the multistage retrieval. As the designed benchmarking system is planned to be used for single stage search techniques metrics dedicated for the multistage retrieval are out of scope of this work. Evaluation Methods Simple calculation of a single metric defined by Formulas 2.1 to 2.7 does not allow to draw conclusion about the search system performance. A trade-off between some.

(42) 34. State of the Art. metrics is quite common, so selected metrics have to be compared against each other. In order to make conclusion about the accuracy of the system evaluation methods need to be defined. An overview of the recommended evaluation methods can be found in [64]: – Retrieval Effectiveness, defined as the comparison of precision (2.6) versus recall (2.5). It is a recommended evaluation method. – Receiver Operating Characteristics, defined as the comparison of detections (2.1) versus false alarms (2.2). – Relative Operating Characteristics defined as the comparison of detections (2.1) versus fallout (2.7). – R-value defined as the precision (2.6) at different cut-off value k. – 3-point Average, defined as average precision (2.6) at defined values of recall (2.5), typically Rk = 0.2; 0.5; 0.7 – 11-point Average, defined as average precision (2.6) at 11 defined values of recall (2.5) Existing Benchmarking Systems Existing visual information retrieval systems focus mainly on preparing the annotated media for the benchmarks [58]. Benchmarks focus also on the media stored locally, whereas the proposed benchmarking system focuses on a distributed storage. The US National Institute of Standards, Information Access Division is responsible for preparing a benchmarking system for retrieval of video. The benchmarking system is called TRECVID [63]. It provides the video data, as well as the set of topics (queries) to work with. The Technical Committee 12 (TC-12) of the International Association for Pattern Recognition provides a set of images and reference queries for testing of search and retrieval of images [27]. The TC-12 benchmark provides four core elements: – a set of images, – a set of queries, – a collection of ground truths associated with images and queries, – a set of measures of the retrieval performance..