Multi-set DHT for Grid Resource management1

Georges Da Costa   Salvatore Orlando   Marios D. Dikaiakos
CNR-ISTI/HPCLab-UCY CNR-ISTI HPCLab-UCY
georgio.da.costa@gmail.com orlando@dsi.unive.it mdd@cs.ucy.ac.cy

Coregrid Integration Workshop 2006, Krakow

Abstract: The Grid resource manager is one of the fundamental Grid services. It has to manage the Grid state and to locate resources for users. With Grids becoming larger, this service needs to be efficient and scalable. Current centralized approach are unable to scale without large dedicated infrastructure. Thus, we propose the Multi-set approach, which aims to find the best tradeoff between DHT-based network and total replication. It is built over classical DHT P2P system. It improves most of current DHT P2P system by taking into account the dynamism of resources. Evaluation is done by simulations. This approach is more efficient than DHT P2P system and total replication whichever the dynamism of resources is.
Peer to Peer, Resource Management, Grids, Performance evaluation

1  Introduction

Grids are based on several basic but nevertheless necessary services. One of such services is the resource manager. This service has to keep track of the Grid state and has to be able to locate resources corresponding to user queries. The attributes associated with the diverse physical resources making up the Grid can be of various types. They can be static, such as the type of network card, or dynamic, such as network bandwidth. Some attributes can be characterized by an intermediate dynamism, such as the free processors number in a cluster.

Resource managers have thus to track the various Grid elements and characteristics. To this end, they have to manage large amounts of information. Information is constantly produced in every node of the Grid, where the physical resources are. Resource managers have to answer to queries which are initiated by users willing to run their application on the Grid. Users submit their queries at portals. Those portals are widely distributed as there are at least one per community. The burden and complexity of resource management should not increase too much with the scale of the Grid system. Moreover, such a system should be able to efficiently answer queries[12] such as: find clusters with at least 32 free processors and Ethernet network.

Difficulties arise due to the large scale of current Grids. The centralized approach shows its limits, since it sacrifices data freshness for efficiency. For example, a system using a tree structure (like the predominant Monitoring and Discovery System of Globus) has to increase timeouts to prevent overload of the root. Clusters are the leaves of the tree. Information on the leaves is updated every second. Leaves are aggregated in groups (sub trees), and so on until reaching the root which aggregates all resource information of the system. For large systems, the root is updated every ten minutes or more to reduce its workload[2].

P2P systems are known[8] for having solved the file-sharing scalability problem and for being fault-tolerant and easy to deploy and manage. However, classical P2P systems (Chord, Freenet) cannot directly be used to solve the problem of Grid resource management, as they lack key functionalities, like the ability to give several answers for one query, or to answer complex queries.

P2P systems for Grid resource management have been already introduced[1, 3]. Even links between databases and P2P have been explored[5]. However, most of these systems are not yet efficient enough for our purposes. They are not reactive and in some case show linear behavior in term of number of communications. Secondly, most of these studies do not take into account the variability of stored information.

Our goal is to improve current systems[1, 3] using a Peer to peer approach to provide dynamism aware resource management. Queries are often kept simple because the most versatile the query is, the less optimizations are possible. For this reason, resources in Grids are usually represented as (or can be transformed in) integers. Queries become boolean expression of atomic interval queries. Grid resource manager can processes the atomic queries independently and then merge the results. This model of queries is generic enough as most current systems queries can be translated in those. Thus a Grid resource manager can be build using a simple module: a system that answer interval queries[3]. This basic module shall share the qualities of the whole system: managing dynamic data, answering fully distributed queries and updates, and being scalable. Our proposal of such a system is the Multi-set DHT which is optimized for all type of information dynamism.

In the following, we will first evaluate the performance of other systems, then we will present our Multi-set approach, and finally we will evaluate its performance.

2  Multi-set DHT

In this section we discuss our proposal to store, index and discover Grid resources identified by specific attributes. Our Multi-Set DHT can be thought as a system that aims to find the best tradeoff between the two antithetic approaches discussed in the following: a single DHT-based P2P network, spanning all the nodes of our system, and an alternative system where each node contains a complete replicated index.

Distributed Hash Table (DHT)

DHT P2P networks are systems that link an integer key with a single value (hash table) in a decentralized way. Most DHTs efficiently answer simple queries. Their worst case cost to retrieve data in term of number of messages exchanged is logarithmic in the number of nodes.

For example, Chord[11] guarantees that queries need less than log(n) messages for n nodes. The mechanism is simple. Each node has a randomly chosen Id and each data has a key obtained by hashing its content. Those two values are between 0 and 2k-1. Each data is managed by one node, whose Id is the nearest but smaller or equal to the data's key. Each node only knows (Figure 1) a handful of other nodes which are the closest to : Id+1, Id+2, Id+4, ..., Id+2i, ..., Id+2k-1 (modulo 2k). Each node knows the values of the keys between its Id and the Id of its successor. To obtain a data, a node sends a query to the nearest (modulo 2k) neighbor (Figure 2). The request is forwarded by the same method to the node responsible for the data. Then at each forward, the distance is halved. So there is at most log(n) messages to contact the right node.


Figure 1: Each node in Chord has an random Id between 0 and 2k-1. Each of them knows only a small number of neighbor. Each node is responsible for the data whose Id is between its own Id and the Id of its successor.


Figure 2: To find data in a Chord system, one node contacts the node which is nearest to the data Id. The query is forwarded until finding the node responsible for the data.


But they are too limited to be directly used for resource management. They only can give one answer, and cannot answer interval requests. Still, some systems are trying to address the interval query problem by using a P2P approach, such as Probe[10], Baton[6], or others[7, 4] described in [12]. These systems use the sub-interval (Figure 3) technique to overcome those limitations.

They have the same performance as usual DHT for a single value. For small intervals, which are spread on a few nodes, the cost remains the same, i.e. logarithmic. However, the query cost of those system is linear in the size of the interval. As the sub-intervals are assigned to distinct nodes of the P2P system, the larger the interval is, the more nodes need to be contacted.

The second problem is related to the workload distribution. A query can aim at looking for resources clusters whose attribute "number of free processors" must be al least 32. Intervals of some resources queries have always the same bounds opened, for example the right one for queries asking for free processors. For open interval queries, the node responsible for one of the two extremities of the key space has to answer all the queries. For example, since typical queries looking for clusters with free processors are interval queries that are open on the right, the node responsible for the right extremity of the key space has to answer all these queries.


Figure 3: For P2P systems able to answer interval queries, the global interval is split in the number of nodes sub-intervals. To find an interval, one has to search for a node responsible for one bond by using the global DHT. Then this node contacts its neighbor (each node of a dht knows some other nodes, and one of them is the next one), and so on until the second bound is reached.


To summarize, the cost for open queries is linear with respect to the number of nodes, and the cost for updates is logarithmic, making this approach efficient when most of requests are updates.




Another extreme method would be to adopt a total replication of all the data to be indexed amongst the nodes. The query is not expensive in terms of messages exchanged, but the changes of resources attributes becomes very expensive since all the nodes must be contacted to update the indexed information. This method is perfect when there are a lot of queries and a very small number of updates as cost for updates is linear, and query cost is constant.




Several DHT systems[6, 7, 10, 4] have been proposed, each one providing a complete algorithm for managing resources. An other approach is to build a system which use those DHT as modules to obtain a more efficient system. An example is provided by Sword[3] which shows that using one of these modules for each Grid resource improves the overall performance.

Our approach

Grid resource management is based on information made available by each resource. In the following we consider the basic unit as the cluster and the resources which are linked to it. Usually, each cluster of the grid uses a local resource manager. Such node is usually a computer linked to the computing and storage elements. As we aim to obtain a fully decentralized system without the need of adding new servers, we will use those nodes as support of our system.

Each resource attribute has its own ratio of update/request (requests are updates and queries). Standard DHT and total replication are only efficient for some of the ratio. Total replication is efficient for a ratio close to 0, and one DHT for a ratio close to 1 for uniform open queries.


Figure 4: On the left side, all the nodes belong to the same DHT. On the right one, all the nodes have a copy of the same index, i.e., the replication is total. One of the possible intermediate case is shown in the center, where nodes are subdivided into several sets. Each node set supports a distinct DHT. In order to switch between a system configuration and another one, merge and split functions are used.


The goal of our Multi-Set DHT approach is to find the best tradeoff for each possible ratio. To this end, we synchronize several classical DHTs, each supported by a distinct set of nodes. All nodes are thus partitioned into distinct sets as shown on Figure 4. When an update is issued, it must be distributed to all sets. On the other hand, queries can be served by only one randomly chosen set. By increasing the number of sets, the query cost decreases, but the update cost increases.

To be efficient, this system adapts the number of sets in function of the ratio. For each set, a node is responsible for aggregating statistics of the usage of the set in order to chose if it is necessary to increase or decrease the set number. The node managing the value 0 of the set is chosen as the responsible for the set.

Once a responsible node choose to change the number of set, one of the two operations Merge or Split are executed.

Merge is based on the dynamicity capability of the underlying DHT system. One of the set is destroyed, and each node of it returns as new nodes in one of the other sets.

Split is based on the error recovery mechanism of the underlying DHT system. Nodes uniformly distributed amongst the sets are removed from them, and go to be part of a new set. Redundancy is used to reduce the cost of creating the new set.

Except during a split, all the nodes have links to at least one node of each other set.

3  Preliminary performances evaluation



Figure 5: Simulation of mean number of messages in a 1000 nodes system to answer Queries and Update. X-axis shows the percentages of the requests that are Updates.


Figure 6: Standard deviation of the node's workload in a 1000 nodes system. X-axis shows the percentages of the requests that are Updates one.


Multi-set is evaluated using a dedicated event-driven simulator implemented in Ocaml[9]. The generator of requests issued them uniformly among nodes. Requests stand for queries and updates. Queries were relative to intervals open on the right (like in at least 32 free processors) and with the left bound uniformly distributed over the values space. The underlying P2P architecture is Chord[11]-like. Metrics are Mean number of messages for requests (Query and Update) and Workload balance. Multi-set is compared with the two other limit approaches, i.e. a single DHT and total replication. Data for one DHT and total replication are obtained by simulation too. Models of the three approaches confirm simulation results.

Figure 5 compares the performance of the three methods for a network of 1000 nodes depending on the ratio Update/Request. The multi-set performance is always better than the others, particularly for non-extreme values of the ratio. The worst performance obtained for the Multi-Set is when the ratio is 1/2. Yet, at this point it is more than 4 times more efficient than the system based on a single DHT, and 8 times more than the one based on total replication.

Figure 6 compares standard deviation of the workload assigned to each node for the three methods for uniformly distributed requests. Standard deviation of the total-replication is null as all the nodes do always the same amount of work. Open queries for one DHT put a lot of weight on nodes responsible for the end of the interval. By using multi-set, the work is more distributed, even when there are mostly queries.

4  Conclusion

Most of the Peer to Peer systems able to answer interval queries directly try to solve the problem by optimizing the Query and the Update costs. We choose to address a more realistic case where the ratio between the queries and updates is important. Using this approach we can improve other systems able to deal with interval queries by providing a high-level algorithm that can be based on other low-level ones. It is possible to use our approach over other Peer to Peer interval systems and then to improve dramatically their performance. We can inherit some of their properties too, for example more complex queries.

The Multi-set method gives good results for each type of ratio Update/Request. Moreover it can adapt automatically to the way the requests are done. Thus it is efficient to manage several types of resources for Grids.

Experimental validation using realistic platforms like PlanetLab or Grid5000 will be done.

References

[1]
Ian Foster Adriana Iamnitchi. A Peer-to-Peer Approach to Resource Location in Grid Environments. Kluwer Publishing, 2003.

[2]
Ben Clifford. Globus monitoring and discovery. In GlobusWorld05, 2005.

[3]
David Patterson David Oppenheimer, Jeannie Albrecht and Amin Vahdat. Scalable wide-area resource discovery. Technical Report UCB/CSD-04-1334, EECS Department, University of California, Berkeley, 2004.

[4]
Prasanna Ganesan, Mayank Bawa, and Hector Garcia-Molina. Online balancing of range-partitioned data with applications to peer-to-peer systems. Technical report, Stanford U., 2004.

[5]
Steven D. Gribble, Alon Y. Halevy, Zachary G. Ives, Maya Rodrig, and Dan Suciu. What can database do for peer-to-peer? In Fourth International Workshop on the Web and Databases (WebDB '2001)., pages 31--36, 2001.

[6]
H. V. Jagadish, Beng Chin Ooi, and Quang Hieu Vu. Baton: a balanced tree structure for peer-to-peer networks. In VLDB '05: Proceedings of the 31st international conference on Very large data bases, pages 661--672. VLDB Endowment, 2005.

[7]
Nikos Ntarmos, Theoni Pitoura, and Peter Triantafillou. Range query optimization leveraging peer heterogeneity. In 3rd International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2005), August 2005.

[8]
Andy Oram. Peer-to-Peer : Harnessing the Power of Disruptive Technologies. O'Reilly, 2001.

[9]
Didier Rémy. Using, Understanding, and Unraveling the OCaml Language. From Practice to Theory and Vice Versa, chapter 2002, pages 413--536. Lecture Notes in Computer Science.

[10]
Ozgur D. Sahin, S. Antony, Divyakant Agrawal, and Amr El Abbadi. Probe: Multi-dimensional range queries in p2p networks. In WISE, pages 332--346, 2005.

[11]
Ion Stoica, Robert Morris, David Karger, Frans Kaashoek, and Hari Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. Technical Report TR-819, MIT, March 2001.

[12]
Paolo Trunfio, Domenico Talia, Paraskevi Fragopoulou, Charis Papadakis, Matteo Mordacchini, Mika Pennanen, Konstantin Popov, Vladimir Vlassov, and Seif Haridi. Peer-to-peer models for resource discovery on grids. In Proc. of the 2nd CoreGRID Workshop on Grid and Peer to Peer Systems Architecture, 2006.

1
Work funded by Coregrid/Ercim

This document was translated from LATEX by HEVEA.