Multi-set DHT for Grid Resource management1
Georges Da Costa Salvatore Orlando Marios D. Dikaiakos
CNR-ISTI/HPCLab-UCY CNR-ISTI HPCLab-UCY
georgio.da.costa@gmail.com orlando@dsi.unive.it mdd@cs.ucy.ac.cy
Coregrid Integration Workshop 2006, Krakow
Abstract:
The Grid resource manager is one of the fundamental Grid services. It has to
manage the Grid state and to locate resources for users. With Grids becoming
larger, this service needs to be efficient and scalable. Current centralized
approach are unable to scale without large dedicated infrastructure. Thus, we
propose the Multi-set approach, which aims to find the best tradeoff
between DHT-based network and total replication. It is built over classical DHT
P2P system. It improves most of current DHT P2P system by taking into
account the dynamism of resources. Evaluation is done by simulations.
This approach is more efficient than DHT P2P system and total replication whichever
the dynamism of resources is.
Peer to Peer, Resource Management, Grids, Performance
evaluation
1 Introduction
Grids are based on several basic but nevertheless necessary services. One of
such services is the resource manager. This service has to keep track of the
Grid state and has to be able to locate resources corresponding to user
queries. The attributes associated with the diverse physical resources making
up the Grid can be of various types. They can be static, such as the type of
network card, or dynamic, such as network bandwidth. Some attributes can be
characterized by an intermediate dynamism, such as the free processors number
in a cluster.
Resource managers have thus to track the various Grid elements and
characteristics. To this end, they have to manage large amounts of information.
Information is constantly produced in every node of the Grid, where the
physical resources are. Resource managers have to answer to queries which are
initiated by users willing to run their application on the Grid. Users submit
their queries at portals. Those portals are widely distributed as there are at
least one per community. The burden and complexity of resource management
should not increase too much with the scale of the Grid system.
Moreover, such a system should be able to efficiently answer
queries[
12] such as:
find clusters with at least 32 free
processors and Ethernet network.
Difficulties arise due to the large scale of current Grids. The centralized
approach shows its limits, since it sacrifices data freshness for
efficiency. For example, a system using a tree structure (like the predominant
Monitoring and Discovery System of Globus) has to increase timeouts to
prevent overload of the root. Clusters are the leaves of the tree. Information
on the leaves is updated every second. Leaves are aggregated in groups (sub
trees), and so on until reaching the root which aggregates all resource
information of the system. For large systems, the root is updated every ten
minutes or more to reduce its workload[
2].
P2P systems are known[
8] for having solved the file-sharing scalability problem
and for being fault-tolerant and easy to deploy and manage. However, classical P2P
systems (Chord, Freenet) cannot directly be used to solve the problem of Grid
resource management, as they lack key functionalities, like the ability
to give several answers for one query, or to answer complex queries.
P2P systems for Grid resource management have been already
introduced[
1, 3]. Even links between databases and P2P have
been explored[
5].
However, most of these systems are not yet
efficient enough for our purposes. They are not reactive and in some case show linear
behavior in term of number of communications. Secondly, most of these studies do not take into
account the variability of stored information.
Our goal is to improve current systems[
1, 3] using a Peer
to peer approach to provide dynamism aware resource management. Queries are
often kept simple because the most versatile the query is, the less
optimizations are possible. For this reason, resources in Grids are usually
represented as (or can be transformed in) integers. Queries become boolean
expression of atomic interval queries. Grid resource manager can processes the
atomic queries independently and then merge the results. This model of queries
is generic enough as most current systems queries can be translated in those.
Thus a Grid resource manager can be build using a simple module: a system that
answer interval queries[
3]. This basic module shall share the qualities of the whole
system: managing dynamic data, answering fully distributed queries and
updates, and being scalable. Our proposal of such a system is the
Multi-set DHT which is optimized for all type of information dynamism.
In the following, we will first evaluate the performance of other systems,
then we will present our Multi-set approach, and finally we will evaluate its
performance.
2 Multi-set DHT
In this section we discuss our proposal to store, index and discover Grid
resources identified by specific attributes. Our Multi-Set DHT can be thought
as a system that aims to find the best tradeoff between the two antithetic
approaches discussed in the following: a single DHT-based P2P network, spanning
all the nodes of our system, and an alternative system where each node contains
a complete replicated index.
Distributed Hash Table (DHT)
DHT P2P networks are systems that link an integer key with a single value (hash
table) in a decentralized way. Most DHTs efficiently answer simple queries. Their
worst case cost to retrieve data in term of number of messages exchanged is
logarithmic in the number of nodes.
For example, Chord[
11] guarantees that queries need less than
log(
n) messages for
n nodes. The mechanism is simple. Each node has a
randomly chosen
Id and each data has a key obtained by hashing its
content. Those two values are between 0 and 2
k-1. Each data is managed by
one node, whose
Id is the nearest but smaller or equal to the data's
key. Each
node only knows (Figure
1) a handful of other nodes which are
the closest to :
Id+1,
Id+2,
Id+4, ...,
Id+2
i, ...,
Id+2
k-1 (modulo 2
k). Each node knows the values
of the keys between its
Id and the
Id of its successor. To obtain a data, a
node sends a query to the nearest (modulo 2
k) neighbor (Figure
2). The request is forwarded by the same method to the node
responsible for the data. Then at each forward, the distance is halved. So there
is at most
log(
n) messages to contact the right node.
Figure 1: Each node in Chord has an random Id between 0 and 2k-1. Each of
them knows only a small number of neighbor. Each node is responsible for the
data whose Id is between its own Id and the Id of its
successor.
Figure 2: To find data in a Chord system, one node
contacts the node which is nearest to the data Id. The query is forwarded
until finding the node responsible for the data.
But they are too limited to be directly used for resource
management. They only can give one answer, and cannot answer interval requests.
Still, some systems are trying to address the interval query problem by using a P2P
approach, such as Probe[
10], Baton[
6], or
others[
7, 4] described in
[
12]. These systems use the sub-interval (Figure
3) technique to overcome those limitations.
They have the same performance as usual DHT for a single value. For small
intervals, which are spread on a few nodes, the cost remains the same, i.e.
logarithmic. However, the query cost of
those system is linear in the size of the interval. As the sub-intervals are
assigned to distinct nodes of the P2P system, the larger the interval is, the more
nodes need to be contacted.
The second problem is related to the workload distribution. A query can aim at
looking for resources
clusters whose attribute "number of free
processors" must be al least 32. Intervals of some resources queries have
always the same bounds opened, for example the right one for queries asking for
free processors. For
open interval queries, the node responsible for one
of the two extremities of the key space has to answer all the queries. For
example, since typical queries looking for clusters with free processors are
interval queries that are open on the right, the node responsible for the right
extremity of the key space has to answer all these queries.
Figure 3: For P2P systems able to answer interval queries, the
global interval is split in the number of nodes sub-intervals. To find an
interval, one has to search for a node responsible for one bond by using the
global DHT. Then this node contacts its neighbor (each node of a dht knows some
other nodes, and one of them is the next one), and so on until the second
bound is reached.
To summarize, the cost for open queries is linear with respect to the number of
nodes, and the cost for updates is logarithmic, making this approach efficient when most of requests are updates.
Another extreme method would be to adopt a total replication of all the data to
be indexed amongst the nodes. The query is not expensive in terms of messages
exchanged, but the changes of resources attributes becomes very expensive since
all the nodes must be contacted to update the indexed information. This method
is perfect when there are a lot of queries and a very small number of updates as
cost for updates is linear, and query cost is constant.
Several DHT systems[
6, 7, 10, 4] have been proposed, each
one providing a complete algorithm for managing resources. An other approach is
to build a system which use those DHT as modules to obtain a more efficient
system. An example is provided by Sword[
3] which shows that using one
of these modules for each Grid resource improves the overall performance.
Our approach
Grid resource management is based on information made available by each
resource. In the following we consider the basic unit as the cluster and the
resources which are linked to it. Usually, each cluster of the grid uses a local
resource manager. Such node is usually a computer linked to the computing and
storage elements. As we aim to obtain a fully decentralized system without the
need of adding new servers, we will use those nodes as support of our system.
Each resource attribute has its own
ratio of update/request (requests
are updates and queries). Standard DHT and total replication are only efficient
for some of the ratio. Total replication is efficient for a ratio close to 0,
and one DHT for a ratio close to 1 for uniform open queries.
Figure 4: On the left side, all the nodes belong to the same DHT. On the right
one, all the nodes have a copy of the same index, i.e., the replication is
total. One of the possible intermediate case is shown in the center, where
nodes are subdivided into several sets. Each node set supports a distinct DHT.
In order to switch between a system configuration and another one, merge
and split functions are used.
The goal of our Multi-Set DHT approach is to find the best tradeoff for each
possible ratio. To this end, we synchronize several classical DHTs, each
supported by a distinct
set of nodes. All nodes are thus partitioned into
distinct sets as shown on Figure
4. When an update is issued, it
must be distributed to all sets. On the other hand, queries can be served by
only one randomly chosen set. By increasing the number of sets, the query cost
decreases, but the update cost increases.
To be efficient, this system adapts the number of sets in function
of the ratio. For each set, a node is responsible for aggregating statistics
of the usage of the set in order to chose if it is necessary to increase
or decrease the set number. The node managing the value 0 of the set is
chosen as the responsible for the set.
Once a responsible node choose to change the number of set, one of the
two operations
Merge or
Split are executed.
Merge is based on the dynamicity capability of the underlying DHT
system. One of the set is destroyed, and each node of it returns as
new
nodes in one of the other sets.
Split is based on the error recovery mechanism of the underlying DHT
system. Nodes uniformly distributed amongst the sets are removed from them,
and go to be part of a new set. Redundancy is used to reduce the cost of creating
the new set.
Except during a split, all the nodes have links to at least one node of
each other set.
3 Preliminary performances evaluation
Figure 5: Simulation of mean number of messages in a 1000 nodes system to answer Queries
and Update. X-axis shows the percentages of the requests that are Updates.
Figure 6: Standard deviation of the node's workload in a 1000 nodes
system. X-axis shows the percentages of the requests that are Updates one.
Multi-set is evaluated using a dedicated event-driven simulator implemented in
Ocaml[
9]. The generator of requests issued them uniformly among nodes. Requests
stand for queries and updates. Queries were relative to intervals open on the
right (like in
at least 32 free processors) and with the left bound
uniformly distributed over the values space. The underlying P2P architecture is
Chord[
11]-like. Metrics are
Mean number of messages for
requests (Query and Update) and
Workload balance. Multi-set is compared
with the two other limit approaches, i.e. a single DHT and total
replication. Data for one DHT and total replication are obtained by simulation
too. Models of the three approaches confirm simulation results.
Figure
5 compares the performance of the three methods for a
network of 1000 nodes depending on the ratio Update/Request. The multi-set
performance is always better than the others, particularly for non-extreme
values of the ratio. The worst performance obtained for the Multi-Set is when
the ratio is 1/2. Yet, at this point it is more than 4 times more efficient
than the system based on a single DHT, and 8 times more than the one based on
total replication.
Figure
6 compares standard deviation of the workload assigned to
each node for the three methods for uniformly distributed requests. Standard
deviation of the total-replication is null as all the nodes do always the same
amount of work. Open queries for one DHT put a lot of weight on nodes
responsible for the end of the interval. By using multi-set, the work is more
distributed, even when there are mostly queries.
4 Conclusion
Most of the Peer to Peer systems able to answer interval queries directly try to
solve the problem by optimizing the Query and the Update costs. We choose to
address a more realistic case where the ratio between the queries and updates is
important. Using this approach we can improve other systems able to deal with interval queries by providing a
high-level algorithm that can be based on other low-level ones.
It is possible to use our approach over other Peer to Peer interval
systems and then to improve dramatically their performance. We can inherit some
of their properties too, for example more complex queries.
The Multi-set method gives good results for each type of ratio
Update/Request. Moreover it can adapt automatically to the way the requests are
done. Thus it is efficient to manage several types of resources for Grids.
Experimental validation using realistic platforms like
PlanetLab or Grid5000 will be done.
References
- [1]
-
Ian Foster Adriana Iamnitchi.
A Peer-to-Peer Approach to Resource Location in Grid
Environments.
Kluwer Publishing, 2003.
- [2]
-
Ben Clifford.
Globus monitoring and discovery.
In GlobusWorld05, 2005.
- [3]
-
David Patterson David Oppenheimer, Jeannie Albrecht and Amin Vahdat.
Scalable wide-area resource discovery.
Technical Report UCB/CSD-04-1334, EECS Department, University of
California, Berkeley, 2004.
- [4]
-
Prasanna Ganesan, Mayank Bawa, and Hector Garcia-Molina.
Online balancing of range-partitioned data with applications to
peer-to-peer systems.
Technical report, Stanford U., 2004.
- [5]
-
Steven D. Gribble, Alon Y. Halevy, Zachary G. Ives, Maya Rodrig, and Dan Suciu.
What can database do for peer-to-peer?
In Fourth International Workshop on the Web and Databases (WebDB
'2001)., pages 31--36, 2001.
- [6]
-
H. V. Jagadish, Beng Chin Ooi, and Quang Hieu Vu.
Baton: a balanced tree structure for peer-to-peer networks.
In VLDB '05: Proceedings of the 31st international conference on
Very large data bases, pages 661--672. VLDB Endowment, 2005.
- [7]
-
Nikos Ntarmos, Theoni Pitoura, and Peter Triantafillou.
Range query optimization leveraging peer heterogeneity.
In 3rd International Workshop on Databases, Information Systems
and Peer-to-Peer Computing (DBISP2P 2005), August 2005.
- [8]
-
Andy Oram.
Peer-to-Peer : Harnessing the Power of Disruptive Technologies.
O'Reilly, 2001.
- [9]
-
Didier Rémy.
Using, Understanding, and Unraveling the OCaml Language. From
Practice to Theory and Vice Versa, chapter 2002, pages 413--536.
Lecture Notes in Computer Science.
- [10]
-
Ozgur D. Sahin, S. Antony, Divyakant Agrawal, and Amr El Abbadi.
Probe: Multi-dimensional range queries in p2p networks.
In WISE, pages 332--346, 2005.
- [11]
-
Ion Stoica, Robert Morris, David Karger, Frans Kaashoek, and Hari Balakrishnan.
Chord: A scalable peer-to-peer lookup service for internet
applications.
Technical Report TR-819, MIT, March 2001.
- [12]
-
Paolo Trunfio, Domenico Talia, Paraskevi Fragopoulou, Charis Papadakis, Matteo
Mordacchini, Mika Pennanen, Konstantin Popov, Vladimir Vlassov, and Seif
Haridi.
Peer-to-peer models for resource discovery on grids.
In Proc. of the 2nd CoreGRID Workshop on Grid and Peer to Peer
Systems Architecture, 2006.
- 1
- Work funded by Coregrid/Ercim
This document was translated from LATEX by
HEVEA.