How to Select a Replication Protocol
According to Scalability, Availability, and Communication
Overhead.
R. Jiménez-Peris,
M. Patiño-Martínez, G. Alonso, B. Kemme
Abstract:
Data replication is playing an increasingly important role in the
design of parallel information systems. In particular, the widespread
use of cluster architectures in high-performance computing has created
many opportunities for applying data replication techniques in new
areas. For instance, as part of work related to cluster computing in
bioinformatics, we have been confronted with the problem of having to
chose an optimal replication strategy in terms of scalability,
availability, and communication overhead. Thus, we have evaluated
several representative replication protocols in order to better
understand their behavior in practice. The results obtained are
surprising in that they challenge many of the assumptions behind
existing protocols. Our evaluation indicates that the conventional
read-one/write-all approach is the best choice for a large range of
applications requiring data replication. We believe this is an
important result for anybody developing code for computing clusters as
the read-one/write-all strategy is much simpler to implement and more
flexible than quorum-based approaches. In this paper we show that, in
addition, it is also the best choice using a number of other selection
criteria.
Proc. of the 20th IEEE Int. Conf. on Reliable
Distributed Systems, SRDS'01, New Orleans, Oct. 2001
Click to get the PostScript , Gzipped PostScript.
Pdf Version