% % THE \simulan LOGO IS DEFINED HERE. % \def\simulan{{\rm s\kern-.06em\raise-.5ex\hbox{i}\kern-.1em\raise-.1ex \hbox{m}\raise-.3ex\hbox{u}\kern-.10emL\kern-.1667em\lower-.6ex \hbox{a}\kern-.10emn}}
%% the \PBeam Logo is defined here \def\PBeam{{\sc\kern.15emP\kern-.9em\raise.125ex\hbox{$\leftarrow$}\sc\kern-.25emB\sc\kern-.1eme\kern-.1ema\kern-.1emm}}
You might also want look at the companion thesis about modelling
with simuLan:
[Schmida91]
Ralf Schmidt-Dannert:
\simulan: Modellierung und Simulation lokaler Netzwerke.
Diplomarbeit, TU Braunschweig, 1991 (in german).
However, they exhibit problems with unbalanced load, conflicts with interactive users and increased failure probability. Process migration and checkpoint/restart are adequate means to solve these problems. Several of such mechanisms are described in the literature, but they all have certain weaknesses and impose restrictions on the applications they can handle.
In this thesis, a new concept for an application transparent migration
and checkpointing mechanism is developed, wich overcomes some
substantial of those restrictions. It supports migration and fault
transparency for parallel and distributed applications, i.e. groups of
communicating processes, on clusters of workstations. Neither the
system kernel nor the application programs need to be modified, and
applications are not required to be written for a specific runtime
environment. However, for better performance the applications can be
linked with a modified system library. From this concept, the
architecture of the example implementation is derived, and first measured results and experiences
with the implementation are reported.
A BiBTeX-File, from my Dissertation.
The system uses a global
virtual name space to provide migration and rollback
transparency in user space for distributed groups of processes
on workstations. Applications always use the same virtual
names for the operating system objects, independent of their
current real location. The system calls are interposed and
their parameters translated between the name spaces. Unlike
other migration mechanisms,
does not require the applications to be written
for a specific programming model or communication library.
The first approach to execute applications in the virtual name space was to link the programs with a modified system library. Now, in this paper we describe design and implementation of a separate system call interposition process that accesses the application via the debugging interface. The main advantage of this approach is that it can handle even unmodified (e.g. commercially bought) application programs. We compare measured performance figures with previous similar approaches and the modified system library.
Some amount of data is kept distributed or replicated on some or all nodes of a distributed system. At every moment, each instance that accesses this data must see the same information. Updates must be delivered ordered, reliably, and efficiently.
Our prototype software implements ordered, reliable multicasts on top of the unreliable IP broad- or multicast with three different methods (Master-Slave, Token Exchange on Demand, Totem Single Ring). This paper shows measurement results for the efficiency and scalability of the three methods in different topologies.
The measurements confirm earlier analytical results. Totem behaves well in large networks with many concurrent senders. The overhead of Token on Demand and of the Master-Slave algorithm is almost the same. Also we could not find an indication for the often-read opinion that the Master-Slave approach scales worse because of the central bottleneck.