@JohnKoepi @ g on software engineering

Distributed systems problem - random sequence

This opens a series of problems in a field of distributed systems to be solved to learn or for discussing at interviews. They are useful to reason about and understand fundamentals.


A cluster consists of a set of nodes. Nodes in a cluster communicate by means of messages. Each node sends a messages to all other nodes. Each node is preseeded with some number to setup deterministic sequence of pseudo-random numbers . An arbitrarily taken message contains a pseudo-random number .

When a cluster starts up, the nodes learn about each other. Then during the grace period they are sending messages. After a while cluster receives a signal to stop sending. When a stop signal is received by a node and it is ready it prints a tuple:

consisting of a number of all messages that were send in a grace period, a sequence of all messages that was sent in a grace period in order and a score, where is a set of all messages sent by all nodes in a grace period ordered by sending time, is a size of a set , a is a th message from sent by some node.

If a node did not finish before the time point T from cluster startup it is terminated and considered failed.

Implement a program that works correctly and maximizes output score. Correctness must be hold under different failure scenarios. Describe what can go wrong and how the solution handle failures.

What are the safety and liveness properties of that system would be? How would you test properties of such a program?

blog comments powered by Disqus