• Michael Whittaker's avatar
    Recovery & view change implementation + bug fixes. · 0cd92291
    Michael Whittaker authored
    This PR implements the view change and recovery portion of IR. It also
    fixes a couple of important bugs and introduces a new mechanism with
    with to unit test IR.
    What's New?
    This PR includes the following, in roughly descending order of
    - The main contribution of this PR is the implementation of view changes
      and recovery. The implementation required some of the following
        - I introduce DoViewChangeMessage and StartViewMessage protobufs as
          well as protobufs for serializing records.
        - IR replicas have logic to periodically initiate a view change or
          initiate a view change upon recovery. They also have logic to
          handle DoViewChangeMessage and StartViewMessage messages.
        - IR replicas persist their view information, which is needed for
          recovery, using PeristentRegisters which persist data on disk.
        - When a client issues a PropseInconsistentMessage or
          ProposeConsistentMessage, the reply indicating whether the request
          has already been finalized. If it has, the clients respond
        - For conensus requests, clients make sure that the majority of
          replies and majority of confirms come from the same view.
    - I fixed a slow path/fast path bug in IR clients. Previously, after a
      timeout, clients would transition from the fast path into the slow
      path. But, they would not wait for a majority of responses from
      replicas. Instead, they would call decide on whatever responses they
      had at the moment.
    - Previously, decide functions took in a set of replies, but they needed
      to take in a multiset of replies so that the decide function could do
      things like count the frequency of each reply.
    - I introduced a new simulated transport ReplTransport. It's pretty
      neat. It launches a command line REPL that you can use to manually
      control the execution of the system. You decide what timers to trigger
      and what messages to deliver. Then, you can save your execution as a
      unit test to run for later. I used the ReplTransport a lot to debug
      the view change and recovery algorithms since a lot of the corner
      cases are hard to trigger with the existing transport layers. I added
      some ReplTransport unit tests for the lock server.
    - I implemented Sync and Merge for the lock server. The lock server is
      now fully functional.
    - I generalized timeout callbacks to error callbacks which are invoked
      whenever an error (not just a timeout) is encountered. This was
      necessary to nicely handle some of the failure scenarios introduced by
      view changes (e.g. a client gets majority replies and majority
      confirms in different views for a consensus request).
    - I performed some miscellaneous cleanup here and there, fixing
      whitespace, changing raw pointers to smart pointers, stuff like that.
    What's Left?
    There are still some things left that this PR doesn't implement:
    - Currently during a view change, replicas send their entire records to
      one another. I'm guessing there are more efficient ways to transfer
      records between replicas, similar to how VR has some optimization
      tricks for log shipping.
    - I have not implemented Sync or Merge for TAPIR, only for the lock
    - Clients do not notify replicas of view changes when they receive
      replies from older views. Similarly, a replica never detects that its
      in a stale view or requests a master record from replica in a higher
      view. It just eventually does a view change to stay up-to-date.
    - None of the code has been profiled or optimized or anything like that.
      This PR focuses only on correctness, not performance.
    - There could be more unit tests.