Lockstep Implementation in Unity3D – Part 2

Overview

In the previous implementation of the lockstep model the game frame rate and the length of the communication turn (referred to as the lockstep turn here) were set at a fixed interval. In real scenarios latency and performance will vary. This update to the original model will keep track of two metrics. The first is the amount of time it takes to communicate with the other players. The second is run-time performance of the game frame method.

Rolling Averages

To handle the fluctuations in latency we want to quickly increase the amount of time of a lockstep turn when the latency increases, while we want to slowly adjust for better latency. This will make the game play feel smoother as the pace at which the game updates will be more stable than constantly adjusting. We also do not want to have to keep track of all of the past measurements of latency for the past lockstep turns. We can sum up all of the past information in a “rolling average” and adjust it by some weight for future values.

When ever a new value is larger than the current average, we will set the average to the new value. This will give use the behavior of quickly raising to the increase in latency. When a value is smaller than the current average, we will trust it’s new value by some weight w, so we have the formula:
newAverage = currentAverage * (1 – w) + newValue * ( w)
Where 0 < w < 1 In my implementation I set w = 0.1. I also keep track of the average per player, and always use the maximum average among the players. Here is the method for adding a new value: [code lang="csharp"] public void Add(int newValue, int playerID) { if(newValue > playerAverages[playerID]) { //rise quickly playerAverages[playerID] = newValue; } else { //slowly fall down playerAverages[playerID] = (playerAverages[playerID] * (9) + newValue * (1)) / 10; } } [/code] In order to maintain determinism the calculation is done using only integers. So the formula is adjusted like the following: newAverage = (currentAverage * (10 - w) + newValue * ( w)) / 10 Where 0 < w < 10 And in my case, w = 1. Runtime Average

The time it takes to execute each game frame is tracked to be used for the runtime average. If the game frames start taking longer, then we need to decrease the number of game frames per lockstep turn. On the other hand if the game frames start executing faster, we can have more game frames per lockstep. For each lock step turn, the longest running game frame is used to be added to the average. The first game frame of each lockstep turn also includes the amount of time it takes to process the action for that lockstep turn. A Stopwatch is used to take the measure of lapsed time.

private void ProcessActions() {
    //process action should be considered in runtime performance
    gameTurnSW.Start ();

    ...

    //finished processing actions for this turn, stop the stopwatch
    gameTurnSW.Stop ();
}

private void GameFrameTurn() {
   ...
		
    //start the stop watch to determine game frame runtime performance
    gameTurnSW.Start();

    //update game
    ...

    GameFrame++;
    if(GameFrame == GameFramesPerLockstepTurn) {
        GameFrame = 0;
    }

    //stop the stop watch, the gameframe turn is over
    gameTurnSW.Stop ();
    //update only if it's larger - we will use the game frame that took the longest in this lockstep turn
    long runtime = Convert.ToInt32 ((Time.deltaTime * 1000))/*deltaTime is in secounds, convert to milliseconds*/ + gameTurnSW.ElapsedMilliseconds;
    if(runtime > currentGameFrameRuntime) {
        currentGameFrameRuntime = runtime;
    }

    //clear for the next frame
    gameTurnSW.Reset();
}

Note that we also include Time.deltaTime. Including this may have some overlap with the last frame if gameframe is running at the same rate as the Update() method. However we need to include it so that rendering and other things that Unity does for us is considered in the measurement. The potential overlap is acceptable as it would just give us a bigger buffer.

Network Average

What to use as the network average was not as clear to me. I ended up using a Stopwatch that ran from the time a player sends an action to the time they receive the final confirmation of the action. This lockstep model sends an action to be processed for two turns in the future. To increment the lockstep turn we ensure that all players have confirmed the action we are about to process. Due to this setup, we could potentially have two actions that are waiting for all of their confirmations. To deal with this, two Stopwatches are used. One for the current action and one for the prior action. This is encapsulated in the ConfirmActions class. When the lockstep turn is incremented, the prior Stopwatch becomes the current stop watch, and the old “current stop watch” is cleared and reused as the new “prior stop watch”.

public class ConfirmedActions
{
...
    public void NextTurn() {
        ...
        Stopwatch swapSW = priorSW;
    		
        //last turns actions is now this turns prior actions
        ...
        priorSW = currentSW;
    	
        //set this turns confirmation actions to the empty array
        ...
        currentSW = swapSW;
        currentSW.Reset ();
    }
}

Whenever a confirmation comes in, we check if we have received all confirmations, and if so stop the respected Stopwatch.

    public void ConfirmAction(int confirmingPlayerID, int currentLockStepTurn, int confirmedActionLockStepTurn) {
		if(confirmedActionLockStepTurn == currentLockStepTurn) {
			//if current turn, add to the current Turn Confirmation
			confirmedCurrent[confirmingPlayerID] = true;
			confirmedCurrentCount++;
			//if we recieved the last confirmation, stop timer
			//this gives us the length of the longest roundtrip message
			if(confirmedCurrentCount == lsm.numberOfPlayers) {
				currentSW.Stop ();
			}
		} else if(confirmedActionLockStepTurn == currentLockStepTurn -1) {
			//if confirmation for prior turn, add to the prior turn confirmation
			confirmedPrior[confirmingPlayerID] = true;
			confirmedPriorCount++;
			//if we recieved the last confirmation, stop timer
			//this gives us the length of the longest roundtrip message
			if(confirmedPriorCount == lsm.numberOfPlayers) {
				priorSW.Stop ();
			}
		} else {
			//TODO: Error Handling
			log.Debug ("WARNING!!!! Unexpected lockstepID Confirmed : " + confirmedActionLockStepTurn + " from player: " + confirmingPlayerID);
		}
	}

Sending the averages

To send the averages experienced on one client to the rest, the Action interface was changed to an abstract class that has two fields.

[Serializable]
public abstract class Action
{
	public int NetworkAverage { get; set; }
	public int RuntimeAverage { get; set; }

	public virtual void ProcessAction() {}
}

When processing the action, these numbers are added to the running average. Then the Lockstep turn and game frame turn is updated

    private void UpdateGameFrameRate() {
		//log.Debug ("Runtime Average is " + runtimeAverage.GetMax ());
		//log.Debug ("Network Average is " + networkAverage.GetMax ());
		LockstepTurnLength = (networkAverage.GetMax () * 2/*two round trips*/) + 1/*minimum of 1 ms*/;
		GameFrameTurnLength = runtimeAverage.GetMax ();

		//lockstep turn has to be at least as long as one game frame
		if(GameFrameTurnLength > LockstepTurnLength) {
			LockstepTurnLength = GameFrameTurnLength;
		}

		GameFramesPerLockstepTurn = LockstepTurnLength / GameFrameTurnLength;
		//if gameframe turn length does not evenly divide the lockstep turn, there is extra time left after the last
		//game frame. Add one to the game frame turn length so it will consume it and recalculate the Lockstep turn length
		if(LockstepTurnLength % GameFrameTurnLength > 0) {
			GameFrameTurnLength++;
			LockstepTurnLength = GameFramesPerLockstepTurn * GameFrameTurnLength;
		}

		LockstepsPerSecond = (1000 / LockstepTurnLength);
		if(LockstepsPerSecond == 0) { LockstepsPerSecond = 1; } //minimum per second

		GameFramesPerSecond = LockstepsPerSecond * GameFramesPerLockstepTurn;

		PerformanceLog.LogGameFrameRate(LockStepTurnID, networkAverage, runtimeAverage, GameFramesPerSecond, LockstepsPerSecond, GameFramesPerLockstepTurn);
	}

Update: Single Player Support

Since the posting of this article a small update has been made to support single player mode. Special thanks to Dan at redstinggames.com for figuring this out. You can see the modifications here:
Single Player Update diff

Source Code

Source code on bitbucket – Dynamic Lockstep Sample

Simple 2D Deterministic Physics Simulation

Motivation

When using a lockstep approach to networking we need to guarantee that each client will stay in sync with each other. One possible area of non-determinism can be the physics simulation. The physics simulation in Unity is not deterministic across multiple platforms. The problem is due to the fact that floating points can be handled differently on different machines depending on the compiler settings and optimizations. This gave me the motivation to implement my own custom physics engine using integer math only.

Unity3D uses NVIDIA’s PhysX engine. It mentions about the usage of Lockstep, Determinism and this engine in the following link:
PhysX Knowledge Base

In the Real Time Strategy game I am developing, all of the collisions will only occur on the x-z plane, so I only need a 2D physics simulation. Furthermore, the objects can all be represented with either a Circle or a Rectangle to keep the simulation simple. Note that the game will still be rendered in 3D and the physics will only be ran in 2D “behind the scenes”.

Overview

Collisions are resolved by using Impulse Resolution. My implementation of this is based on the following article:

How to Create a Custom 2D Physics Engine: The Basics and Impulse Resolution

I also used a QuadTree Data Structure to limit the number of collisions that have to be checked each frame. This is achieved in two phases. The first phase known as the “Broad Phase” will compare all objects within the same region in the QuadTree to each other using an axis aligned bounding box (AABB) test (for the rectangle objects it just uses the rectangle, for circles a rectangle is used that the circle would fit in). The AABB test is very quick compared to checking against a circle. This check gives us a list of potential collisions. Next we iterate through the potential collisions, determine if they did indeed collide, and if they did perform the impulse resolution to them as described in the above article. This phase is called the “Narrow Phase”.

The Game World Scale
In order to have more accuracy but maintain integers, the Physics Engine has a Game World Scale. This is used to scale up floating points. The Game World Scale is set at 100. This will give us accuracy up to two decimal points in Unity. This can be adjusted simply by changing the Game World Scale.

    public class PhysicsEngine
    {
        public const int GameWorldScale = 100;
        ...
    }

The suffix GWN (game world number) is added to some variables in the PhysicsObject to denote that it is in the Game World Scale.

The Physics Object

The physics object represents the object in our 2D world. It has a collider that is either a Rectangle or a Circle. The collider is used for the physics calculations. It also has a Bound object that is always a Rectangle and is the AABB of the object. This is used during the Broad Phase. The Physics object also has a Velocity, Mass, and Restitution.

The Physics object also allows you to attach one of any other object to it. This is useful to retrieve references to other objects in Unity when doing collision checks (for instance to determine if you ran into a building or a unit).

        object attached;

        public void Attach<T>(T t) where T:class
        {
            attached = t;
        }

        public T GetAttached<T>() where T:class
        {
            return attached as T;
        }

The Broad Phase

A QuadTree is a datastructure that can be used to recursively subdivide a region. Each region is a rectangle. The QuadTree will initially start out representing the entire area. As Objects are added to the QuadTree, it will divide itself into four sub QuadTrees when it reaches a certain limit and then move the objects to the sub QuadTree the object belongs in. The amount of objects to split on is a setting that is set when initializing the QuadTree. The QuadTree should also be limited to how many times it can split. This is because we want the smallest QuadTree region to still be larger than the area of our largest object. This is another setting that is set during initialization. Once the QuadTree reaches it’s maximum subdivisions, it will no longer split based on the maximum object setting. Instead it will just add it to the existing regions. In my implementation, if an objects region spans across multiple QuadTree regions, it will be added to each. So each object in the worst case can be in four QuadTrees (with the assumption made earlier that the smallest QuadTree region is still larger than our largest object region). When ever an object moves it is removed from the QuadTree, and then added back with it’s new location. Using this approach keeps us from having to rebuild the QuadTree every frame.

During the broad phase of the Physics Update, we iterate through each QuadTree region and find all potential collisions. Since an object can be in multiple QuadTree regions, it is possible that the collision will show up multiple times. To avoid duplicate collisions, I first defined a BraodPhaseCollision object that will hold the two objects that collided. I then override the GetHashCode() method and implemented it in a way that the hash code will be the same regardless of the order the objects are stored in the BroadPhaseCollision object. I also did the same for the Equals() method. Next I used a HashSet to hold all of the BroadPhaseCollision objects that where found. The HashSet data structure guarantees that an object will only be added once even if you call Add multiple times with the same object. Using this method a collision between [Object A and Object B] and a collision between [Object B and Object A] will be considered the same.

    internal class CollisionBroadPhase
    {
        public PhysicsObject A { get; private set; }
        public PhysicsObject B { get; private set; }

        private int hash;

        internal CollisionBroadPhase(PhysicsObject a, PhysicsObject b)
        {
            A = a;
            B = b;

            //make sure objects are always in the same order when calculating hash
            if (A.CompareTo(B) < 0)
            {
                hash = new Tuple<PhysicsObject, PhysicsObject>(A, B).GetHashCode();
            }
            else
            {
                hash = new Tuple<PhysicsObject, PhysicsObject>(B, A).GetHashCode();
            }
        }

        public override int GetHashCode()
        {
            return hash;
        }

        public override bool Equals(object obj)
        {
            if (obj is CollisionBroadPhase)
            {
                CollisionBroadPhase other = (CollisionBroadPhase)obj;
                if((A.Equals(other.A) && B.Equals(other.B)) || (A.Equals(other.B) && B.Equals(other.A))) {
                    return true;
                }
            }

            return false;
        }
    }

The Narrow Phase

At this point we will have a set of BroadPhaseCollision objects that are potentially colliding. Next we iterate through those objects and test if they have an actual collision. We have three different collision types:

  • Rectangle vs. Rectangle
  • Circle vs. Circle
  • Rectangle vs. Circle

Each collision will be handled slightly different. For more details please see the link mentioned in the overview. Each object has a Velocity (a 2D Vector) and a “restitution” to determines the “bouncy-ness” of the object. Additionally each object will also have a “Mass”. A Mass of 0 will indicate the object cannot be moved.

The main physics loop

After the narrow phase, each object’s Velocity will be updated based on the collision. The main physics loop will then iterate through all objects and update their location based on their Velocity. Velocity is in Units per Second, so the velocity will be updated based on the passed in Frames per Second rate.

Then each objects velocity will be reduced towards 0 based on a defined “friction” amount. The friction is applied after the collisions so that it does not have to overcome friction to separate the two objects. In this implementation we only have friction between the objects and the x-z plane. There is no friction calculations between each object.

        public void Update(int fixedFramesPerSecond)
        {

            OrderedSet<CollisionBroadPhase> broadCollisions = broadPhaseCollisionDetector.GetAllCollisions();

            foreach (CollisionBroadPhase broadCollision in broadCollisions)
            {
                CollisionResolver.ResolveCollision(broadCollision);
            }

            foreach (PhysicsObject physicsObject in PhysicsObjects)
            {
                if (physicsObject.MassGWN > 0 && physicsObject.Velocity != Vector2Int.Zero)
                {
                    //remove from quad tree as it's position is about to change
                    broadPhaseCollisionDetector.Remove(physicsObject);

                    physicsObject.Center = physicsObject.Center + (physicsObject.Velocity / fixedFramesPerSecond);

                    //apply friction
                    physicsObject.Velocity = physicsObject.Velocity.MoveTowardsZero(Friction / fixedFramesPerSecond);

                    broadPhaseCollisionDetector.Insert(physicsObject);
                }
                
            }

            
        }

Source Code
Source Code on Bit Bucket

Generic OrderedDictionary implemented in C# DotNet

The Dictionary class in .NET does not have a defined ordering when iterating through it. It can seem to maintain order until you start to remove objects from the collection. In order to maintain determinism, when ever we loop through a collection it needs to be in a deterministic order. There does already exist a OrderedDictionary, however there is not a Generic version.

This implementation uses two underlying collections. A Dictionary and a LinkedList. The Dictionary maintains a mapping from the Key to a LinkedListNode in the LinkedList. The LinkedList maintains a list of KeyValuePairs and keeps them in insertion order. A LinkedList is used instead of a List as it provides constant time removal (assuming you have a reference to a node).

private Dictionary<TKey, LinkedListNode<KeyValuePair<TKey, TValue>>> mDictionary;
private LinkedList<KeyValuePair<TKey, TValue>> mLinkedList;

This implementation was inspired by the OrderedHashSet found at:

HASHSET THAT PRESERVES INSERTION ORDER OR .NET IMPLEMENTATION OF LINKEDHASHSET

The source code can be found at:

Generic OrderedDictionary on BitBucket

Lockstep Implementation in Unity3D

In the lockstep network model each client application runs the entire game simulation. The benefits of this approach is that it reduces the amount of information that needs to be sent over the wire. Only the user inputs need to be sent to each other, as opposed to an authoritative server model where for example every units position will need to be sent as often as possible.

For example imagine you want to move a character in the game world. In an authoritative model, the physics simulation will be ran only on the server. The client will communicate to the server where the character should move to. The server will perform the path finding and start to move the character. The server will then send to each client the characters position as frequently as possible so that the clients will have the most updated location. This must be done for every character in the game world. In a real time strategy game you can have thousands of units so the authoritative model would not be feasible.

In a lockstep model, once the user decides to move a character, that is communicated to every client. Every client will then perform path finding and update the characters position. Only the first communication will need to be sent over the internet as every client will have their own physics simulation and can update the characters position themselves.

This model does present some challenges. Each client must run their simulation in sync with each other. This means for example, that the physics simulation must perform the exact same number of updates and in the same order for each action. If this was not the case, then one client could get ahead or behind of the others and when a new command is sent, the path would be different for the client that got ahead or behind. These differences would build up and very different games would be played for each client.

Another issue is determinism across different machines and platforms. Small differences in calculations can cause butterfly effects building up to very different games. This issue will be covered in more detail in a future article.

The implementation presented here is inspired by the 1500 Archers article. Each players command is processed two turns in the future. Having a delay between when the action is sent and when it is processed helps the game from seeming choppy and laggy. This implementation also leaves room for us to add dynamic controls to adjust the turn length based on latency and machine performance. This part is not covered here and will come in a future article.

For this implementation we have the following definitions:

Lockstep turn
A lockstep turn will be made up of multiple game turns. One action per player will be processed in one lockstep turn. The length of the lockstep turn will be adjusted based on performance. At this time it is just hard coded as 200ms.
Game turn
A game turn is when the game logic and physics simulation will be updated. The number of game turns per lockstep turn will be adjusted based on performance. At this time it is hard coded to 50ms, or 4 per lockstep. This means there would be 20 game turns per second.
Action
An action is a command issued by a player. For example select units in a specified area, or move the selected units to the target location.

Note: we will not be using unity3d’s physics engine. A custom engine that is deterministic will be used. It’s implementation will be covered in a future article.

The main game loop

Unity3d’s loop is ran in a single thread. There are two functions that can be implemented to insert our custom code:

  • Update()
  • FixedUpdate()

Unity3d’s main loop will call Update() once per iteration. This will happen as fast as possible, or will attempt to run at a specific fps rate depending on the settings. FixedUpdate() will be ran so that it is called a predictable number of times per second depending on the settings. During the main loops iteration, it can be called zero or multiple times depending on how long the last iteration took. FixedUpdate() has the behavior we want as we need our code to run an exact number of times per lockstep turn. However, FixedUpdate() rate can only be set before run time. We want to adjust our game frame rate based on performance.

The Game Frame Turn

This implementation has similar logic as the FixedUpdate() inside our Update() function. The main difference is that we can adjust it’s rate. This is achieved by having an “accumulative” time. The time of the last iteration will be added to it each Update() call. This is the Time.deltaTime. If the accumulative time is greater than our fixed game frame rate (50ms) then we will make a gameframe() call. We will continue to call gameframe() and subtract from accumulative time by 50ms until accumulative time is less than 50ms.

	private float AccumilatedTime = 0f;

	private float FrameLength = 0.05f; //50 miliseconds

	//called once per unity frame
	public void Update() {
		//Basically same logic as FixedUpdate, but we can scale it by adjusting FrameLength
		AccumilatedTime = AccumilatedTime + Time.deltaTime;

		//in case the FPS is too slow, we may need to update the game multiple times a frame
		while(AccumilatedTime > FrameLength) {
			GameFrameTurn ();
			AccumilatedTime = AccumilatedTime - FrameLength;
		}
	}

We keep track of the number of game frames for the current lockstep turn. When we reach the number of game frames per lockstep turn, we update the lockstep turn on the next game frame. If the lockstep is not ready to advance to the next turn then we will not increment the game frame, and we will perform the lockstep check again on the next frame.

	private void GameFrameTurn() {
		//first frame is used to process actions
		if(GameFrame == 0) {
			if(LockStepTurn()) {
				GameFrame++;
			}
		} else {
			//update game

			//...
			
			GameFrame++;
			if(GameFrame == GameFramesPerLocksetpTurn) {
				GameFrame = 0;
			}
		}
	}

During the game frame turn, the physics simulation is updated and our game logic is updated. Game logic is added by implementing an Interface (IHasGameFrame) and adding that object to a collection that we can iterate through.

	private void GameFrameTurn() {
		//first frame is used to process actions
		if(GameFrame == 0) {
			if(LockStepTurn()) {
				GameFrame++;
			}
		} else {
			//update game
			SceneManager.Manager.TwoDPhysics.Update (GameFramesPerSecond);
			
			List<IHasGameFrame> finished = new List<IHasGameFrame>();
			foreach(IHasGameFrame obj in SceneManager.Manager.GameFrameObjects) {
				obj.GameFrameTurn(GameFramesPerSecond);
				if(obj.Finished) {
					finished.Add (obj);
				}
			}
			
			foreach(IHasGameFrame obj in finished) {
				SceneManager.Manager.GameFrameObjects.Remove (obj);
			}
			
			GameFrame++;
			if(GameFrame == GameFramesPerLocksetpTurn) {
				GameFrame = 0;
			}
		}
	}

The IHasGameFrame interface has a method called GameFrameTurn that takes as an argument the current number of game frames per second. An implementing object that has game logic should base any calculations on the GameFramesPerSecond. For instance if one unit is attacking another and it has an attack rate of 10 damage per second, you would apply the damage by dividing it by GameFramesPerSecond. The GameFramesPerSecond will be adjusted based on performance.

The IHasGameFrame interface also has a property to indicate when it is finished. This allows the implementing object to inform the main game frame loop it is no longer needed. An example of this would be an object that follows a path and once it reached it’s destination, the object is no longer needed.

The Lockstep Turn

In order to stay in sync with the other clients, each lockstep turn we must ask the following questions:

  • Did we recieve every client’s action for the next turn?
  • Did every client confirm that they recieved our action?

We have two objects, ConfirmedActions and PendingActions. Each of these have a collection for each possible message they may receive. We check both of these objects if we are ready to advance to the next turn.

	private bool NextTurn() {		
		if(confirmedActions.ReadyForNextTurn() && pendingActions.ReadyForNextTurn()) {
			//increment the turn ID
			LockStepTurnID++;
			//move the confirmed actions to next turn
			confirmedActions.NextTurn();
			//move the pending actions to this turn
			pendingActions.NextTurn();
			
			return true;
		}
		
		return false;
	}

Actions

Actions or commands are communicated by implementing the IAction interface. This has one method with no arguments called ProcessAction(). The class must also be Serializable. This means any fields of the object should also be Serializable. When a user interacts with the UI, an instance of the action is created, and sent to our lockstep manager in a queue. The queue is used in case the game is too slow and the user is able to send more than one command within a single lockstep turn. Only one command will be sent at a time, but none of them will be ignored.

When sending the action to the other players, the action instance will be serialized to an array of bytes, and then deserialized by those players. A default “NoAction” object will be sent when the user did not perform any actions that turn. The other actions will be specific to the game logic. Here is an example of an action to create a new unit:

using System;
using UnityEngine;

[Serializable]
public class CreateUnit : IAction
{
	int owningPlayer;
	int buildingID;
	
	public CreateUnit (int owningPlayer, int buildingID) {
		this.owningPlayer = owningPlayer;
		this.buildingID = buildingID;
	}
	
	public void ProcessAction() {
		Building b = SceneManager.Manager.GamePieceManager.GetBuilding(owningPlayer, buildingID);
		b.SpawnUnit();
	}
}

This action depends on the SceneManager having a static reference. If you do not like this implementation, the IAction interface could be modified so that the ProcessAction() receives an instance of a SceneManager.

The Sample Source can be found at:
Bitbucket – Sample Lockstep

Part 2