Overview
Multiplayer games based on the Source
engine use a Client/Server networking
architecture. Usually a server is a
dedicated host that runs the game and is
authoritative about world simulation, game
rules, and player input processing. A client
is a player's computer connected to a game
server. The client and server communicate
with each other by sending small data
packets at a high frequency (usually 20 to
30 packets per second). A client receives
the current world state from the server and
generates video and audio output based on
these updates The client also samples data
from input devices (keyboard, mouse,
microphone, etc.) and sends these input
samples back to the server for further
processing. Clients only communicate with
the game server and not between each other
(like in a peer-to-peer application). In
contrast with a single player game, a
multiplayer game has to deal with a variety
of new problems caused by packet-based
communication.
Network bandwidth is limited, so the
server can't send a new update packet to all
clients for every single world change.
Instead, the server takes snapshots of the
current world state at a constant rate and
broadcasts these snapshots to the clients.
Network packets take a certain amount of
time to travel between the client and the
server (i.e. the ping time). This means that
the client time is always a little bit
behind the server time. Furthermore, client
input packets are also delayed on their way
back, so the server is processing temporally
delayed user commands. In addition, each
client has a different network delay which
varies over time due to other background
traffic and the client's framerate. These
time differences between server and client
causes logical problems, becoming worse with
increasing network latencies. In fast-paced
action games, even a delay of a few
milliseconds can cause a laggy gameplay
feeling and make it hard to hit other
players or interact with moving objects.
Besides bandwidth limitations and network
latencies, information can get lost due to
network packet loss.

To cope with all these issues introduced
by network communication, the Source engine
uses multiple techniques to solve these
problems, or at least make them less visible
to the player. These techniques include data
compression, interpolation, prediction, and
lag compensation. These techniques are
tightly coupled, and changes made within one
system may affect other systems. This
document describes the general functionality
of these systems and how they work together.
Basic Networking
The server simulates the game in discrete
time steps called ticks. By
default, 66 ticks per second are simulated,
but mods can specify their own tickrate. For
example Counter-Strike:Source uses a lower
tickrate of 33 ticks/second to reduce the
server CPU load. During each tick, the
server processes incoming user commands,
runs a physical simulation step, checks the
game rules, and updates all object states.
After simulating a tick, the server decides
if any client needs a world update and takes
a snapshot of the current world state if
necessary. A higher tickrate increases the
simulation precision, but also requires more
CPU power and available bandwidth on both
server and client. The server admin may
override the default tickrate with the
-tickrate command
line parameter, though tickrate changes done
this way are not recommended because the mod
may not work as designed if its tickrate is
changed.
Clients usually have only a limited
amount of available bandwidth. In the worst
case, players with a modem connection can't
receive more than 5 to 7 KB/sec. If the
server would tried to send them updates with
a higher data rate, packet loss would be
unavoidable. Therefore, the client has to
tell the server its incoming bandwidth
capacity by setting the console variable
rate (in
bytes/second). This is the most important
network variable for clients and it has to
be set correctly for an optimal gameplay
experience. The client can request a certain
snapshot rate by changing
cl_updaterate
(default 20), but the server will never send
more updates than simulated ticks or exceed
the requested client
rate limit. Server admins can limit
data rate values requested by clients with
sv_minrate and
sv_maxrate (both
in bytes/second). Also the snapshot rate can
be restricted with
sv_minupdaterate and
sv_maxupdaterate
(both in snapshots/second).
The client creates user commands
from sampling input devices with the same
tick rate that the server is running with. A
user command is basically a snapshot of the
current keyboard and mouse state. But
instead of sending a new packet to the
server for each user command, the client
sends command packets at a certain rate of
packets per second (usually 30). This means
two or more user commands are transmitted
within the same packet. Clients can increase
the command rate with
cl_cmdrate. This will increase
responsiveness but requires more outgoing
bandwidth, too.
Game data is compressed using delta
compression to reduce network load.
That means the server doesn't send a full
world snapshot each time, but rather only
changes (a delta snapshot) that happened
since the last acknowledged update. With
each packet sent between the client and
server, acknowledge numbers are attached to
keep track of their data flow. Usually full
(non-delta) snapshots are only sent when a
game starts or a client suffers from heavy
packet loss for a couple of seconds. Clients
can request a full snapshot manually with
the cl_fullupdate
command.
Responsiveness, or the time between user
input and its visible feedback in the game
world, are determined by lots of factors,
including the server/client CPU load,
simulation tickrate, data rate and snapshot
update settings, but mostly by the network
packet traveling time. The time between the
client sending a user command, the server
responding to it, and the client receiving
the server's response is called the
latency or ping (or round trip
time). Low latency is a significant
advantage when playing a multiplayer online
game. Techniques like prediction and lag
compensation try to minimize that advantage
and allow a fair game for players with
slower connections. Tweaking networking
setting can help to gain a better experience
if the necessary bandwidth and CPU power is
available. We recommend keeping the default
settings, since improper changes may cause
more negative side effects than actual
benefits.
Entity Interpolation
By default, the client receives about 20
snapshot per second. If the objects
(entities) in the world were only rendered
at the positions received by the server,
moving objects and animation would look
choppy and jittery. Dropped packets would
also cause noticeable glitches. The trick to
solve this problem is to go back in time for
rendering, so positions and animations can
be continuously interpolated between two
recently received snapshot. This technique
is called client side entity interpolation
and is enabled by default with
cl_interpolate 1.
With 20 snapshots per second, a new update
arrives about every 50 milliseconds. If the
client render time is shifted back by 50
milliseconds, entities can be always
interpolated between the last received
snapshot and the snapshot before that. The
Source engine does the entity interpolation
with a 100-millisecond delay (cl_interp
0.1). This way, even if one snapshot
is lost, there are always two valid
snapshots to interpolate between. Take a
look at the following figure showing the
arrival times of incoming world snapshots:

The last snapshot received on the client
was at tick 344 or 10.30 seconds. The client
time continues to increase based on this
snapshot and the client frame rate. If a new
video frame is rendered, the rendering time
is the current client time 10.32 minus the
view interpolation delay of 0.1 seconds.
This would be 10.22 in our example and all
entities and their animations are
interpolated using the correct fraction
between snapshot 340 and 342.
Since we have an interpolation delay of
100 milliseconds, the interpolation would
even work if snapshot 342 were missing due
to packet loss. Then the interpolation could
use snapshots 340 and 344. If more than one
snapshot in a row is dropped, interpolation
can't work perfectly because it runs out of
snapshots in the history buffer. In that
case the renderer uses extrapolation (cl_extrapolate
1) and tries a simple linear
extrapolation of entities based on their
known history so far. The extrapolation is
done only for 0.25 seconds of packet loss (cl_extrapolate_amount),
since the prediction errors would become to
big after that.
The entity interpolation is causing a
constant view "lag" of 100 milliseconds,
even if you're playing on a listen server
(server and client on the same machine). So
if you turn on
sv_showhitboxes the player hitboxes
are drawn in server time, meaning they are
ahead of the rendered player model by 100
milliseconds. This doesn't mean you have to
lead you're aiming when shooting at other
players since the server-side lag
compensation knows about client entity
interpolation and corrects this error. If
you turn off interpolation on a listen
server (cl_interpolate 0),
the drawn hitboxes will match the rendered
player model again, but the animations and
moving objects will become very jittery.
Input Prediction
Lets assume a player has a network
latency of 100 milliseconds and starts to
move forward. The information that the
+FORWARD key is
pressed is stored in a user command and send
to the server. There the user command is
processed by the movement code and the
player's character is moved forward in the
game world. This world state change is
transmitted to all clients with the next
snapshot update. So the player would see his
own change of movement with a 100
milliseconds delay after he started walking.
This delay applies to all players actions
like movement, shooting weapons, etc. and
becomes worst with higher latencies.
A delay between player input and
corresponding visual feedback creates a
strange, unnatural feeling and makes it hard
to move or aim precisely. Client-side input
prediction (cl_predict 1)
is a way to remove this delay and let the
player's actions feel more instant. Instead
of waiting for the server to update your own
position, the local client just predicts the
results of its own user commands. Therefore
the clients runs exactly the same code and
rules the server will use to process the
user commands. After the prediction is
finished, the local player will move
instantly to the new location while the
server still sees him at the old place.
After 100 milliseconds, the client will
receive the server snapshot that contains
the changes based on the user command he
predicted earlier. Then the client compares
the server position with his predicted
position. If they are different, a
prediction error has occurred. This
indicates that the client didn't have the
correct information about other entities and
the environment when it processed the user
command. Then the client has to correct its
own position, since the server has final
authority over client-side prediction. If
cl_showerror 1 is
turned on, clients can see when prediction
errors happen. Corrections of prediction
errors can be quite noticeable and will
cause your view to jump a bit. To visually
smooth that effect, the prediction error is
corrected gradually over a short amount of
time (cl_smoothtime).
Prediction error smoothing can be turned off
with cl_smooth 0.
Predicting an object's behavior only
works if the clients knows same the rules
and state of the object like the server
does. That's usually not the case since the
server knows more internal information about
objects than the clients do. Clients see
only a small part of the world and just get
enough information to render objects.
Therefore, prediction works only for your
own player, and the weapons controlled by
you. Proper prediction of other players or
interactive objects is not possible on the
client at this point.
Lag Compensation
Let's say a player shoots at a target at
client time 10.5. The firing information is
packed into a user command and sent to the
server. While the packet is on its way
through the network, the server continues to
simulate the world, and the target might
have moved to a different position. The user
commands arrives at server time 10.6 and the
server wouldn't detect the hit, even though
the player has aimed exactly at the target.
This error is corrected by the server-side
lag compensation (sv_unlag
1)
The lag compensation system keeps a
history of all recent player positions for a
time span of about one second (can be
changed with sv_maxunlag).
If a user command is executed, the server
estimates at what time the command was
created. This command execution time is
calculated as followed:
Command Execution Time =
Current Server Time - Client Latency -
Client View Interpolation
Then the server moves all other players
back to where they were at the command
execution time. The user command is executed
and the hit is detected correctly. After the
user command has been processed, the players
are moved back to their original position.
On a listen server you can enable
sv_showimpacts 1
to see the different server and client
hitboxes:

This screenshot was taken on a listen
server with 200 milliseconds of lag (using
net_fakelag),
right after the server confirmed the hit.
The red hitbox shows the target position on
the client where it was 100 milliseconds
ago. Since then, the target continued to
move to the left while the user command was
traveling to the server. After the user
command arrived, the server restored the
target position (blue hitbox) based on the
estimated command execution time. The server
traces the shot and confirms the hit (the
client sees blood effects). Client and
server hitboxes don't exactly match because
of small precision errors in time
measurement. Even a small difference of a
few milliseconds can cause an error of
several inches for fast-moving objects.
Multiplayer hit detection is not pixel
perfect and has known precision limitations
based on the tickrate and the speed of
moving objects. Increasing the tickrate does
improve the precision of hit detection, but
also requires more CPU, memory, and
bandwidth capacity for server and clients.
The question arises, why is hit detection
so complicated on the server? Doing the back
tracking of player positions and dealing
with precision errors while hit detection
could be done client-side way easier and
with pixel precision. The client would just
tell the server with a "hit" message what
player has been hit and where. We can't
allow that simply because a game server
can't trust the clients on such important
decisions. Even if the client is "clean" and
protected by VAC (Valve-Anti-Cheat), the
packets could be still modified on a 3rd
machine while routed to the game server.
These "cheat proxies" could inject "hit"
messages into the network packet without
being detected by VAC (a "man-in-the-middle"
attack).
Network latencies and lag compensation
can creates paradoxes that seem illogical
compared to the real world. For example, you
can be hit by an attacker you can't even see
anymore because you already took cover. What
happened is that the server moved your
player hitboxes back in time, where you were
still exposed to your attacker. This
inconsistency problem can't be solved in
general because of the relative slow packet
speeds. In the real world, you don't notice
this problem because light (the packets)
travels so fast and you and everybody around
you see the same world as it is right now.
Net Graph
The Source engine offers a couple of
tools to check your client connection speed
and quality. The most popular one is the net
graph, which can be enabled with
net_graph 2.
Incoming packets are represented by small
lines moving from right to left. The height
of each line reflects size of a packet. If a
gap appears between lines, a packet was lost
or arrived out of order. The lines are
color-coded depending on what kind of data
they contain.
Under the net graph, the first line shows
your current rendered frames per second,
your average latency, and the current value
of cl_updaterate. The second line shows the
size in bytes of the last incoming packet
(snapshots), the average incoming bandwidth,
and received packets per second. The third
line shows the same data just for outgoing
packets (user commands).
|