ReplayDIRECTOR
Technical Whitepaper

Recording and Replaying for Multithreaded
and Multiprocessor Games 

A ReplaySolutions White Paper

“Have you ever spent time reproducing bugs?
Would you like all that time back?”
 

Spending long hours, days and weeks bug-fixing can take focus away from creating a great game, result in delayed ship dates, and ultimately doom a project to failure.

An accurate recording and replay system that does not affect performance can save massive amounts of time and effort during the development cycle, especially during crunch time.


FACT: Real bug databases have shown that a good recording system can reduce the amount of time spent fixing bugs by 40%.

Recording and Replay Systems

An effective recording and replay system that does not affect performance can save massive amounts of time during the development cycle, especially during crunch time.

This will:

  •  Save time and money during development
  •  Allow you to ship your game on time
  •  Improve the quality of your game, and the MetaCritic™ score.


The Problems…

Time is constantly wasted working to get reproducible bugs into developers’ hands from QA.

Furthermore, some of the toughest challenges for developers moving onto next generation hardware involve dealing with multiprocessors and multithreaded programming.

Writing bug-free software on these platforms can be orders of magnitude more difficult than previous generations. A lot of those pains deal with getting a predictable, reproducible bug before the developer can even attempt to fix the problem, and then verify the fix.


FACT: 50% of bugs are fixed incorrectly the first time!

Source: NASA study

FACT: 67% of all games miss their initial target ship date!

The Past & Present…

Without a recording system, the typical workflow in bug fixing is often the following:

imageimage
Figure 1 – Older Methods of Bug Fixing

Current solutions to these problems attempt to use tools like VCRs and DVRs to record the video output which developers can hopefully use to reproduce the problem themselves in the game.

Controller, keyboard and mouse input recorders can record some bugs, but typically are not effective for most problems.

Finally, more advanced solutions include home-grown, proprietary recording systems that can require programmers to use certain code-bases, be difficult to maintain and affect performance significantly.


ReplayDIRECTOR™ records everything, including:
1. Context switches
2. All User Input
3. Timers 
4. Uninitialized Stack &

     Heap Memory Access
5. Async File IO

6. Network IO, XboxLIVE
7. Interrupts & Callbacks
6. Assembly Instructions
(eg. MFTB, RDTSC)

ReplayDIRECTOR™: A Deep Recording System

Replay Solutions has solved the major technical hurdles to provide a 100% accurate, fully transparent recording system that does not require any source code changes. This system is called ReplayDIRECTOR™.

ReplayDIRECTOR™ is applied after compiling and linking the game using binary instrumentation. It does not affect debugging symbols or performance in the game. This means the development team does not need to make any changes to the way they work, the code they write, or the tools they use.

With ReplayDIRECTOR™ applied, the game runs normally with only a small amount of recorded data being output and stored in the background. Recordings of bugs can be attached to bug reports, or FTP’d to developers. Most importantly, standard debuggers like Visual Studio can be used normally.


We found the bug once, and our developer just pressed play and saw the crash in the debugger…

John Chowanec
Lead Producer
Crystal Dynamics

The New Workflow

With an accurate recording system like ReplayDIRECTOR™ in place, the bug fixing workflow is completely optimized. The error-prone steps are eliminated.

imageimage
Figure 2 : The ReplayDIRECTOR™ Workflow


“We recorded a 16 player game where we found the crash, sent the Replay to the developer, and they fixed it that night.”

Eric Masyk
Lead Tester
Crystal Dynamics

“ReplayDIRECTOR™ is one of the best tools I've found for our test department here at Crystal Dynamics.”

Chris Bruno
QA Manager
Crystal Dynamics

ReplayDIRECTOR™: T he Internals

The ReplayDIRECTOR™ system sits between the operating system acting as a light-weight filter to record and replay all sources of random, or ‘non-deterministic’, data or events that might affect the game.

image
Figure 3 : ReplayDIRECTOR™ Architecture

The result of this is a system that will produce a 100% accurate, frame-by-frame, line-for-line replay of the recording. It is possible to attach a debugger, set breakpoints, single-step and inspect data, because the game is in fact running in real-time.


Performance? ReplayDIRECTOR™ will not affect the frame rate of your game.
 “With Replay, we had all the testers play the game; we recorded the bug and sent the Replay to our developer in the Netherlands. We had a fix the next day.”

Joe Quadara
Lead Multiplayer Tester
Eidos

ReplayDIRECTOR™: Performance & Memory Usage

ReplayDIRECTOR™ is designed to have as little impact as possible on system resources. Metrics for the currently supported platforms are as follows:

  Memory Usage Recorded Data
XBOX 360 12 -16 mbytes* 600 – 1100 kb / min
Vista 6.5 mbytes 800 – 1400 kb / min
XP 4.5 mbytes 600 – 1200 kb / min
*ReplayDIRECTOR™ for Xbox360 is currently being optimized for memory usage.

During recording and replaying, the run-time performance impact is not noticeable. The frame rate of the game should not drop as a result of ReplayDIRECTOR™.

This means that developers and testers can have ReplayDIRECTOR™ recording in the background at all times. This is a dramatic shift in terms of the usage of recording systems. With recording always on, anytime a bug is found, a perfect recording is available to share with any team member, whether that team member is in the building, or on the other side of the world.


“Q: How do you see yourself using Replay on your upcoming project?
A: From Start to Finish.

John Chowanec
Lead Producer
Crystal Dynamics

ReplayDIRECTOR™: Native Multi-core Support

Multi-threaded and mutli-core game developers face greater challenges when bug fixing.

The ReplayDIRECTOR™ system records critical state data when a thread context switch occurs. During replay, ReplayDIRECTOR™ makes sure the application is scheduled in the same way as it was when recorded. This ensures that critical context switches occur in the exact same order.

This makes sure that execution will be 100% reproducible during replay, even if the code has any number of unidentified race conditions.


 

ReplayDIRECTOR™: Interrupt & Callback Support

The system is designed to handle all potential sources of random, or non-deterministic, input to the game. These of course include interrupts and callbacks which may access game code at any point.

Using a low-level system of handlers and trapping mechanisms, coupled with a specialized binary instrumentation technique, ReplayDIRECTOR™ records all occurrences of interrupts and callbacks that affect game execution. During replay, the game will receive the same sequence of interrupts and callbacks that were recorded, along with the same data payloads.


Using ReplayDIRECTOR™…

Using ReplayDIRECTOR™ is simple and requires virtually no changes to the development cycle, or the build process. Replay features are applied as a binary instrumentation step immediately after linking.

  • Write code
  • Compile & Link
  • Install ReplayDIRECTOR™
  • Package & Distribution

Installing ReplayDIRECTOR™ takes between 4 to 8 seconds. This must be done each time a new executable binary is created.

The ‘Replay-enabled’ game can be run normally, burned to CD or DVD and run by anyone. Users will not even know it’s there until they need it.