Gamasutra: Christopher Hazard's Blog - Developing a Performance Test Suite Using Replays

news

features

Blogs

jobs/resumes

contractors

store

rss

twitter


	View All

	August 10, 2010

		THQ Reports $30M Loss, 2.7M UFC Units Shipped [3]

		Interview: How Joystick Labs Is Supporting Indie Entrepreneurship [6]

		Top iPhone Game Apps: Osmos Charts High in Debut Week


	View All

	August 10, 2010

		Infinite Space: An Argument for Single-Sharded Architecture in MMOs [11]

		Technology, Design: Rage [3]

		The Deaths Of Game Narrative [39]


	View All Post a Job RSS

	August 10, 2010

		Relic Entertainment Animator

		Watercooler, Inc Art Lead

		Activision Web Engineer

		Activision Web Designer

		Activision Director, User Experience & Web Management

		Relic Entertainment Level Designer (3rd Person Action)

Developing a Performance Test Suite Using Replays
by Christopher Hazard on 08/06/10 04:25:00 pm Expert Blogs

1 comments

Posted 08/06/10 04:25:00 pm

If you're making a AAA title you may want to squeeze more detail into a scene. If you're making a casual game for a mobile device, you might want to improve responsiveness on the slowest device or reduce the battery drain of your game.

Normally, developers run the game engine using a profiler to see how the code is performing. Profilers are commonly used development tools that tell you which parts of your code are running what percentage of the time. Profiling is an essential skill every developer should have, only a few pegs behind being able to debug effectively. Part of the skill is knowing what to profile.

The first pass on profiling is to simply sit down and play some of the game yourself with the profiler running in the background. You take a look at the profiler results, and it tells you that 30 percent of the time is spent doing some task like collision detection. You look at the code, figure out a way to make it better, and test it functionally. It works! Now, how do you know it performs better?

If you play the game for more profiling, you can't exactly compare your previous results. Even if you were in the same area in the game when you play again, your results won't be the same. Maybe you missed a jump your second time around, or your fingers were a little slower, or you play through faster. It's difficult and time consuming to ensure the same scenario, so it's better to automate it.

Having a replay framework either in your game engine or in your development environment does not just offer major benefits for reproducing bugs, but it also helps with profiling. You can play through several areas of the game, ideally with different play styles by different people, and now you have a suite of performance tests. Every time you run your regular testing for code or content changes, you can run your profile suite.

Another strategy is to use your AI against itself. Each random number seed you use to kick the game off is another test case. The obvious risk here is that human players won't use the same playstyle as the AI, leaving your performance results biased.

In software testing, there's the idea of code coverage. That means that you make sure your tests run through every functional area in your code. You make sure that every part of code that checks constraints, triggers an action, etc. is run. It's impossible to test every case, but the goal is to make sure that all of the code is tested at least once.

In performance testing, an analogous case is to make sure you've gone through all the major use cases of the game. Do your automated replays go through the area with the biggest number of bad guys, the biggest number of lights, and that mini-game you so cleverly designed by hacking around the game mechanics?

The game my company is currently working on, the time travel RTS Achron, is very performance intensive on the CPU. The motivation for this blog post came from a recent discovery. We had a small battery of performance tests that we'd always run, and we depended on its results.

Recently, one of our users made a custom map that was quite large and complained that the performance on his map was so bad that it eventually became unplayable. My initial reaction was to think that he was using a slow CPU or that it'd be some issue we couldn't fix, but because he'd posted his save game (which, in Achron includes most of the game replay), I decided to profile it.

It turned out that he had hit upon a performance case that was not included in our suite: a large battle as it moves off the timeline. Over 20% of the CPU was being spent checking to see if a set of resources could be freed, a function that had never even appeared on the list of the top 100 most time-consuming functions.

All I had to do was reduce the frequency that the engine attempts to reclaim those resources. By checking only one twentieth of the amount of time, I was able to reclaim 19% of the CPU, and the only drawback was that certain resources on average took a few hundred milliseconds longer to be reclaimed.

Comments

Robert Allen

8 Aug 2010 at 7:59 pm PST

When I was working on commercial desktop messaging software we had a series of hundreds of regression tests in 3-4 seperate levels of time consumption (i.e. coffee break, lunch, overnight, and then some.) It helped us a LOT in delivering quality software in a timely manner.

Sadly with games, many of which are crafted from scratch each time, it's not easy nor likely to develop and apply regression tests during a single game life-cycle. But it probably is possible to generate a list of "must have tests" which should be created and used for each new game.