There are a few kinds of game metrics I can think of, when the subject comes up:
- Operational metrics
- Game state snapshots
- Event logs
Game state snapshots: this is the sort of data we provide with the character XML feeds - what your character looks like right now. If we save enough historical data (or generate and record reports on the data at regular intervals), then we can actually do a lot of analysis on trends over time. If we only save reports, and purge snapshots periodically, some information is lost, but we gain most of the benefit at a fraction of the data storage cost. Better, any report you know ahead of time you want to do can easily be reported in each snapshot.
One downside of how we're currently doing snapshots (storing the generated XML) is that it's very costly to actually do analysis of the whole data set (Essentially, it relies on parsing all of the XML each time, and traversing the in-memory document, rather than relying on the database server to keep the data collected by type and constantly organized. (With SQL Server 2005 we could conceivably work around some of these limitations by using the XML data type, which tells the database to do the parsing and organization... I have not fully explored this option yet.)
Event logs: you can infer events from changes in the state of snapshots (so far, even most of our internal metrics are snapshot-based, so we've done a lot of work in this regard), but you can also rebuild state based on events (again, it can be a real pain). In either case, though, you can't get one from the other quickly - if we were generating character XML based on replaying all the events relevant to a character, we could not keep the web data as up-to-date with the database view as we do.
One thing you can't get from event inference (starting with snapshots, and figuring out what happened between two snapshots) is the precise ordering of events... which in turn lets you reconstruct overall game play at a later date. You can answer one-off, unexpected questions as they come up, which can be a boon to bug hunting and customer feedback. You can also do neat things like RSS feeds in real time (where 'real time' means "predictable, very short lags between the event and its presence in the feed").
It really comes down to that - what is the best use of your time? If you know you're going to be asking a question regularly, record the answer to the question, not information that lets you extrapolate the answer. Imagine calculating PvP ratings by wandering through each PvP match, and calculating who the winner is based on who died how often... clearly, the answer is to at least record "who won" each time. But if you're displaying the PvP rating in-game, then it's clearly calculated in-game... so just record the rating as data on the player and don't ask your reporting team to calculate it later.
One interesting thing I've gotten to see, being at NCsoft, is games that try to focus solely on just event logs or snapshots, and ignore or pish-posh the other. So far, it seems pretty clear that either one on its own fails to provide some of the information (or at least, easy access to information) the other approach could easily provide.


It's only in practice that they aren't.
I think that snapshots are far more useful than event logs for external consumption. Event logs shine in after the fact analysis and data mining. As you say - you can frequently infer events from snapshots, but if you have comprehensive event logs you can always re-create the snapshots. It just won't be cost effective to do it very often.
If you realize you need snapshots for something in your data-mining, you can write the process to play forward through the events and re-create the data in a batch process (this assumes offline data processing, rather than trying to do something even close to real-time). I've gotten a lot of mileage out of this technique when analyzing fraud patterns.
But also, my take away from Joe Ludwig's presentation about the FLogging of Pirates of the Burning Sea was that even data that they knew was important to their analysis like "time at level" was painful to recreate. Contrariwise, City of Heroes has clearly been happy answering the questions they already knew they had for that exact question.
But snapshots can only moderately (at best) answer questions you develop after the fact... and they are almost useless to customer service. One of the things Joe delved into on FLogs is the way CS could view FLog events (including chat, trade events, etc.) to respond to complaints about griefing and the like.
Incidentally, if I can make it fast, I think event logs will also be valuable for external consumption. Dump events like "level gained," "rainbow item found," and so on events into a web accessible database, and you've got an RSS feed waiting to happen. And since "RSS feeds" are on my to-do list... :-)
Also, hi! I will start reading your blog now :)
Yeah, we are doing something similar internally - mostly because we haven't got much budget, and it's easier politically to build on existing internal solutions right now. :-/