DPS Bot takes parses from World of Logs and aggregates them to rank dps specs. To do this, DPS Bot uses the somewhat novel ranking measure of “spec score.” What is spec score? How should we use it to evaluate dps? After the cut we go over this ranking method and the rest of DPS Bot’s measures to develop an understanding of how to use and interpret the site.
DPS Bot’s particular method of standardizing parses and ranking specs has consequences, and so it is important that we understand how DPS Bot’s “spec score” works before we start using DPS Bot to make conclusions and judgments about the state of DPS. Seriallos, the creator and administrator of dpsbot, describes spec scores like this:
For each individual boss, the spec with the top DPS on a fight gets a score of 100 and then all other specs are scored according to that top spec. The average across all fights for a given spec is the Spec Score for the current tier.
Let’s break this down. The spec with the highest median dps across all examined parses of a given encounter is scored 100. All other specs are scored according to their median dps’s percentage of the highest spec’s median dps. So, if the highest spec does 30k and a second spec does 27k, this second spec will be given the score of 90 because its dps was 90% of the top spec’s dps. To achieve overall rankings, a spec’s scores from all current tier encounters are averaged together into one score. So, if there are three bosses in the tier and MM’s median dps is 90% of the top spec on one boss, 83% of the top spec on another, and it is the top spec on the third, its overal spec score for the tier would be calculated like this: (90+83+100)/3 = 91.
Why bother with all of this? There are a few reasons. The first is that it makes comparison easy. The score is the same in scale and meaning across encounters. The same cannot be said of DPS. The second is that it diminishes the influence of `outlier’ or biased medians (like fire mages on Cho’gall or combat rogues on Halfus). It does this by restraining the range of variation to 0-100, normalizing the scores, such that a gimmick fight for a spec won’t inflate the spec to the ridiculous degree it would with DPS. A third reason is that it makes averaging easy. Since the scores for the individual encounters all share the same scale and meaning, it is no leap to average them for an overall score for the whole tier.
Part of understanding what a spec score means is having a handle on the data that is getting turned into a spec score. On the righthand side of the DPS Bot page there are control settings that let you adjust the data that is used and the data that is displayed.
This setting lets you choose between using all parses within the Timepsan or merely the top 100 parses. There are pros and cons to both settings. The top 100 is bounded by an implicit standard of quality (assuming that there are enough parses for the crappy ones to be left out), letting you look only at the skillful min-maxing players who assumedly depict what a spec is capable of at the high end. But you also have to keep in mind that the top 100 is not a random sample, it is not representative of the average hunter (or mage or warlock, etc.). “All parses” does a better job of reflecting the average raider, but it also includes outliers on the low end and introduces greater variation in gear, skill, etc., across the parses used.
This setting lets you choose a cut off point for the age of the parses you include in the graph display. Do you only want to look at parses from the past month? The past six months? Larger timespans include past patches, making them not strictly representative of current-patch players. Note that changing the timespan to include six months will have no effect on the presentation of the most recent month; see Sample Period for the answer why.
On the topic of time, Seriallos notes that not all times within a tier are the same. Early data after new raid content is released is “noisy” in the sense that World of Logs’ detection of bosses is not yet refined and spec scores can end up shifting dramatically when new bosses are killed. So, if you notice plots and spec scores looking weird after the release of a new patch, well, there’s good reason for that.
This setting determines how a data point for a spec score is determined, including the current-day data points presented in the tables. Do you want it to be the average of preceeding two weeks of parses or just one day of parses? You can see this reflected in the graph: the plotted lines change from smoother rolling averages (2 weeks) to jagged day-to-day variations (1 day).
You should probably leave the setting at two weeks because, well, by reducing the number of parses put into a data point you are making the sample sizes for some specs very, very low. Keep in mind, though, that by leaving the sample period at two weeks that you will be including two samples from many individual raiders, particularly if you’re using the “All Parses” setting. This means that to get a sense of the number of raiders per spec, you should divide the number of samples by two.
This setting allows you to choose which statistic is used to rank specs in the graph and the tables. It does not change the data, just the value used in ordering it.
- Default Measure: This is the same as Average when viewing overall spec scores and the same as the median when looking at specific fights or overall dps.
- Median: Ranks by the median spec score.
- Average: It ranks and displays the specs by their average spec score across fights in the current tier.
- Standard Deviation: This is a measure of how much variation there is away from the average. The larger the standard deviation, the further on average the various scores are away from the average of all the scores. Put another away, a standard deviation is a measure of how far away you have to go from the average before you encounter a certain percentage of the scores. The larger the standard deviation, the more spread out and flatter the scores tend to be when viewed as a distribution. What this means for the overall spec score is that a relatively low standard deviation means that a spec is performing about the same in terms of dps on all fights. A relatively high SD means that the spec varies a lot in its dps from fight to fight.
- Samples: This is the number of samples used in creating the spec score. The lower the number of samples used for a spec, the less confidence we can have in its representing a true average. Some specs, like BM of late, have had so few samples that raidbots simply excludes them from analysis. It cannot in good conscience use the 10 (or whatever number) parses to make claims about the spec as a whole.
- Max: This is the highest spec score that a spec achieved in any encounter in the current tier. Thus, more than one spec can have a Max of 100 since the top spec will vary from boss to boss. This statistic is good for quickly answering the question: “is this spec good on any fight?”
Below “Controls” you have options to select which content you are looking at (25H, 25N, etc.) and, within that, a choice between Spec Scores and Overall DPS. You can also choose to view invidual bosses. These options are self-explanatory but there are a few things worth noting about them.
- The default view is of 25H spec scores.
- Overall DPS for a tier is prone to being skewed by outlier fights. Just look at how much Fire Mage DPS on Alysrazor skews its overall ranking in dps on 25H.
- Overal DPS defaults to median dps rather than average dps, though, which limits the ability of outlier parses to skew rankings.
- DPS is used by default in the tables for individual fights because the ranking would be the same as with spec score and so the simplification of information present in spec scores is not really necessary.
- Note that, especially early on in a given tier, hardmode individual boss kills may have very few parses per spec and so are less reliable.
Big thanks to Seriallos for creating DPS Bot and for looking over this piece to make sure I didn’t misrepresent his site.