0

Handicapping & Visualizing Aggregate Data

When handicapping a race (or an entire card), one of the first things I do is create a custom report that aggregates some past performance (pp) data. I like a quick way to quantify certain aspects of a race. However, there are “issues” with using aggregate data…especially when averaging…but that’s beyond the scope of this post. For the sake of discussion, we’ll focus on speed figures.

The chart below visualizes two aspects of averaging speed figures of the horses entered in the 8th race at Tampa Bay Downs on February 26th, 2014. The Color and Size of each horse’s box correlate to the Average Speed Figure and the number of Races respectively. The colors range from Red (lowest) to Green (highest) Average Speed Figure. The box sizes range from the smallest (fewest) to largest (most) Races considered. If you position your mouse cursor over a box, a little pop-up displays the data, e.g., for the #9, HOGUE, his Average Speed Figure is 60.125 of 8 races considered. I guess I should mention, a race is “considered” when it is the same type (Route or Sprint) and over the same surface. For this race, we’re considering all PPs that were Sprints on Dirt.



A quick look at the chart reveals that #3 COLLETON GARDENS has the highest average speed figure…as well as the fewest number of races considered, 65.33 and 3 respectively. We can also see that #2 IMVROS INDY has one of the lowest average speed figures and the most races considered, 37.20 and 24 respectively. This visualization, I believe, demonstrates why you shouldn’t (and I don’t) rely on aggregated data as a single point of reference.

In the chart below, I tightened the requirements for a race to be considered. Now, for a race to be considered, the horse must have finished in the Top 4, OR must not have been beaten by more than 5 lengths.



Ok, now lets have some fun and kick up the Tree Map a notch or two! You know that feeling you get when your browsing the PPs…and there aren’t any PPs…you know, a Maiden race full of first time starters. Usually, I’ll pass the race. But what if I’m trying to construct a pick-n ticket? That’s when I dive into trainer stats, jockey stats, sire and dam stats…AND, study the Works. Okay, so looking at the 5th race at Tampa Bay Downs, on February 27, 2014, there’s several horses that yield goose eggs when I run my aggregate data reports. So, what I did, was gather up all of the works over the last 60 days. I could’ve gone back further, or not as far…60 days seems to yield a decent number of works for each horse (note: I had to go back 60 days to get at least 1 work for some of these guys). Here’s how this tree map works. The Size of the box is dependent on the total number of furlongs the horse worked, i.e., if a horse had 5 works at 3 furlongs, the “size” of the box is 15. I’ve also calculated the horses feet/second (pace) for each work. The Pace is used to determine the color of the box. The tree map calculates a value of each horses pace relative to the others…a percentile, if you will. The slower the pace, the redder the box. The faster the pace, the greener the box.



Now, what’s really cool about this particular tree map is that you can “drill-down” and see each of the horse’s works. The size of the box is based on the distance of the work and the color is based on the pace. Left-click on a horse to drill down…Right-click to drill up. There you have it…a relatively useful tool to evaluate “First-Timers” based on their works…but don’t forget to refer to those other stats…