Curiosity-Driven

September 29, 2007

Visualizing Build Data Pt.1: Simple Coverage Time-Series

Filed under: continuous_integration, web — Tags: , , , — teropa @ 9:44 pm

Many continuous integration servers that have been running for any significant period of time have accumulated a large amount of data about project health. This is especially true for projects that use code coverage and analysis tools.

However, often this data is just sitting there, in XML files on the build server, seen by no one. Seems like an awful waste of perfectly good data to me. So during this fall I’m going to be thinking about what can be done with it to make it look more interesting. Being a beginning student of information visualization, I’m going to be especially concentrating on the visual aspect of things, but I’ll bet I’m going to do some forays into statistical analysis and data mining as well.

Starting with the simplest of things: A time-series showing the trend of code coverage. What is the least chart-junky way to display this variable?

A line chart is probably the best way to visualize a time-series:

coverage_timeseries_1.gif

Code coverage is distinctively a variable of “volume”, and always a fraction of some maximum volume (100%), so it might benefit from being displayed as such, by coloring the “filled” area:

coverage_timeseries_2.gif

The filled area of coverage is usually considered good, and the uncovered area as bad, so maybe the negative quality of the negative space could be highlighted by coloring it red?

coverage_timeseries_3.gif

The colored areas now outline the data, so the original line representing the graph becomes redundant. This means it must go. Let’s get rid of the bounding box too. Now we can also increase the value of the fill colors as they become the primary (only) element of the graph:

coverage_timeseries_4.gif

That’s better. However, although right now the graph shows the general trend pretty well, it isn’t very easy to make out what the actual coverage value is at any given time. Maybe this can be helped by introducing a horizontal grid line at every 20% interval?

coverage_timeseries_5.gif

It does help, but the grid is way too heavy. It actually takes over as the primary graphic element. That can’t be good. I don’t like things that are too heavy for their purpose. We’ll make the grid as thin as possible, and make it white so it fades to the background:

coverage_timeseries_6.gif

There. It is now easy to see the grid but it doesn’t get in your face.

I haven’t thought about the temporal scale here at all, nor the different granularities of code coverage that we could examine (project / package / class / method / block / line -level). That is what I’ll look at next.

1 Comment »

  1. [...] Visualizing Build Data Pt.2: Coverage Time-Series Multiples Filed under: continuous_integration, web — teropa @ 5:09 pm This post continues the exploration started in Pt.1 [...]

    Pingback by Visualizing Build Data Pt.2: Coverage Time-Series Multiples « Curiosity-Driven — October 2, 2007 @ 5:09 pm


RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.