Mar 17 2009

CS 519 Final Project

How do you debug a program that you can’t see? This is the problem faced by users interacting with the increasing number of applications utilizing machine-learning components to train themselves to a user’s inclinations. These programs consist of an instance of a machine-learning classifier that has been trained on data from a particular user and either resides on this user’s machine, or is designed solely for interaction with this user over a network. Common examples include email filtering software and recommendation systems, such as the one utilized by Amazon.com. The machine-learned program itself is accessed through a more traditional program, such as an e-mail application, which uses the learned program to decide how incoming messages should be categorized.

Our goal is to create a visualization of the learned program that a user may interact with, allowing the user to both understand why the program makes each of its decisions, and how the program can be corrected when it makes a faulty decision. Our audience consists of end users who neither have knowledge of formal software debugging techniques nor understand how machine-learning systems operate. These are people who use their computers for work or leisure, and are not interested in spending anything more than cursory time and effort to learn, e.g., how to improve the accuracy of their SPAM filter.

One of the most powerful machine-learning systems used today are Conditional Random Fields (CRFs). These systems excel at complex, sequential tasks such as natural language processing, and thus find themselves at the heart of many machine-learned programs. We used the logic for a CRF as the data set for our visualization. This logic includes a set of features, such as words, phrases, and other identifiable aspects of data which is being run through the learned program, as well as the set of numerical values each feature uses to determine its importance to each available category.

The data used to create the visualization is a transcript of a user study. This transcript consists of the words and actions of an end user debugging a spreadsheet in Microsoft Excel. Each sentence of the transcript is assigned to one of four categories (Seeking information, information gained, information lost, or none), which makes analysis of the transcript by researchers easier. Our visualization explains the logic the learned program might use to categorize each sentence of the transcript; future work would include allowing the user to adjust this logic when it results in poor classifications. This release is not connected to an actual classifier.  The displayed explanations are randomly generated, but provide a useful idea of what a functional implementation would look like.

Transcript Viewer source code [XCode 3.1 project, requires Mac OS X 10.5 or higher]

A screenshot of our prototype Transcript Viewer

A screenshot of our prototype Transcript Viewer