Overview
Signpost ended up as the name of the project that came out of my PhD work at the University of Kent. This site is meant as an overview of the project together with some supplementary material that was included in the thesis. I make no guarantees on the work being complete, far from it. In fact there are several aspects of the project, and thesis, that warrant further work (the debugging tools and knowledge base surveys in particular) but thus far time just hasn't allowed it. If you are interested in the project, please email me (mike@bug-box.net) and I'll be pleased to discuss things further.
I'm assuming that most people who are visiting this page are doing so because they have already read the IEEE Software article on Signpost. If you have, you can probably skim this section and jump around the links to your left.
Signpost is a tool that programmers can use while they are developing code to find out about bugs in their code that are already known, and possibly fixed. The largest source of these types of bugs are vendor libraries although this certainly isn't an exclusive realm. Vendors try to inform their users by releasing knowledge bases of the bugs that they know about, but programmers often have difficulty in finding relevant information because of three assumptions
- Libraries are well tested so any bugs must be because of my code
- I must be using exactly the same keywords in searching for my bug as the person who wrote it up
- So many (or so few) hits, there's no point in trying to adjust the search terms
These assumptions frequently mislead the programmer due to the software pyramid, the keyword barrier and search term synonymy
The software pyramid
Programs today are more reliant upon supporting code than ever before. Whenever you write a program, you are inherently only as reliable as the code below. This can be illustrated by the software pyramid

If there are bugs in lower levels of software, there is the possibility that they might filter up affecting your program. The Intel Pentium floating point bug is a good example of this - errors in the calculation of floating point number caused CAD software to incorrectly draw diagrams
The keyword barrier
Searching via single keywords is inefficient due to the potential number of matches, so keywords are often combined using boolean operators (e.g. AND, OR, NOT). As the complexity of a query increases, the number of potential matches is reduced, but so does the probability that the query will contain relevant matches.

Search term synonymy
The way that a knowledge base article is written has an impact on the possibility of selection due to the numerous ways a defect can be interpreted and described. For example, the simple word “file” can have multiple connotations in a UNIX environment (e.g. a UNIX file may be a text file, a device or a process block). An alternative source of error is where the programmer’s perception of a problem is at fault. For example, a programmer might notice a problem when displaying certain documents and attribute this to the view controller. Making a change to the view controller fixes the problem, but the actual cause was due to incorrectly loading the document. As long as the program continues to use a single view for this document it will appear that the defect has been fixed but the defect’s root cause is still present and any further changes to the application could trigger the same, or an apparently different problem.
Signpost attempts to address each of these issues by giving the programmer a context-aware mechanism for querying knowledge bases without the necessity of keywords, or in fact any input from the programmer.
