From Code to Design Document: A Play in Four Acts
Act One: The Code and The Motivation
Here’s the first player, our code.
% ls Repository/trunk/clementine/interfaces/gesture
CLGestureCue.cxx CLGestureCueAppearance.h CLGestureListener.cxx CLGestureParser.h
CLGestureCue.h CLGestureInterface.cxx CLGestureListener.h CLGestureTrainer.cxx
CLGestureCueAppearance.cxx CLGestureInterface.h CLGestureParser.cxx CLGestureTrainer.h
% cat CLGestureListener.h
#ifndef __CLEMENTINE_INTERFACES_GESTURE_CLGESTURELISTENER_H__
#define __CLEMENTINE_INTERFACES_GESTURE_CLGESTURELISTENER_H__
#include "CLGestureParser.h"
#include <string>
/**
* CLGestureListener is responsible for listening on the CLGestureInterface
* for a single gesture type.
*/
class CLGestureListener
{
public:
/**
* Instantiate the listener.
* @param gestureName the name of the gesture
* (and consequently, the filename to parse) that this object is created to listen for.
*/
CLGestureListener(std::string gestureName);
... etc
It’s a whole bunch of header and implementation files for a C++ project I’m involved with. This project requires the use of thorough documentation, but our design documents are hundreds of revisions behind our code. While it’s important to have gone through initial designs, our code at this point is miles removed from those designs. It would be nice to be able to automagically update our design document to be kept current on what is occurring in the code. This will not make up our entire design document. It is meant to be used in conjunction with manual, hand-written and hand-proofed analysis of the larger architecture.
The second player is AsciiDoc, a wonderful formatting and markup system. You may know it from the Git User’s Manual. AsciiDoc affords us several advantages.
- It is in plain text, which means it can sit in source control, right alongside our code, and diffs are easily viewable.
- It is a templating language, which allows output into HTML, PDF, you name it. With enough hacking, you can produce really distinct output for webpages and PDF readers.
- It allows division of sections into different files, so authors can focus on the sections that concern them without being overwhelmed by the text they’re editing.
Act Two: The Tools
We want the comments in our code to be part of the documentation, but that’s not all. Since UML is the standard when it comes to software engineering description, it would be nice to have UML diagrams in our output as well. The ubiquitous free diagramming tool Dia provides, with its invocation, command line arguments to convert Dia diagrams into images. It goes like this, where you wish the output to be image_name.png:
dia -t png -e image_name.png diagram_name.dia
So this part will be simple. Generating the Dia diagrams from code is already taken care of for us. Aaron Trevena has written an excellent script called AutoDia which provides this functionality. All we need to do is put these pieces together, along with writing the functionality to extract the comments we want to become documentation.
Act Three: The Program
I am calling it Amorfus, because I think client-side applications should start getting into the Web 2.0 naming crazes.
How do you use it?
- Create a new
CommentDocParser, with these parameters: a string pointing to the subdirectory of your trunk that you wish to document, a string containing the conceptual name of the code in that directory, and a string containing your trunk directory. You can leave this last one out, and it will default to'.'. - Run
#parseon that parser. - Create a file handle, and output the return value of
#to_asciidocto that file. - That’s it!
Here’s an example:
require 'amorfus' DIRECTORY_HEADER_LEVEL = 1 # This will become the section depth, in AsciiDoc, for each individual class we find r = CommentDocParser.new 'interfaces/gesture', 'Gesture Interface', 'Repository/trunk' r.parse t = File.new('output.txt', 'w') t.write r.to_asciidoc t.close
What does it do in the background?
- It uses a dumb regex based heurestic to determine if a comment has value.
- It classifies methods by what it can find of their name.
- It generates Dia diagrams whenever it can find an object.
- When generating images, it parses Dia diagrams using Nokogiri and removes irrelevant objects from that diagram, so that only the featured object shows up in the image.
Finally, we’ll need to make a small patch to AutoDia to make it recognize structs as equally valid objects. This is extremely hackish (see the Epilogue), but it manages to work for now.
--- Autodia-2.03/lib/Autodia/Handler/Cpp.pm 2009-04-15 01:10:46.000000000 -0400 +++ Autodia-2.03.orig/lib/Autodia/Handler/Cpp.pm 2005-04-15 08:02:49.000000000 -0400 @@ -72,7 +72,7 @@ $i++; # check for class declaration - if ($line =~ m/^\s*(?:class|struct)\s+(\w+)/) + if ($line =~ m/^\s*class\s+(\w+)/) { # print "found class : $line \n";
Act Four: The Results
The resultant HTML looks like this:

From our previous example, this can be generated on the command line like so:
% asciidoc --unsafe -e data-uri output.txt
The --unsafe and -e data-uri allows AsciiDoc to embed the images you’ve created directly into the HTML, instead of referencing them externally. This makes the document self-contained, in a sense. You can ignore these flags if you wish. In that case, standard <img> tags will be generated.
Epilogue: Warnings
This functionality was hacked together in about two hours. It does horrible things to Dia’s XML documents, it guesses at what the current spec for AsciiDoc is, and requires a hastily applied patch to a third party tool, AutoDia, in order to accomplish its goals. Because of it’s nature, I can’t make any guarantees as to how effectively it will work, what circumstances it will work under, and the like. If you have any suggestions or improvements, I implore you to fork the code (currently hosted at this Gist) and see what you can come up with. For example:
- I don’t even think it recognizes variables correctly. You might want to fix this.
- It ignores
public:,private:, andprotected:. You might want to fix this, but it is meant for design documents, so all methods should be documented. - If a class has more than one constructor, only one will be displayed in the documentation. You might want to fix this.