Practical Design Patterns in C# – State

The purpose of the state design pattern is to allow an object to alter its behaviour when its internal state changes.

The example of a logging function below is a model candidate for conversion into a state implementation.

The requirements of this function are as follows.

  1. Write the log entry to the end of a text file by default.
  2. The file size is unlimited by default, denoted by setting the value of the maximum file size to 0.
  3. If the value of the maximum file size is greater than 0 then the log file is archived when it reaches this size, and the entry is written to a new empty log file.
  4. If the recycle attribute is enabled and the maximum file size is set to a value greater than 0, then the oldest entry in the log file is erased to make space for the new entry instead of archiving the entire log file.

The implementation of the actual mechanism for writing to disk is not relevant in this example. Rather, the focus is on creating a framework that enables selecting the correct mechanism based on the preferences set.

public void Log(string message)
{
    if (LimitSize)
    {
        if (file.Length < MAX_SIZE)
        {
            // Write to the end of the log file
        }
        else
        {
            if (Recycle)
            {
                // Clear room at the top of the log file
            }
            else
            {
                // Create a new log file with current time stamp
            }

            // Write to the end of the log file
        }
    }
    else
    {
        // Write to the end of the log file
    }
}

The implementation shown above is very naïve, tightly coupled, rigid and brittle. However, it is also a fairly common occurrence in most code bases due to its straightforward approach to addressing the problem.

The complicated logical structure is difficult to decipher, debug and extend. The tasks of determining file size limits, recovering disk space, creating a new file and writing to the file are all discrete from each other and should have no dependency between themselves. But they are all interwoven in this approach and add a huge cognitive overhead in order to understand the execution path.

There are multiple conditions being evaluated before the log entry is made. The programmer has to simulate the computer’s own execution path for each statement before even arriving at the important lines related to writing to disk. A couple of blocks are duplicated code. It would be nice if the execution path could be streamlined and duplicate code removed.

Restructured Code

An eager approach to selecting the preferred mechanism breaks up the conditional statements into manageable chunks. It is logically no different from having the same code in the Log method. But moving it into the mutator makes it easier to understand by adding context into the logic rather than handing it all in a single giant function.

public int MaxFileSize
{
    get
    {
        return _maxFileSize;
    }

    set
    {
        _maxFileSize = value;

        if (0 == MaxFileSize)
        {
            // Activate the unlimited logger
            return;
        }

        if (Recycle)
        {
            // Activate the recycling logger
        }
        else
        {
            // Activate the rotating logger
        }
    }
}

public bool Recycle
{
    get
    {
        return _recycle;
    }

    set
    {
        _recycle = value;

        if (0 == MaxFileSize)
        {
            // Recycling is not applicable when file sizes are unlimited
            return;
        }

        if (Recycle)
        {
            // Activate the recycling logger
        }
        else
        {
            // Activate the rotating logger
        }
    }
}

The comments indicate that you’re still missing some code that actually performs the operation. This is the spot where the behaviour is selected and applied based on the state of the object. We use C# delegates to alter the behaviour of the object at runtime.

public delegate void LogDelegate(Level level, string statement)

public LogDelegate Log
{
    get;
    private set;
}

The Logger class contains private methods that implement the same signature as the Log delegate.

private void AppendAndLog(Level level, string message)
{
}

private void RotateAndLog(Level level, string message)
{
}

private void RecycleAndLog(Level level, string message)
{
}

When the condition evaluation is completed and the logging method has to be selected, a new instance of the Log delegate is created which points to the correct logging method.

if (Recycle)
{
    Log = new LogDelegate(RecycleAndLog);
}
else
{
    Log = new LogDelegate(RotateAndLog);
}

This separates the selection of the correct logging technique to use from the implementation of the technique itself. The logging method implementations can be changed at will to fix bugs or extend their features without adding risk to breaking the rest of the code.

What’s in a Name?

Naming has been acknowledged in many texts as a difficult aspect of practical computer programming. The problem with assigning appropriate names for entities is that it is not quite as cut and dry as most other aspects of the craft. The rules are not very objective. Choosing a good name requires applying a certain amount of heuristics drawn out of experience and intuition.

Good names are easy to overlook because they stay out of the way. But a smart programmer can learn much about the kind of naming schemes to avoid simply by having to maintain code with poor quality naming conventions. Badly named variables can make this task so difficult that there might be some merit in the idea of exposing programmers to such a maintenance exercise specifically to teach them this valuable skill.

We look over a simple example of how poor naming conventions inhibit the design, and conversely, how a well chosen name can make short work of extending the application in the future.

Wheel buildWheel(int spokeCount);

This signature declares a method that takes an integer spoke count as input and returns a Wheel instance. Fred calls the method in the following manner.

Wheel front = buildWheel(32);

This gets flagged by his pair programmer Dave for using a magic number. So Fred changes the code.

const int THIRTY_TWO = 32;

Wheel front = buildWheel(THIRTY_TWO);

It isn’t the most elegant snippet they have encountered. But both Fred & Dave agree that this code no longer uses a magic number. So they commit the code and go home.

Two years later, the specifications change. The client can now build lightweight wheels with just 16 spokes. Fred is no longer on the team. Susan, the maintenance programmer, takes up the task and inspects the code. She finds the line where the number of spokes is declared and edits it.

const int THIRTY_TWO = 16;

Wheel front = buildWheel(THIRTY_TWO);

Technically, the code is fixed. But it has deteriorated a bit more in quality and maintainability.

A few months later, the client sees resurgent demand from their customers for high-spoke count wheels due to their durability and higher resilience to handle bad surfaces. So they wish to make the number of spokes a customisable option that can either be 32 or 16. Two weeks into the modification they also release a new 12-spoke wheel. The code now looks like this.

const int THIRTY_TWO = 16;
const int THIRTY_TWO_TRUE = 32;
const int TWELVE = 12;

Wheel front;
switch (wheelType)
{
    case DURABLE:
        front = buildWheel(THIRTY_TWO2);
        break;
    case LIGHTWEIGHT:
        front = buildWheel(THIRTY_TWO);
        break;
    case ULTRA_LIGHTWEIGHT:
    default:
        front = buildWheel(TWELVE);
        break;
}

This is a very minor, localized problem in the code. It has relatively negligible impact on the performance of the application. Dozens of minor warts such as this fester in any sizeable code base. Programmers are human too, and make mistakes sometimes. Maybe the deadline inched too close and they had to roll out right that minute. The key is to identify and fix those code smells as soon as they’re encountered. Queuing them up for a grand refactoring project risks turning these minor issues into full-blown architectural challenges.

Things often get trickier when such names are used in giant mudballs with extremely wide scope, such as singleton objects or global buckets. Nobody wants to read the code surrounding all 28 references to the constant THIRTY_TWO, without which making the change would be extremely hazardous. So THIRTY_TWO continues to live on in the code base, forever doomed to contain just half of what it denotes.

This type of problem can be nipped in the bud very easily by simply spending a moment thinking about the names of objects. In this case, the method signature itself provides a hint for the correct name of the parameter – spokeCount. Then Fred’s iteration would have looked like this.

const int SPOKE_COUNT = 32;
 ...

When Susan comes in to add two new spoke count options, she realizes that the constant SPOKE_COUNT already exists. This nudges her into defining two more constants – LOW_SPOKE_COUNT, MID_SPOKE_COUNT. If Susan uses a reasonably modern IDE, it would also allows her to rename the old constant to HIGH_SPOKE_COUNT.

const int HIGH_SPOKE_COUNT = 32;
const int MID_SPOKE_COUNT = 16;
const int LOW_SPOKE_COUNT = 12;

Suddenly, by simply choosing a name that better reflects the purpose of the entity rather than its value, the application programmer has made its intended usage clearer, and nudged future maintenance programmers towards using similar appropriate names to new code that they write.

A Guide to Effective Version Control

We have come a long way since 2008 when I wrote this introductory article about version control systems. Their rising popularity in the past few years is a notable sign of maturity in the software development community. Whereas they were once a bastion of only “serious” development operations, now even app developers working alone use them regularly.

In this article, I talk about how a team can use a version control system to its advantage by describing a preferred way to store the files in the repository and some processes to complement the structure.

There are myriad ways of organizing a version control repository. It is a sophisticated file system, after all. A well-structured repository can be a valuable asset to a software team by creating a clearly defined workspaces for programmers, opening up the opportunity to create a continuous integration pipeline, and standardizing deployment.

Repository Layout

In my preferred layout, the trunk is used for the main development line. In addition, branches are created for releases and feature sandboxes. This style of organization is loosely based on trunk-based development.

The Trunk

All the primary development of the project occurs in this directory. The trunk is further organized into sub-directories for every project that constitutes the final product. For example, a product might consist of separate web and desktop components. Or it might have a dependency on a separate static or dynamic library. The exact arrangement of this is determined by the programmers and release engineers working together. All these projects reside side-by-side in the trunk.

A feature or bug fix is considered open until the code has not been committed to the trunk. Hence, all developers have access to this directory and must make it a regular part of their daily work.

trunk

Maintenance Branches

Once the product reaches a milestone and is deemed ready for release, the release engineer is responsible for generating a binary and deploying the product. But the first step is to make a maintenance branch. This branch is a snapshot of the trunk that can be used by developers to provide bug-fixes to an already deployed product without accidentally releasing new features.

It is named according to whatever convention the team decides (I prefer a numeric x.y scheme).

branch

If bugs are discovered in the product post release, they are patched in the trunk and merged into the active maintenance branch. Other developers, meanwhile, can safely continue adding new features on the trunk without inadvertently bringing them into the current deployed versions.

A bug-fix release is made out of the maintenance branch after a bug is patched.

It is important that bug fixes should originate in the trunk and later be merged into the branch rather than the other way around. If the fix is made in the branch and the developer forgets to merge it back into the trunk, it can cause a regression later down the line. By patching in the trunk, the developer uses the process to automatically ensure that patches exist in both places.

Tags

Release binaries should always be made out of a clean export from the repository. But which revision should one export? If you’re creating a maintenance branch, you could use its HEAD revision. But you’re out of luck if you need to generate a previously released build again after subsequent edits have been made to the branch. Unless you tagged the branch before generating a binary.

A tag is a branch that everyone agrees to never edit. It is given a unique name from the rest of its siblings. My preference is to go the x.y.z scheme, where the x and y components of the name come from the maintenance branch from which it derives its lineage.

tag

The z component of the tag name is set to 0 when the first maintenance branch is made. This number is incremented whenever subsequent tags are made out of the maintenance branch.

Using the Repository

Once the structure is ready, some processes need to be put in place in order to utilize it. The typical developer role has it easy – keep committing error-free code into the trunk, and make sure the working directory is updated from the trunk as often as possible.

Feature Releases

A new feature release begins when one of the managers green-lights it. The repository is locked down temporarily and a maintenance branch and a tag are created from the head revision of the trunk. These two copies are named according to the version convention in place for the project.

This maintenance branch supersedes all previously generated branches, which may even be deleted from the repository. The repository is then unlocked and developers can go back to making commits in the trunk.

Maintenance

When bugs are discovered in a released product, the developers investigate in the trunk and attempt to fix it there. If found and fixed, they notify the release engineer who then cherry-picks the relevant commits and merges them into the current maintenance branch.

The maintenance branch is tagged after it is deemed ready for release. The binary generation and deployment process remains the same – clean export from the newly created tag, followed by compilation and deployment.

If a bug cannot be located in the trunk, the developer may have to look for it in the maintenance branch and fix it if it is found there. In this case, the bug fix may be merged back into the trunk if deemed necessary.

Release Generation

The release engineer’s work is not done yet. The release binaries are yet to be created from the newly created tag. For this, the tag is exported from the repository into an empty directory. This is important so that no unnecessary files are accidentally added into the binaries. This step cannot be emphasized enough.

All binaries must be generated from a clean and complete export from the repository.

Generating binaries from partially exported directories runs a high risk of including incorrect or outdated files into the binary. This inaccuracy can be the source of impossible-to-reproduce bugs or regressions. At the minimum, it is a sign of a poorly managed process.

Once the binary is generated, it is packaged to its final consumption form – an archive or installer for distribution – then deployed to wherever it is needed (such as a file server or web server).

The binaries and installers may be added into the repository for posterity.

Nothing Is So Simple That it Cannot Be Difficult

Long years in the software industry have conditioned me to dread last-minute feature additions. Something as innocuous as a mailer subscription form can turn into a minefield of security and privacy concerns if done incorrectly. In the book Peopleware, writers Tom DeMarco and Timothy Lister have expressed how managing software projects is less of a technical challenge and more of a social interaction maze. Keeping clients happy while still being able to convince them about the intensive behind-the-scenes work behind the simplest of features takes a lot of people-skill.

Incomplete or inaccurate specifications can complicate the problem even further. This can be a result of the specifications being created by inexperienced people, or people with little or no understanding of UX or software. This becomes particularly insidious because the presence of a specification (even if it is only superficial) lulls stakeholders into thinking that the project is under control. But all sorts of bad things begin to happen once the rubber meets the road. Designers work towards an incomplete layout. Engineers make an incomplete product, which results either in customer dissatisfaction or time and cost overruns.

A seemingly simple feature such as pagination requires quite a bit of programming just to make it work at all. More code has to be written to consider common cases such as representation of and navigation through large data sets. Usability issues have to be taken into account, such as device specifics to handle different input systems such as touch screens, keyboards and mouse clicks.

Product requirements often begin with a very vague idea of what is required. “A blog application needs comments”, says the product manager. So the engineering team gets to work and churns out a comment form, and a display mechanism. But nobody expected Jeff Atwood to deploy the software on his website, which regularly gets 3000 comments in the first 10 minutes of a post being put up, and nobody can now open his website any more because the database server choked on retrieving so many comments for a million simultaneous visitors (yes, even if they are flat discussions).

After Jeff posts a rant on his website, the product manager eats his hat and gets back to writing a specification for the comment pagination, as he should have to begin with. This is his design from the first draft of the specification.

Previous | 1 | 2 | 3 | 4 | 5 | Next

The engineering lead looks at the design and notices that it is still incomplete. The Previous and Next buttons will not do anything useful if the visitor is already on the first or last page. So those links will have to be disabled as necessary. The revised design looks like this.

Previous1 | 2 | 3 | 4 | 5 | Next
Previous | 1 | 2 | 3 | 4 | 5 | Next

But what happens when there are more than 5 pages of comments on the post? If they keep increasing the page count, eventually it will have to be wrapped to the next line or made to run off the screen, either of which looks very ugly.

Previous1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | Next

Somebody suggests letting users type in the page number in a text field that also doubles up as the current page indicator.

First |  | Last

While this is easy, it is not very user friendly. How many pages of comments do we have in all? What happens if the visitor enters too large a number or a non-numeric value? Visitors on mobile devices probably will not be very pleased to have to type very frequently.

After yet another round of discussion, the team finally decides that visitors are probably not interested in viewing every comment on the blog. Yet, showing the total number of comments would be desirable for Jeff because it is a measure of his popularity. Fitness models measure bicep circumference. Geeks measure blog comments.

Somebody suggests the following user interface, which seems to fit the bill.

1 | 2 | 3 | 4 | 5 ... 274
1 ... 65 | 66 | 67 | 68 ... 274
1 ... 270 | 271 | 272 | 273 | 274

The first and last page links are always displayed. The rest of the links are dedicated to displaying the current page number and a range of values around it. For the end user – the website visitor – the control is very simple and easy to learn at a glance.

Of course, this entire exercise is moot if it is done without proper usability testing with relevant target audiences in mind. Formal testing methods are often inaccessible to small teams, but hallway tests can still provide feedback, and results in a much better product.

Director Done Right – Why Adobe Must Promote AIR

When I leapt headfirst into multimedia development a decade ago, there was just one go-to product for all interactive needs – Macromedia Director. With origins in the vastly successful MacroMind VideoWorks application, Director had a strong pedigree in multimedia production. Regular updates of the product over more than a decade had converted the original VideoWorks into a development environment with an object-oriented programming language of its own called Lingo. It also supported external plug-ins for added features, ran on Windows and Macintosh computers and supported all media formats of the day – plain text, images, sound and video. The greatest plus point of Director was its ability to create a projector out of the Director file, which was an executable file that bundled the player and the binary file along with any plugins that were used. The projector was guaranteed to work on any system without requiring any additional libraries (except maybe audio and video codecs).

A new age, a new champion

Then the whole web thing happened and Director could not keep pace any more. Macromedia shipped Xtras to support more media formats such as hypertext and Flash, and to make network requests. They also shipped the Shockwave browser plug-in to run a Director file in the browser. But it was no match for the Flash plug-in, which was a fraction of the download. This weighed heavily in the dial-up era back then. Chinks in the development environment began to take their toll. The biggest flaw of Director was that it bundled media and scripts in its working file into a single binary blob and saved it to disk. This meant that version control tools could not diff the scripts against previous revisions. Developers could not collaborate and work upon the same project without leaping through significant loops. The files would sometimes get damaged and become unreadable if there were power failures or other system breakdowns while the file was being saved. There was no easy way to create and reuse libraries. And the eccentric syntax of the Lingo language was a black mark on its developers, who could never gain any repute among people who used “real” programming languages with curly braces.

When Flash began to gain a foothold in the market because of its reach and ease of use, multimedia developers began to move on to the new platform in a steady trickle over the years, until Director became a niche production environment. The situation gave Macromedia a chance to right the problems of Director on a new platform, blank as a clean slate. And they capitalized upon it in a stellar manner. ActionScript quickly morphed from a simple scripting language with a handful of constructs, to a full-blown development environment that supported complex object hierarchies, most OOP principles, a rich API for various tasks ranging from media control to network access, all packaged inside a runtime that was tiny and already available on most of the internet-enabled computers. And it had curly braces.

A web of entanglement

Yet, the biggest drawback of Flash was its sandbox that prevented developers from doing anything beyond the boundaries of the browser. Local file system access was out. Native nested windows were out. Drag and drop was not available. Launching native processes was out. Projector files were slightly more forgiving and allowed read-only local file access and limited spawning of new processes. But this was nowhere close to the unrestricted freedom offered by Director projectors. This in itself was not bad most of the time. But when more serious functionality was needed, ActionScript developers had to depend upon another programming language to get the job done. To be fair, Macromedia, and now Adobe have always done their best to make cross-language interop as easy as possible. We have come a long way from the getURL() calls required in ActionScript 1 to today’s ExternalInterface API. But a dependency meant having to support yet another codebase.

Breaking free

It was heartening then to hear of the Adobe AIR announcement some years ago. Finally, Flash applications could be written to take advantage of local facilities of the system along with the existing media and internet APIs. It was now possible to read and write local files in a meaningful manner. Network utilities could be a lot more proactive and efficient. Applications could update themselves transparently and could support richer desktop paradigms such as taskbar icons and drag-and-drop interactions.

There still isn’t complete unfettered access to the system, the way Director projectors could. But this limitation will remain due to the different security scenario today. Being able to sell AIR as a secure replacement for web applications is highly desirable for Adobe in the face of HTML 5 and the advancing capabilities of browsers. But combined with all the other benefits that Flash offers, along with the reduced development time over other desktop development environments, and instant cross-platform compatibility, this is an insignificant bump.

The only drawback that AIR has over Director is that it requires a separate download of the AIR runtime. At 11 MB, it does not add a significant cost in today’s age. But it is a deal breaker for users without administrative rights. There really should be a way to package the runtime and the application package into a single executable.

HTML is not the only future

There still are naysayers who predict that the Flash platform is dead. All that is needed is wider acceptance and support for HTML 5. But with support for HTML and JavaScript built into AIR (with WebKit, no less), Adobe already has that base covered. And the web is not the do-all and end-all of computing. While having a social networking application online makes sense, personal accounting or mail applications are better kept locally for security and accessibility reasons. And desktop applications have the unique ability to morph into hybrids that work locally as well as with remote services, which is better in some cases than having a web-only application. Most people already use a hybrid application – their mail client. New mails can be downloaded and read when connected online, while still being able to access downloaded emails when working offline.

Sure the W3C is working on writing specifications for similar functionality with HTML 5. But who said choice is bad? Given the sorry state of browser standardization even today, it is more likely that each browser publisher will implement the spec in some half-assed manner that causes subtle differences between them and require someone to invent a shim to fix those incompatibilities. And HTML 5 does not address all the features that AIR supports, including hardware accelerated 3D and filter effects, native menus, file extension registration, drag and drop, advanced sound and video APIs and windowing support beyond the most rudimentary.

In spite of all the advantages that it has, AIR has failed to make major inroads into desktop development. Blame lies squarely on Adobe for failing to promote it well enough. Other than ActionScript developer circles, there rarely is any mention or knowledge of the platform. They really ought to pick up some steam on that front and corner the market the way Director and Flash have done in their time.