A Spreadsheet of Posts for WordPress

When Microsoft was working on Excel during the early nineteen nineties, their research indicated that many people did not use spreadsheets for the mathematical tasks (such as financial planning, bookkeeping or simulations) which were the original goals of the spreadsheet genre. Instead, they were used for making lists – grocery shopping, planning parties for their kids, logging project tasks, so on. Microsoft took this research into account and reoriented the functional specifications to make list-making much easier than the competition and previous versions of their own product. Joel Spolsky, who was program manager for Microsoft Excel during this period, states the following.

When we were designing Excel 5.0, the first major release to use serious activity-based planning, we only had to watch about five customers using the product before we realized that an enormous number of people just use Excel to keep lists. They are not entering any formulas or doing any calculation at all! We hadn’t even considered this before. Keeping lists turned out to be far more popular than any other activity with Excel. And this led us to invent a whole slew of features that make it easier to keep lists: easier sorting, automatic data entry, the AutoFilter feature which helps you see a slice of your list, and multi-user features which let several people work on the same list at the same time while Excel automatically reconciles everything.

The Process of Designing a Product, Joel on Software

If you abstract away the specifics of the application, you can easily see that the spreadsheet grid is a one-table database. Having multiple tables can be much more efficient, but a single table is easier to understand for people who do not use databases regularly.

The simplicity of spreadsheets as databases goes beyond just reading.

Adding columns? A database engine needs, at the least, the column name and its data type. In a spreadsheet all you need to do is place the cursor over the first empty cell in the first row and type its name. Filtering is one of the simpler tasks in a database, but spreadsheets even make that easier by providing a dropdown of valid values to choose from. Even inserting new records is much easier in a spreadsheet as compared to a database engine. And more importantly, the grid is a good way of capturing an aggregate view of the entire database and making wholesale changes to groups of records quickly.

A spreadsheet application can be the subject of some really grimy IT horror stories that can involve file shares, multiple users and overwritten records. But if put into a proper data validation framework, the spreadsheet interface is robust and easy to use. Many database management clients use the it to represent their table interface, even for multiple relationships.

Extending to multiple dimensions comes at the cost of some loss of elegance when representing one-to-many relationships. While this UI works, it likely marks the limits of a spreadsheet interface. Nested relationships quickly become difficult to understand. And each nested level can represent only one relationship. It comes as no surprise then that so many database-oriented applications (not the management clients themselves) eschew the spreadsheet interface in favour of a form-driven approach.

But I still think that the spreadsheet interface is vastly underused on the web. As I have mentioned above, it does work for simpler collections of records, such as articles in a content management system or user lists.

Spreadsheet of Posts is a WordPress plugin that aggregates all posts into a spreadsheet interface. While WordPress is a fairly simple to use product for even laypersons, the Post creation process leaves a bit to be desired. The sheer amount of shuffling back and forth between the All Posts and the Edit Posts screen takes away a lot of time. This is similar to the REP loop that interpreted languages provide. The visual difference between two screens is bad enough. To that, the delay caused by network activity, fetching the same data repeatedly from the database, regenerating almost identical markup each time and then rendering it again on the client side increases the effort of getting into the flow. Any boredom while performing such a task is justified.

Spreadsheet of Posts attempts to address these shortcomings by putting all the posts into a grid. Edits are made in-place on the grid itself. New records are inserted by filling in a blank row. Records are deleted by selecting any cell and hitting delete. The record that corresponds to the selected row is deleted. The plugin uses standard page-based navigation to cycle through records that do not fit into a single page. It also provides an alternative to display all the records in a single page. Cell heights can be varied to fit the content, or be set to a fixed constant.

The underlying grid is the Hands on Table jQuery grid editor. While there are more functionally complete alternatives, the interactions and appearance of Hands on Table come closest to replicating Excel. Even the bare bones demo on their home page evokes a lot of familiarity to Microsoft’s product. This is a big necessity to make users more comfortable with Spreadsheet of Posts. The rest of the back end of the plugin uses the standard WordPress framework.

Spreadsheet of Posts is still work in development. A current version can be downloaded from here.

How to Write Unmaintainable Code – PHP Redux

How to Write Unmaintainable Code is a satirical piece of programming advice written by Roedy Green a long time ago. That it remains his most popular essay till date speaks volumes about the programming skills of human society as a whole. Collectively, we are either a massive bunch of shit-chucking apes who take this essay seriously, or a small bunch of highly skilled developers seeking relief from daily disillusionment in our work behind its biting humour.

I hope to hell that it’s the latter.

Green’s essay is written with Java and C programmers in mind, but many of the techniques described can be applied to any programming language. Consider this essay as a ‘lite’ version of Green’s to keep us script kiddies from feeling left out.

Write code. Or markup. Or both. Together.

PHP has a very unique ability of allowing the programmer to intersperse program code with HTML markup in the same file by using special PHP tags to identify code blocks. And as with all things PHP, there is more than one way to skin a cat.

<? echo “Hello, World!”; > // What is referred to as the ‘old style’.

<? echo “Hello, World!”; ?> // What is referred to as the ‘first new style’. Note the added question mark in the end tag.

<?php echo “Hello, World!”; ?> // The ‘second new style’ is similar to the first new style and most recommended

<script language=”php”>
    echo “Hello, World!”; > // What else? The ‘third new style’ of tags
</script>

<% echo “Hello, World!”; %> // And finally, a fourth way that uses ASP-style tags

This in itself opens whole new avenues for abuse for a programmer used to the sharper separation of code and presentation enforced in a more serious language. What’s worse is that the tag rules can be changed on individual systems through the php.ini file. Develop with all style flags enabled. Tell your colleagues that you’re doing it in the interest of maximum compatibility with third-party libraries. For added fun, find (or write your own) library that liberally switches between all five types of tags and make it an indispensible part of your project. Or keep switching coding standards every few weeks until there’s a healthy mix of all types of PHP tags scattered throughout your code. Refuse to replace previously used styles with the current favourite because it’s better to leave well enough alone.

Do not document this anywhere.

This is in the interest of avoiding single-step deployments on a stock (or sane) PHP installation. Force system administrators to find the problem in your code and then comb through the php.ini file for the appropriate settings. Always say that you have security in mind, which is encouraged by having the system administrator read each line of the settings file.

But the ability of the language to intersperse HTML markup even inside if…else or loop blocks evidences the sheer the heinousness that must have enveloped the language designers when they wrote that bit of the specification.

There are so many ways to abuse this ‘feature’.

A tamer way would be –

<?php if (foo == null): ?>
    <p> Invalid foo</p>
<? elseif (foo == 1): ?>
    <p>Foo is too small.</p>
<? else ?>
    <p>Correct foo entered</p>
<?php endif ?>

A more advanced programmer might use something like this –

<div class=”query_output”>
    <?
        $sql=”select * from names”;
        if (db_connect()): ?>
            <p>Error!</p>
        <?php else: ?>
            <p><strong>Names</strong></p>
        <?
            $res = db_query($sql);
            $names = db_result($res);
            foreach ($names as $names): ?>
                <!--span><?php echo $name ?></span-->
                <li class=”name”><? echo $name ?></li>
        <? endforeach; ?>
    <? endif; ?>
</div>

By closely coupling the business logic with the presentation layer, your manager will have to assign a qualified programmer to make even innocuous changes to the front-end code rather than assign them to a design trainee. Male lions in the wild mark their territory by pissing on trees. Programmers should mark theirs by pissing all over the front-end markup.

Reinvent the wheel. Poorly

If there’s one thing that PHP is famous for, it’s the rich selection of extremely useful, but badly named functions in its libraries and standard extensions. Make things worse by ignoring the in-built functions entirely and rolling your own instead. Feign ignorance, which is a pretty valid excuse when talking about PHP because it has extensions for everything you might dream about, and some things you might have never even heard of (can you spell Swish-e?).

Write your functions in the most naïve and inefficient manner possible. You get bonus points if it is error prone due to subtle variations in the input.

For example, PHP has an in-built strtotime() function that can convert ‘about any English textual datetime description into a Unix timestamp’. Ignore it. Take a date/time string and split it into its component values by using hardcoded delimiters. Don’t take regional differences into account. Everybody should just be using ISO 8601 anyways.

Naming is the key

Begin programming before the domain model is fully ready or understood. That way you get to create your own colourful terminology for objects which haven’t been named yet. For example, suppose you’re building a web-based interface for a controller in a bakery. Look up the thesaurus for the term ‘mould’ and use an unrelated synonym such as ‘die’ in place of the mould. Repeated instances of the word ‘die’ will make the maintenance programmer wonder if you wanted to pass a subliminal message on to him through your code.

Take this to the next level by using different naming conventions for the same element. If your database field stores the machine operator’s name in a ‘username’ field, store the value in a variable called ‘user_machine_op_curr’, ‘machine_operator_curuser’, ‘machop_current’ and ‘machine_operator_current_user’ in different places where you need the value.

For maximum effect, instead of encapsulating usage of this variable in a single module, apply the RYE principle to litter the code across all the files. Suppose you need a widget to display the active username at the top of the page. Write the query and code to do this on every page with subtle variations in field names and joining random bits of data from unrelated or unneeded tables. The maintenance programmer will have no idea whether the query really needs the other fields, which will dissuade him from refactoring the code into a single module.

Tell everyone who asks that you are caching data for later use in order to improve performance.

Code formatting is hard. Let’s go shopping.

Randomize whitespace rules in your project. Use tabs in some places, spaces in another, or you can even mix and match both on the same line. Aggressively prevent any kind of structure from appearing in the code through indentation. Nested if…else blocks or loops are best for this kind of camouflage. The unwary programmer will not notice a nested loop if it is not indented further than its parent. Compound this complexity with the ability to enter and exit PHP code blocks at random locations within the file.

Repeat Yourself Everywhere (RYE)

Functions and objects are for weenies. Duplicating large chunks of badly formatted code all over the project puts hair on your chest. Even if you are a girl.

This is the ultimate abuse of a maintenance programmer in any language. But by effectively combining the previous principles, PHP gives this technique the potential to become a lifelong nightmare of bugs which are hard to find, fix or test.

Putting on the Pounds

By diligently following all the pointers given above you can be sure to reach file sizes of magnanimous proportions. The ultimate goal is to make a file large enough to timeout the version control server while a network operation is in progress. Make enough of those and the maintenance programmer will be caught between a hard place and a rock. On one hand, there is the fragility of the code which requires frequent commits to keep from making too big a change. On the other there are constant network errors while trying to commit changes to version control.

Database Abuse

PHP’s ultimate utility is its excellent data-processing pedigree. Numerous extensions have been written to enable connecting PHP with different database servers. And the one common thing that all database publishers extol is to consolidate query operations through the use of joins for fewer disk operations and lower network transfer.

Ignore them.

Instead, run multiple queries – one to retrieve the master table and the rest while cycling through the result set to retrieve values from the secondary table, using the current record as the selection criteria. If you have enough values in the master table, this will slow your server down enough to frustrate the end user. Use the techniques mentioned above to obfuscate and entangle your queries with unrelated code to make it resistant to change.

For added joy, create a new connection at random places in your code to the same database server. This will confuse the casual maintenance programmer into thinking that you’re actually retrieving information from two databases, and hence justify you not using a join.

Summary

Perform enough of these activities in a single project and you can be assured of months of job security. The best part about working in a language like PHP is that you don’t have to worry about making it not look maintainable. Legions of bad programmers have given the language such a bad rep that people almost expect PHP code to look bad.

Which kind of makes this whole essay moot, though.

The AMP Tango

I have recently moved to a smaller city that can’t quite boast of the public transport system that Mumbai provides to its citizens. It does exist on paper, and when you go out you do have the option of trying to locate a public utility bus or train. But that task is so difficult that most people cave in to the temptation of flagging an auto rickshaw. Even if it costs up to ten times the price of a bus.

Which brings us to another problem that the people of Mumbai never have to face – auto rickshaw drivers often quote a fixed price to take you to your destination. That figure doesn’t always correspond to what the meter might have indicated had it been used. So you can either yield to their demands while feeling frustrated at being robbed in broad daylight, or haggle with them to bring them down to what you might consider a fair price. And if you don’t like haggling too much, the experience isn’t going to be very pleasant.

I spent the afternoon haggling with some rather brilliant code, trying to coax it to work as it was advertised it would. I’m documenting the efforts I went through out here so that I can reduce some of the agony the next time I have to do this.

A Brief Background

Developers and software managers everywhere laud the power and elegance of the Apache, MySQL and PHP stack. Several commercial outfits have built their entire business model around them. I have been going through the ritual of installing these three on Windows for a long time now and yet, I stumble every time I try to integrate them.

Installing Apache is a cinch. The developers have packaged it into a very nice installer that sets it up on your system within minutes. The default settings are fine enough for almost anybody who does not want to play system administrator. And you don’t even need to restart your system in most cases.

Integrating PHP on top of Apache used to be a bit of a task, but lately it’s been becoming easier. I had to add 3 lines to the Apache httpd.conf file to get it to work. There is a huge dependency on the PHP initialization file, php.ini, but the risk is reduced because the default settings in that file too are sufficient for a typical developer machine.

MySQL too comes with its own installer that takes away most of the pain from the setup process.

The Problem

But this is where the ease of use disappears. Making MySQL work with the Apache and PHP stack always seems to be a pain. Since I don’t do this very often, I can’t remember the integration recipe very well. I’m usually not even sure if I faced the same problems the last time around – much less the resolution.

This article is a guideline for me to follow the next time I have to go through installing these applications. The problem I faced and its resolution is still fresh in my mind since I only stepped through it this afternoon.

It all started when I attempted to load the php_mysqli.dll extension to integrate MySQL into the stack. For extensions to load, PHP needs to be given the path to the folder where the .dll files are stored. The default location is the “ext” folder under the PHP root folder. This path is defined in the variable “extensions_dir” in php.ini.

For some reason the path was not interpreted correctly by PHP on start-up. The Apache error log kept showing the following error –

PHP Warning: PHP Startup: Unable to load dynamic library ‘E:\\apps\\php-5.2.3\\ext\\php_mysqli.dll’ – The specified module could not be found.\r\n in Unknown on line 0

The double backslashes in the path baffled me because the value in php.ini was definitely not typed that way. It is a common convention to escape certain characters by placing the backslash before it, so I felt that this was mangling the path.

Resolutions

I searched the web and found that many people have faced this issue. The solutions proposed were varied.

Some suggested that the library file libmysql.dll bundled with PHP 5.2.3 was buggy. I replaced it with the recommended earlier version (from PHP 5.2.1) but the error still persisted. Others suggested that the extensions that were needed could be moved to the System32 folder, so that they would be found by the PATH system variable. But I refused to corrupt my OS folders with such unnecessary files.

The obvious other solution was to add the extensions folder to my system path, which I duly did. But to my surprise the extension still didn’t load. I checked and rechecked that the folder was included in the system path.

By chance I happened to run a script in my browser that called the phpinfo() function. Browsing through the values, I came upon the Apache Environment section that described how Apache interacted within the OS. And that’s where I found the problem. For some reason, even after a restarting the service, Apache continued to maintain the old system variables in its memory. Because of this, PHP didn’t know where to search for extensions and kept returning the same error.

Restarting Windows solved the problem.

I still haven’t figured out why the “extensions_dir” variable doesn’t work as expected. I have tried various combinations of front- and backslashes, besides experimenting with several relative and absolute path names without any luck. Backslashes are always escaped in the error log, although front slashes are left as-is.

In all, I learnt that restarting your machine does often solve problems – not through magic but through simply realigning different portions of memory that may be connected in some obscure manner. Not bad for an afternoon.

WAMP Installation Checklist

For the benefit of those struggling with the kind of problems I faced, here’s the complete checklist to follow when installing the AMP stack on Windows.

  1. Download and install Apache
  2. Download the PHP archive and unzip into a folder of your choice
  3. Add the following line to httpd.conf depending upon the version of Apache installed -LoadModule php5_module “E:/apps/php-5.2.3/php5apache2.dll”
    or
    LoadModule php5_module “E:/apps/php-5.2.3/php5apache2_2.dll”
  4. Add the following line to httpd.conf -AddType application/x-httpd-php .php
  5. Add the following line to httpd.conf -PHPIniDir “E:/apps/php-5.2.3”
  6. Add the PHP folder to the Windows PATH variable.
  7. Restart your system after completing the last step to make sure that Apache refreshes the environment values. For some reason this does not happen simply by restarting Apache.
  8. Download and install MySQL.
  9. Change the extension_dir variable in php.ini to point to the complete path to the extensions directory.
  10. Add the following line to php.ini -extension=php_mysqli.dll
  11. In some situations phpMyAdmin also requires the php_mcrypt.dll and php_mbstring.dll extensions enabled. If needed, uncomment their loader lines in php.ini
  12. Copy libmcrypt.dll to %SYSTEM% folder if you need the php_mcrypt.dll extension to work.