How to Write Unmaintainable Code – PHP Redux

How to Write Unmaintainable Code is a satirical piece of programming advice written by Roedy Green a long time ago. That it remains his most popular essay till date speaks volumes about the programming skills of human society as a whole. Collectively, we are either a massive bunch of shit-chucking apes who take this essay seriously, or a small bunch of highly skilled developers seeking relief from daily disillusionment in our work behind its biting humour.

I hope to hell that it’s the latter.

Green’s essay is written with Java and C programmers in mind, but many of the techniques described can be applied to any programming language. Consider this essay as a “lite” version of Green’s to keep us script kiddies from feeling left out.

Write code. Or markup. Or both. Together.

PHP has a very unique ability of allowing the programmer to intersperse program code with HTML markup in the same file by using special PHP tags to identify code blocks. And as with all things PHP, there is more than one way to skin a cat.

<? echo "Hello, World!"; > // What is referred to as the "old style".

<? echo "Hello, World!"; ?> // What is referred to as the "first new style". Note the added question mark in the end tag.

<?php echo "Hello, World!"; ?> // The "second new style" is similar to the first new style and most recommended

<script language="php">
    echo "Hello, World!"; > // What else? The "third new style" of tags
</script>

<% echo "Hello, World!"; %> // And finally, a fourth way that uses ASP-style tags

This in itself opens whole new avenues for abuse for a programmer used to the sharper separation of code and presentation enforced in a more serious language. What’s worse is that the tag rules can be changed on individual systems through the php.ini file. Develop with all style flags enabled. Tell your colleagues that you’re doing it in the interest of maximum compatibility with third-party libraries. For added fun, find (or write your own) library that liberally switches between all five types of tags and make it an indispensable part of your project. Or keep switching coding standards every few weeks until there’s a healthy mix of all types of PHP tags scattered throughout your code. Refuse to replace previously used styles with the current favourite because it’s better to leave well enough alone.

Do not document this anywhere.

This is in the interest of avoiding single-step deployments on a stock (or sane) PHP installation. Force system administrators to find the problem in your code and then comb through the php.ini file for the appropriate settings. Always say that you have security in mind, which is encouraged by having the system administrator read each line of the settings file.

But the ability of the language to intersperse HTML markup even inside if…else or loop blocks evidences the sheer the heinousness that must have enveloped the language designers when they wrote that bit of the specification.

There are so many ways to abuse this “feature”.

A tamer way would be –

<?php if (foo == null): ?>
    <p> Invalid foo</p>
<? elseif (foo == 1): ?>
    <p>Foo is too small.</p>
<? else ?>
    <p>Correct foo entered</p>
<?php endif ?>

A more advanced programmer might use something like this –

<div class="query_output">
    <?
        $sql="select * from names";
        if (db_connect()): ?>
            <p>Error!</p>
        <?php else: ?>
            <p><strong>Names</strong></p>
        <?
            $res = db_query($sql);
            $names = db_result($res);
            foreach ($names as $names): ?>
                <!--span><?php echo $name ?></span-->
                <li class="name"><? echo $name ?></li>
        <? endforeach; ?>
    <? endif; ?>
</div>

By closely coupling the business logic with the presentation layer, your manager will have to assign a qualified programmer to make even innocuous changes to the front-end code rather than assign them to a design trainee. Male lions in the wild mark their territory by pissing on trees. Programmers should mark theirs by pissing all over the front-end markup.

Reinvent the wheel. Poorly

If there’s one thing that PHP is famous for, it’s the rich selection of extremely useful, but badly named functions in its libraries and standard extensions. Make things worse by ignoring the in-built functions entirely and rolling your own instead. Feign ignorance, which is a pretty valid excuse when talking about PHP because it has extensions for everything you might dream about, and some things you might have never even heard of (can you spell Swish-e?).

Write your functions in the most naïve and inefficient manner possible. You get bonus points if it is error prone due to subtle variations in the input.

For example, PHP has an in-built strtotime() function that can convert “about any English textual datetime description into a Unix timestamp”. Ignore it. Take a date/time string and split it into its component values by using hard-coded delimiters. Don’t take regional differences into account. Everybody should just be using ISO 8601 anyways.

Naming is the key

Begin programming before the domain model is fully ready or understood. That way you get to create your own colourful terminology for objects which haven’t been named yet. For example, suppose you’re building a web-based interface for a controller in a bakery. Look up the thesaurus for the term “mould” and use an unrelated synonym such as “die” in place of the mould. Repeated instances of the word “die” will make the maintenance programmer wonder if you wanted to pass a subliminal message on to him through your code.

Take this to the next level by using different naming conventions for the same element. If your database field stores the machine operator’s name in a “username” field, store the value in a variable called “user_machine_op_curr”, “machine_operator_curuser”, “machop_current” and “machine_operator_current_user” in different places where you need the value.

For maximum effect, instead of encapsulating usage of this variable in a single module, apply the RYE principle to litter the code across all the files. Suppose you need a widget to display the active username at the top of the page. Write the query and code to do this on every page with subtle variations in field names and joining random bits of data from unrelated or unneeded tables. The maintenance programmer will have no idea whether the query really needs the other fields, which will dissuade him from refactoring the code into a single module.

Tell everyone who asks that you are caching data for later use in order to improve performance.

Code formatting is hard. Let’s go shopping.

Randomize whitespace rules in your project. Use tabs in some places, spaces in another, or you can even mix and match both on the same line. Aggressively prevent any kind of structure from appearing in the code through indentation. Nested if…else blocks or loops are best for this kind of camouflage. The unwary programmer will not notice a nested loop if it is not indented further than its parent. Compound this complexity with the ability to enter and exit PHP code blocks at random locations within the file.

Repeat Yourself Everywhere (RYE)

Functions and objects are for weenies. Duplicating large chunks of badly formatted code all over the project puts hair on your chest. Even if you are a girl.

This is the ultimate abuse of a maintenance programmer in any language. But by effectively combining the previous principles, PHP gives this technique the potential to become a lifelong nightmare of bugs which are hard to find, fix or test.

Putting on the Pounds

By diligently following all the pointers given above you can be sure to reach file sizes of magnanimous proportions. The ultimate goal is to make a file large enough to timeout the version control server while a network operation is in progress. Make enough of those and the maintenance programmer will be caught between a hard place and a rock. On one hand, there is the fragility of the code which requires frequent commits to keep from making too big a change. On the other there are constant network errors while trying to commit changes to version control.

Database Abuse

PHP’s ultimate utility is its excellent data-processing pedigree. Numerous extensions have been written to enable connecting PHP with different database servers. And the one common thing that all database publishers extol is to consolidate query operations through the use of joins for fewer disk operations and lower network transfer.

Ignore them.

Instead, run multiple queries – one to retrieve the master table and the rest while cycling through the result set to retrieve values from the secondary table, using the current record as the selection criteria. If you have enough values in the master table, this will slow your server down enough to frustrate the end user. Use the techniques mentioned above to obfuscate and entangle your queries with unrelated code to make it resistant to change.

For added joy, create a new connection at random places in your code to the same database server. This will confuse the casual maintenance programmer into thinking that you’re actually retrieving information from two databases, and hence justify you not using a join.

Summary

Perform enough of these activities in a single project and you can be assured of months of job security. The best part about working in a language like PHP is that you don’t have to worry about making it not look maintainable. Legions of bad programmers have given the language such a bad rep that people almost expect PHP code to look bad.

Which kind of makes this whole essay moot, though.

Difference Between a Pointer and a Pointer-to-pointer in C

C passes variables by value. So if you pass 2 ints x and y to the function square(), it creates a copy of both variables, populates them with the same value as x and y and hands them over to the function to process.

int square(int x, int y)
{
  return x * y;
}

int main(void)
{
  square(a, b);
}

The same thing happens with pointers. A char *a points to location 0x000000 (or NULL) in memory. The variable is then passed to a function initString(). Now a copy of the pointer is created, which also points to the value 0x000000 and is processed by the initString() function. Within initString(), new memory is allocated using malloc() to the variable a.

void initString(char *x)
{
  x = malloc(6 * sizeof(char));
  sprintf(x, "%s", "Hello");
}

int main(void)
{
  char *a = NULL;
  initString(a);

  printf("%s\n", a);
}

Output

(null)

What happens in this case is that a fresh pointer variable is created, which happens to point at the same location as a, which is initially NULL. Then memory is allocated to this pointer in the initString() function, so it begins to point to the location returned by malloc(). When the function returns, the x variable is destroyed (and the memory allocated to it is now unreferenced, which creates a memory leak). In the meantime, a is still pointing to NULL.

Now let us change the signature of the initString() function.

void initString(char **x)
{
  *x = malloc(6 * sizeof(char));
  sprintf(*x, "%s", "Hello");
}

int main(void)
{
  char *a = NULL;

  initString(&a);
  printf("%s\n", a);
}

Output

Hello

In this case, a is still pointing to NULL. However, instead of passing a directly, initString() is handed over a reference to the pointer a – a pointer to a pointer. This causes a pointer variable to be created which points to the memory location of a. Now, the dereference operator (*) can be used to locate the address of a. Thus the memory address returned by malloc() can be assigned to a.

This method is not needed if you have already allocated memory to a char array in the calling function. This is because the pointer parameter in the callee function also refers to the same memory address allocated in the caller function.

Here’s a quick working example that demonstrates the differences between a pointer and a pointer-to-pointer when passed on to a function.

#include <stdio.h>
#include <stdlib.h>

void initString1(char *a)
{
  a = malloc(6 * sizeof(char));
  strcpy(a, "Hello");
  printf("initString1():\t%s\t%p\t%p\n", a, a, &a);
}

void initString2(char **a)
{
  *a = malloc(6 * sizeof(char));
  strcpy(*a, "World");
  printf("initString2():\t%s\t%p\t%p\n", *a, *a, &*a);
}

int main(int argc, int *argv[])
{
  char *b = NULL;

  printf("Function |\tValue\tPoints to\tAddress\n", b, b, &b);
  initString1(b);
  //strcpy(b, "World"); // Crashes
  printf("main() :\t%s\t%p\t%p\n", b, b, &b);

  initString2(&b);
  strcpy(b, "Hello");
  printf("main() :\t%s\t%p\t%p\n", b, b, &b);

  return 0;
}

The AMP Tango

I have recently moved to a smaller city that can’t quite boast of the public transport system that Mumbai provides to its citizens. It does exist on paper, and when you go out you do have the option of trying to locate a public utility bus or train. But that task is so difficult that most people cave in to the temptation of flagging an auto rickshaw. Even if it costs up to ten times the price of a bus.

Which brings us to another problem that the people of Mumbai never have to face – auto rickshaw drivers often quote a fixed price to take you to your destination. That figure doesn’t always correspond to what the meter might have indicated had it been used. So you can either yield to their demands while feeling frustrated at being robbed in broad daylight, or haggle with them to bring them down to what you might consider a fair price. And if you don’t like haggling too much, the experience isn’t going to be very pleasant.

I spent the afternoon haggling with some rather brilliant code, trying to coax it to work as it was advertised it would. I’m documenting the efforts I went through out here so that I can reduce some of the agony the next time I have to do this.

A Brief Background

Developers and software managers everywhere laud the power and elegance of the Apache, MySQL and PHP stack. Several commercial outfits have built their entire business model around them. I have been going through the ritual of installing these three on Windows for a long time now and yet, I stumble every time I try to integrate them.

Installing Apache is a cinch. The developers have packaged it into a very nice installer that sets it up on your system within minutes. The default settings are fine enough for almost anybody who does not want to play system administrator. And you don’t even need to restart your system in most cases.

Integrating PHP on top of Apache used to be a bit of a task, but lately it’s been becoming easier. I had to add 3 lines to the Apache httpd.conf file to get it to work. There is a huge dependency on the PHP initialization file, php.ini, but the risk is reduced because the default settings in that file too are sufficient for a typical developer machine.

MySQL too comes with its own installer that takes away most of the pain from the setup process.

The Problem

But this is where the ease of use disappears. Making MySQL work with the Apache and PHP stack always seems to be a pain. Since I don’t do this very often, I can’t remember the integration recipe very well. I’m usually not even sure if I faced the same problems the last time around – much less the resolution.

This article is a guideline for me to follow the next time I have to go through installing these applications. The problem I faced and its resolution is still fresh in my mind since I only stepped through it this afternoon.

It all started when I attempted to load the php_mysqli.dll extension to integrate MySQL into the stack. For extensions to load, PHP needs to be given the path to the folder where the .dll files are stored. The default location is the “ext” folder under the PHP root folder. This path is defined in the variable “extensions_dir” in php.ini.

For some reason the path was not interpreted correctly by PHP on start-up. The Apache error log kept showing the following error –

PHP Warning: PHP Startup: Unable to load dynamic library ‘E:\\apps\\php-5.2.3\\ext\\php_mysqli.dll’ – The specified module could not be found.\r\n in Unknown on line 0

The double backslashes in the path baffled me because the value in php.ini was definitely not typed that way. It is a common convention to escape certain characters by placing the backslash before it, so I felt that this was mangling the path.

Resolutions

I searched the web and found that many people have faced this issue. The solutions proposed were varied.

Some suggested that the library file libmysql.dll bundled with PHP 5.2.3 was buggy. I replaced it with the recommended earlier version (from PHP 5.2.1) but the error still persisted. Others suggested that the extensions that were needed could be moved to the System32 folder, so that they would be found by the PATH system variable. But I refused to corrupt my OS folders with such unnecessary files.

The obvious other solution was to add the extensions folder to my system path, which I duly did. But to my surprise the extension still didn’t load. I checked and rechecked that the folder was included in the system path.

By chance I happened to run a script in my browser that called the phpinfo() function. Browsing through the values, I came upon the Apache Environment section that described how Apache interacted within the OS. And that’s where I found the problem. For some reason, even after a restarting the service, Apache continued to maintain the old system variables in its memory. Because of this, PHP didn’t know where to search for extensions and kept returning the same error.

Restarting Windows solved the problem.

I still haven’t figured out why the “extensions_dir” variable doesn’t work as expected. I have tried various combinations of front- and backslashes, besides experimenting with several relative and absolute path names without any luck. Backslashes are always escaped in the error log, although front slashes are left as-is.

In all, I learnt that restarting your machine does often solve problems – not through magic but through simply realigning different portions of memory that may be connected in some obscure manner. Not bad for an afternoon.

WAMP Installation Checklist

For the benefit of those struggling with the kind of problems I faced, here’s the complete checklist to follow when installing the AMP stack on Windows.

  1. Download and install Apache
  2. Download the PHP archive and unzip into a folder of your choice
  3. Add the following line to httpd.conf depending upon the version of Apache installed -LoadModule php5_module “E:/apps/php-5.2.3/php5apache2.dll”
    or
    LoadModule php5_module “E:/apps/php-5.2.3/php5apache2_2.dll”
  4. Add the following line to httpd.conf -AddType application/x-httpd-php .php
  5. Add the following line to httpd.conf -PHPIniDir “E:/apps/php-5.2.3”
  6. Add the PHP folder to the Windows PATH variable.
  7. Restart your system after completing the last step to make sure that Apache refreshes the environment values. For some reason this does not happen simply by restarting Apache.
  8. Download and install MySQL.
  9. Change the extension_dir variable in php.ini to point to the complete path to the extensions directory.
  10. Add the following line to php.ini -extension=php_mysqli.dll
  11. In some situations phpMyAdmin also requires the php_mcrypt.dll and php_mbstring.dll extensions enabled. If needed, uncomment their loader lines in php.ini
  12. Copy libmcrypt.dll to %SYSTEM% folder if you need the php_mcrypt.dll extension to work.