Friday, March 23, 2007

Coding By Dogma

I learned lots of things this week, but the most important thing I learned is that coding by dogma is a widely followed practice. Who knew? I didn't. And it really took me by surprise.

It all started with my post, "Who Needs a Database?" which despite its provocative title was not anti database in nature. Rather, it was a post about shirking convention and building something that works by using simple techniques and simple code. The end result was a two page website that works exactly the way it was designed to. The sticking point for many people was that it didn't use a database but used a few flat files instead.

There were a few things about the reactions that took me by surprise. The first was that many readers left comments insinuating that my post was "anti database". The second was that the subject of not using a database utterly, completely and totally (I know, those adjectives are redundant - I'm trying to make a point here) polarized people. People either thought it was the most stupid thing in the world or they thought it was great.

In fact, some of the backlash was so vehement that JimboJones took it upon himself to post a link to the article on http://www.thedailywtf.com/. Link is here. For those of you who haven't been to thedailywtf, it's like the kiss of death. To see a reference to your code on thedailywtf is one of the most shameful things that can happen to a developer.

A curious thing happened when I saw the link to my code on thedailywtf and the laugh fest that followed. I entered a state of deep calm when I saw that there are developers who code by dogma. I understood then, what made people so angry.

Coding By Dogma

Even though I had clearly written that databases are standard pieces of any architecture for good reasons, nobody seemed to care. They couldn't see past the fact that I WASN'T USING A DATABASE!

Even though I clearly explained that based on the requirements for my site (requirements that I came up with, for a site that was all mine, not a client), there was no need for a database and I could accomplish exactly what I set out to do with just a few flat files, they still shouted "BUT YOU AREN'T USING A DATABASE!".

Even after I wrote in a follow up post explaining that people were missing the point which was essentially, if the site does exactly what it's designed to do, if the site is lean, simple and took me just a few hours to put together then what's the problem with using a few flat files? But they still shouted, "THE PROBLEM IS THAT YOU DIDN'T USE A DATABASE AND BECAUSE YOU DIDN'T YOU MADE SOMETHING TERRIBLE!"

Ok, so nobody actually shouted at me, but it sure felt like it.

So, why did my responses make those guys so angry, even angrier?

I think it's because many of us code by dogma. When presented with a problem (say, building a simple site exactly like mine) we automatically say: "ok, first, we know we're going to need a database, so let's do that".

In the vast majority of the cases we'd be right. What I tried to do was challenge that convention. I didn't do it senselessly however, I did it in the context where not using a database was a fine choice. But context doesn't matter to us when we code by dogma. We can't see past not using a database. There is no situation that might warrant using a flat file over a database. That's coding by dogma.

This discussion is by no means new. I'd like to wrap up here with a few examples.

Example 1:

radar.oreilly.com



Gabe (of memeorandum.com) wrote: "I didn't bother with databases because I didn't need the added complexity... I maintain the full text and metadata for thousands of articles and blog posts in core. Tech.memeorandum occupies about 600M of core. Not huge."

Mark (of bloglines.com) wrote: "The 1.4 billion blog posts we've archived since we went on-line are stored in a data storage system that we wrote ourselves. This system is based on flat files that are replicated across multiple machines, somewhat like the system outlined in the Google File System paper."


My goodness, these are fairly well known companies. I mean, Google uses databases AND flatfiles...

Make sure you read some of the comments. You'll hear a lot of the same rhetoric there that you'll see in the comments left here and on thedailywtf.com


Example 2

Joel Spolsky got massive backlash when he wrote about Wasabi, the custom language his company developed. When I first heard about it I thought it was the most stupid thing in the world. The more I thought about it though, the more I realized why he did what he did.

Joel says:



In most deployed servers today, the lowest common denominators are VBScript (on Windows), PHP4, and PHP5 (on Unix). If we try to require anything fancier on the server, we increase our tech support costs dramatically. Even though PHP is available for Windows, it's not preinstalled, and I don't want to pay engineers to help all of our Windows customers install PHP. We could use .NET, but then I'd have to pay engineers to install Mono for all our Unix customers, and the .NET runtime isn't quite ubiquitous on Windows servers.

Since we don't want to program in VBScript or PHP4 or even PHP5 and we certainly don't want to have to port everything to three target platforms, the best solution for us is a custom language that compiles to our target platforms.



When we don't pay attention to the context - to the real problem we're trying to solve we fail to see that what Joel did was smart. Hard, but smart.

What made some people angry was that Joel WROTE HIS OWN LANGUAGE! To most people, that's just a programming excercise, something you'd do in college, not something you'd ever want to do in "real life".

Well, that's just code dogma.

Thanks for reading!

12 comments:

Anonymous said...

I think this is an incredible post. From my own experience, it addresses a fundamental problem that is far too common in software development. For the sake of argument, let's divorce the idea of "code dogma" from John's previous post. Some of you think it was a bad idea for him to avoid using a database... I get it. Instead, let's discuss this idea more generally.

The failure to challenge conventions--and I mean conventions of all types, not just in the coding world--is quite literally the antithesis of innovation. This is true both in the development of necessary and functional projects and in terms of the exploration of ideas for purely academic purposes. I am in no way saying that we need to reinvent useful and advanced tools with every new project. What I want to ask is this: how are we better off by simply accepting that the way we have always done things--the way we initially learned to do them--is the *most* effective and efficient approach?

Charles Tilly--the prolific social thinker who has written several books about how change takes place in society--describes the process by which we form decisions and address the questions that arise in our lives. He says that we rely on four basic reason-giving devices for answering these questions: codes, conventions, stories, and technical accounts. All of these mechanisms serve the purpose of replicating what we already know about the world, rather than challenging the status quo.

We use "codes"--the written and unwritten rules that dictate how we are "supposed" to act--to justify our compliance with the ways things have always been done. "Conventions" are the higher-level concepts in our society that help explain normative values through simple axioms ("why reinvent the wheel?"). "Stories" personalize and simplify broad changes in history (or, as in this case, within our specific field) and shape our understanding of these events in a way that is both accessible to and reinforcing of our common beliefs. Consider, for example, the Steve Jobs/Apple or the Bill Gates/Microsoft stories. They are parables of how the little guys challenged the behemoth that was IBM with simple start-up companies that they initially ran from their garages. How different are those stories from "David and Goliath" or "the Battle of Agincourt?" Finally--and this is a form of reason-giving that most of you know especially well--we use "technical accounts" in situations where an individual's specific knowledge of tools and processes allows "expert opinion" to be accepted without scrutiny, and the decisions of such individuals as obvious and unquestionably true. After all, these people are the "experts," right?

Any experienced software developer can point to a dozen examples where each of these reasons have been given for why a specific technology was chosen to build a solution:

CODES: "It says in this O'Reilly book that stored procedures are the best way to access complex queries in SQL Server."
CONVENTIONS: "A distributed thin-client solution is always better than relying on software applications that stand alone on specific machines."
STORIES: "There's no reason you can't finish this project by the end of the week. When Paul and I started this company we worked 18 hour days and sacrificed everything so we would never miss a deadline."
TECHNICAL ACCOUNTS: "By leveraging the language-independence of .Net our C# and VB developers will be able to continue writing code in their native development environments. This will save our company massive amounts of money in re-training costs, while simultaneously bridging the gap between our legacy software and our 2.0 implementation."

In many cases, these various reasons reflect what we've learned about the development field over the course of IS's relatively brief history. For a lot of standard applications, such solutions save time and make sense. There really is no reason to reinvent the wheel if all you ever want to do is roll your turnips around in your cart. That said, if you're the engineer tasked with developing safer future models of cars with better road-handling and greater gas mileage for Audi, you may very well want to consider reinventing the wheel.

Tilly's classification of reason-giving types isn't intended simply to show how conventional wisdom supports the status quo. He also wants to illustrate how all major innovations must, in some way, break free of or redefine these accepted answers and common solutions. If no one had challenged the status quo in the microprocessing field in the 1990s, symmetric multiprocessing would never have been a possibility, and there would be a point where the conventional approach the miniaturization (excluding, of course, the other innovation of nanotechnology) would have ultimately hit a brick wall. The same can be said for the development of web-based information solutions. Without developers eager to challenge the existing system, Tim Berners-Lee's system of document sharing over distributed networks would have been little more than a networked extension of hypertext.

It's not hard to find examples from your own work as well... that is unless you are too caught up in the supposed-infallibility of your own technical expertise, the ways things have always been done, or adherence to a fundamentalist belief in your own form of "code dogma."

My point is simple: don't criticize that which is different. Without people like the guy who runs this blog--people who are willing to ask "why have we always done this and is there a better way?"--you'd all still be typing your way through the next level of Blasteroids on your Commodore 64. While nostalgia for such simpler times may be attractive, 25 years of perspective makes it pretty clear just how much that would suck.

John said...

Holy cow - that's quite a comment!
While I appreciate your comparison to me and David and Goliath (or Agincourt!) - I don't want to place TOO much importance on my blog post.

My intention wasn't to provide a brilliant alternative to using databases. All I did was uncover a situation where people took sides.

That said, man, I really appreciated the time you took to comment!

Anonymous said...

Heh. Yeah, I do have a tendency to prattle on...

You'll notice that I said:

"For the sake of argument, let's divorce the idea of "code dogma" from John's previous post."

What I mean is that I think the idea that you've raised is what's important. This concept of dogmatically doing what we've always done and refusing to consider alternatives is a huge problem. Complacency breeds stagnation.

As someone who spent close to a decade in the development field, the concept of "code dogma" really rings true to me.

Anonymous said...

Dogma exists at many levels, too. From the smug comment "My code is self documenting, I don't need comments" to the mindless "Microsoft wrote it, so I won't use it."

I have become increasingly aware and frustrated by this reality myself lately.

Good post.

brongondwana said...

When my now wife and I were travelling through Europe I wanted to set up something we could use as an online diary so people could see what we were doing. Also, I wanted to have a calendar showing where we were planning to be, and be able to update these things.

Nothing out there looked like what I wanted, so I wrote my own. The main design goal was "easy to update from anywhere". So I chose emails.

Basically, send an email to a specific address and it would be delivered into a Maildir on my server. The delivery would also trigger a script to be run that opened up the directory, read each email and applied the sum of all the instructions, e.g.

Subject: diary 7 Oct

blah blah blah

Or
Subject: location 7 Oct

London

There was more than that - but the basic concept was so simple. We even sent in an email from a device on a London street designed to allow people to send a basic email using a touch screen, and it worked fine.

If I had access to ssh or even just webmail I could delete things that shouldn't be in there, clean up, whatever - the cron job would update the site (I rebuilt it every day just in case)...

if _anything_ failed, the output wouldn't replace the current site, and the current site was purely static files. Nothing that could break.


It worked flawlessly the entire time, which makes it a success in my book, strange unconventional design nonwithstanding. Go the antidogma. Foow!

John said...

Haha bron:

Nice story. :) You know, who the heck needs "enterprise software" especially when it's something just for you?

Thanks for the comment - it was funny.

John said...

David, that's a great point, too. I interviewed someone recently who said that he won't use anything written by Microsoft because "I don't work well with Microsoft products".

That's just crazy and senseless. (Wait, is senseless the same thing as crazy?...)

Anonymous said...

Thank you.

I'm guilty of coding by dogma, and it's something I'd like to stop. It's good to be reminded occasionally that how we're doing things isn't neccesarily the best solution.

Anonymous said...

I work as a consultant at a Fortune 15 company with a huge legacy mainframe (OS/390) installation. If you think flat-files are for light-weights, guess again - these guys run something like $70 billion per year of business through systems which use a mix of flat files and VSAM* files. It's not always pretty or flexible, and my role there is to agitate for migration to newer technologies, but...the stuff works!

I appreciate the post. I also appreciate the comment from the person who wrote their own "update my blog via email" application. From FogBugz to Salesforce.com to (closer to home) an enterprise-critical SAP integration process that I wrote, email is sometimes the best way to get data into your app, either because it's the only way to get data through a firewall, or because it's the most ubiquitous of protocols.

The fact that every language out there has support for file operations (even most commercial versions of SQL - hah!) should be clue that sometimes a database is not the answer.

There are lots of other places you might want a file-based system, including embedded devices, client-based apps trying to keep a light footprint, etc. Case in point, Firefox uses flat-file storage for such things as Bookmarks.

*RTFM

Jordan Wilberding said...

Who is this guy, and who cares what he thinks or does?

John said...

Who are you referring to? One of the commenters, the author, the people who wrote on thedailywtf?

Matt said...

Great article! Also, many of the comments are excellent.

I had never really considered using flat-files until in 2002 when I had a DBA who made it so difficult to get anything done, that I just bypassed him and did everything in flat file systems. I found that I was able to develop the app incredibly fast since I didn't have to design the perfect schema. Since then I have always kept flat files as an option when I'm developing.