How to truthfully stay DRY

22

OCT

09

How to truthfully stay DRY

"An article about Don’t Repeat Yourself? Please don't repeat common knowledge! Heard that, seen that, done that!" Really? Even though the pragmatic programmers who coined that term were very explicit about what DRY means it seems that the common understanding of DRY often falls short of fully comprehending this paramount concept.

DRY is not about factoring out parts of your code into methods or classes which would otherwise be repetitive. This is sometimes described as Once and Only Once. I take the liberty to say that DRY never bothered about that obviousness which was indeed heard, seen and done since the ages after punched cards by any programmer with a brain. Where does DRY apply then? Very simple: everywhere you deal with information. View your application outside your ruby, php, javascript, etc. and look at this piece of SQL:

CREATE TABLE users (
	id int identity(1,1),
	sPassword varchar(32),
	email varchar(64),
	bNewsletter type char(1),
	primary key (id)
)

I bet your application entertains good relations with a database in one way or another. 'Yes' you say, 'but what?' you ask puzzled, 'how could I repeat myself here?' Well you might want to consider a JavaScript validating corresponding user input and since we always have to validate on both ends (you didn't forget, did you?) You have to do the same validation on the server as well.

The same information hard coded in three places! What happens if we add a new field "URL" to our table or modify encoding from Latin1 to UTF-8? I tell you what happens: you will forget to update at least one of your information sources. If you are lucky you will catch any error with your tests. But I tell you: you won't be lucky and will have produced a juicy bug that is waiting to sprout legs and torch your application as soon as your customer gets his hands on the new version.

Now what can we do to protect us from such predictable disaster? DRY tells us what to do: no repetition of information, one source to rule them all. Any change to that master source has to propagate to all other sources.

In your case you might want to consider the SQL table creation script to be this source. The data needed for the two validation scripts get derived from it: bad idea. SQL is ignorant about data types you use such as password, email, etc. in addition you have a hard time modelling dependencies such as: email field required if newsletter checkbox is ticked. The client and server side code is not better suited for that trick.

I recommend using a configuration file in JSON format as your master source which is used by a shell script to generate all three files, sql, js and ruby/php/whatever and then recreates the table in the database. If you don't want to drop an existing table you could make your script as smart as to alter your table instead.

Concerning your validation you might be tempted to have the shell script incorporate the logic into your methods directly: again, bad idea, you would contaminate your dry and clean code with repetitive information. Instead refactor your validation scripts to generic methods that work with external data objects that include all information needed to validate your forms. Only those are generated.

Such a configuration script might look something like the JSON below:

oUserData = {
	"indexes" = [
		"email"
	],
	"encoding" = "UTF-8",
	"formId" = "registration",
	"fields" = {
		"sPassword" = {
			"sqlType" : "varchar",
			"formType" : "password",
			"length" : 20,
			"validates" : {
				"required" : true,
				"isPassword" : true
			}
		},
		"email" = {
			"sqlType" : "varchar",
			"formType" : "text",
			"length" : 20,
			"validates" : {
				"required" : true,
				"isEmail" : true
			}
		},
		"newsletter" = {
			"sqlType" : "char",
			"formType" : "checkbox",
			"length" : 1,
			"validates" : {
				"required" : false,
				"dependsOn" : [
						"email"
					]
			}
		}
	}
}

You now might argue that we still replicate information, indeed, all four files (even one more OMG!) hold - at least partly - the same information. However, generating those pieces of information is no breach of DRY at all since you do not treat them as "normal" code. The generated files are by no means maintained nor are they put under version control. You will never touch them, they are only created if your data or validation model needs to be changed or you build the application.

As you can see the principle DRY has much more to it as you might have thought initially. It not only helped us to avoid broken code because of changes "outside" the coders realm but also brought us to a situation where we refactored our code to be used in the next project!

We also see that we touch other areas such as configuration management or database administration. Most probably some smart guy keeps a jealous watch on those sacred grounds, help him understand why he also should stay dry!

Curro said on 2009-11-06 17:36:08

Do you have the shell scripts written? I could spend
some time on this. Send me an email if you want to take
the discussion off these comments.

Juergen Riemer said on 2009-11-06 17:50:00

Curro!
My shell scripts are are only creating tables not altering them hence they are rather simple. Since I use JSON as configuration file I can use them "as is" in JavaScript/server side language. The more complex part is to be found in the validation classes.. those I can share with you for sure.. lets talk about it at the beer-review next week :)

Julian said on 2009-11-27 10:17:09

What about an "application generator" GUI (web-based) that will take care of the creation of the initial config file itself (JSON, XML, whatsoever), as well as the generation of any sql, client-side and server-side scripts? Have you tried something like that, Jürgen?
I remember I started working on a similar idea some time ago but never found the time to finalize it/make it production-ready :)

Juergen Riemer said on 2009-11-30 12:12:00

Julian,
I was indeed thinking of creating the HTML parts of the form, I did not go for that yet though. I would have to add information on how to render the form into the config file (at least horizontal or vertical alignment) and of course the CSS/JavaScript identifiers for styling and JS hooks, that would dilute the business logic with presentational information which I try avoid like the plaque... though one could derive generic identifiers from the form element's type...
Still you are right in identifying those pieces of HTML being duplication of information thus constituting a plain breach of DRY. Hmmm, I now have to rack my brain on that issue :)
As for entire applications though I would be tempted to fall back on templating frameworks or simply use e.g. drupal as much as possible.
Anyway, thanks for your comment, it will keep me busy while commuting to work.. I will let you know about any result :) I'd also be happy about any of yours for sure!
is the sum of zero and nine.