In one of my computer science courses the instructor asked if computing resources have reached the point where efficiency can safely be ignored. I responded with a resounding “No!” but some of the younger pups in the class disagreed. I'm begging to think I was showing my age, at least within certain contexts.
Something in my brain makes me love the idea of optimization. I actually police myself away from it, but I still find myself spending hours upon hours thinking about “better ways” to do things. I was recently debating with myself about storing telephone numbers in a database as integers vs. strings. I found the integer approach most appealing. It just felt right. But I quickly figured out that would require formatting the data to be human readable every time I needed to display it as well as a bit of trickery on the front end to get the form input (which would be in a human readable format) into a basic int structure. That consumes a lot of my time, and possibly a lot of time for the users who eventually put the collected data to use. In exchange, I save a little bit of hard drive space.
The new server has 500 gigs of RAIDed space. A formatted telephone number string, such as “(888) 555 - 1234” takes up 16 characters. That's a 17 byte VARCHAR. Even less if we default to something shorter for unrequired fields left blank by the user. But for the sake of argument let’s go with VARCHAR. Storing “8885551234” as a BIGINT requires 8 bytes, saving us 9 bytes. That’s 9 out of over 500 billion available. We’ll end up with a few hundred form fields that will see a few hundred hits per year. For the sake of argument let’s say 400 squared, or 160,000. If my attempts at optimization save an average of 9 bytes per field per record, we'll run out of space after about 350,000 years. I'm not sure what the clock cycles are involved in fetching or even comparing a 16 character string vs a 10 digit number, but probably even more negligible than the storage space. Obviously, server resources are abundant when compared to my time and the time of my users.
I’m about half way done building the forms on my to-do list and I just convinced myself to change the way I handle things. Oy vey.
If data is collected for the purpose of being later presented to humans, I will store it as a string, optimization be damned. I’ll use numeric types for data that is collected to be crunched, which in all honesty if rare right now. In situations where it could conceivably be used for both, such as dates, I’m probably better off storing both versions and fetching (or sorting by) whichever is most appropriate rather than running a timestamp through the date() function as needed or converting user input into the MySQL DATE format (which is both human readable and easily sortable).
2 years into this redesign project and I’m still not done. In hindsight, the biggest setback has been my own perfectionism. I have a hand crafted attitude towards my work. I take great pride in it. But at what cost? It feels great when I see something like Smashing Magazine’s list of current best practices in form validation and I realize I’m already doing the majority of those things simply because they “feel right”. But I spent all day yesterday working on a single form, stayed 45 minutes late, and still didn’t get it done. That felt anything but great. What’s the trade off? Where do I draw the line?
I’m even doing it now. I’m encoding my apostrophes and quote marks. Can anyone out there notice the difference between “this” and "this"? It's 12 extra key strokes for me to use the "proper" encoded characters. For all I know, Blogger auto-converts them for me anyway. (*EDIT* No, it doesn’t.) I've just developed the habit over the years of hand coding HTML. 12 keystrokes per quote pairs times 5 quotes per page times 3,000 pages at 400 characters per minute is 7.5 hours. That's a full workday over the past 2 years. Is that too high a price to pay for typographic correctness?
What's the cost of XHTML validation? Of ADA compliance? That last one could end up saving us a mint if lawsuits start getting tossed around. I know where my personal comfort level lies on most of these issues and I'm willing to re-evaluate in light of new information and fresh perspectives. As the only web guy around here I guess I get to make those judgement calls for the institution. But in a freelance situation my time is the client's money, which is definitely a scarce resource. A 0.2% markup cost for things like typographic correctness may not sit well with some clients, but there's plenty of designers out there who also don't care. Maybe they can service those clients.