Thinking about development and related processes

In my mind, product development is broken down into three stages:  design, development, and deployment.  Project management and quality assurance are supplementary activities.  Project management manages scheduling, budget, resources (people and things) and tasks.  Quality Assurance is  responsible for testing, process, and requirements management.  In general, PM interfaces with “Business”, and QA is concerned about “Customers.”  Operations handles the product after deployment.

Model Hierarchies

Hierarchy complexity

Hierarchies should be 4-7 levels deep.

Three or less is not really a model.  Ten or more is too complex.

Two levels is really just a tuple.  Three levels is is just grouping of tuples, which doesn’t really require modeling.  Of course there are exceptions.

Artificial Hierarchies

You often see models (especially in XML, but also in OOP) where people sense this, and create artificial depth in hierarchies.  For example:

<product>

<features>

<feature>

<features> is just a useless wrapper around features that tells you “one or more features” may follow.  That’s really just a crutch for the parser.  You could just count up the individual <feature> element underneath <product>.

If a featurset has attributes that can’t be expressed in the individual features, then its okay. But if it’s only a grouping, then the grouping is really just an instance of a higher level element.

You also see a base element that is only there to have a base element.  That’s a flaw in XML, but a base element can be good for descriptive naming purposes, or to describe domain level attributes.  I think top level attributes are a bad practice though.

Guidelines for domain modeling

I think two common mistakes are made in modeling data & processes that can lead to unintended complexity.  The first is conflating two models that should be separate, and the second is oversimplifying models.Conflated model: when what should be two separate models are intermingled, resulting in confusing (forced) associations.  If you have two elements in a hierarchy and you’re not sure which is higher, it could be a sign of conflated models.

Inadequate model: when the model is too simple, resulting in multiple aspects being crammed into one element — making comparison and isolation difficult.  If an element has too vague a meaning or has muliple contexts, it may be better modeled in separate elements.

Contexts are a cood indicator of domain areas.  Requiring context to understand an element may hint at two or more models.

Leveraging existing knowledge and techniques — ideal framework requirement #3

Too many framework developers decide they’re going to be too clever by half. As a result, writing plugins, extensions, modules, templates, models, views, controllers, classes, procedures, whatever you call them is always a unique experience.

My rant about ORM (and templating) tools was one part of this. Writing records to a database is not rocket science. But learning the way tools want you to do it sometimes is. I realize there is sometimes complexity that it is trying to hide, but don’t. That’s what I’m talking about with magic.

It’s okay to use an abstraction layer. It’s even okay to use an ORM if you really feel like it. But I should be able to quickly trace my code’s path and debug what’s happening. Even if (most like) it isn’t your (the framework designer’s) code, it’s mine. At least I won’t spend all day pointing my finger at you.

But the truth is, chances are, I’m going to have to work around your framework, or customize it, or optimize it, or put in some ugly hack at some point. If you’re too clever by half, I’m going to (incorrectly) assume your framework can’t handle it, and throw it out. The cdbaby story on Oreillynet.com comes to mind.

Writing records to a database, or alternately using a cache, marking the cache dirty when appropriate, and using either a file-based or in-memory (local or remote) cache isn’t rocket science either. But if I don’t know how you’re doing it, I can’t poke around and figure out that it was my stupid configuration and not your brilliant framework at fault.

In summary, if I look at (to continue my example) a persistence class, it should look like something I’m familiar with. I’m aware that there are more techniques and idioms than one man can be aware of, but it seems too many people go out of their way to create their own.

I understand the desire to “do it the way you’ve always wanted to” — that’s a huge part of the open source itch. I’m guilty of it myself. But so many people just have such bad style. Or maybe I just can’t recognise good style when I see it. You probably won’t when you look at my framework and the new way I code.

Rails created whole new idioms and even rewrote (by dynamic class overloading) some of Ruby’s semantics, but they could do that because there wasn’t really anyone using Ruby at the time, so the syntax, coding style, and language idioms were practically up for grabs, as far as the larger programming community was concerned.

People (including me) were tolerant of learning Rails idioms (including ActiveRecord) because we were learning a new language anyway. But while some things (like ActiveRecord) had pretty good style; others, like routes, had a smell from the start. It’s often dismissed by humble programmers as “just not getting it” — but lots of people “just didn’t get” EJBs for years before the switch was flipped and suddenly everyone admitted that it was just a bad design from the start.

Another part of leveraging existing knowledge and techniques is using existing tools. I deliberately left that out, because I don’t always think its best. If you use an existing crappy library, the flavor will spread through your code, but things like loggers, unit tests, etc. have such entrenched methods, that while your annotation-based AOP injected distributed transactional logger is just going to confuse people — oh wait, that sounds like the standard way of doing things in Javaland these days.

You might be nodding your head and thinking “design patterns” but that’s NOT what I’m talking about. Design patterns, for the most part are something people talk about who want to sound smarter than everyone else. If only people who think they’re smarter than everyone else are going to work on your framework, fine — sprinkle some decorators and anti-singletons, and whatnot around and be sure to use those big words in your documentation. But design patterns are really things like arches and dormer windows, and I don’t think that has anything to do with web frameworks, and come to think of it, I can’t think of anything that arches and dormer windows have in common, except they’re parts of buildings, and neither one of them is really a pattern.

Being obvious about what it’s doing – requirement #2 of the ideal framework

“What you don’t know can’t hurt you” — but it can’t help you either.

Like many others, I suppose, I fell in love with Rails because it “Gets out of your way” so well.  At least at first.

Thanks especially to Ruby’s dynamic method modification and anonymous blocks, Rails hides a lot of the details from the user.  I still can’t think in closures, because I’m an implementation fetishist, but most of the time you don’t need to know how something works.   And moving it out of way only helps you concentrate on what you’re doing.  Until you need to do it some other way.  And then cleverness bites you.

Like it always does in perl.  Sure, a solid grasp of *how* OOP works makes it possible to understand how OOP in perl works, but you’re spending too much time showing how it works in you code to get the biggest benefit of OOP — simplification.  Which is probably the most underdelivered feature of OOP anyway.

But a good framework shouldn’t do too much magic.

For one, it makes people like me — people who like to poke their fingers in the black box — nervous.  There might be something sharp, or something you can break inside that box.  And learning that box’s implementation as well as it’s interface gets to be too big of a burden.

Well, I suppose I should discipline myself and leave the lid on the box and get my work done…but that wouldn’t be ideal, would it?

“But what if I need to pop the hood and fix something?” that voice in my head says. I probably overestimate my need (or capability) of fixing (or improving, or modifying) the engine inside that framework box, but there you have it.

I like CodeIgniter because it’s obvious about what it’s doing.  Maybe 91% of that is their really good documentation, but then maybe that’s 91% of this requirement.

The rule I take away from this requirement is that you should avoid unnecessary layers, like ORM and templating.  They shouldn’t be central features at least.  People have no trouble dismissing things authentication and authorization as external, but consider their ORM tool or templating system or CMS interface to be the core of their framework.

Maybe I’m just wrong (or different) in my thinking that authn/authz are core, or maybe it’s just the type of apps I tend to develop.  But I’m willing to accept that their external if you’re willing to accept that your feature is.  Or I guess different frameworks have different “themes” based on their focus — content managment, presentation, persistence, or authorization.

Another reason to be obvious about what you’re doing is that it makes it easier for us humble demi-hackers to work on.  Less magic means less powerful magicians can provide meaningful contributions.  That’s one area where CPAN shines and PEAR flounders.  PEAR’s coding conventions are a barrier to entry.  CPAN has tests that accomplish the same thing (better) and allow ranking without disallowing contributing.

Being able to understand how your framework works allows you to work within the framework, rather than trying to work around it.

When I first started in PHP, I really struggled with session caching, and db connection pooling because I sensed that I wanted something more, and when Java frameworks started to offer tools for this, I was easily swayed, swallowing the camel to get at the fools’ gold, or something like that.  But when I better understood PHP’s shared-nothing architecture, and Apache’s MPM process strategy, I realized that it was a waste of time to try and build that stuff into PHP.

The final reason being obvious about what you’re doing is because sometimes that voice in my head is right and you do need to open the box, even if it has so fare been a gift from the Gods.  And then you can only hope that you’ll be able to figure out what it’s doing.

Magic tools are fine when they increase productivity or abstract away implementation details, but for the suspicious and superstitious, being able to see how that magic is performed is important.

Which leads, unsurprisingly to my next requirement for the ideal framework:

“Leveraging existing knowledge and techniques”

Getting out of the way – requirement #1 of the ideal framework

The framework that manages best, manages least”  or something like that.
Even in the original it didn’t necessarily mean the least amount of government, but that because of the structure, less intervention was required.

This doesn’t directly boil down to simplicity,  but simplicity is the easiest way to reduce management overhead.  That’s why things like wikis, agile programming, and lightweight project management tools like Basecamp are so popular.  But what do you do when complexity is needed?  Fine grained authorization, detailed workflow, complex views, localization, etc. (More on this later.)

A good framework should not have complex configuration, obscure conventions, or rigid patterns.

On the one hand, Rails is very good at this.  You can build sites with Rails without really understand Ruby or Rails.  On the other, it’s very difficult to go outside the Rails box.  Rails has improved, or rather techniques have been learned that make this doable.  For the most common 80% of tasks, though, I’d say Rails sets the standard for getting out of the way.

Rails’ philosophy of “convention over configuration” works in Rails because for the most part, it gets the intuitive part of convention right.  It also benefits from being an early comer (in at least the current round) among frameworks.  Many imitators who diverge do so at their own peril, first because Rails did an exceptionally good job, and second, because “The Rails Way” (good book by the way, by Obie Fernandez.  Don’t think I’m plugging for him though, I don’t know him, but my uninformed personal opinion of him isn’t positive.) has become a de facto standard of comparison with most other frameworks that came after it.

ActiveRecord, however, does a great job at first –it was arguably the defining feature that made Rails popular, but then falls down when you want to go outside the class-is-a-table pattern.  Not saying it blocks you, but it gets in the way.  ActiveRecord is so valuable for prototyping new applications, however, it’s almost a must for a good framework to support it.

I’d argue a good framework should allow things like ActiveRecord, not necessarily support them.  I haven’t yet seen the tool, method, pattern that makes transitioning from ActiveRecord smoothe.

When convention falls down and configuration is needed, then things can get in the way, and perhaps one of Rails’ stumbling blocks is the advanced beginner has had it so easy so far with sensible defaults, that transitioning to manual configuration is as like hitting a wall as hitting the water at high speed can be.  If you weren’t going so fast to start with, you’d slip right into the configuration.  And after something like Struts or Spring IOC, Rails configuration is like slipping into a jacuzzi after a long traffic jam.

Similar story with URLs.  Rails’ page-controller default routing is a thing of beauty, and a true aid to site design all around.  But when the page-is-a-controller metaphor breaks down, it’s a pain.  Probably more because you tend to think that way and choose that as the wrong pattern when you shouldn’t have.  Also, without an easily customizable front controller, pages classes tend to have too much “baked” in, and the complexity is in the wrong place.

The thing I like most about Rails is how well it does “get out of the way”, and the thing I like about CodeIgniter is that it accomplishes it almost as well,  but transparently, and could benefit from some of Rails’ magic.  Which leads to my next requirement for the ideal framework:

Being obvious about what it’s doing.

Framework fetish and rants on SolarPHP, Django, and Python

I’ve been looking at more frameworks — I know, it’s a problem. Including an in depth look at Joomla and a peek at SolarPHP, including a look at YAWP, Savant, PDO.I’m not going to use Solar, but a day spent looking at it is a day wasted. A quick criticism: It’s too “magical”, requires too much memorization, has tons of errors in it’s documentation.

I wish there was an easy way to submit errata or make changes. The SolarPHP manual is a wiki (which supposedly gives them an excuse to prevent page navigation), but normal users can’t edit it (which is fine), but there should be a workflow to propose changes — that’s a problem with wikis in general, no workflow, but it goes counter to the wiki “philosophy” which is silly.

Solar has a very obtuse system. While someone familiar with Rails-like applications can figure it out (with the help of the documentation and some guesswork — because the documentation includes things like mis-named files that cause opaque error messages. Every error you make will produce an exception with the exact same text:

“Exception encountered in the fetch() method. in Solar.php line 363″

and the stack trace doesn’t even mention your application code at all.

While Rails had an unusual configuration system, it had reasonable defaults, that were fairly easy to learn. Every framework that imitates Rails and adopts convention over configuration has the problem that it’s not Rails. The further you stray from Rails the more users will be frustrated when it diverges. Corollarily(?), the closer you hew to Rails, the more frustrated you’ll be when it (often unexpectedly — read: undocumentedly) diverges. I don’t want to learn a new “convention over configuration” for every framework.

SolarPHP’s ORM sucks, too. While I didn’t look into enough to say how much it sucks, it’s a rule that anything that copies hibernate sucks, and suffers the same problem as Rails imitators. The more it diverges from Hibernate’s syntax (even if it’s an improvement), the worse it is. And I’ve a hunch that Solar’s ORM isn’t nearly powerful or flexible enough to justify it. I categorically refuse to learn another ORM syntax.

SolarPHP’s templating is unimpressive, it’s MVC file layout is uninspired (and non-standard), it’s authentication is inadequate, it’s localization is simplistic (but works — on the otherhand, how hard is a string substitution?) In short, it didn’t really bring anything that appealed to me, while giving enough frustrations (mainly with the documentation) and annoyances (mainly because of differences with my knowledge of existing frameworks) to put me off.

I can’t believe it, but it seems to not even have a scaffolding script to help you learn (or not need to learn) it’s file layout structure.

Anyway, I’ve distilled some requirements out of the experience. I think a good framework should:

  1. Get out of the way
  2. Be obvious about what it’s doing
  3. Leverage existing knowledge and techniques
  4. Help write better code by:
    1. Making code simpler, less verbose
    2. Help me stay organized
    3. Help me avoid shooting myself in the foot
  5. Allow hacks when needed

I’ll post more on that later.

I’m not trying to hate on SolarPHP. It’s bearing the brunt of my accumulated dissatisfaction with frameworks. I mentioned before that the only ones that have really excited me (since Rails) are Wicket and CodeIgniter. The former runs on Java, which in my book makes it only suitable for intranets or large (read: managed by a full time sysadmin) sites. The latter is too simplistic by just a little bit. It could actually benefit from a bit more magic, and things like authentication and ajax left as an exercise for the user. Which is fine, especially if you were going to build everything from scratch (or from Zend / eZ) yourself anyway.

One framework I haven’t looked at, but (to spread the hate around) I’m sure I wouldn’t like is Django. That’s just from the general impression I’ve gotten by crowdsurfing the intermemeoshere. I’m sure Lawrence, Kansas is a nice town and all, but building a CMS for a small community newspaper isn’t exactly brain surgery. Chicago crime is way cool, but that’s just showing off Google web services with a neat idea. It could’ve been done in VBscript and it would have been just as cool.

My impression is that Django consists of the core team marketing it plus a bunch of people who thing Python is cooler than Ruby. The one thing python has over ruby is a halfway decent Apache module. And I do mean halfway. At that end of the spectrum (near the very bottom) speed, is not a deciding factor. I think python’s syntax is silly, though like almost everyone else, I thought tuples as a type were cool. You can get the practical benefits of that in PHP nowadays with list() — which I think is prettier syntax.

To all the PHP haters out there whose only arguments are “it’s possible to write code that has a SQL injection vulnerabily” and “I don’t like (or understand) some of the function names”, I don’t want to hear it. PHP isn’t the best language in the world, get over it. But it’s got a better syntax (and object model) than Python.

Do you know what “def” means? It means “define function”. Defining a function with part of the word “define” is empirically uglier than with the word “function” and less fun than abbreviating it “fun.” Mandatory underscores, bad keywords and yes that horrible white space delimiter are enough to make me not want to use python.

If you don’t like PHP function names, it’s probably because you never learned C. Hint: the 3 or so example functions that hurt people’s tender sensibilities in every “I hate PHP” rant are direct transcriptions of functions used to write your favorite scripting language, which was written in C and I almost guarantee it has the dreaded strpos(), strlen(), et al, buried in it’s source code.

The fact that every rant on PHP contains complaints about the same functions as a usenet post from 1997 shows that they don’t really have their own opinion on it.

And don’t get me started on “self”. Python’s bolted on objects may be better than Perl’s but that’s not saying anything at all. One thing that appealed to me about Python was that you could take more control of your request lifecyle — what I really liked about mod_perl, but it turns out that building it all into index.php is both faster and more stable.

But I’ll probably check out Django, and learn from it too. I’m sure I’ll find things I like and things I hate, and things I hadn’t thought of. Do I think it’ll have that “killer feature” that will make me want to switch? No, but I’ll still enjoy it. I enjoy Python, I just don’t like it’s syntax. And all other things being equal (which they practically are), I’d rather use PHP’s syntax, and put up with it’s shortcomings.

I did like my time studying SolarPHP, even if I don’t want to use it. I learned some interesting things with dependency injection in PHP that I could never pull off myself. And as much as I maligned it’s ORM, I could never build something like it — and even if I could, I’m sure someone would hate my API.

Whew! This turned out to be quite a rant, and with any luck, I’ll draw vultures, both for and against. Which is something I definitely don’t want. I’m well aware that your average script kiddie can kick techno-sand in my puny face, and the only reason I feel comfortable ranting is because I’m sure no one more nerd-savvy than my mom will read this post. But with the way search engines work these days, my little flicker of ill-informed self-commentary will probably draw moths and rangers with buckets of techno-water to douse my fragile ego.

Maybe I should get some adwords up before the bonfire starts.

I’ll post what I really wanted to, next.