What’s In Chris’ Brain: September 2007 Edition

Ah yes, it’s now the unofficial end of summer here in the Greater Toronto Area. It runs from Victoria Day until Labour Day. Also commonly referred to as “construction season” due to all the road work that goes on on the highways surrounding Toronto during that time period. Yes, we will still get some nice hot days (like today, where it’s above 30 Celsius) but you can almost smell fall coming. The nights get a little cooler, days get a little shorter, and baseball season is almost over too. Which means I suddenly get a lot of my free time back. So, I’ve started thinking about what I’m going to work on in the fall.

  • *finally* get some work done on my baseball-roadtrip-planning site. Since I’m not immune to the Facebook phenomenon I will be making it Facebook compatable as well
  • add features to my interactive testing console for CakePHP, the big one being the ability to create routes and then pass in a URL to see what controller / action pair would handle the request. It might involve new functionality for the router, but hopefully I can do it without expanding the
  • I have my eyes on a few enhancement tickets for the 1.2 release that I think I can help with, the first one being some additional functionality for the lazy developer’s favourite tool, the CakePHP bake script.

On top of that I have a very small job to do today, which is to record a podcast on my earlier “Glue vs. Full-Stack” post for Cal Evans and his PHP Abstract podcast.

A Glimpse Inside CakePHP 1.2:

One of my co-workers over at CDC (the mighty gwoo) gave a talk to the Orange Country PHP group about CakePHP 1.2 and some of the features that it contains. One of the more interesting items, well interesting to *me* anyway, is the addition of a convenience feature to "has and belongs to many" associations called "with". Stolen directly from gwoo's slides, here's an example of it:

PHP:
  1. <?php
  2. class Post extends AppModel {
  3. var $hasAndBelongsToMany = array(
  4.  ‘Tag’ => array(
  5.  ‘className’ => ’Tag’,
  6.  ‘with’ => ‘TaggedPost’,
  7.  )
  8.  );
  9.  
  10.     function beforeSave() {
  11.          if(!empty($this->data[‘Tag’])) {
  12. $this->TaggedPost->save($this->data[‘Tag’]);
  13.          }
  14.     }
  15. }
  16. ?>

So what is the "with" parameter really for? It's nothing more than a convenience parameter that lets you apply a label to the name of your join table, so you don't have to call it by it's ugly name, in this case PostTags. Want to see it in action?

PHP:
  1. <?php
  2. class PostsController extends AppController {
  3. var $name = ‘Posts’;
  4. function tags() {
  5. $this->set(‘tags’, $this->Post->TaggedPost->findAll());
  6. }
  7. }
  8. ?>
  9.  
  10. <?php
  11. foreach ($tags as $tag) :
  12. echo $tag[‘Post’][‘title’];
  13. echo $tag[‘Tag’][‘name’];
  14. echo $tag[‘TaggedPost’][date];
  15. endforeach;
  16. ?>

It's little touches like that, unseen by a lot of developers, that makes CakPHP just a little bit easier to use with each passing day. You can download gwoo's slides here.

Extraction vs. Design

For those not familiar with the history of the Ruby on Rails framework, it's interesting (well, at least to me) to note that the framework grew from extractions from working on Basecamp, an online project management / collaboration tool. Basically, they took functionality from Basecamp (the "extraction") and used it as components for Ruby on Rails. This is probably a gross oversimplification, but this is not a blog posting about how Ruby on Rails was created.

David Heinemeier Hansson (the driving force behind Ruby on Rails) is a huge believer in using extractions as a method to add functionality to your projects, typically in the context of a framework. You need some particular functionality, you write it, and if it's good enough you extract it from it's original context and make it generic enough that it can be used by other people. You could also call this "building stuff you need" instead of "building stuff you think you might need". It's part of the whole YAGNI philosophy of programming, I guess.

So, the flipside to this idea of how you get new features is doing it by design. This is where you sit down and decide "I'm going to write a component to interface with Frammastat's Whoozinator API because I think some people would like to have that". So, you sit down and you bang out the code so that users of GrumpyFramework can now talk to Frammastat's Whoozinator via a component instead of writing their own. It's like what I did when I helped write code for Zend Framework to talk to Last.fm's Audioscrobbler service. I didn't need it, but a friend thought it was a good idea and it was a chance to learn how to use Zend Framework. I received a patch request for it the other day, so maybe I should go and take a look at it... :)

Anyway, where was I? Oh yeah

In this case, a decision was made to build something specifically for a project, whether it was required as part of my current work or not. Did I need that Frammastat interface for GrumpyFramework? No, but somebody else might. And that, in my opinion, leads down the road to premature optimization and feature creep. Just because you *might* need something, doesn't mean you should spend the time and effort to write some code for it. Often times, design makes sense when you know that the vast majority of users of your project will need something. But for a lot of things, wait until you actually need them.

I have found that the best bits and pieces of code that I have used with CakePHP have come from people extracting ideas they were using for their own personal projects and then putting a good spin on it so others could use it. Of course, it helps that CakePHP lets you create components and behaviors (that talk to controllers and models respectively) so that you can extend functionality without rewriting any of the core, but I remain firmly convinced that the extraction of ideas from current projects is the best method of adding new functionality to an existing application.

Stupid CakePHP Controller Tricks

A big shout-out to my favourite typist for showing me some of these gems, which I will gladly share with you

So, one of the things that often happens in CakePHP is that you will have multiple values to send to your view, so you might have code that looks like this:

PHP:
  1. $this->set('user', $this->User->read(null, $id));
  2. $this->set('foo', $foo);
  3. $this->set('bar', $bar);
  4. $this->set('baz', $baz);

Seems simple enough, yes? Once when I showed code similar to this to the above-mentioned typist, he said "ew" and showed me two neat little tricks to make the code (a) more readable and (b) a little more efficient.

Solution one? Use compact() to pass all the variables to your view.

PHP:
  1. $user = $this->User->read(null, $id);
  2. $this->set(compact('user', 'foo', 'bar', 'baz'));

What does the compact() function do? It takes the array you pass into it and looks for variables of the same name as the elements in that array. It then spits out an array of key => value pairs. So, one little trick with compact() means you only have to use one set statement. This works because all those $this->set() statements simply add those values to an array. But you already knew that, right?

The second solution is similar to sing the "chmod 777 firehose" to solve UNIX-based permissions problems.

PHP:
  1. $user = $this->User->read(null, $id);
  2. $this->set(get_defined_vars());

Now, while this looks neat it does come with one big caveat: if you use get_defined_vars() you will be passing a lot of stuff into your view that you might not want there. In a way it's like using REGISTER_GLOBALS in your CakePHP. Every variable you've defined in the controller (and some other ones that CakePHP has defined for you) in your current scope will be available in the view.

Database Shards and CakePHP

As I've mentioned in this blog before, I spent 4 years working for an adult dating web site. One of the biggest problems we ran into was a bottleneck involving database replication lag. In a normal replication setup you have your application do writes to a master database and then those changes get replicated to the slaves. That's okay...until you start dealing with huge amounts of updates.

So, the quick solution was to make sure that we had fast enough hardware on the slaves to handle the huge volume of updates. I remember partition lag in the order of 30 minutes on some of the machines (that's right, THIRTY MINUTES). But the bigger problem came down to trying to find a way to minimize replication problems. One of the things discussed (and I have no idea if it ever got implemented after I left) was partitioning the data into groups that made sense. Some of the suggestions were to partition based on geographical location of the users, or simply do grouping based on the record ID.

The other day I came across a great blog posting talking about database shards and instantly realized that they were talking about exactly what we were facing. The goal is to spread the data around, denormalize things so that you have all your necessary data in one spot and to try and minimize replication issues. So you'd be reading and writing to a shard depending on whatever criteria you are using.

Now, I know that you could put code into the beforeSave() method on a model to figure out what shard you would be writing to, but how to figure out what shard you would be *reading* from is something I am still mulling over. Suggestions from #cakephp-dev (where I hang out during the work day) seem to point towards using a behavior (go to the Bakery and search for "behavior" to see lots of examples) to make this work. I'll fool around with some code to see if I can come up with something that works.

Want to advertise on this blog? Send email to chartjes@littlehart.net
GTcars Canadian Car Audio TurboDodge Audi Forum
Mustang Forum Dodge Intrepid Miata Turbo
GTscene Pontiac Bonneville