Who Is Responsible For Scaling? post

May 23rd, 2007

I question I see a lot on the CakePHP mailing list is "can CakePHP scale?". Now, other than this being an incredibly general statement it got me to thinking about scalability. In many ways the issue is like security: developers expect security to be "built into the application" when time and time again it has been pointed out that security is the responsibility of the developer, not the language itself. If you don't make sure to filter your input and escape your output, well, who's fault is that really? Yes, I know there are cool things like the filter extension for PHP and that many frameworks have helpers that can automagically filter your input. But that's still code written by someone else that HAS TO BE USED.

So, what does this have to do with scaling? Well, let's look at it from several different perspectives. At the language level, it has been proven that PHP's shared-nothing tendencies mean that it is suited to what I call horizontal scaling, where you add more web servers running PHP on them. No need for a central chokepoint means you can scale by simply adding more servers. It's really that simple.

Now, once you start adding in shared resources (like a database or centralized session storage) the problem ceases to be PHP's and becomes the problem of the developer. Why? Because now you have to write your application (or components of your application) with these shared resources in mind. You're now involving resources OTHER than PHP in the equation as well. Say you have a centralized database and multiple application servers. Is the ratio of reads to writes high enough that response time to the database server will be low enough to not impact application performance? If you use replication, is replication lag going to kill you? All these questions are beyond the scope of PHP.

During my talk at php|tek I got asked if I thought that adding a framework to your code meant unnecessary overhead. Unfortunately, the answer is "it depends, but I try not to worry about it". The reason for this wishy-washy answer is that I find that when you start using a framework, you can stop using custom code for low-level things and instead use whatever the framework is using (like DB access or form creation helpers). You gain in speed because, well, you don't have to write those things, you just have to learn how to use them. The trade-off is developer speed vs. application speed, but there are things you can to increase application speed: output caching, opcode caching, data partitioning, server tweaking. Again, all those things are not done in PHP-land.

So, looking at the above I think it's pretty obvious that scaling is not the responsibility of PHP (or a framework in this case) but the responsibility of the developer. When you are building an application, you need to be mindful of the actual design so that when the time comes you will be able to scale horizontally by adding more servers or you can scale vertically (moving functionality around to different levels of your application) with minimal headaches. Having been through this on an adult dating site, I can say without a doubt that trying to scale your application after the fact just doesn't work. Do your homework, understand the problem, and don't ask if a framework can scale. Ask yourself if you know how to build a scalable application in the first place.