AKF partners wrote a good blog post about how to visualize IT scalability: http://akfpartners.com/techblog/2008/05/08/splitting-applications-or-services-for-scale/
An application that's monolithic and not scalable starts at the bottom, left corner (0,0,0).
You have three choices as you scale your application:
1. Put an application on more servers behind a load balancer and evenly distribute load across them. (horizontal duplication on X-axis). As Blogtalkradio had more visitors, we kept adding more web servers and databases.
2. Separate application into components each of which can be run on different servers. (Y-axis split by function). At Blogtalkradio, our studio control board is hosted on set of servers, RSS feeds are produced by another group, and regular visitors go to yet another cluster.
3. Finally, for websites with hypergrowth, you can do sharding (aka Z-axis lookup-oriented splits). Here, you can take user GUID or ID and compute H(.) consistent hash or modulo N and send the traffic to the corresponding server. This is nice to implement at CDN or load balancer level if you can augment them with scripts.