What Is a Cluster?

A group of machines all serving an identical purpose is called a cluster. Similarly, an application or a service is clustered if any component of the application or service is served by more than one server.

Figure 15.1 does not meet this definition of a clustered service, even though there are multiple machines, because each machine has a unique roll that is not filled by any of the other machines.

Figure 15.1. An application that does not meet the cluster definition.

What Is a Cluster?

Figure 15.2 shows a simple clustered service. This example has two front-end machines that are load-balanced via round-robin DNS. Both Web servers actively serve identical content.

Figure 15.2. A simple clustered service.

What Is a Cluster?

There are two major reasons to move a site past a single Web server:

Redundancy If your Web site serves a critical purpose and you cannot afford even a brief outage, you need to use multiple Web servers for redundancy. No matter how expensive your hardware is, it will eventually fail, need to be replaced, or need physical maintenance. Murphy's Law applies to IT at least as much as to any industry, so you can be assured that any unexpected failures will occur at the least convenient time. If your service has particularly high uptime requirements, you might not only require separate servers but multiple bandwidth providers and possibly even disparate data center spaces in which to house redundant site facilities.
Capacity On the flip side, sites are often moved to a clustered setup to meet their increasing traffic demands. Scaling to meet traffic demands often entails one of two strategies:
- Splitting a collection of services into multiple small clusters
- Creating large clusters that can serve multiple roles

Load Balancing

This book is not about load balancing. Load balancing is a complex topic, and the scope of this book doesn't allow for the treatment it deserves. There are myriad software and hardware solutions available, varying in price, quality, and feature sets. This chapter focuses on how to build clusters intelligently and how to extend many of the techniques covered in earlier chapters to applications running in a clustered environment. At the end of the chapter I've listed some specific load-balancing solutions.

While both splitting a collection of services into multiple small clusters and creating large clusters that can serve multiple roles have merits, the first is the most prone to abuse. I've seen numerous clients crippled by "highly scalable" architectures (see Figure 15.3).

Figure 15.3. An overly complex application architecture.

The many benefits of this type of setup include the following:

By separating services onto different clusters, you can ensure that the needs of each can be scaled independently if traffic does not increase uniformly over all services.
A physical separation is consistent and reinforces the logical design separation.

The drawbacks are considerations of scale. Many projects are overdivided into clusters. You have 10 logically separate services? Then you should have 10 clusters. Every service is business critical, so each should have at least two machines representing it (for redundancy). Very quickly, we have committed ourselves to 20 servers. In the bad cases, developers take advantage of the knowledge that the clusters are actually separate servers and write services that use mutually exclusive facilities. Sloppy reliance on the separation of the services can also include things as simple as using the same-named directory for storing data. Design mistakes like these can be hard or impossible to fix and can result in having to keep all the servers actually physically separate.

Having 10 separate clusters handling different services is not necessarily a bad thing. If you are serving several million pages per day, you might be able to efficiently spread your traffic across such a cluster. The problem occurs when you have a system design that requires a huge amount of physical resources but is serving only 100,000 or 1,000,000 pages per day. Then you are stuck in the situation of maintaining a large infrastructure that is highly underutilized.

Dot-com lore is full of grossly "mis-specified" and underutilized architectures. Not only are they wasteful of hardware resources, they are expensive to build and maintain. Although it is easy to blame company failures on mismanagement and bad ideas, one should never forget that the $5 million data center setup does not help the bottom line. As a systems architect for dot-com companies, I've always felt my job was not only to design infrastructures that can scale easily but to build them to maximize the return on investment.

Now that the cautionary tale of over-clustering is out of the way, how do we break services into clusters that work?

Table of Contents