Mystery and magic behind Load Balancer

588

Comments

Mystery and magic behind Load Balancer

July 01, 2010 Bloggies by Administrator Edit

The load balancer presents virtual servers to the outside world. Each virtual server points to a cluster of services that reside on one or more physical hosts.

Hey, don’t loose your heart by lexicons used in above statement. Instead, let’s examine each terminology one by one to make our life simpler.

Host: The physical server which receives the traffic from load balancer. This is synonymous with the IP address of the physical address (for example 192.168.57.10). In scenario when LB is absent host is the server name such as www.viaedge.com and this resolves to the IP address 192.168.57.10. Different vendor terms it as node / server/Host.

Services: It includes the application TCP port of actual application running as well as the host (IP address of the physical address of the server). A host (192.168.57.10) may have more than one service available (HTTP, FTP, DNS, and so on). By defining each application uniquely 192.168.57.10:80, 192.168.57.10:21, and 192.168.57.10:53. Different vendor terms it as member / services. Some vendor terms it as node too (quite unfortunately!!)

Clusters: They are collections of similar services available on any number of hosts. For instance, all services that offer the company Human resources would be collected into a cluster called “company HR page” and all services that offer ecommerce services would be collected into a cluster called “eCommerce.” Some vendor terms it as pool/cluster/farm.

Virtual Server: Like the definition of services, the virtual server usually includes the application port was well as the IP address. Since most vendors use virtual server terminology is often used in LB discussion, we will continue to use that terminology in our discussion, although the term “virtual service” would be more in keeping with the IP:Port convention.

Now we are equipped with all lexicons of load balancing, now see the statement used in the beginning “The load balancer presents virtual servers to the outside world. Each virtual server points to a cluster of services that reside on one or more physical hosts.” I got the meaning but still didn’t understand the concept. Not an issue, let’s dissects it further to understand it better.

The five magic steps of load balancing

(Assuming the load balancer will typically sit in-line between the client and the hosts that provide the services the client wants to use)

1. The client attempts to connect with the service on the load balancer.

2. The load balancer accepts the connection, and after deciding which host should receive the connection, changes the destination IP (and possibly port) to match the service of the selected host (note that the source IP of the client is not touched).

3. The host accepts the connection and responds back to the original source, the client, via its default route, the load balancer.

4. The load balancer intercepts the return packet from the host and now changes the source IP (and possible port) to match the virtual server IP and port, and forwards the packet back to the client.

5. The client receives the return packet, believing that it came from the virtual server, and continues the process.

Let’s further discuss the couple of issues to finally crack the rest of the mystery of load balancing

First, as the client knows, it sends packets to the virtual server and the virtual server responds—simple.

Second, the network address translation (NAT) takes place. This is where the load balancer replaces the destination IP sent by the client (of the virtual server) with the destination IP of the host to which it has chosen to load balance the request. Step three is the second half of this process (the part that makes the NAT “bi-directional”). The source IP of the return packet from the host will be the IP of the host; if this address were not changed and the packet was simply forwarded to the client, the client would be receiving a packet from someone it didn’t request one from, and would simply drop it. Instead, the load balancer, remembering the connection, rewrites the packet so that the source IP is that of the virtual server, thus solving this problem.

Hopefully by now, you are comfortable with the basics of load balancing. In next blog we will discuss on issues such as “Load balancing decision”, To load balance or not to load balance”, Application deliver controller (ADC) “and many more.



AddThis Social Bookmark Button