Availability means your web application is available to your users to use. We would all like our applications to available 100% of the time. But for various reasons it does not happen. The goal of high availability is to make the application available as much as possible. Generally, availability is expressed as a percent of time that application is available per year. One may say availability is 99% or 99.9% and so on.
Redundancy and failover are techniques used to achieve high availability. Redundancy is achieved by having multiple copies of your server. Instead of 1 apache web server, you have two. One is the active server. The active server is monitered and if for some reason it fails, you failover to the 2nd server which becomes active. Another approach is to use a cluster of active servers as is done in a tomcat clusters. All servers are active. A load balancer distributes load among the members of the cluster. If one or two member of the cluster go down, no users are affects because other servers continue processing. Of course, the load balancer can become a point of failure and needs redundancy and failover.
If you were launching a new web application to the cloud, you might start of with a basic architecture as shown below without any HA consideration.
Phase 1: 1 Tomcat web server
Phase 2: Tomcat cluster
You add redundancy and scalability by using a tomcat cluster as shown in the figure below. The cluster is fronted by Apache Web server + mod_proxy which distributes requests to the individual server. Mod_proxy is the load balancer.
Now the application scales horizontally. Tomcat or application failure is not an issue because there are other servers in the cluster. But we have introduced a new point a failure, the load balancer. If Apache+mod_proxy goes down, the application is unavailable.
To read more about setting up a tomcat cluster see Tomcat clustering
To learn how to use a load balancer with tomcat see Loadbalancing with Tomcat
Phase 3: Highly available Tomcat cluster
The figure below shows how to eliminate the point of failure and make the load balancer highly available.
You add redundancy by adding a second apache+mod_proxy. However only one of the apache is active. The second apache is not handling any requests. It merely monitors the active server using a tool like heartbeat. If for some reason, the active server goes down, the 2nd server knows and the passive server takes over the ip address and starts handling requests. How does this happen ?
This is possible because the ip address for this application that is advertised to the world is shared by the two apache’s. This is know as a virtual ip address. While the 2 servers share the virtual IP, TCP/IP routes packets to only the active server. When the active server goes down, the passive server tells TCP/IP to start routing packets intended for this ip address to it. There are TCP/IP commands that let the server start and stop listening on the virtual ip address.
Tools like heartbeat and Ultramonkey enable you to maintain a heartbeat with another and failover when necessary. With heartbeat, there is a heartbeat process on each server. Config files have information on the virtual ip address, active server, passive server. There are several articles on the internet on how to setup heartbeat.
In summary, you can build highly available applications using open source tools. The key cocepts of HA, redundancy, monitoring & failover, virtual ip address apply to any service and not just web servers. You can use the same concepts to make your database server highly available.