{"id":782,"date":"2014-12-11T07:50:30","date_gmt":"2014-12-11T07:50:30","guid":{"rendered":"http:\/\/www.migenius.com\/?p=782"},"modified":"2020-11-02T21:31:40","modified_gmt":"2020-11-02T21:31:40","slug":"how-many-servers-do-i-need","status":"publish","type":"post","link":"https:\/\/www.migenius.com\/articles\/how-many-servers-do-i-need","title":{"rendered":"How Many Servers Do I Need?"},"content":{"rendered":"

It sounds like such a simple question and it is asked by almost every new customer we engage. In reality, answering this can get complex quickly, however there are a few basic ways to get some quick estimates. What we are really talking about here is Capacity Planning<\/em>. As a more general reference on the subject of planning for scalable websites you could take a look at many of the books out there, one now older but still great reference is Cal Henderson\u2019s Building Scalable Websites<\/a>.<\/p>\n

Before setting out we should clarify that at migenius we are typically working with websites and applications with very different compute requirements to your average website. Normally servers are used for serving pages, database operations, perhaps running search engines and other tasks that would occupy only a small portion of any servers resources when servicing a single users requests.<\/p>\n

In stark contrast, the portion of a website using our technology is usually associated with Photorealistic Rendering<\/em>, an extremely complex and compute-intensive operation. To get the speed needed migenius utilises NVIDIA Iray<\/a> and servers with NVIDIA GPU hardware, making these deployments quite atypical in the world of web application development.<\/p>\n

Our technology always targets the maximum performance possible for any given task, even when servicing only a single user. As a result it will quickly soak up any available resources it can in pursuing this goal. This means that most of the time we fully utilise any server which is running our software. Serving a webpage or doing database operations on the other hand usually only utilises a small fraction of the available server resources (allowing many users to use the same server).<\/p>\n

Because of these differences we will outline a process for making an estimate of the required servers which we use ourselves when working with customers who are adopting our RealityServer<\/a> technology. This process should be applicable to any application which also has a heavy\u00a0compute requirement.<\/p>\n\"617873_Nebulae_2\"\n

Estimating Your Server Requirements<\/h2>\n

So, here is a fairly straight forward process in six steps to get an estimate of the number of servers you might need to support a compute-intensive (on the server-side) web application.<\/p>\n

Step 1 – Select a Base Hardware Configuration<\/h3>\n

While you might choose to completely reconfigure your hardware selection later on in the process, you need to get a baseline performance on a particular, known configuration. If you have relative performance data for other hardware configurations you can then estimate how using different hardware will affect your application.<\/p>\n

You should pick a hardware configuration that at least gets you your desired performance for a single user. This hardware configuration could be a single server, it could be a cluster, it doesn\u2019t really matter as long as it’s a known configuration and it can get you your basic performance.<\/p>\n

Step 2 – Select a Representative Dataset<\/h3>\n

Since we are focused here on compute-intensive applications the amount of computation power needed typically depends entirely on the dataset chosen. There is no point choosing a simple dataset if your application will be processing something much more complex, select something that is representative of what you expect to see when you deploy.<\/p>\n

If your datasets vary a lot in complexity, you might want to select multiple datasets. When you don\u2019t have a dataset to use you might need to create a synthetic dataset which approximates something of a similar complexity to the real dataset. Whatever you do, don\u2019t use something trivial since it will only mean you greatly underestimate your resource requirements.<\/p>\n

Step 3 – Select Minimum Required Performance<\/h3>\n

To get a realistic estimate of the required hardware, you really need to think about what the minimum required performance you want to see will be. This might not be the average or typical performance but it should represent the minimum level at which you consider your user is still getting an acceptable experience.<\/p>\n

For our example, let\u2019s say that we measure performance in thousands operations per second (performance can be anything you can define and measure), so higher numbers are better in this case. We are not talking about what these operations might be here because we want to keep it generic but it really doesn\u2019t matter. Now, for arguments sake we will say that for our application we must have at least the following performance:<\/p>\n

\n 10.00 thousand operations per second\n<\/p><\/blockquote>\n

Below this threshold we will assume the user experience suffers to the point of being unacceptable.<\/p>\n

Step 4 – Select the Maximum Number of Simultaneous Users<\/h3>\n

Now this part can be tricky, particularly since it really depends on whether you have any information from an existing application or website you can use, or if you need to try and make an educated guess based on what you know about your application. This number, however, is also one of the most important, so you should do whatever you can to make a good estimate.<\/p>\n

If you do have an existing website with a lot of traffic, your web analytics can help a lot in identifying usage patterns that can lead you to an accurate estimate. Usually your compute-intensive application will only be a small portion of what is done on your website, so you should be careful not to just look at things like monthly unique visitors, since this will lead to greatly overestimating the traffic that would reach your application.<\/p>\n

You will also need to decide at this stage whether to plan for the peak or the average. If you want to plan for the peak (maximum simultaneous users) then you may obviously end up with a lot of unused hardware capacity during non-peak times. The flip side is if you plan for the average peaks will generally overwhelm your resources and some users are going to miss out.<\/p>\n

If you are deploying a system with the ability to scale up and down the resource allocation, you can potentially avoid being caught out without enough resources\u00a0if you are able to detect when the peaks are coming before they arrive. Auto-scaling is topic for another post, however it is well-supported by most cloud providers.<\/p>\n

For our discussion here let\u2019s assume the following maximum number of simultaneous users that need to be supported.<\/p>\n

\n 25 simultaneous users\n<\/p><\/blockquote>\n

Keep in mind these are users that are actively using the compute-intensive portion of the application at exactly the same time. This will typically be a much smaller number than the number of overall visitors to your site (unless the compute intensive application is the entire reason for people to visit your site).<\/p>\n

Step 5 – Run Tests on the Representative Data on the Base Hardware Configuration<\/h3>\n

Once you have your hardware, dataset, performance level and simultaneous users, you need do some test runs and measure the performance and utilisation. Here is a typical sequence you might go through.<\/p>\n

    \n
  1. One User, measure performance<\/li>\n
  2. Two Simultaneous Users, measure performance per user<\/li>\n
  3. Three Simultaneous Users, measure performance per user<\/li>\n
  4. Four Simultaneous Users, measure performance per user<\/li>\n<\/ol>\n

     <\/p>\n

    At each step you should review the performance achieved and stop at the point where it goes below your defined Minimum Acceptable Performance<\/em>. Then take the count of simultaneous users from the previous run\u00a0and\u00a0that will be the number of users your base hardware configuration can support. We\u2019ll call that Users per Server<\/em>.<\/p>\n

    Step 6 – Determine Server Requirements<\/h3>\n

    So now we have all of the pieces we need to calculate how many servers we will need to provision. For our example we will say we obtained the following performance measurements from Step 5 as an example:<\/p>\n