# Queueing Theory Explained

Let’s start this post with an extract of Murphy’s Law.

If you change queue, the one that you left will start to move faster than the one you are in now.

Your queue always goes the slowest.

Whatever queue you join, no matter how short it looks, it will always take the longest to you to get served.

Well, such statements are not encouraging to reading further this blog post that, basically, aims at explaining in a simple voice the basics of the Queueing Theory; anyway, it was a well posed tribute to Murphy, and, at the same time, a challenging bootstrap!

Queueing Theory, my vision: a branch of the applied mathematics providing probabilistic modeling tools for Queueing Systems. The key aspect, to me, is around the Queueing Systems, something really simple and daily experienced by all of us; an example, the below picture.

A service desk (i.e. service offering) serves a line of people (i.e. service requestors), that’s what happens almost every day in banks, post offices, canteens, etc. Well, once in a line/queue, what do we care the most? Normally, how long it will take to get served, and what does the service desk care the most? Normally, how many clients will be served per timeframe-basis. Summarizing by this example, our system has a set of intuitive parameters that relate each other:

• Service Time. How long it takes to serve a customer, on average.
• Wait Time. How long a customer has to wait, on average.
• Queue Length. How many customers can be lined up, on average.

Intuitively, if the service desk starts serving slower, the queue length increases and the wait times increase too; otherwise, if the wait time increases, this means that the service time is increased and the queue length will grow up.  Another important aspect of a Queueing System is the interarrival rate:

• Interarrival Rate. The rate at which the customers arrive at the service desk line, on average.

Now, why Queueuing Theory is so important for Computer Scientists, Engineers and practitioners? The answer is: because the set of models can be applied in real-world scenarios to mainly i. size and ii. tune network-based systems (e.g. distributed systems).

NOTE Of course, the fields of application are far more vast, this blog post presents a perspective mostly suited for ICT Engineers and Computer Scientists.

Let’s switch over a more formal definition of the Queueing Theory, and let’ start by assessing a reference model: in a Network, each Node can be modeled as a Queueing System. As example, let’s assume the TCP System Model below presented.

Three nodes can be enlisted: a Client, a Server and the Network, all of them interconnected and, at some extent, making use of buffering. The ‘λ’ defines the interarrival rate and the ‘μ’ defines the service time. Such a complex system can be analyzed by involving the Queueing Theory modeling, as discussed earlier in the post. It would be interesting to simulate the performances of such a system before to deploy it, or in case the system is already in production, bottlenecks and performance analysis might be easily carried out.

It’s time to introduce a bit of formalism in this discussion. A Queueing System has to be modeled as a Stochastic Process, in which each variable/parameter is a Random Variable. According to the Kendall’s Notation, a system composed by one or more nodes can be represented using the following notation:

A / B / m / K / n / D

Where:

• A – distribution function of the interarrival times,
• B – distribution function of the service times,
• m – number of servers
• K – system’s capacity (e.g. maximum number of customers in the system, comprehensive of the one being served)
• n – population size (e.g. numbers of sources of customers),
• D – service discipline.

An example: M / M / m / infty / infty / FIFO, that’s a queueing system modeled by a Markov process for Arrival and Service times, composed by m Servers, having infinite (i.e. infty) Queue and Population capacities but FIFO service policing.

The most common types of random distributions are:

• M – Markovian (memoryless exponential distribution),
• D – Deterministic,
• G – Generic.

An example: M / M / 1 has Markovian interarrival and service time, but only 1 Server; on the other hand, M / G / r, has a Markovian interarrival, a Generic service time and r Servers.

Queueing Systems can be classified in:

• Finite-Source System. The interarrival rate depends on the system’s state (e.g. number of cusomers/requests already in the system),
• Infinite-Source System. The interarrival rate is independent on the system’s state (e.g. requests arrival rate is completely independent from the number of requests already in the system).

From the above perspective, a Queuing Model aids in identifying a set of statistics characterizing a system, like:

• Main Statistics
• Mean Number of Customers in a System and Variance.
• Mean Number of Waiting Customers, mean Queue Length and Variance.
• Server Utilization.
• Distribution of the Response Time of a Customer.
• Distribution of the Waiting Time.

The foundation is the Probability Theory that is the basic for the Statistics, and branches in i. Discrete Probability (finite and discrete sample space) and ii. Continuous Probability (indefinite and continuous sample space). In Queueing Theory, mostly continuous probability is used: events flow in a system with continuity. The Performance indicators that mostly concerns are i. Response Time and ii. Througput, intended, respectively, as measure of the RTT (Round-Trip Time) latencies and rate of requests served per second. The Little’s Law relates metrics like i. Response Time, ii. Throughput and iii. Queue Length in a law valid universally for G / G / m systems, and that can be used to analyse and solve Software Performance and Scalability problems. Little’s Law formulation is the following:

The long-term average number of customers in a stable system L is equal to the long-term average effective arrival rate, λ, multiplied by the (Palm‑)average time a customer spends in the system, W; or expressed algebraically: L = λW.

According to the formulation, the Little’s Law relates the i. average number of customers L in the system, ii. effective arrival rate λ, and iii. average time spent W in the system. An example of application: let’s assume to have 9 jobs in queue and only 1 served, knowing that the average throughput is 50 jobs/second, the average response time can be calculated simply by W=L/λ=10/50=0.2 seconds.

The Queueing Theory has many derived models that can be applied to real-worl/case scenarios. The theory ensures a level of abstraction that aids in both assessing and simulating metrics, this means it’s a very powerful tool in tuning and sizing the network-based systems. Statistical tools like R allows to easily define and simulate systems: here, an example of how easy is to simulate a M / M / 1 by using R.

Software Perfomance Engineering is one of the Computer Science’s branches that makes use of the Queueing Theory, for example, to analytically validate test campaign results, or to find bottlenecks, etc. Anyway, the Queueing Theory is largely used in telecommunications, the same sector in which the theory itself was born.

In conclusion, this post tried to introduce the basics of Queueing Theory in a simple voice: every Engineer should know about that theory, definitely the theory  should be properly a tool in the skill set of any Computer Scientists and ICT Engineers.

## 3 thoughts on “Queueing Theory Explained”

1. vladmihalcea says:

Very nice and clear explanation of the basic queueing theory principles. It reminds me of my Simple Scalability Equation article, I wrote some time ago.

Liked by 1 person

1. hailpam says:

Thanks! I was reading and appreciating your article. I tried in my blog post to express the Kendall’s notation in a simple voice: sometimes, it’s useful to have a general understanding too.

I’ll try to ding into the theory with another post.

Liked by 1 person