The Shortest Queue Conundrum

What algorithms lie behind the movement of multiple queues?

Overview

The Area of Enquiry & Design Intent

When there are multiple queues to, say, a ticket counter at the theatre, or the checkout counter at a department store, one is faced with the question of which queue would move the fastest, so as to join that one. But before making a decision of which queue to pick, can we determine which queue would move the fastest?

Objectives of the Study

What parameters decide the speed of movement of the queue, and how much do they affect the outcome?

Is there an algorithm that governs which line, among multiple lines, moves the fastest?
On what basis do we pick lines?

I aim to study the scope and dependency of the movement of the queue, as affected by the observed parameters.

Design Intent

I intend to be able to determine, upon observing a set of queues and studying all the relevant parameters, which queue would move the fastest, by coding an algorithm and testing it.

Index

Understanding & Analyzing Algorithms

What Is The Shortest Queue Conundrum?

Observing the Parameters Involved

Detailed Protocol for Coding the Algorithm

Coding the Algorithm

Conclusion: Inferences, Limitations & Future Scope

Index

Click on any of the topics to go directly to the relevant section

Understanding & Analyzing Algorithms

What Is A Good Algorithm?

In a team of five, we brainstormed on what we thought made an algorithm ‘good’. We exchanged ideas on why we felt good algorithms needed to have certain characteristics, and discussed examples of the same. After penning all our thoughts down, we further broke down each characteristic, discussed its significance in being included, and then added it to our final consolidated definition of what a good algorithm means to us.

Brainstorming On 'Good Algorithm'

Examples For Keywords In Brainstorming

(Click any of the images for a larger view)

Our Consolidated Definition of a Good Algorithm

Click the image to zoom into the details

Studying Basic Search & Sort Algorithms

We wrote the code for two basic algorithms, Merge Sort and Binary Search, then proceeded to formulate an experiment protocol to design test cases to study all the dependencies between,

Maximum Value of the Element
Size of Input Array
No. of Iterations

against the Total Runtime in these algorithms, and also find the best, average and worst runtimes for 1000 iterations. These dependencies were then plotted on a graph and analyzed.

We studied another method of analyzing algorithms, which was an arithmetic model of predicting runtimes, by calculating the arithmetic cost of each step. Totaling the cost of every step in the algorithm, we built an equation which was used to calculate the total predicted runtime. This calculated value was then compared with the real runtime value of the algorithm, to measure the accuracy of using the arithmetic model.

These methods of algorithm analysis prepared the foundation for the main task: identifying underlying algorithms around us to understand and analyze them using the tools we learnt.

Understanding & Analyzing Algorithms

What Is The Shortest Queue Conundrum?

An Introduction To The Chosen Topic

Have you ever stood in a grocery store with your shopping bags, about to checkout, faced with the decision of which checkout line to join, so as to get home in the shortest time possible?

The cutest and most accurate representation of the inspiration behind this study, found on Google during research

You are not alone. We all have, on some occasion, taken a few seconds to observe the customers standing in each line and then done a quick mental calculation to figure out which line would be the shortest, and therefore the best one to join. However, while we can justify our choice with significant reasoning based on our observations, we often never seem to join the right queue, despite all our efforts. As soon as we join one, triumphantly rejoicing at our smart decision, our queue is sure to slow down while the one right next to ours will seemingly speed up. This has often happened to many of us, including me, and seems to be a cruel joke played on us by the universe. This mystery and intrigue behind the movement of queues is the area I want to study further in this project on algorithms.

I intend to examine and understand the factors that affect the speed of movement of a set of queues, and build an algorithm incorporating these parameters to arrive at a solution, that will utilize the algorithm to determine a clear answer for an incoming customer of which line would be the most beneficial to join at that moment.

What Is The Shortest Queue Conundrum?

Observing The Parameters Involved

What factors could affect the speed of a line?

From my own experience, some significant parameters that seem to affect the speed of movement of a line are,

number of people in the queue
number of items each person is holding (in case of a department store checkout counter)
amount of luggage they are carrying (in case of an airport check-in counter)
type of person in the queue (for example, if it is a family with children, a group of youngsters, or a person standing alone)
method of payment of the person in the front of the line (credit/debit usually seems to take more time than cash)
how efficient the cashier behind the counter is at completing the task

These possibilities have been penned down solely from experience, but is a good place to start actually testing them to see if a change in the parameters results in a positive, negative, or no change at all in the queue speed.

Brief Study Plan

I plan to design test cases for one location with multiple queues using the key factors that upon observation, seem to affect the speed of movement of the queue, as parameters in the test case to find their relationship with the amount of time taken for a person at the end of the queue, to reach the front.

If there are three queues to the counter at the chosen location, the time taken for say, the fourth person in each queue to reach the counter, could be calculated using the chosen parameters, varying one parameter and keeping the rest constant, in order to compare the results and observe their relationship with the speed of the queue. These outcomes could be compared with values recorded at the location in real time, and observe the variance in graphs, and determine the accuracy of the test cases.

Upon finding and plotting these results, it could be determined which parameters are significant (or hyperparameters), and which parameters do not have a notable effect on the outcome. These significant parameters can then be further tested with more data points to observe and define their relationship with the speed of movement of the queue.

Field Visit to D-Mart for Observations

I chose D-Mart as the location to record real readings of the considered parameters. I stood near the checkout counters and with help from a friend, we observed 3 queues that were in movement at the time.

Images from Google, as I forgot to take pictures from my field visit

Observational Readings

We stood near the checkout lines and took down values of these parameters:

Type of customer
Approximate number of items
Mode of payment
Service time

The Effect of Unforeseen Elements

While taking these readings, I realized that the possibility of unforeseen factors also needed to be considered. In the third image above, the last reading shows that the total service time for the older couple was 12 minutes due to some issue with their credit card. They tried to make it work for several minutes while the other customers in line waited, but then the cashier ended up having to open a new counter while this issue was being resolved.

This kind of event is completely unforeseen, as the other customers who used cards did not have any issues. But this does bring in an element of chaos, that could be considered in the algorithm as a buffer time period, or could serve as a parameter that results in a complete do-over where the algorithm reassigns the customers in that affected line, to one of the other lines based on the new data.

Observing The Parameters Involved

Detailed Protocol For Coding The Algorithm

The Detailed Plan-Of-Action

I plan to collect data of variables in the scenario that might be affecting queue time from one location with multiple queues, use that data to randomize these parameters to simulate real world queues, and then calculate their queue times. All the possible factors that might affect queue time will be listed, and multiple real readings from queues will be collected for as many factors as possible. For the remaining, an average value as close to real data as possible will be chosen. I will build the algorithm, and have it draw the values of the variables from these readings at random, in order to simulate a real queue. The number of queues in the scenario will be randomly picked as well.

Building The Algorithm

The algorithm will work in steps, first going into a loop- the number of iterations being the number of queues (randomly generated), and then a random number of customers will be selected for that queue.

Here, it will call the function that calculates the customer-related parameters, i.e., whether or not they have a membership card of the store. If they do, a certain number of seconds are added as extra payment time, keeping in mind that the customer will be asked to produce that card, which will then be swiped, points will be added and then it will be returned to the customer. If the customer says they do not have a membership card, but would like one, again a certain number of seconds are added for the process that follows.

The other factor that relates to customers is mode of payment. Here, I can use the real data that was collected at D-Mart with multiple readings of the time duration it takes for cash and card payment to occur. In the algorithm, at this point, the customer will be assigned cash or card payment at random, and whichever payment is chosen, one value will be chosen from the list of real readings and added as payment time. The same is carried out for card payment.

Once these factors are added, the function to calculate item-related factors will be called, carrying forward the ‘payment time’ value as a parameter in the function, as it will be used in the calculation here. In the item function, factors relating to the item are assigned, the first being whether or not it is on discount. If it is on discount, a value will be added as extra scan time, as the cashier would take a few extra seconds to scan and deduct an amount from the item’s price.

The item category is also assigned at random, the considered choices being ‘clothing’ and ‘food’. The latter does not matter, it could be something other than food too, as we are concerned with the time period of the former. The cashier would take a few extra seconds to remove the anti theft device from the item of clothing, and then would have to scan and fold it.

The number of items this customer is purchasing, is chosen at random and the loop for checking these factors is run that many times.

Finally, the service time for this customer will be calculated in this function and returned. In the main function, this value will keep getting added to the other service times in the queue, and all the final queue times will be stored in a list and returned

Additionally, when an individual is making the decision of which queue to join, the first customer in every queue may be partially done, and so this would factor into their decision. If a person is making their payment, this means they are just about to leave the queue, which would be appealing to the individual standing outside the queues, as this queue would move up one person very soon. This will be included in the algorithm in a simplified manner, wherein the first customer will be assigned a number at random from 1, 2 and 3.

1 would mean they have just reached the counter, and have completed only 25% of their service time.
2 would mean they are halfway through (50%), and
3 would mean they are almost done, i.e. 75% of their service time is over.

In the algorithm, this percentage would be subtracted from their total service time, only for the first customer.

The final output will be a list of queue times, displayed in order of queue number- the first queue time corresponding to the first queue, and so on. The shortest queue time can be noted, and the corresponding queue can be joined.

Detailed Protocol for Coding the Algorithm

Coding The Algorithm

The Final Code

Using the data collected from the field visit to D-Mart, I added a few more possible factors that could affect the speed of the line, as outlined in the previous section. The following are parameters taken into consideration for the code, that involve both real readings taken from the store as well as others that were beyond the scope of collecting on the day of the visit, but do take place regularly in checkout queues.

Click the image to zoom into details, if text not legible

Snippets of the Code

Find the whole code here

Screen Mockups

Screen 1

Screen 2

Screen 3

Screen 1

1/3

A clearer view of the 3 screens, navigate using the arrows

Coding The Algorithm

Conclusion

Inferences From Study

The data used to test this algorithm is of course, very small, which is why any dependent or independent factors affecting the queue time tested with this algorithm would result in skewed outcomes. This is mainly due to the differences between grocery stores. Each store has different characteristics- one store might see a large cash-using customer base and so, may be very efficient in storing change for cash, thereby making cash transactions much quicker than card transactions. Another store might be the other way around, well-equipped for card transactions but may take longer to hunt for the accurate change for a cash payment. A strong factore could be the efficiency of the cashiers, one might even be more efficient than the other within the same grocery store. Other various factors that come into play is the technological strength of the store- its computer system, how quickly they can log in discount values, add membership points, or even scan items.

So, the algorithm would need to cater completely to a particular store before being deployed there in order to give customers the most accurate results.

Limitations

Due to the short timeline of this project, this study could only incorporate observations taken at one grocery store, for a few hours. Therefore, the values of some key parameters had to be assumed for the purpose of building the algorithm. Also, the 'type of customer' parameter (i.e. young woman, a couple, older man, family of 4, etc) that was recorded during observations, was not built into the algorithm as I could not conclude from the miniscule set of readings whether or not the type of customer makes a difference to the overall queue time. Perhaps with a larger study, it could be determined if this data factors in, or if it has no effect on the final prediction.

One of the main limitations in the current algorithm is that the 'chaos factor' (this term could include human error, or just unforeseen circumstances that are beyond accurate prediction, and how that could be incorporated into the algorithm's output) has not been built into it. Factoring that in would require a much more detailed study that employs extensive observations for several distinct grocery stores, at all hours including peak hours as well as slow hours, and recording when functions take place without a hitch, compared to when human errors, or factors beyond the employee's control occur.

This algorithm was built with the thought that it would not need any customer input at the time of use, but currently, it uses data such as customer's mode of payment and membership details to determine a more accurate output. This information may not be available without the customer inputting it themselves, so the question arises for the future scope of this algorithm- how much data (such as item count) can be accessed through scanners or other such modes, and how much information would need manual customer interaction with the device.

Lastly, an important thing to note is that in my observations taken at D-Mart, all methods of payment by customers were either cash or card. Now however, payment modes such as GooglePay and PayTM are gaining widespread popularity, especially with the younger generation, and even much smaller stores allow UPI payment these days. The payment time period in seconds for these modes were not accounted for in the algorithm, but for future possibilities, this addition of data would lead to more accurate predictions.

Future Scope: The Vision For This Algorithm

No one likes long checkout lines, and this study could go a long way towards increasing the efficiency of grocery stores. The algorithm, upon feeding data, can provide valuable real-time information to the grocery store's customers about which queue would be the best one to join. It could be deployed as a simple screen right before the checkout counters, where the algorithm would first need to record the data. This, I imagine, could be accomplished by scanners present near the shopping cart to determine an approximate number of items, as well as the category of items. An even simpler way could be a smart cart, where this data would already be recorded in the process of shopping. The algorithm could use the individual customer's data, along with the pre-recorded data of the grocery store- including their average scan times, efficiency of the specific cashier at each queue (this data would probably have to be updated in case cashiers switch counters or change shifts), and other time periods for every task they undertake.

The customer would only need to wait a few seconds while the program runs and then displays which queue should be joined, and why. The 'why' would be included to increase transparency of its working, but mainly so that the algorithm can continually be checked and improved. For example, if the algorithm calculated a slower queue time for one of the queues due to one customer's items including a large number of clothes, the algorithm could have come to this conclusion through the addition of several 'fold' times (time taken by the cashier to properly fold the clothing items), as well as a few discount times. But the items could a number of towels that are already folded in the cart, that could have been wrongly scanned as more complicated clothing items. Transparency of data during events like this, would enable us to improve the algorithm to account for mishaps and miscalculations such as these.

Another aspect, as mentioned in the limitations, would be to account for the mode of payment. The limitation is that the current algorithm takes into account the mode of payment, however, that information would not be known through a simple scan of the cart. That information would have to be entered on the screen by the customer before the algorithm runs and tells them which queue to join. It is important to note, however, that entering this information would not benefit the current customer in any way- it is for the algorithm to run accurately for the next few customers. Entering which mode of payment they plan to use is a simple enough task though, that customers would probably be willing to do in order to improve their own overall shopping experience at the store.

Line Switching

In one of the observations taken at D-Mart, one of the customers had some issue with their card, and while trying to fix the issue, the service time for the customer totaled over 12 minutes and the staff had to open another counter for the remaining customers standing behind him. This is an example of unforeseen circumstances that could occur at any moment. What if the algorithm could incorporate the resultant 'line switching' into its functioning?

When something unexpected happens and customers have to switch lines, or even if they want to switch lines because they feel the other lines are moving quicker, how could the algorithm aide the customers in such a scenario? In a situation where suppose a card machine fails, the whole queue of customers will have to be dispersed to the other queues, and the algorithm in this scenario will have to make instant decisions for multiple people, not just one. It could perhaps recalculate each queue's total time period compared with the to-be-distributed customers' predicted service time periods, and make a decision for each of them.

Not Just Grocery Stores

The study of queues is not limited to grocery stores- it could apply to movie theatres, service centers, banks, or really any place where standing in line is a hassle, and customers would value knowing how long their task would take in order to plan the rest of their day accordingly. This would also bring in much more transparency and efficiency in the functioning of these systems.

Conclusion: Inferences, Limitations & Future Scope

What algorithms lie behind the movement of multiple queues?