This paper discusses an optimal dynamic policy for a queueing system with M servers, no waiting room, and two types of customers. Customer types differ with respect to the reward that is paid on commencement of service, but service times are exponentially distributed with the same mean for both types of customers. The arrival stream of one customer type is generated by a Poisson process, and the other customer type arrives according to the overflow process of an M/M/m/m queue. The objective is to determine a policy for admitting customers to maximize the expected long-run average reward.
By posing the problem in the framework of Markov decision processes and exploiting properties of submodular functions, the optimal policy is shown to be a “generalized trunk reservation policy”; in other words, the optimal policy accepts higher-paying customers whenever possible and accepts lower-paying customers only if fewer than c1 servers are busy, where i is the number of busy servers in the overflow queue. Computational issues are also discussed. More specifically, approximations of the overflow process by an interrupted Poisson process and a Poisson process are investigated.