Search

Suppose that π is a policy for resource allocation in a stochastic environment and π ∗ is an optimal policy. Two existing procedures for policy evaluation are described and compared. Both of these evaluate π by means of upper bounds on R(π ∗) – R(π), the total reward lost when making resource allocations according to π rather than π∗. The bounds developed by these two methods are called Type 1 and Type 2. We demonstrate by example that neither of these procedures dominates the other in the sense of always yielding tighter bounds. A modification to Type 2 bounds is proposed resulting in an improved procedure which always dominates the Type 1 approach.

Existing procedures for strategy evaluation based on earlier work by the author have led (inter alia) to procedures for the assessment of heuristics for stochastic scheduling and to the development of an approach to sensitivity analysis. Recent theoretical work has now led to the development of more effective evaluation procedures. The key results are reported here.

Search Results

Refine search

Refine search

Actions for selected content:

2 results

Bounds for discounted stochastic scheduling problems

Procedures for the evaluation of strategies for resource allocation in a stochastic environment

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

2 results

Bounds for discounted stochastic scheduling problems

Procedures for the evaluation of strategies for resource allocation in a stochastic environment