Hostname: page-component-cd9895bd7-p9bg8 Total loading time: 0 Render date: 2024-12-23T11:08:01.141Z Has data issue: false hasContentIssue false

QUICK SIMULATION METHODS FOR ESTIMATING THE UNRELIABILITY OF REGENERATIVE MODELS OF LARGE, HIGHLY RELIABLE SYSTEMS

Published online by Cambridge University Press:  01 July 2004

Marvin K. Nakayama
Affiliation:
Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, E-mail: [email protected]
Perwez Shahabuddin
Affiliation:
Department of Industrial Engineering and Operations Research, Columbia University, New York, NY 10027, E-mail: [email protected]

Abstract

We investigate fast simulation techniques for estimating the unreliability in large Markovian models of highly reliable systems for which analytical/numerical techniques are difficult to apply. We first show mathematically that for “small” time horizons, the relative simulation error, when using the importance sampling techniques of failure biasing and forcing, remains bounded as component failure rates tend to zero. This is in contrast to naive simulation where the relative error tends to infinity. For “large” time horizons where these techniques are not efficient, we use the approach of first bounding the unreliability in terms of regenerative-cycle-based measures and then estimating the regenerative-cycle-based measures using importance sampling; the latter can be done very efficiently. We first use bounds developed in the literature for the asymptotic distribution of the time to hitting a rare set in regenerative systems. However, these bounds are “close” to the unreliability only for a certain range of time horizons. We develop new bounds that make use of the special structure of the systems that we consider and are “close” to the unreliability for a much wider range of time horizons. These techniques extend to non-Markovian, highly reliable systems as long as the regenerative structure is preserved.

Type
Research Article
Copyright
© 2004 Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES

Alexopoulos, C. & Shultes, B.C. (2001). Estimating reliability measures of highly-dependable Markovian systems, using balanced likelihood ratios. IEEE Transactions on Reliability 50: 265280.Google Scholar
Blum, A., Goyal, A., Heidelberger, P., Lavenberg, S.S., Nakayama, M.K., & Shahabuddin, P. (1994). Modeling and analysis of system dependability using the System Availability Estimator. In M. Malek, D.S. Fussel, A. Dahbura, & T. Nanya (eds.), Digest of papers: The twenty-fourth annual international symposium on fault-tolerant computing. New York: IEEE Press, pp. 137141.
Blum, A., Heidelberger, P., Lavenberg, S.S., Nakayama, M.K., & Shahabuddin, P. (1993). System Availability Estimator (SAVE) language reference and user's manual version 4.0. Research Report RA 219 S, IBM T.J. Watson Research Center, Yorktown Heights, NY.
Brown, M. (1990). Error bounds for exponential approximations of geometric convolutions. The Annals of Probability 18(3): 13881402.Google Scholar
Carrasco, J.A. (1991). Efficient transient simulation of failure/repair Markovian models. In V.K. Agarwal & E. Cerny (eds.), Proceedings of the tenth symposium on reliable distributed systems. New York: IEEE Press, pp. 152161.CrossRef
Carrasco, J.A. (1992). Failure distance-based simulation of repairable fault-tolerant systems. In G. Balbo & G. Serazzi (eds.), Computer performance evaluation: Modelling techniques and tools, pp. 351366. Amsterdam: Elsevier.
Conway, A.E. & Goyal, A. (1987). Monte Carlo simulation of computer system availability/reliability models. In J.P. Shen, D.P. Siewiorek, F. Cristian, & J. Goldberg (eds.), Proceedings of the seventeenth symposium of fault-tolerant computing. New York: IEEE Press, pp. 230235.
Crane, M.A. & Iglehart, D.L. (1975). Simulating stable stochastic systems, III: Regenerative processes and discrete event simulations. Operations Research 23(1): 3345.Google Scholar
Geist, R.M. & Smotherman, M. (1989). Ultrahigh reliability estimates through simulation. In Proceedings of the annual reliability and maintainability symposium. New York: IEEE Press, pp. 350355.
Gertsbakh, I.B. (1984). Asymptotic methods in reliability theory: A review. Advances in Applied Probability 16: 147175.Google Scholar
Glynn, P.W. (1994). Importance sampling for Markov chains: Asymptotics for the variance. Stochastic Models 10: 701717.Google Scholar
Glynn, P.W. & Iglehart, D.L. (1989). Importance sampling for stochastic simulations. Management Science 35(11): 13671393.Google Scholar
Goyal, A., Carter, W.C., de Souza e Silva, E., Lavenberg, S.S., & Trivedi, K.S. (1986). The System Availability Estimator. In H. Kopetz & M. Dal Cin (eds.), Proceedings of the sixteenth symposium on fault tolerant computing, pp. 8489. New York: IEEE.
Goyal, A. & Lavenberg, S.S. (1987). Modelling and analysis of computer system availability. IBM Journal of Research and Development 31(6): 651664.Google Scholar
Goyal, A., Shahabuddin, P., Heidelberger, P., Nicola, V.F., & Glynn, P.W. (1992). A unified framework for simulating Markovian models of highly dependable systems. IEEE Transactions on Computers C-41(1): 3651.Google Scholar
Gross, D. & Miller, D.R. (1984). The randomization technique as a modeling tool and solution procedure for transient Markov processes. Operations Research 32: 343361.Google Scholar
Heidelberger, P. (1995). Fast simulation of rare events in queueing and reliability models. ACM Transactions on Modeling and Computer Simulation 5(1): 4385.Google Scholar
Heidelberger, P., Shahabuddin, P., & Nicola, V.F. (1994). Bounded relative error in estimating transient measures of highly dependable Markovian systems. ACM Transactions on Modeling and Computer Simulation 4(2): 137164.Google Scholar
Jensen, A. (1953). Markoff chains as an aid in the study of Markoff processes. Skandinavisk Aktuarietidskrift 36: 8791.Google Scholar
Juneja, S. & Shahabuddin, P. (1992). Fast simulation of Markovian reliability/availability models with general repair policies. In D. Pradhan, J. Stiffler, J. Lala, & I. Koren (eds.), Proceedings of the twenty-second annual international symposium on fault tolerant computing. New York: IEEE Computer Society Press, pp. 150159.
Juneja, S. & Shahabuddin, P. (2001). Efficient simulation of Markov chains with small transition probabilities. Management Science 47: 547562.Google Scholar
Juneja, S. & Shahabuddin, P. (2001). A splitting-based importance-sampling algorithm for fast simulation of Markov reliability models with general repair policies. IEEE Transactions on Reliability 50: 235245.Google Scholar
Kalashnikov, V.V. (1989). Analytical and simulation estimates of reliability for regenerative models. Systems Analysis Modelling Simulation 6: 833851.Google Scholar
Keilson, J. (1979). Markov chain models—rarity and exponentiality. New York: Springer-Verlag.CrossRef
Lewis, E.E. & Böhm, F. (1984). Monte Carlo simulation of Markov unreliability models. Nuclear Engineering and Design 77: 4962.Google Scholar
Muntz, R.R., de Souza e Silva, E., & Goyal, A. (1989). Bounding availability of repairable computer systems. IEEE Transactions on Computers 38: 17141723.Google Scholar
Nakayama, M.K. (1994). A characterization of the simple failure biasing method for simulations of highly reliable Markovian systems. ACM Transactions on Modeling and Computer Simulation 4(1): 5288.Google Scholar
Nakayama, M.K. (1994). Fast simulation methods for highly dependable systems. In D.A. Sadowski, A.F. Seila, J.D. Tew, & S. Manivannan (eds.), Proceedings of the 1994 winter simulation conference. New York: IEEE Press, pp. 221228.
Nakayama, M.K. (1996). General conditions for bounded relative error in simulation of highly reliable Markovian systems. Advances in Applied Probability 28: 687727.Google Scholar
Nicola, V.F., Nakayama, M.K., Heidelberger, P., & Goyal, A. (1993). Fast simulation of highly dependable systems with general failure and repair processes. IEEE Transactions on Computers 42(8): 14401452.Google Scholar
Nicola, V.F., Shahabuddin, P., Heidelberger, P., & Glynn, P.W. (1993). Fast simulation of steady-state availability in non-Markovian highly dependable systems. In J.C. Laprie & A. Goyal (eds.), Proceedings of the 23rd annual international symposium on fault tolerant computing. New York: IEEE Computer Society Press, pp. 491498.
Nicola, V.F., Shahabuddin, P., & Nakayama, M.K. (2001). Techniques for fast simulation of models of highly dependable systems. IEEE Transactions on Reliability 50: 246264.Google Scholar
Obal, W.D., II & Sanders, W.H. (1994). Importance sampling simulation in ULTRA-SAN. Simulation 62: 98111.Google Scholar
Serfling, R.J. (1980). Approximation theorems in mathematical statistics. New York: Wiley.CrossRef
Shahabuddin, P. (1994). Fast transient simulation of Markovian models of highly dependable systems. Performance Evaluation, Special Issue: Performance '93—16th IFIP Working Group 7.3 International Symposium on Computer Performance Modeling, Measurement and Evaluation 20, 1–3, 267286.Google Scholar
Shahabuddin, P. (1994). Importance sampling for the simulation of highly reliable Markovian systems. Management Science 40(3): 333352.Google Scholar
Shahabuddin, P. (2001). Rare event simulation in stochastic models. In B.R. Haverkort, R. Marie, K. Trivedi, & G. Rubino (eds.), Performability modeling techniques and tools. New York: Wiley, pp. 163178.
Shahabuddin, P. & Nakayama, M.K. (1993). Estimation of unreliability and its derivatives for large time horizons in Markovian systems. In E.C. Russell, W.E. Biles, G.W. Evans, & M. Mollaghasemi (eds.), Proceedings of the 1993 winter simulation conference. New York: IEEE Press, pp. 422429.
Shultes, B.C. (1997). Regenerative techniques for estimating performance measures of highly dependable systems with repair. Ph.D. thesis, Georgia Institute of Technology, Atlanta.
Solovyev, A.D. (1971). Asymptotic behavior of the time of first occurrence of a rare event. Engineering Cybernetics 9: 10381048.Google Scholar
Strickland, S.G. (1995). Optimal importance sampling for quick simulation of highly reliable systems. In E.C. Russell, W.E. Biles, G.W. Evans, & M. Mollaghasemi (eds.), Proceedings of the 1993 winter simulation conference. New York: IEEE Press, pp. 437444.
Young, D. (1971). Iterative solution of large linear systems. New York: Academic Press.