Book contents
- Frontmatter
- Contents
- Contributors
- Preface
- Acknowledgments
- 1 Pure Premium Modeling Using Generalized Linear Models
- 2 Applying Generalized Linear Models to Insurance Data: Frequency/Severity versus Pure Premium Modeling
- 3 Generalized Linear Models as Predictive Claim Models
- 4 Frameworks for General Insurance Ratemaking: Beyond the Generalized Linear Model
- 5 Using Multilevel Modeling for Group Health Insurance Ratemaking: A Case Study from the Egyptian Market
- 6 Clustering in General Insurance Pricing
- 7 Application of Two Unsupervised Learning Techniques to Questionable Claims: PRIDIT and Random Forest
- 8 The Predictive Distribution of Loss Reserve Estimates over a Finite Time Horizon
- 9 Finite Mixture Model and Workers’ Compensation Large-Loss Regression Mixture Model and Workers’ Compensation Large-Loss Regression Analysis
- 10 A Framework for Managing Claim Escalation Using Predictive Modeling
- 11 Predictive Modeling for Usage-Based Auto Insurance
- Index
- References
7 - Application of Two Unsupervised Learning Techniques to Questionable Claims: PRIDIT and Random Forest
Published online by Cambridge University Press: 05 August 2016
- Frontmatter
- Contents
- Contributors
- Preface
- Acknowledgments
- 1 Pure Premium Modeling Using Generalized Linear Models
- 2 Applying Generalized Linear Models to Insurance Data: Frequency/Severity versus Pure Premium Modeling
- 3 Generalized Linear Models as Predictive Claim Models
- 4 Frameworks for General Insurance Ratemaking: Beyond the Generalized Linear Model
- 5 Using Multilevel Modeling for Group Health Insurance Ratemaking: A Case Study from the Egyptian Market
- 6 Clustering in General Insurance Pricing
- 7 Application of Two Unsupervised Learning Techniques to Questionable Claims: PRIDIT and Random Forest
- 8 The Predictive Distribution of Loss Reserve Estimates over a Finite Time Horizon
- 9 Finite Mixture Model and Workers’ Compensation Large-Loss Regression Mixture Model and Workers’ Compensation Large-Loss Regression Analysis
- 10 A Framework for Managing Claim Escalation Using Predictive Modeling
- 11 Predictive Modeling for Usage-Based Auto Insurance
- Index
- References
Summary
Chapter Preview. Predictive modeling can be divided into two major kinds of modeling, referred to as supervised and unsupervised learning, distinguished primarily by the presence or absence of dependent/target variable data in the data used for modeling. Supervised learning approaches probably account for the majority of modeling analyses. The topic of unsupervised learning was introduced in Chapter 12 of Volume I of this book. This chapter follows up with an introduction to two advanced unsupervised learning techniques PRIDIT (Principal Components of RIDITS) and Random Forest (a tree based data-mining method that is most commonly used in supervised learning applications). The methods will be applied to an automobile insurance database to model questionable claims. A couple of additional unsupervised learning methods used for visualization, including multidimensional scaling, will also be briefly introduced.
Databases used for detecting questionable claims often do not contain a questionable claims indicator as a dependent variable. Unsupervised learning methods are often used to address this limitation. A simulated database containing features observed in actual questionable claims data was developed for this research based on actual data. The methods in this chapter will be applied to this data. The database is available online at the book's website.
Introduction
An introduction to unsupervised learning techniques as applied to insurance problems is provided by Francis (2014) as part of Predictive Modeling Applications in Actuarial Science, Volume I, a text intended to introduce actuaries and insurance professionals to predictive modeling analytic techniques. As an introductory work, it focused on two classical approaches: principal components and clustering. Both are standard statistical methods that have been in use for many decades and are well known to statisticians. The classical approaches have been augmented by many other unsupervised learning methods such as neural networks, association rules and link analysis. While these are frequently cited methods for unsupervised learning, only the kohonen neural network method will be briefly discussed in this chapter. The two methods featured here, PRIDIT and Random Forest clustering are less well known and less widely used. Brockett et al. (2003) introduced the application of PRIDITs to the detection of questionable claims in insurance. Lieberthal (2008) has applied the PRIDIT method to hospital quality studies.
- Type
- Chapter
- Information
- Predictive Modeling Applications in Actuarial Science , pp. 180 - 207Publisher: Cambridge University PressPrint publication year: 2016