Search

Robust Canonical Discriminant Analysis
Peter Verboon, Ivo A. van der Lans
Journal:

Psychometrika / Volume 59 / Issue 4 / December 1994

Published online by Cambridge University Press:

01 January 2025, pp. 485-507
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
A method for robust canonical discriminant analysis via two robust objective loss functions is discussed. These functions are useful to reduce the influence of outliers in the data. Majorization is used at several stages of the minimization procedure to obtain a monotonically convergent algorithm. An advantage of the proposed method is that it allows for optimal scaling of the variables. In a simulation study it is shown that under the presence of outliers the robust functions outperform the ordinary least squares function, both when the underlying structure is linear in the variables as when it is nonlinear. Furthermore, the method is illustrated with empirical data.

A Minimum-Cost Network-Flow Solution to the Case V Thurstone Scaling Problem
James M. Lattin
Journal:

Psychometrika / Volume 55 / Issue 2 / June 1990

Published online by Cambridge University Press:

01 January 2025, pp. 353-370
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This paper presents an approach for determining unidimensional scale estimates that are relatively insensitive to limited inconsistencies in paired comparisons data. The solution procedure, shown to be a minimum-cost network-flow problem, is presented in conjunction with a sensitivity diagnostic that assesses the influence of a single pairwise comparison on traditional Thurstone (ordinary least squares) scale estimates. When the diagnostic indicates some source of distortion in the data, the network technique appears to be more successful than Thurstone scaling in preserving the interval scale properties of the estimates.

Robust Multidimensional Scaling
Ian Spence, Stephan Lewandowsky
Journal:

Psychometrika / Volume 54 / Issue 3 / September 1989

Published online by Cambridge University Press:

01 January 2025, pp. 501-513
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
A method for multidimensional scaling that is highly resistant to the effects of outliers is described. To illustrate the efficacy of the procedure, some Monte Carlo simulation results are presented. The method is shown to perform well when outliers are present, even in relatively large numbers, and also to perform comparably to other approaches when no outliers are present.

A Unified Approach to Exploratory Factor Analysis with Missing Data, Nonnormal Data, and in the Presence of Outliers
Ke-Hai Yuan, Linda L. Marshall, Peter M. Bentler
Journal:

Psychometrika / Volume 67 / Issue 1 / March 2002

Published online by Cambridge University Press:

01 January 2025, pp. 95-121
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Factor analysis is regularly used for analyzing survey data. Missing data, data with outliers and consequently nonnormal data are very common for data obtained through questionnaires. Based on covariance matrix estimates for such nonstandard samples, a unified approach for factor analysis is developed. By generalizing the approach of maximum likelihood under constraints, statistical properties of the estimates for factor loadings and error variances are obtained. A rescaled Bartlett-corrected statistic is proposed for evaluating the number of factors. Equivariance and invariance of parameter estimates and their standard errors for canonical, varimax, and normalized varimax rotations are discussed. Numerical results illustrate the sensitivity of classical methods and advantages of the proposed procedures.

Outliers and Influential Observations in Exponential Random Graph Models
Johan Koskinen, Peng Wang, Garry Robins, Philippa Pattison
Journal:

Psychometrika / Volume 83 / Issue 4 / December 2018

Published online by Cambridge University Press:

01 January 2025, pp. 809-830
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We discuss measuring and detecting influential observations and outliers in the context of exponential family random graph (ERG) models for social networks. We focus on the level of the nodes of the network and consider those nodes whose removal would result in changes to the model as extreme or “central” with respect to the structural features that “matter”. We construe removal in terms of two case-deletion strategies: the tie-variables of an actor are assumed to be unobserved, or the node is removed resulting in the induced subgraph. We define the difference in inferred model resulting from case deletion from the perspective of information theory and difference in estimates, in both the natural and mean-value parameterisation, representing varying degrees of approximation. We arrive at several measures of influence and propose the use of two that do not require refitting of the model and lend themselves to routine application in the ERGM fitting procedure. MCMC p values are obtained for testing how extreme each node is with respect to the network structure. The influence measures are applied to two well-known data sets to illustrate the information they provide. From a network perspective, the proposed statistics offer an indication of which actors are most distinctive in the network structure, in terms of not abiding by the structural norms present across other actors.

Maximum Likelihood Methods in Treating Outliers and Symmetrically Heavy-Tailed Distributions for Nonlinear Structural Equation Models with Missing Data
Sik-Yum Lee, Ye-Mao Xia
Journal:

Psychometrika / Volume 71 / Issue 3 / September 2006

Published online by Cambridge University Press:

01 January 2025, pp. 565-585
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
By means of more than a dozen user friendly packages, structural equation models (SEMs) are widely used in behavioral, education, social, and psychological research. As the underlying theory and methods in these packages are vulnerable to outliers and distributions with longer-than-normal tails, a fundamental problem in the field is the development of robust methods to reduce the influence of outliers and the distributional deviation in the analysis. In this paper we develop a maximum likelihood (ML) approach that is robust to outliers and symmetrically heavy-tailed distributions for analyzing nonlinear SEMs with ignorable missing data. The analytic strategy is to incorporate a general class of distributions into the latent variables and the error measurements in the measurement and structural equations. A Monte Carlo EM (MCEM) algorithm is constructed to obtain the ML estimates, and a path sampling procedure is implemented to compute the observed-data log-likelihood and then the Bayesian information criterion for model comparison. The proposed methodologies are illustrated with simulation studies and an example.

Robust mortality forecasting in the presence of outliers
Stephen J. Richards
Journal:

British Actuarial Journal / Volume 29 / 2024

Published online by Cambridge University Press:

04 December 2024, e19
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Stochastic mortality models are important for a variety of actuarial tasks, from best-estimate forecasting to assessment of risk capital requirements. However, the mortality shock associated with the Covid-19 pandemic of 2020 distorts forecasts by (i) biasing parameter estimates, (ii) biasing starting points, and (iii) inflating variance. Stochastic mortality models therefore require outlier-robust methods for forecasting. Objective methods are required, as outliers are not always obvious on visual inspection. In this paper we look at the robustification of three broad classes of forecast: univariate time indices (such as in the Lee-Carter and APC models); multivariate time indices (such as in the Cairns-Blake-Dowd and newer Tang-Li-Tickle model families); and penalty projections (such as with the 2D P-spline model). In each case we identify outliers using quantitative methods, then co-estimate outlier effects along with other parameters. Doing so removes the bias and distortion to the forecast caused by a mortality shock, while providing a robust starting point for projections. Illustrations are given for various models in common use.

6 - Central Tendency
from Part 2
Daniel S. Scheller, Texas Tech University
Book:

Elementary Statistics for Public Administration

Published online:

01 November 2024

Print publication:

12 September 2024, pp 115-132
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Central tendency describes the typical value of a variable.Measures of central tendency by level of measurement are covered including the mean, median, and mode.Appropriate use of each measure by level of measurement is the central theme of the chapter.The chapter shows how to find these measures of central tendency by hand and in the R Commander with detailed instructions and steps.Skewed distributions and outliers of data are also covered, as is the relationship between the mean and median in these cases.

3 - Multiple Linear Regression
John H. Maindonald, Statistics Research Associates, Wellington, New Zealand, W. John Braun, University of British Columbia, Okanagan, Jeffrey L. Andrews, University of British Columbia, Okanagan
Book:

A Practical Guide to Data Analysis Using R

Published online:

11 May 2024

Print publication:

30 May 2024, pp 144-207
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Multiple linear regression generalizes straight line regression to allow multiple explanatory (or predictor) variables, in this chapter under the normal errors assumption. The focus may be on accurate prediction. Or it may, alternatively or additionally, be on the regression coefficients themselves. Simplistic interpretations of coefficients can be grossly misleading. Later chapters elaborate on the ideas and methods developed in this chapter, applying them in new contexts. The attaching of causal interpretations to model coefficients must be justified both by reference to subject area knowledge and by careful checks to ensure that they are not artefacts of the correlation structure. There is attention to regression diagnostics, to assessment, and comparison of models. Variable selection strategies can readily over-fit. Hence the importance of training/test approaches and cross-validation. The potential is demonstrated for errors in x to seriously bias regression coefficients. Strong multicollinearity leads to large variance inflation factors.

12 - Random Matrix Theory, Signal + Noise Matrices, and Phase Transitions
Jeffrey A. Fessler, University of Michigan, Ann Arbor, Raj Rao Nadakuditi, University of Michigan, Ann Arbor
Book:

Linear Algebra for Data Science, Machine Learning, and Signal Processing

Published online:

01 November 2024

Print publication:

16 May 2024, pp 390-404
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

There are many applications of the low-rank signal-plus-noise model 𝒀 = 𝑿 + 𝒁 where 𝑿 is a low-rank matrix and 𝒁 is noise, such as denoising and dimensionality reduction. We are interested in the properties of the latent matrix 𝑿, such as its singular value decomposition (SVD), but all we are given is the noisy matrix 𝒀. It is important to understand how the SVD components of 𝒀 relate to those of 𝑿 in the presence of a random noise matrix 𝒁. The field of random matrix theory (RMT) provides insights into those relationships, and this chapter summarizes some key results from RMT that help explain how the noise in 𝒁 perturbs the SVD components, by analyzing limits as matrix dimensions increase. The perturbations considered include roundoff error, additive Gaussian noise, outliers, and missing data. This is the only chapter that requires familiarity with the distributions of continuous random variables, and it provides many pointers to the literature on this modern topic, along with several demos that illustrate remarkable agreement between the asymptotic predictions and the empirical performance even for modest matrix sizes.

3 - Bank Atrophy and Outliers
Xuan-Thao Nguyen, University of Washington
Book:

Silicon Valley Bank

Published online:

08 February 2024

Print publication:

15 February 2024, pp 40-77
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Medearis and his two cofounders of Silicon Valley Bank wished to tackle the antiquated banking practices that led to a massive reduction in the number of banks, the disappearance of community banks, and the mergers of Big Banks. Bank regulations and culture prevent banks from embracing tech startups and entrepreneurs as lending clients. The SVB founders knew about Bank of America’s abandonment of its early tech lending, missed opportunities, and bank failures to capture tech startups and entrepreneurs. The old, conservative banking environment during the early days of the tech sector presented the founders with an opportunity.

C - Strategist Should Find Advantage
Scott Boorman, Yale University, Connecticut
Book:

Three Faces of Sun Tzu

Published online:

07 March 2024

Print publication:

15 February 2024, pp 171-234
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

C - Strategist Should Find Advantage
Scott Boorman, Yale University, Connecticut
Book:

Three Faces of Sun Tzu

Published online:

07 March 2024

Print publication:

15 February 2024, pp 171-234
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

A core contention woven into the fabric of Sun Tzu’s thinking is that all situations faced by a strategic actor, even those that appear on their face to be losing ones, hold seeds of opportunity that, if grasped correctly, can be parlayed into strategic advantage.1 An illustrative statement starts off Passage #5.1 below.

2 - Features, Combined: Normalization, Discretization and Outliers
from Part One - Fundamentals
Pablo Duboue
Book:

The Art of Feature Engineering

Published online:

29 May 2020

Print publication:

25 June 2020, pp 34-58
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter discusses Feature Engineering techniques that look holistically at the feature set, therefore replacing or enhancing the features based on their relation to the whole set of instances and features. Techniques such as normalization, scaling, dealing with outliers and generating descriptive features are covered. Scaling and normalization are the most common, it involves finding the maximum and minimum and changing the values to ensure they will lie in a given interval (e.g., [0, 1] or [−1, 1]). Discretization and binning involve, for example, analyzing a feature that is an integer (any number from -1 trillion to +1 trillion) and realize that it only takes the values 0, 1 and 10 so it can be simplified into a symbolic feature with three values (value0, value1 and value10). Descriptive features is the gathering of information that talks about the shape of the data, the discussion centres around using tables of counts (histograms) and general descriptive features such as maximum, minimum and averages. Outlier detection and treatment refers to looking at the feature values across many instances and realizing some values might present themselves very far from the rest.

3 - Graphical Presentation of Data
Les Kirkup, University of Technology, Sydney
Book:

Experimental Methods for Science and Engineering Students

Published online:

24 August 2019

Print publication:

05 September 2019, pp 24-58
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Graphs are a powerful and concise way to communicate information. Representing data from an experiment in the form of an x-y graph allows relationships to be examined, scatter in data to be assessed and allows for the rapid identification of special or unusual features. A well laid out graph containing all the components discussed in this chapter can act as a 'one stop' summary of a whole experiment. Someone studying an account of an experiment will often examine the graph(s) included in the account first to gain an overall picture of the outcome of an experiment. The importance of graphs, therefore, cannot be overstated as they so often play a central role in the communication of the key findings of an experiment. This chapter contains many examples of graphs and includes exercises and end of chapter problems which reinforce the graph-plotting principles.

High-resolution chronology for the Mesoamerican urban center of Teotihuacan derived from Bayesian statistics of radiocarbon and archaeological data
Laura E. Beramendi-Orosco, Galia Gonzalez-Hernandez, Jaime Urrutia-Fucugauchi, Linda R. Manzanilla, Ana M. Soler-Arechalde, Avto Goguitchaishvili, Nick Jarboe
Journal:

Quaternary Research / Volume 71 / Issue 2 / March 2009

Published online by Cambridge University Press:

20 January 2017, pp. 99-107
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
A high-resolution 14C chronology for the Teopancazco archaeological site in the Teotihuacan urban center of Mesoamerica was generated by Bayesian analysis of 33 radiocarbon dates and detailed archaeological information related to occupation stratigraphy, pottery and archaeomagnetic dates. The calibrated intervals obtained using the Bayesian model are up to ca. 70% shorter than those obtained with individual calibrations. For some samples, this is a consequence of plateaus in the part of the calibration curve covered by the sample dates (2500 to 1450 14C yr BP). Effects of outliers are explored by comparing the results from a Bayesian model that incorporates radiocarbon data for two outlier samples with the same model excluding them. The effect of outliers was more significant than expected. Inclusion of radiocarbon dates from two altered contexts, 500 14C yr earlier than those for the first occupational phase, results in ages calculated by the model earlier than the archaeological records. The Bayesian chronology excluding these outliers separates the first two Teopancazco occupational phases and suggests that ending of the Xolalpan phase was around cal AD 550, 100 yr earlier than previously estimated and in accordance with previously reported archaeomagnetic dates from lime plasters for the same site.

Linear Models with Outliers: Choosing between Conditional-Mean and Conditional-Median Methods
Jeffrey J. Harden, Bruce A. Desmarais
Journal:

State Politics & Policy Quarterly / Volume 11 / Issue 4 / December 2011

Published online by Cambridge University Press:

25 January 2021, pp. 371-389

Print publication:

December 2011
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
State politics researchers commonly employ ordinary least squares (OLS) regression or one of its variants to test linear hypotheses. However, OLS is easily influenced by outliers and thus can produce misleading results when the error term distribution has heavy tails. Here we demonstrate that median regression (MR), an alternative to OLS that conditions the median of the dependent variable (rather than the mean) on the independent variables, can be a solution to this problem. Then we propose and validate a hypothesis test that applied researchers can use to select between OLS and MR in a given sample of data. Finally, we present two examples from state politics research in which (1) the test selects MR over OLS and (2) differences in results between the two methods could lead to different substantive inferences. We conclude that MR and the test we propose can improve linear models in state politics research.

On a Simple Graphical Approach to Modelling Economic Fluctuations with an Application to United Kingdom Price Inflation, 1265 to 2005
W. S. Chan, M. W. Ng, H. Tong
Journal:

Annals of Actuarial Science / Volume 1 / Issue 1 / March 2006

Published online by Cambridge University Press:

10 May 2011, pp. 103-128
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Structural instability in economic time series is widely reported in the literature. It is most prevalent in such series as price indices and inflation related data. Many methods have been developed for analysing and modelling structural changes in a univariate time series model. However, most of them assume that the data are generated by one fixed type (linear or non-linear) of the time series processes. This paper proposes a strategy for modelling different segments of an economic time series by different linear or non-linear models. A graphical procedure is suggested for detecting the model change points. The proposed procedure is illustrated by modelling annual United Kingdom price inflation series over the period 1265 to 2005. Stochastic modelling of inflation rates is an important topic to actuaries for dealing with long-term index linked insurance business. The proposed method suggests dividing the U.K. inflation series into four segments for modelling. Inflation projections based on the latest segment of the data are obtained through simulations. To get a better understanding of the impact of structural changes on inflation projections we also perform a forecasting study.

A Comparison of Outlier Detection Procedures and Robust Estimation Methods in GPS Positioning
Nathan L. Knight, Jinling Wang
Journal:

The Journal of Navigation / Volume 62 / Issue 4 / October 2009

Published online by Cambridge University Press:

07 October 2009, pp. 699-709

Print publication:

October 2009
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
With more satellite systems becoming available there is currently a need for Receiver Autonomous Integrity Monitoring (RAIM) to exclude multiple outliers. While the single outlier test can be applied iteratively, in the field of statistics robust methods are preferred when multiple outliers exist. This study compares the outlier test and numerous robust methods with simulated GPS measurements to identify which methods have the greatest ability to correctly exclude outliers. It was found that no method could correctly exclude outliers 100% of the time. However, for a single outlier the outlier test achieved the highest rates of correct exclusion followed by the MM-estimator and the L1-norm. As the number of outliers increased MM-estimators and the L1-norm obtained the highest rates of normal exclusion, which were up to ten percent higher than the outlier test.

La parité des pouvoirs d’achat pour l’économie chinoise: une nouvelle analyse par les tests de racine unitaire *
Olivier Darné, Jean-François Hoarau
Journal:

Recherches Économiques de Louvain/ Louvain Economic Review / Volume 74 / Issue 2 / 2008

Published online by Cambridge University Press:

17 August 2016, pp. 219-236

Print publication:

2008
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this paper we re-examine whether purchasing power parity holds in the long run in China from a two-steps procedure correcting outliers and testing unit roots. Thus, the efficient unit root tests developed by Elliott, Rothenberg and Stock (1996) and Ng and Perron (2001) are applied on the Renminbi bilateral (to the US dollar) real exchange rate, corrected from outliers, over the period 1970 to 2006 (in monthly frequency). We underlined the effects of large, but infrequent shocks due to changes of Chinese exchange policy undertaken since the China’s foreign exchange reform on the real exchange rate, in particular several devaluations between 1984-1994. We also show that there is no tendency to the purchasing power parity in China to hold in the long run during this period.

Search Results

Refine search

Refine search

Actions for selected content:

23 results

Robust Canonical Discriminant Analysis

A Minimum-Cost Network-Flow Solution to the Case V Thurstone Scaling Problem

Robust Multidimensional Scaling

A Unified Approach to Exploratory Factor Analysis with Missing Data, Nonnormal Data, and in the Presence of Outliers

Outliers and Influential Observations in Exponential Random Graph Models

Maximum Likelihood Methods in Treating Outliers and Symmetrically Heavy-Tailed Distributions for Nonlinear Structural Equation Models with Missing Data

Robust mortality forecasting in the presence of outliers

6 - Central Tendency

Summary

3 - Multiple Linear Regression

Summary

12 - Random Matrix Theory, Signal + Noise Matrices, and Phase Transitions

Summary

3 - Bank Atrophy and Outliers

Summary

C - Strategist Should Find Advantage

C - Strategist Should Find Advantage

Summary

2 - Features, Combined: Normalization, Discretization and Outliers

Summary

3 - Graphical Presentation of Data

Summary

High-resolution chronology for the Mesoamerican urban center of Teotihuacan derived from Bayesian statistics of radiocarbon and archaeological data

Linear Models with Outliers: Choosing between Conditional-Mean and Conditional-Median Methods

On a Simple Graphical Approach to Modelling Economic Fluctuations with an Application to United Kingdom Price Inflation, 1265 to 2005

A Comparison of Outlier Detection Procedures and Robust Estimation Methods in GPS Positioning

La parité des pouvoirs d’achat pour l’économie chinoise: une nouvelle analyse par les tests de racine unitaire *

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

23 results

Summary

Summary

Summary

Summary

Summary

Summary

Summary