Book contents
- Frontmatter
- Contents
- Acknowledgments
- About This Book
- How to Use This Book
- Chapter 1 Navigation
- Chapter 2 Preliminary Data Exploration
- Chapter 3 Storing and Manipulating Data
- Chapter 4 Advanced Concepts in Dataset and Variable Manipulation
- Chapter 5 Introduction to Common Procedures
- Chapter 6 Procedures for Simple Statistics
- Chapter 7 More about Common Procedures
- Chapter 8 Data Visualization
- Chapter 9 JMP as an Alternative
- Index
Chapter 6 - Procedures for Simple Statistics
Published online by Cambridge University Press: 05 June 2016
- Frontmatter
- Contents
- Acknowledgments
- About This Book
- How to Use This Book
- Chapter 1 Navigation
- Chapter 2 Preliminary Data Exploration
- Chapter 3 Storing and Manipulating Data
- Chapter 4 Advanced Concepts in Dataset and Variable Manipulation
- Chapter 5 Introduction to Common Procedures
- Chapter 6 Procedures for Simple Statistics
- Chapter 7 More about Common Procedures
- Chapter 8 Data Visualization
- Chapter 9 JMP as an Alternative
- Index
Summary
Most of the time, regular data reporting consists of simple statistics. Answers to questions like “how many?”, “what proportion?”, “what's the highest value?”, and “are those variables related?” commonly form the basis of such reporting. SAS provides a selection of procedures that are best used to answer these most basic questions in a way that is accurate, succinct, and simple. In this chapter, we discuss four procedures that are commonly used to create user-friendly reports that answer these questions: FREQUENCY, MEANS, UNIVARIATE, and CORR.
THE FREQUENCY PROCEDURE
Perhaps the simplest way to answer the question “how many” is to use the FREQUENCY procedure, typically referred to as PROC FREQ. This procedure tells the user how many observations carry each value of a particular variable by producing tabular or list-style frequency counts for all variables listed in the TABLES statement.
Let's use the sashelp dataset ‘cars’ to explore the ins and outs of PROC FREQ. Example 6.1 shows simple syntax for executing PROC FREQ with one variable. In cases where more than one variable is included in the TABLES statement (Example 6.2), the output includes separate frequency listings for each variable, like the ones in Figure 6.1.
EXAMPLE 6.1. FREQUENCY Procedure Syntax.
proc freq data = sashelp.cars;
tables type;
run;
EXAMPLE 6.2. FREQUENCY Procedure Syntax with Two Variables.
proc freq data = sashelp.cars;
tables type drivetrain;
run;
The output shown in Figure 6.1 provides information about the variable ‘type.’ The first column of information tells us that there are six distinct values for this variable: ‘hybrid,’ ‘SUV,’ and so on. The Frequency column tells us how many observations carry each variable value, while the Percent column indicates the percentage of observations that carry each variable value. For instance, when we look at the row of information for variable value ‘SUV’, we see that there are 60 observations with this value and that those observations make up 14.02% of our data. The last two columns provide cumulative information; there are 63 observations with variable values ‘Hybrid’ or ‘SUV’, which make up 14.72% of our data. This information is especially useful in instances where the categorical values are sequential in some way, such as levels of disease severity, income, and so forth.
As with most procedures, there are a handful of options that allow the user to manipulate the appearance of the output.
- Type
- Chapter
- Information
- Data Management Essentials Using SAS and JMP , pp. 79 - 93Publisher: Cambridge University PressPrint publication year: 2016