Neural Network Modeling using SAS Enterprise Miner

Randall Matignon

 FormatISBN Price  
This Book is Available Electronic Book (E-book Instructions)9781418466756 $ 6.95
This Book is Available Paperback (8.25x11)9781418423414 $ 34.95

This book is designed in making statisticians, researchers, and programmers aware of the awesome new product now available in SAS called Enterprise Miner. The book will also make readers get familiar with the neural network forecasting methodology in statistics. One of the goals to this book is making the powerful new SAS module called Enterprise Miner easy for you to use with step-by-step instructions in creating a Enterprise Miner process flow diagram in preparation to data-mining analysis and neural network forecast modeling.

Topics discussed in this book

  • An overview to traditional regression modeling.
  • An overview to neural network modeling.
  • Numerical examples of various neural network designs and optimization techniques.
  • An overview to the powerful SAS product called Enterprise Miner.
  • An overview to the SAS neural network modeling procedure called PROC NEURAL.
  • Designing a SAS Enterprise Miner process flow diagram to perform neural network
  • forecast modeling and traditional regression modeling with an explanation to the various configuration settings to the Enterprise Miner nodes used in the analysis.
  • Comparing neural network forecast modeling estimates with traditional modeling estimates based on various examples from SAS manuals and literature with an added overview to the various modeling designs and a brief explanation to the SAS modeling procedures, option statements, and corresponding SAS output listings.

Randall Matignon has been a SAS programmer analyst for over 15 years. He received his M.S. in Statistics and a B.S. in Mathematics and Statistics from California State University, Hayward. He has worked as a consultant at various marketing research, health care and pharmaceutical companies as a statistical programmer, application developer, and data analyst.

An Overview of Enterprise Miner

Enterprise Miner v.4.1 is a great new product SAS has recently introduced in version 8. It consists of a variety of analytical tools like neural networks to support data mining to enhance traditional forecasting modeling. Data mining is an analytical tool that is used in solving critical business decisions by analyzing enormous amounts of data in order to discover relationships and unknown patterns in the data. Enterprise Miner is a powerful product now available within the SAS software. The EM data mining SEMMA methodology is specifically designed to handling enormous data sets in preparation to subsequent data analysis. In SAS Enterprise Miner, the abbreviation SEMMA stands for Sampling, Exploring, Modifying, Modeling, and Assessing large amounts of data. Neural network modeling with regard to the data mining tasks falls under predictive modeling i.e. regression or classification modeling. In regression modeling, the aim is building a model that will produce values of one variable to be predicted based on a set of known values of other variables. In classification modeling, the difference is that the variable to predict is categorical based on a set of known quantitative variables.

Purpose of Writing this Book

One reason for writing this book is because there is not a tremendous amount of written literature on neural network modeling using SAS Enterprise Miner. The book is a step-by-step approach to neural network modeling using SAS Enterprise Miner and the use of the SAS neural network procedure called PROC NEURAL. This book consists of a step-by-step approach in designing a neural network process flow diagram using SAS Enterprise Miner. The book will also explain the various statements and options to the NEURAL procedure. There are numerous examples in explaining the various complex neural network designs and optimization techniques used in network modeling with numerous examples taken from various SAS literature comparing the forecasting results between both neural network and traditional regression forecasting techniques with an explanation to the SAS modeling results. The book’s introductory is a brief overview to traditional regression modeling and the various statistical assumptions that must be satisfied.

Highlights to this Book

Chapter 2 discusses basic model building and the various modeling assumptions that need to be satisfied. These modeling assumptions in order of importance are independence, equal variance, and normality in the modeling terms must be satisfied in both traditional regression modeling and neural network designs. However, it should be noted that some neural network modelers ignore these same important modeling assumptions. This section will explain the various diagnosis statistic used in identifying outliers and influential data points that have a profound effect to the modeling results. And finally explaining the various goodness-of-fit statistics used in determining the best linear combination of input variables among a pool of all possible combination of input variables to the regression model.

Chapter 3 explains the neural network design and the various configuration settings. The section will first explain a simple perceptron design for a binary-valued target variable. Next, we will discuss the neural network designs and the various configurations to the design like the various layers, weights, combination functions, transfer functions, objective or error functions, and optimization techniques that are used. The section will explain the various optimization techniques such as the various line search and grid search techniques. It will be followed by various numerical examples in order to simplify the complexity of the numerous optimization techniques that are applied in calculating the neural network weight estimates and determining the smallest error to the neural network model. A numerical example of the backpropagation algorithm will be presented that is typically used in a neural network MLP design. The section will explain the similarity between the multiple regression parameter estimates and the neural network weight estimates. Pruning techniques used in pre-processing the model will be discussed leading to the general strategies in interpreting important input variables to the neural network model and constructing a well-designed neural network model. The section will conclude with a brief summary to the advantages and disadvantages of a neural network design.

Chapter 4 first explains both the SAS neural network procedure called the NEURAL procedure and the SAS data mining regression procedure called the DMREG procedure and the various option statements. The chapter will then display diagrams of the neural network architecture in a couple of modeling comparison examples presented later in the book. Thereby, for the reader to graphically understand the neural network configuration between the various layers and the weight estimates associated with these same neural network layers. Followed by SAS output listings from the En