@ND

Relative importance of regressors

Dominance Analysis

Readings

Budescu, D. V. (1993). Dominance analysis: a new approach to the problem of relative importance of predictors in multiple regression, Psychological Bulletin, 114, 542-551;
Azen, R., and Budescu D. V. (2003). The dominance analysis approach for comparing predictors in multiple regression. Psychological Methods, 8, 129-148.

SAS Macros

These macros were developed under SAS 8.

This program outputs the additional contributions table as discussed in Budescu (1993).
The OUTPUT of this program includes the following information:

Regression analysis using the input data file;
Dominance analysis results for each dominance level:
- Additional contributions table;
- Average contributions within model size;
- Overall averaged contributions.

DA macro

This program outputs the additional contributions table as discussed in Budescu (1993) as well as D_ij values, means, standard errors and reproducibility results as discussed in Azen & Budescu (2003).
The OUTPUT of this program includes the following information:

Regression analysis using the input data file;
Dominance analysis results for each dominance level:
- Additional contributions table (Complete dominance);
- Average contributions within model size (Conditional dominance);
- Overall averaged contributions (General dominance). For each of the above dominance levels:
- Frequencies and probabilities of dominance over repeated sampling;
- Mean and SE of dominance indicator over repeated sampling;
- Reproducibility of sample results over repeated sampling;

Multivariate DA macro

This program outputs the additional contributions table as discussed in Azen & Budescu (2006).
This program will compute DA based on (1) each of q response variables (Y) separately; (2) a function of the q response variables taken together using multivariate R²; and (3) a function of the q response variables taken together using multivariate P².
WARNING: This program is quite SLOW! Each bootstrap run may take about 30 seconds.

Logistic DA main program | Logistic DA macros

This program provides output for the analysis of empirical data sets as discussed in Azen & Traxel (2009).
This program will compute DA for logistic regression using one binary response variable (Y) and p predictor variables.

References on relative importance

The most important references for the R-package relaimpo are marked with an asterisk.

Azen, R. and Budescu, D.V. (2003). The dominance analysis approach for comparing predictors in multiple regression. Psychological Methods 8, 129-148.

Azen, R. (2003). Dominance Analysis SAS Macros. URL: www.uwm.edu/~azen/damacro.html.

Bring, J. (1996). A geometric approach to compare variables in a regression model. The American Statistician 50, 57-62.

Budescu, D.V. (1993). Dominance Analysis: A new approach to the problem of relative importance in multiple regression. Psychological Bulletin 114, 542-551.

Budescu, D.V. and Azen, R. (2004). Beyond Global Measures of Relative Importance: Some Insights from Dominance Analysis. Organizational Research Methods 7, 341 - 350.

Chevan, A. and Sutherland, M. (1991). Hierarchical Partitioning. The American Statistician 45, 90-96.

Conklin, M., Powaga, K. and Lipovetsky, S. (2004). Customer satisfaction analysis: Identification of key drivers. European Journal of Operational Research 154, 819–827.

*Darlington, R.B. (1968). Multiple regression in psychological research and practice. Psychological Bulletin 69, 161-182. (last, first, betasq, pratt)

Feldman, B. (1999). The proportional value of a cooperative game. Manuscript for a contributed paper at the Econometric Society World Congress 2000. Downloadable at http://fmwww.bc.edu/RePEc/es2000/1140.pdf.

Feldman, B. (2002). A Dual Model of Cooperative Value. Manuscript, downloadable from http://ssrn.com/abstract=317284.

*Feldman, B. (2005). Relative Importance and Value. Manuscript (latest version), downloadable at http://www.prismanalytics.com/docs/RelativeImportance.pdf. (pmvd)

Feldman, B. (2005b). Statistics and variance decomposition add-in for Excel. http://www.prismanalytics.com/gsb/addin.htm

Feldman, B. (2007). A theory of attribution. MPRA Paper No. 3349. Downloadable at http://mpra.ub.uni-muenchen.de/3349/01/MPRA_paper_3349.pdf.

Fickel, N. (2001). Sequenzialregression: Eine neodeskriptive Lösung des Multikollinearitätsproblems mittels stufenweise bereinigter und synchronisierter Variablen. Habilitationsschrift, University of Erlangen-Nuremberg. VWF, Berlin.

Fickel, N. (2003). Measuring Supplementary Influence by Using Sequential Linear Regression. Downloadable from Mathematics Preprint Server.

Firth, D. (1998). Relative importance of explanatory variables. Conference ”Statistical Issues in the Social Sciences”, Stockholm, October 1998. URL: http://www.nuff.ox.ac.uk/sociology/alcd/relimp.pdf.

Fox, J. (2002). Bootstrapping regression models. In: An R and S-PLUS Companion to Applied Regression: A web appendix to the book. Sage, Thousand Oaks, CA. URL: http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-bootstrapping.pdf. (appropriate bootstrapping in regression models)

*Grömping, U. (2006). Relative Importance for Linear Regression in R: The Package relaimpo. Journal of Statistical Software 17, Issue 1.

*Grömping, U. (2007). Estimators of Relative Importance in Linear Regression Based on Variance Decomposition. The American Statistician 61, 139-147.

Grömping, U. (2007). Response to comment by Scott Menard, re: Estimators of Relative Importance in Linear Regression Based on Variance Decomposition. In: Letters to the Editor, The American Statistician 61, 280-284.

Grömping, U. (2008). The ic.infer Package (Inequality constrained inference in linear normal situations). R package version 1.0-6. In: R Development Core Team (2008). R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria. (lmg for order-restricted linear model, e.g. with all coefficients restricted to non-negative values).

Grömping, U. (2009). Variable Importance Assessment in Regression: Linear Regression Versus Random Forest. The American Statistician 63, 308-319.

Grömping, U. and Landau, S. (2009). Do not adjust coefficients in Shapley value regression. To appear in Applied Stochastic Models in Business and Industry. Online early view: DOI: 10.1002/asmb.773.

Hart, S. and Mas-Colell, A. (1989). Potential, value and consistency. Econometrica 57, 589-614. (game-theoretic background for lmg)

Healy, M.J.R. (1990). Measuring importance. Statistics in Medicine 9, 633-637.

Hoffman, P.J. (1960). The paramorphic representation of clinical judgment. Psychological Bulletin 57, 116-131.

Hoffman, P.J. (1962). Assessment of the independent contributions of predictors. Psychological Bulletin 59, 77-80.

Johnson, J.W. (2000). A heuristic method for estimating the relative weight of predictor variables in multiple regression. Multivariate behavioral research 35, 1-19.

Johnson, J.W. (2004). Factors affecting relative weights: the influence of sampling and measurement error. Organizational Research Methods 7, 283-299.

Johnson, J.W. and Lebreton, J.M. (2004). History and Use of Relative Importance Indices in Organizational Research. Organizational Research Methods 7, 238 - 257.

Kruskal, W. (1987). Relative importance by averaging over orderings. The American Statistician 41: 6-10.

Kruskal, W. (1987b): Correction to ”Relative importance by averaging over orderings”. The American Statistician 41: 341.

Kruskal, W. and Majors, R. (1989). Concepts of relative importance in recent scientific literature. The American Statistician 43: 2-6.

Lebreton, J.M., Ployhart, R.E. and Ladd, R.T. (2004). A Monte Carlo Comparison of Relative Importance Methodologies. Organizational Research Methods 7, 258 - 282.

*Lindeman, R.H., Merenda, P.F. and Gold, R.Z. (1980). Introduction to Bivariate and Multivariate Analysis, Scott, Foresman, Glenview IL. (lmg, p.119ff)

Lipovetsky, S. and Conklin, M. (2001). Analysis of Regression in Game Theory Approach. Applied Stochastic Models in Business and Industry 17, 319-330.

MacNally, R. (2000) Regression and model building in conservation biology, biogeography and ecology: the distinction between and reconciliation of 'predictive' and 'explanatory' models. Biodiversity and Conservation 9: 655-671.

MacNally, R. (2002) Multiple regression and inference in conservation biology and ecology: further comments on identifying important predictor variables. Biodiversity and Conservation 11: 1397-1401.

MacNally, R. & Walsh, C. (2004). Hierarchical partitioning public-domain software. Biodiversity and Conservation 13, 659-660.

Nimon K. and Roberts, J.K. (2009). yhat: Interpreting Regression Effects. R package version 1.0-2. http://cran.r-project.org/web/packages/yhat/yhat.pdfhttp://cran.r-project.org/package=yhat

Ortmann, K.M. (2000). The proportional value of a positive cooperative game. Mathematical Methods of Operations Research 51, 235-248. (game-theoretic background for pmvd)

Pedhazur, E.J. (1982, 2nd ed.). Multiple regression in behavioral research: explanation and prediction. Holt, Rinehart and Winston, New York.

*Pratt, J.W. (1987). Dividing the indivisible: Using simple symmetry to partition variance explained. In: Pukkila, T. and Puntanen, S. (Eds.): Proceedings of second Tampere conference in statistics, University of Tampere, Finland, 245-260. (pratt)

Shapley, L. (1953). A value for n-person games. Reprinted in: Roth, A. (1988, ed.): The Shapley Value: Essays in Honor of Lloyd S. Shapley. Cambridge University Press, Cambridge. (game-theoretic background for lmg)

Soofi, E.S., Retzer, J.J. and Yasai-Ardekani, M. (2000). A Framework for Measuring the Importance of Variables with Applications to Management Research and Decision Models. Decision Sciences 31, 1-31.

Theil, H. (1971). Principles of Econometrics. Wiley, New York..

Theil, H. (1987). How many bits of information does an independent variable yield in a multiple regression? Statistics and Probability Letters 6, 107-108.

Theil, H. and Chung, C.-F. (1988a). Relations between two sets of variates: the bits of information provided by each variate in each set. Statistics and Probability Letters 6, 137-139.

Theil, H. and Chung, C.-F. (1988). Information-theoretic measures of fit for univariate and multivariate linear regressions. The American Statistician 42, 249-252.

Thomas, D.R., Hughes, E. and Zumbo, B.D. (1998). On variable importance in linear regression. Social Indicators Research 45, 253-275.

Thomas, D.R., Zhu, P.C. and Decady, Y.J. (2007). Point estimates and confidence intervals for variable importance in multiple linear regression. J. Educational and Behavioral Statistics 32, 61-91.

Ward, J.H. (1962). Comments on ”The paramorphic representation of clinical judgment”. Psychological Bulletin 59, 74-76.

Walsh C. & Mac Nally, R. (2003). The hier.part Package: Hierarchical Partitioning. (Part of: Documentation for R: A language and environment for statistical computing.) R Foundation for Statistical Computing, Vienna, Austria. URL: http://cran.r-project.org/web/packages/hier.part/hier.part.pdf.

Whittaker, T.A.; Fouladi, R.T.; Williams, N.J. (2002). Determining Predictor Importance in Multiple Regression Under Varied Correlational And Distributional Conditions. J. Modern Applied Statistical Methods 1, 354-366.

Table of Contents

Relative importance of regressors

Dominance Analysis

Readings

SAS Macros

References on relative importance