Cambridge Encyclopedia :: Cambridge Encyclopedia Vol. 25

factor analysis - Mathematical model of the same example, Factor analysis in psychometrics, Factor analysis in marketing

A set of techniques popular in psychometric research to reduce data to manageable form. Given a set of correlations between various measures (eg responses to items on a questionnaire), factor analysis identifies a small number of factors (weighted combinations of the observed measures) which best account for the correlations. Such factors are statistical: giving them a psychological interpretation is sometimes difficult and contentious.

Factor analysis is a statistical technique used to explain variability among observed random variables in terms of fewer unobserved random variables called factors. The observed variables are modeled as linear combinations of the factors, plus "error" terms. Factor analysis originated in psychometrics, and is used in social sciences, marketing, product management, operations research, and other applied sciences that deal with large quantities of data.

Suppose a psychologist proposes a theory that there are two kinds of intelligence, "verbal intelligence" and "mathematical intelligence". The psychologist's theory may say that, for each of the 10 subjects, the score averaged over the group of all students who share some common pair of values for verbal and mathematical "intelligences" is some constant times their level of verbal intelligence plus another constant times their level of mathematical intelligence, i.e., it is a linear combination of those two "factors". The numbers, for this particular subject, by which the two kinds of intelligence are multiplied to obtain the expected score, are posited by the theory to be the same for all intelligence level pairs, and are called "factor loadings" for this subject. For example, the theory may hold that the average student's aptitude in the field of amphibology is

{ 10 × the student's verbal intelligence } + { 6 × the student's mathematical intelligence }.

The numbers 10 and 6 are the factor loadings associated with amphibology.

Two students having identical degrees of verbal intelligence and identical degrees of mathematical intelligence may have different aptitudes in amphibology because individual aptitudes differ from average aptitudes.

The observable data that go into factor analysis would be 10 scores of each of the 1000 students, a total of 10,000 numbers. The factor loadings and levels of the two kinds of intelligence of each student must be inferred from the data.

University of Phoenix

Mathematical model of the same example

In the example above, for i = 1, ..., 1,000 the ith student's scores are

where

xk,i is the ith student's score for the kth subject μk is the mean of the students' scores for the kth subject (assumed to be zero, for simplicity, in the example as described above, which would amount to a simple shift of the scale used) vi is the ith student's "verbal intelligence", mi is the ith student's "mathematical intelligence", are the factor loadings for the kth subject, for j = 1, 2. εk,i is the difference between the ith student's score in the kth subject and the average score in the kth subject of all students whose levels of verbal and mathematical intelligence are the same as those of the ith student,

In matrix notation, we have

X = μ + LF + ε

where

X is a 10 × 1,000 matrix of observable random variables, μ is a 10 × 1 column vector of unobservable constants (in this case "constants" are quantities not differing from one individual student to the next; the randomness arises from the random way in which the students are chosen), L is a 10 × 2 matrix of factor loadings (unobservable constants), F is a 2 × 1,000 matrix of unobservable random variables, ε is a 10 × 1,000 matrix of unobservable random variables.

Observe that by doubling the scale on which "verbal intelligence"—the first component in each column of F—is measured, and simultaneously halving the factor loadings for verbal intelligence makes no difference to the model. Even if they are uncorrelated, we can not tell which factor corresponds to verbal intellegence and which corresponds to mathematical intellegence without an outside argument.) The "errors" ε are taken to be independent of each other. [How this is done is a subject that must get addressed in this article, which remains "under construction".]

Factor analysis in psychometrics

History

Charles Spearman pioneered the use of factor analysis in the field of psychology and is sometimes credited with the invention of factor analysis.

Raymond Cattell expanded on Spearman’s idea of a two-factor theory of intelligence after performing his own tests and factor analysis. His research lead to the development of his theory of fluid and crystallized intelligence, as well as his 16 Personality Factors theory of personality. Cattell was a strong advocate of factor analysis and psychometrics.

Applications in psychology

Factor analysis has been used in the study of human intelligence and human personality as a method for comparing the outcomes of (hopefully) objective tests and to construct matrices to define correlations between these outcomes, as well as finding the factors for these results.

Advantages

Offers a much more objective method of testing traits such as intelligence in humans Allows for a satisfactory comparison between the results of intelligence tests Provides support for theories that would be difficult to prove otherwise

Disadvantages

"...each orientation is equally acceptable mathematically. But different factorial theories proved to differ as much in terms of the orientations of factorial axes for a given solution as in terms of anything else, so that model fitting did not prove to be useful in distinguishing among theories." This means all rotations represent different underlying processes, but all rotations are equally valid outcomes of standard factor analysis optimization. Therefore, it is impossible to pick the proper rotation using factor analysis alone. "[Raymond Cattell] believed that factor analysis was 'a tool that could be applied to the study of behavior and ... Interpreting factor analysis is based on using a “heuristic”, which is a solution that is "convenient even if not absolutely true" (Richard B.

Factor analysis in marketing

The basic steps are:

Identify the salient attributes consumers use to evaluate products in this category. Input the data into a statistical program and run the factor analysis procedure.

Analysis

The analysis will isolate the underlying factors that explain the data. Factor analysis is an interdependence technique. Factor analysis assumes that all the rating data on different attributes can be reduced down to a few important dimensions. The statistical algorithm deconstructs the rating (called a raw score) into its various components, and reconstructs the partial scores into underlying factor scores. The degree of correlation between the initial raw score and the final factor score is called a factor loading. There are two approaches to factor analysis: "principal component analysis" (the total variance in the data is considered); and "common factor analysis" (the common variance is considered).

Note that there are very important conceptual differences between the two approaches, an important one being that the common factor model involves a testable model whereas principal components does not. This is due to the fact that in the common factor model, unique variables are required to be uncorrelated, whereas residuals in principal components are correlated. If you want to build a testable model to explain the intercorrelations among input variables, you should carry out a factor analysis. If the observed variables are completely unrelated, factor analysis is unable to produce a meaningful pattern (though the eigenvalues will highlight this: suggesting that each variable should be given a factor in its own right). If sets of observed variables are highly similar to each other but distinct from other items, Factor analysis will assign a factor them, even though this factor will essentially capture true variance of a single item.

factor VIII - Genetics, Physiology, Therapeutic use [next] [back] facade

User Comments Add a comment…