Cambridge Encyclopedia :: Cambridge Encyclopedia Vol. 60

psychometrics - Origins and background, Definition of measurement in the social sciences, Instruments and procedures, Theoretical approaches

A branch of psychology concerned with the measurement of psychological characteristics, especially intelligence, abilities, personality, and mood states. Psychometric tests are carefully constructed and standardized to provide measures of the highest possible reliability and validity.

Portions of the summary below have been contributed by Wikipedia.
Psychology
Portal - History
Areas
Applied
Biological
Clinical
Cognitive
Developmental
Educational
Evolutionary
Experimental
Industrial/Org
Linguistics
Social
Lists
Publications
Topics

Psychometrics is the field of study concerned with the theory and technique of psychological measurement, which includes the measurement of knowledge, abilities, attitudes, and personality traits.

Origins and background

Much of the early theoretical and applied work in psychometrics was undertaken in an attempt to measure intelligence. Charles Spearman, a pioneer in psychometrics who developed approaches to the measurement of intelligence, studied under Wilhelm Wundt and was trained in psychophysics. Thurstone later developed and applied a theoretical approach to the measurement referred to as the law of comparative judgment, an approach which has close connections to the psychophysical theory developed by Ernst Heinrich Weber and Gustav Fechner.

More recently, psychometric theory has been applied in the measurement of personality, attitudes and beliefs, academic achievement, and in health-related fields.

Definition of measurement in the social sciences

The definition of measurement in the social sciences has a long history. Although widely adopted, this definition differs in important respects from the more classical definition of measurement adopted throughout the physical sciences, which is that measurement is the numerical estimation and expression of the magnitude of one quantity relative to another (Michell, 1997). 49)

These divergent responses are reflected to a large extent within alternative approaches to measurement. On the other hand, when measurement models such as the Rasch model are employed, numbers are not assigned based on a rule. Measurements are estimated based on the models, and tests are conducted to ascertain whether it has been possible to meet the relevant criteria.

Instruments and procedures

The first psychometric instruments were designed to measure the concept of intelligence. The best known historical approach involves the Stanford-Binet IQ test, developed originally by the French Psychologist Alfred Binet. Nevertheless, IQ tests are useful tools for various purposes.

Psychometrics is applied widely in educational assessment to measure abilities in domains such as reading, writing, and mathematics. The main approaches in applying tests in these domains have been Classical Test Theory and the more modern Item Response Theory and Rasch measurement models.

Another major focus in psychometrics have been on personality testing. An alternative approach involves the application of unfolding measurement models, the most general being the Hyperbolic Cosine Model (Andrich &

Theoretical approaches

Psychometric theory involves several distinct areas of study. First, psychometricians have developed a large body of theory used in the development of mental tests and analysis of data collected from these tests. This work can be roughly divided into classical test theory (CTT) and the more recent item response theory (IRT). An approach which is similar to IRT but also quite distinctive, in terms of its origins and features, is represented by the Rasch model for measurement. The development of the Rasch model, and the broader class of models to which it belongs, was explicitly founded on requirements of measurement in the physical sciences (Rasch, 1960).

University of Phoenix

Second, psychometricians have developed methods for working with large matrices of correlations and covariances. These methods allow statistically sophisticated models to be fitted to data and tested to determine if they are adequate fits.

Key concepts

The key traditional concepts in classical test theory are reliability and validity. A reliable measure is measuring something consistently, while a valid measure is measuring what it is supposed to measure. A reliable measure may be consistent without necessarily being valid, e.g., a measurement instrument like a broken ruler may always under-measure a quantity by the same amount each time (consistently), but the resulting quantity is still wrong, that is, invalid.

Both reliability and validity may be assessed mathematically. Internal consistency may be assessed by correlating performance on two halves of a test (split-half reliability); the value of the Pearson product-moment correlation coefficient is adjusted with the Spearman-Brown prediction formula to correspond to the correlation between two full-length tests. Stability over repeated measures is assessed with the Pearson coefficient, as is the equivalence of different versions of the same measure (different forms of an intelligence test, for example).

Validity may be assessed by correlating measures with a criterion measure known to be valid. Content validity is simply a demonstration that the items of a test are drawn from the domain being measured. In a personnel selection example, test content is based on a defined statement or set of statements of knowledge, skill, ability, or other characteristics obtained from a job analysis.

Predictive or concurrent validity cannot exceed the square of the correlation between two versions of the same measure.

Item response theory models the relationship between latent traits and responses to test items. For example, a university student's knowledge of history can be deduced from his or her score on a university test and then be compared reliably with a high school student's knowledge deduced from a less difficult test. Scores derived by classical test theory do not have this characteristic, and assessment of actual ability (rather than ability relative to other test-takers) must be assessed by comparing scores to those of a norm group randomly selected from the population. In fact, all measures derived from classical test theory are dependent on the sample tested, while, in principle, those derived from item response theory are not.

Standards of quality

The considerations of validity and reliability typically are viewed as essential elements for determining the quality of any test. However, professional and practitioner associations frequently have placed these concerns within broader contexts when developing standards and making overall judgments about the quality of any test as a whole within a given context.

Testing standards

In the field of psychometrics, the Standards for Educational and Psychological Testing place standards about validity and reliability, along with errors of measurement and related considerations under the general topic of test construction, evaluation and documentation. The second major topic covers standards related to fairness in testing, including fairness in testing and test use, the rights and responsibilities of test takers, testing individuals of diverse linguistic backgrounds, and testing individuals with disabilities. The third and final major topic covers standards related to testing applications, including the responsibilities of test users, psychological testing and assessment, educational testing and assessment, testing in employment and credentialing, plus testing in program evaluation and public policy.

Evaluation standards

In the field of evaluation, and in particular educational evaluation, the Joint Committee on Standards for Educational Evaluation has published three sets of standards for evaluations.

psychometry - Depictions of psychometry in popular culture [next] [back] psychology - History, Principles, Scope of psychology, Research methods, Criticism, Art Psychology

User Comments Add a comment…