Microsoft Store
 

Psychometrics


 

For information regarding the parapsychology phenomenon of distance knowledge, see psychometry.

Theoretical approaches

Psychometric theory involves several distinct areas of study. First, psychometricians have developed a large body of theory used in the development of mental tests and analysis of data collected from these tests. This work can be roughly divided into classical test theory (CTT) and the more recent item response theory (IRT). An approach which is similar to IRT but also quite distinctive, in terms of its origins and features, is represented by the Rasch model for measurement. The development of the Rasch model, and the broader class of models to which it belongs, was explicitly founded on requirements of measurement in the physical sciences (Rasch, 1960).

Related Topics:
Classical test theory - Item response theory - Rasch model

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Second, psychometricians have developed methods for working with large matrices of correlations and covariances. Techniques in this general tradition include factor analysis (finding important underlying dimensions in the data), multidimensional scaling (finding a simple representation for high-dimensional data) and data clustering (finding objects which are like each other). In these multivariate descriptive methods, users try to simplify large amounts of data. More recently, structural equation modeling and path analysis represent more sophisticated approaches to solving this problem of large covariance matrices. These methods allow statistically sophisticated models to be fitted to data and tested to determine if they are adequate fits.

Related Topics:
Factor analysis - Multidimensional scaling - Data clustering - Structural equation modeling - Path analysis

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Key concepts

The key traditional concepts in classical test theory are reliability and validity. A reliable measure is measuring something consistently, while a valid measure is measuring what it is supposed to measure. A reliable measure may be consistent without necessarily being valid, .e.g., a measurement instrument like a broken ruler may always under-measure a quantity by the same amount each time (consistently), but the resulting quantity is still wrong, that is, invalid. For another example, a reliable rifle will have a tight cluster of bullets in the target, while a valid one will center that cluster around the center of the target.

Related Topics:
Reliability - Validity

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Both reliability and validity may be assessed mathematically. Internal consistency may be assessed by correlating performance on two halves of a test (split-half reliability); the value of the Pearson product-moment correlation coefficient is adjusted with the Spearman-Brown prediction formula to correspond to the correlation between two full-length tests. Other approaches include the intra-class correlation (the ratio of variance of measurements of a given target to the variance of all targets). A commonly used measure is Cronbach's α, which is equivalent to the mean of all possible split-half coefficients. Stability over repeated measures is assessed with the Pearson coefficient, as is the equivalence of different versions of the same measure (different forms of an intelligence test, for example). Other measures are also used.

Related Topics:
Pearson product-moment correlation coefficient - Spearman-Brown prediction formula - Cronbach's α

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Validity may be assessed by correlating measures with a criterion measure known to be valid. When the criterion measure is collected at the same time as the measure being validated the goal is to establish concurrent validity; when the criterion is collected later the goal is to establish predictive validity. A measure has construct validity if it is related to other variables as required by theory. Content validity, or face validity, is simply a demonstration that the items of a test are drawn from the domain being measured; it does not guarantee that the test actually measures phenomena in that domain.

Related Topics:
Concurrent validity - Predictive validity - Construct validity - Content validity

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Predictive or concurrent validity cannot exceed the square of the correlation between two versions of the same measure.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Item response theory models the relationship between latent traits and responses to test items. Among other advantages, IRT provides a basis for obtaining an estimate of the location of a test-taker on a given latent trait as well as the standard error of measurement of that location. For example, a university student's knowledge of history can be deduced from his or her score on a university test and then be compared reliably with a high school student's knowledge deduced from a less difficult test. Scores derived by classical test theory do not have this characteristic, and assessment of actual ability (rather than ability relative to other test-takers) must be assessed by comparing scores to those of a norm group randomly selected from the population. In fact, all measures derived from classical test theory are dependent on the sample tested, while, in principle, those derived from item response theory are not.

Related Topics:
Latent traits - Norm group

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

For some, the field of psychometrics has controversial aspects relating to the human implications of applied measurement. In part, the controversy involves the very notion of standardized tests. For others, the problematic aspects of psychometrics involve the history of the field, which involve aspects of eugenics.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~