Some researchers have written to me concerned about the low coefficient alphas they have obtained for the TIPI scales or the poor factor structures obtained in factor analyses. However, the TIPI was not designed with these criteria in mind; in fact, the TIPI was designed using criteria that almost guarantee it will perform poorly in terms of alpha and Confirmatory Factor Analysis (CFA) or Exploratory Factor Analysis (EFA) indices. It is almost impossible to get high alphas and good fit indices in instruments like the TIPI, which are designed to measure very broad domains with only two items per dimension and using items at both the positive and negative poles. For this reason some researchers have pointed out that alphas are misleading when calculated on scales with small numbers of items (Kline, 2000; Wood & Hampson, 2005).
As noted in the original TIPI manuscript (Gosling et al., 2003), the goal of the TIPI was to create a very short instrument that optimized validity (including content validity). The goal was NOT to create an instrument with high alphas and good CFA fits. It would have been easy to design scales that optimized alpha and CFA fits. For example, we could have created an “Extraversion” scale from the items “Talkative, verbal” and “Untalkative, quiet.” But had we done this we would have essentially developed a scale measuring just one facet (talkativeness) of Extraversion; the high alphas and impressive fit indices would have come at the high cost of more important concerns like content validity and, in all likelihood, criterion validity. Criteria like alpha and clean factor structures are only meaningful to the extent they reflect improved validity. In cases like the TIPI (using a few items to measure broad domains), they don’t. If reliability estimates are needed, a more appropriate index would be test-retest reliability.