• Nie Znaleziono Wyników

The quadratic relationship between difficulty of intelligence test items and their correlations with working memory

N/A
N/A
Protected

Academic year: 2022

Share "The quadratic relationship between difficulty of intelligence test items and their correlations with working memory"

Copied!
13
0
0

Pełen tekst

(1)

in Psychology

OPEN ACCESS

Edited by:

Gabriel Radvansky, University o f Notre Dame, USA

Reviewed by:

Michael J. Kane, University o f North Carolina at Greensboro, USA Jason F. Reimer, California State University, San Bernardino, USA

*Correspondence:

Tomasz Smolen, Neurocognitive Psychology Unit, Department o f Psychology, Pedagogical University o f Krakow, ul. Podchorazych 2, 30-084 Krakow, Poland tsmolen@up.krakow.pl

Specialty section:

This article was submitted to Cognition, a section o f the journal Frontiers in Psychology

Received: 17 June 2015 Accepted: 07 August 2015 Published: 24 August 2015

Citation:

Smolen T and Chuderski A (2015) The quadratic relationship between difficulty o f intelligence test items and their correlations with working memory. Front. Psychol. 6:1270.

doi: 10.3389/fpsyg.2015.01270

<D

CrossMark

The quadratic relationship between difficulty of intelligence test items and their correlations with working memory

Tom asz S m o le n 1* a n d A d a m O huderski2

1 Neurocognitive Psychology Unit, Department o f Psychology, Pedagogical University o f Krakow, Krakow, Poland, 2 Cognitive Science Department, Institute o f Philosophy, Jagiellonian University, Krakow, Poland

Fluid intelligence (Gf) is a crucial cognitive ability that involves abstract reasoning in order to solve novel problems. Recent research demonstrated that Gf strongly depends on the individual effectiveness of working memory (WM). We investigated a popular claim that if the storage capacity underlay the W M -G f correlation, then such a correlation should increase with an increasing number of items or rules (load) in a Gf-test. As often no such link is observed, on that basis the storage-capacity account is rejected, and alternative accounts of Gf (e.g., related to executive control or processing speed) are proposed. Using both analytical inference and numerical simulations, we demonstrated that the load-dependent change in correlation is primarily a function of the amount of floor/ceiling effect for particular items. Thus, the item-wise WM correlation of a Gf-test depends on its overall difficulty, and the difficulty distribution across its items. When the early test items yield huge ceiling, but the late items do not approach floor, that correlation will increase throughout the test. If the early items locate themselves between ceiling and floor, but the late items approach floor, the respective correlation will decrease. For a hallmark Gf-test, the Raven-test, whose items span from ceiling to floor, the quadratic relationship is expected, and it was shown empirically using a large sample and tw o types of WMC tasks. In consequence, no changes in correlation due to varying W M /Gf load, or lack of them, can yield an argument for or against any theory of WM/Gf. Moreover, as the mathematical properties of the correlation formula make it relatively immune to ceiling/floor effects for overall moderate correlations, only minor changes (if any) in the W M -G f correlation should be expected for many psychological tests.

Keywords: fluid intelligence, working memory, Raven-test, correlation, floor/ceiling effects

1. Introduction

Fluid intelligence (G f) is an im p o rta n t cognitive ability th a t co n stitu tes th e m ain c o m p o n e n t o f h u m a n general intellectual a p titu d e (G ustaffson, 1984). G f consists o f using reaso n in g (inductive, deductive, spatial, etc.) in o rd e r to solve novel ab stract p ro b lem s unsolvable b y solely using existing know ledge. Fluid intelligence explains a large p a rt o f in d iv id u al differences in th e diverse types o f h u m a n co g n itio n a n d behavior. F or in stan ce, m o re in tellig en t people are b e tte r in know ledge acquisition, language co m p reh en sio n , an d spatial navigation, th ey achieve on average a h ig h er

Frontiers in Psychology | www.frontiersin.org 1 August 2015 | Volume 6 | Article 1270

(2)

socioeconom ical status (in clu d in g academ ic, professional, an d financial one) th a n do less in tellig en t people, a n d th ey also b e tte r m an ag e in daily life (e.g., less often m e e t w ith accidents, m o re effectively go th ro u g h m edical tre a tm e n ts, etc., D eary , 2012). A h allm ark test o f fluid intelligence is R aven’s A dvanced Progressive M atrices test, w h ic h req u ires discovering o n e or m o re ab stract rules h id d e n w ith in th e geom etrical p a tte rn th a t is m issing o n e frag m en t, an d applying th o se rules in o rd e r to choose fro m several alternatives o n e co rrect so lu tio n th a t b est m atches th e p attern .

A n im p o rta n t th e m e in fluid intelligence research consists o f iden tificatio n o f its u n d erly in g cognitive m echanism s. T h e last 20 years o f research have p ro d u c e d conv in cin g evidence th a t th e stro n g est k n o w n p re d ic to r o f G f is th e capacity (W M C ) o f w o rk in g m e m o ry (W M )— a n eu ro co g n itiv e m ech a n ism responsible for active m a in te n a n c e an d tra n sfo rm a tio n o f task- relev an t in fo rm a tio n in th e m in d . N u m e ro u s studies have d e m o n s tra te d th a t, w h en p ro p erly m easu red (see below ), W M C can explain betw een h a lf (K ane e t al., 2005) a n d all variance in G f (O b erau er e t al., 2008; C h u d ersk i, 2013). U n fo rtu n ately , th is o b serv atio n d id n o t lead autom atically to th e identification o f w h at m akes b o th W M a n d G f co rrelate strongly, because W M tasks are them selves q u ite com plex; usually m o re th a n one W M p ro cess/reso u rce is involved in p erfo rm an ce in these tasks (Shipstead e t al., 2014). T h u s, o n e o f th e m o s t exciting debates in G f research co n cern s th e id en tificatio n o f th e m ech an ism s responsible for th e stro n g association betw een W M C an d Gf.

O n e in flu en tial th e o ry assum es th a t sh ared variance in b o th W M tasks a n d G f-tests dep en d s on a tte n tio n co n tro l exerted over cognitive processes th a t includes goal-driven d irecting atte n tio n an d filtering o u t d istractio n (K ane an d Engle, 2002;

S hipstead e t al., 2014). Evidence fo r th is th e o ry com es fro m significant co rrelatio n s betw een G f an d th e indices o f executive co n tro l o b tain ed fro m various tasks, fo r exam ple involving m e m o ry u p d a tin g (Burges e t al., 2011), th e in h ib itio n o f u n w an ted th o u g h ts o r p re p o te n t responses (D em p ster and Corkill, 1999; U n sw o rth et al., 2004), a n d d u al-task in g (Ben- S h ak h ar a n d Sheffer, 2001) . M oreover, research on h ig h -W M C individuals (usually assessed w ith th e com plex span task th a t strongly correlates w ith G f), c o m p ared to lo w -W M C people, d e m o n s tra te d th a t th e fo rm e r w ere faster a n d m o re accurate on antisaccades (U n sw o rth e t al., 2004), p ro d u c e d sm aller e rro r rates in in c o n g ru e n t trials using a h ig h -c o n g ru e n t version o f the Stroop test (K ane an d Engle, 2003) a n d th e flanker task (H eits and Engle, 2007), as w ell as m o re effectively suppressed distracto rs in a d ich o tic listen in g task (C onw ay e t al., 2001). T he atte n tio n - co n tro l th e o ry o f fluid reaso n in g h o ld s th a t people w ith low atte n tio n co n tro l are p o o r reaso n ers because th ey find it difficult to m a in ta in reaso n in g goals, an d th e ir cognitive processing is p ro n e to fre q u e n t cap tu re by irrele v an t stim uli.

A lternatively, p erfo rm an ce on sim ple s to ra g e c a p a c ity (sh o rt- te rm m em ory; STM ) tasks th a t involve little a tte n tio n co n tro l b u t req u ire th e active m a in te n a n c e o f a few item s in parallel, w as at least as go o d a p re d ic to r o f G f as p e rfo rm a n c e on tasks re q u irin g executive co n tro l, w h en reh earsal a n d c h u n k in g w ere b locked in th e fo rm e r tasks (e.g., C ow an e t al., 2006; C olom e t al., 2008; C h u d ersk i e t al., 2012) . T hese results suggest th a t storage

capacity m ay be th e m ain d e te rm in a n t o f fluid intelligence. O ne e x p lan a tio n (C a rp e n te r e t al., 1990) p red icts th a t m o re capacious W M allows to keep th e su b -p ro d u cts o fre a so n in g (in d u ced rules, elem ents o f a so lu tio n , etc.) in th e m o st active a n d accessible p a rt o f W M , called p rim a ry m e m o ry (C ow an e t al., 2006). W M m ay also play an im p o rta n t role in fluid reaso n in g because it affects w h a t relatio n sh ip s can be c o n stru cte d am o n g W M item s (e.g., H u m m e l a n d H olyoak, 2003). N otably, O b erau er et al. (2008) p ro p o se d th a t r e la tio n a l in te g r a t io n—th e co n stru c tio n o f flexible, te m p o ra ry b in d in g s betw een a n u m b e r o f chunks h eld in W M in o rd e r to develop novel, m o re com plex stru c tu re s—is crucial to reasoning.

A lth o u g h c u rre n t th eo rizin g te n d s to acknow ledge th a t b o th executive co n tro l a n d storage capacity m ech a n ism s in som e way c o n trib u te to G f (e.g., C ow an e t al., 2006; C h u d ersk i a n d N ecka, 2012; S hipstead e t al., 2014; U n sw o rth e t al., 2014) th e m u tu a l relatio n sh ip s betw een these tw o m ech a n ism s have n o t y et been u n d e rs to o d satisfactorily (are th ey in te ra c tin g o r in d ep en d e n t?

does on e u n d erlie th e other, o r vice versa?), an d it is still argued th a t eith er executive co n tro l (e.g., Burges et al., 2011; Shipstead e t al., 2014) o r storage capacity (M artin ez e t al., 2011; C huderski e t al., 2 0 1 2) is a m o re fu n d a m e n ta l facto r fo r ex plaining fluid intelligence (w hereas th e o th e r facto r ju s t explains som e m in o r v arian ce in Gf). O n e im p o rta n t set o f arg u m en ts in favor o f each th e o ry cam e fro m th e analysis o f W M - G f correlatio n s in the fu n c tio n o f an in creasin g difficulty o f G f-test item s. Such studies em pirically tested th e h y pothesis, originally p u t fo rw ard b y the sem inal capacity-based m o d el o f processing in th e R aven-test (C a rp e n te r et al., 1990), w h ich assum ed th a t m o re difficult item s o f th e G f-tests sh ould involve m o re in fo rm a tio n b ein g stored in W M , an d th u s such item s sh ould yield stro n g er correlations betw een G f a n d W M th a n do easier item s, w h en W M loads are unlikely to surpass th e W M C o f m o s t p articip an ts. T he logic o f such tests w as th e follow ing: if th e storage-capacity acco u n t is rig h t, th e n th e positive co rrelatio n betw een th e R aven item difficulty a n d W M C sh ould be observed; otherw ise, if no or even negative co rrelatio n is n o ted , th e n th e storage capacity acc o u n t sh o u ld be rejected, an d th e re is ro o m fo r som e alternative ex p lan a tio n s o f th e n eu ro co g n itiv e basis o f G f (m o st possibly, the executive co n tro l o r processing speed accounts).

Som e researchers have in d eed fo u n d evidence th a t P earso n ’s

r increases fo r m o re difficult item s (Little e t al., 2014) , a n d on th is basis advocated th e plausibility o f storage-capacity account;

w hereas o th ers fo u n d such co rrelatio n s to be fairly co n sta n t (S althouse, 1993, 2014; U n sw o rth an d Engle, 2005; S althouse and P ink, 2008; W iley e t al., 2011), a n d th u s rejected th is account, in stead o p tin g for th e a tte n tio n -c o n tro l account. M oreover, a sim ilar a rg u m e n t has b een used o u tsid e th e W M do m ain , fo r exam ple in studies o f relatio n sh ip s betw een intelligence an d aging (S althouse, 1993; B abcock, 2002) o r learn in g (C arlstedt e t al., 2000; V erguts an d D eBoeck, 2002).

O u r goal is to show th a t th e above line o f reasoning: if m o re difficult G f item s lead to stro n g er W M - G f co rrelatio n s, th e n storage capacity likely u nderlies Gf, if n o t, th e n som e o th er m ech a n ism s m u s t u n d e rp in Gf, alth o u g h intuitively attractive, is nevertheless fu n d a m e n ta lly flawed. To o u tlin e o u r reaso n in g , th e change o f co rrelatio n fo r a p a rtic u la r item p rim arily depends

Frontiers in Psychology | www.frontiersin.org 2 August 2015 | Volume 6 | Article 1270

(3)

on th e a m o u n t o f floor/ceiling effect fo r th a t item , an d the m axim al stre n g th o f co rrelatio n exists w h en no such effects are present. T h u s, co rrelatio n s d ro p fo r b o th very easy a n d very difficult item s (i.e., a lo t o f floor/ceiling), an d are th e h ig h est for item s o f average difficulty (little o r n o floor/ceiling). M oreover, th e m ath em atical fo rm u la fo r th e P earson c o rrelatio n is highly im m u n e even to relatively large a m o u n ts o f ceiling o r floor effects if an overall W M - G f co rrelatio n is m o d e ra te o r w eak, a n d th u s only m in o r differences in c o rrelatio n betw een easy/difficult an d m e d iu m item s can be expected fo r m o s t o f psychological tests. In consequence, n o change in W M - G f c o rrelatio n w ith increasing G f-test difficulty, o r lack o f it, can be used as an arg u m e n t in favor o f o r against eith er th e ex ecu tiv e-co n tro l o r storage-capacity accounts o f W M /G f. T his m a tte r is n o t on ly a statistical issue, b u t is a key m eth o d o lo g ical a n d th eo retical p ro b lem , because th e a rg u m e n t in q u estio n has been raised by n u m e ro u s n otable researchers in th e ir th e o riz in g a b o u t th e W M an d intelligence relatio n sh ip (see below ). T h u s, it is crucial to system atically evaluate th e validity o f this argum ent.

T he rem a in in g te x t has b een divided in to fo u r sections: First, w e p re s e n t th e m o s t n o tab le exam ples in th e lite ra tu re o f the arg u m en ts related to th e in creasin g difficulty o f G f-tests. Second, using analytical inference w e show th e co rrelatio n betw een tw o variables to (slowly) decrease w ith in creasin g ceiling o r floor effects. T h ird , w ith n u m erical sim u latio n s w e d e m o n stra te w hen th e co rrelatio n fo rm u la is insensitive to such effects. Finally, w e co n firm o u r p re d ic tio n em pirically w ith th e use o f a large sam ple o f p articip an ts, th e R aven-test, an d tw o m e th o d s o f W M C m easu rem en t.

2. Examples of Relevant Studies

T h ere are at least tw o cases in w h ich co rrelatio n betw een tw o variables m ay p o ten tially change betw een tw o co n d itio n s (e.g., easy a n d difficult). First, in C o n d itio n 1, th e m ech a n ism g en eratin g v arian ce in variable x m ay also generate a substantial variance in variable y, b u t it m ay yield less o r no such variance in C o n d itio n 2 (we will develop th is arg u m e n t form ally in th e su b seq u en t section). T hus, in C o n d itio n 1, th e variance o f variable y consists o f a relatively large a m o u n t o f variance shared w ith variable x , an d a relatively sm all a m o u n t o f e rro r variance (reflecting m e a s u re m e n t erro rs a n d th e effects o f all o th er variables affecting variable y n o t acco u n ted for). In C o n d itio n 2, how ever, th e variance o f variable y consists m ain ly o f e rro r variance, w hereas th e a m o u n t o f v arian ce sh ared w ith variable x is relatively small. As th e co rrelatio n coefficient reflects th e squared ro o t o f sh ared variance, such a coefficient will be substantially larg er in C o n d itio n 1 th a n in C o n d itio n 2. F or exam ple, in o u r re c e n t stu d y (C h u d ersk i an d N ecka, 2012) , w e fo u n d th a t scores on th e w ell-know n W M task, n am ely th e n-b ack task, d ep en d on eith er p rim a ry m em o ry , w h en th e W M lo ad is sm all, o r activated lo n g -te rm m e m o ry (LTM ), w h en th e lo ad is large (an d th e to -b e- d etected targ et falls o u t o f th e p rim a ry m em ory). T h en , w e fo u n d th a t th e p rim a ry -m e m o ry co n d itio n su bstantially co rrelated w ith scores on intelligence tests (r « 0.5), w hereas in th e LTM c o n d itio n th is co rrelatio n w as close to zero. T h u s, w e concluded th a t p rim a ry m e m o ry u nderlies intelligence, w hereas LTM is

n o t an im p o rta n t m ec h a n ism for Gf. In general, th is type o f a rg u m e n t is q u ite obvious, a n d a su b stan tial p a rt o f psychological know ledge has been in fe rre d w ith such a logic.

H ow ever, th e studies th a t raised th e arg u m e n t related to the in creasin g W M - G f co rrelatio n w ith in creasin g difficulty o f Gf- tests relies on q u ite a different logic. T hese studies assum e th a t the sam e m ech a n ism drives variable y in b o th conditions. H ow ever, it is expected th a t in an easier co n d itio n th e ceiling effect w ill arise, an d b ecause th is w ill yield less v ariatio n in data, it w ill also yield a low er co rrelatio n betw een variables, th a n will a m o re difficult co n d itio n in w h ic h ceiling effects will be red u ced o r ab sen t (an d th e floor effect will still be low).

F or exam ple, investigating relatio n s betw een intelligence, aging, an d W M , S althouse ( 1993, see also S althouse an d P in k , 2008; S althouse, 2014) fo u n d W M - G f co rrelatio n s fairly stable fo r th e Raven item s th a t differed in difficulty:

Average solution accuracy varied considerably across the items examined, an d it seems reasonable to hypothesize th at at least some of the item variation m ig h t have been due to increased w orking m em ory dem ands. [... ] The configuration o f results [... ] presents a challenge for interpretation. O n the one hand, there is evidence o f m oderate to large relations betw een the m easures of w orking m em ory an d m atrix reasoning perform ance, b u t on the other hand, the data indicate th at these relations are no greater for difficult (low accuracy) th an for easy (high accuracy) problem s.

A [... ] potential explanation is th at m uch o f the variation in item difficulty m ay be attributable to factors unrelated to w orking m em ory (pp. 181-182).

S im ilar arg u m en ts can be fo u n d in U n sw o rth an d Engle (2005) , w ho divided scores on th e R aven-test item s in to fo u r quartiles acco rd in g to decreasing m e a n accuracy, a n d fo u n d co n sta n t co rrelatio n s o f th e first th ree quartiles w ith a v a ria n t o f com plex span task:

[... ] the correlation betw een solution accuracy an d a m easure o f w orking m em ory capacity should increase as the num ber of rules, goals, an d /o r sub-results on a given problem increases (given th at there is enough systematic variability present). T hat is, item s w ith low m em ory loads will n o t exceed even the capacity o f low W M span participants an d thus m o st individuals should get these problem s right an d there should be little systematic variability present. However, as m em ory load increases so will item discrim inability an d thus the item -W M span correlations will increase (p. 70).

[... ] the results suggest that, for the m ost part, the [W M -Gf]

correlations are fairly constant an d do n o t vary systematically w ith variations in m em ory load [... ] Taken together, the results o f the present study strongly suggest th at the num ber o f goals or sub-results th at can be held in m em ory does n o t account for the shared variance betw een w orking m em ory span m easures an d fluid intelligence. Thus, the results do n o t support the hypothesis [... ] th at the link betw een individual differences in w orking m em ory capacity an d intelligence is due to differences in the ability to hold a certain num ber o f item s in working m em ory (p. 78).

Frontiers in Psychology | www.frontiersin.org 3 August 2015 | Volume 6 | Article 1270

(4)

F u rth e rm o re , W iley et al. (2011) used c o n sta n t p o in t-b iserial co rrelatio n s betw een progressively m o re difficult item s o f the R aven-test a n d a v a ria n t o f th e com plex span to conclude:

O ur results [... ] are consistent w ith previous findings in that neither the norm ative difficulty o f RAPM item s n o r the num ber of rule tokens required for solution show ed the positive relation with W M C th at w ould be predicted by a rule/capacity explanation.

[... ] these factors do n o t seem to be w hat drives the relation with WMC. [... ] Thus, differences in the quality o f executive function, and n o t capacity per se, m ay be responsible for the relationship between W M C and RAPM (p. 261).

All these arg u m en ts against th e storage-capacity acco u n t o f G f have often b een referred to in th e literatu re. To give only one exam ple fro m V igneau e t al. (2006):

A recent study by U nsw orth an d Engle (2005) show ed th at the relation betw een w orking m em ory capacity a n d the Raven seems to be rather constant across levels o f difficulty an d m em ory load.

This result is incom patible w ith the view th at the num ber of rules or rule instances required to solve an item is central to the expression o f individual differences on the Raven (p. 262).

P ro p o n e n ts o f th e storage-capacity acco u n t (Little e t al., 2014) d efended th e ir a p p ro a c h b y show ing th a t th e co n sta n t W M -R a v e n c o rrelatio n in prev io u s studies resu lted fro m m e a s u re m e n t e rro rs an d th e generally low level o f observed co rrelatio n s betw een th e R aven-test item s/q u artiles an d W M scores. O b tain in g p ro n o u n c e d such co rrelatio n s, th ey show ed th a t a relatively sm all b u t significant rise in th e W M -R av en co rrelatio n can be observed w h en th e Raven item s b eco m e m o re difficult:

H igh overall accuracy lowers the item-wise correlation for the early item s (i.e., the point-biserial correlation m u st be near zero if nearly everyone gets the item correct) resulting in an increasing slope across the entire test. For the later m ore difficult items, the participants w ho respond correctly have to come from the pool o f participants w ho have higher Raven’s scores an d higher WMC.

Consequently, w ith a high overall correlation between W M C and Raven’s, the point-biserial correlation between W M C an d the m ost difficult Raven's item s th at have the lowest accuracy, m ust be higher than the point-biserial correlation betw een W M C and the easiest Raven's item s (p. 6).

W hen there is a m oderate to strong overall correlation between W M C and perform ance on the Raven's-test o f fluid abilities, then the role o f W M C becom es increasingly m ore im portant as item difficulty increases. [... ] O ur results are compatible with theoretical analyses o f Raven’s perform ance th at appeal to w orking m em ory as a repository for rules and interm ediate results (pp. 10-11).

F ro m above citations w e can clearly see th a t w h e th e r o r n o t th e in crease in difficulty o f G f-test item s led to an increase in th e ir co rrelatio n has served as an im p o rta n t arg u m e n t in th e d eb ate betw een a tte n tio n -c o n tro l a n d storage-capacity (an d o th er) accounts. H ow ever, can th e presence/lack o f such

an increase serve as an a rg u m e n t in fav o r/ag ain st th e role o f storage capacity in Gf? W ith a sim ple fo rm al analysis, w e aim to show th a t th e answ er to th is q u estio n is definitely “n o .”

3. Analytical Inference of a Relationship Between test Difficulty and its Correlation with another Test

3.1. General Assumptions

In general, th e issue o f th e analysis o f co rrelatio n s betw een different types o f d a ta is q u ite com plex. H ow ever, w e can m ake a few assu m p tio n s th a t effectively sim plify th e reasoning. Som e o f th e assu m p tio n s have no effect on th e generalizability o f o u r reasoning. O th e r assu m p tio n s yield a sm all effect, b u t are reasonable on th e g ro u n d s o f em p irical d o m ain to w h ich we refer.

1. All analyzed variables (in d e p e n d e n t variable, tru e test score a n d test ou tco m e) display a m ean equal to zero an d a variance equal to one. T his assu m p tio n is justified b y th e fact th a t the c o rrelatio n e stim a to r is scale in d e p e n d e n t a n d th u s has no effect on reasoning.

2. All analyzed variables have th e n o rm a l distrib u tio n . 3. A p a rtic ip a n ts ability does n o t d e p e n d on te st difficulty.

4. C ovariance o f d e p e n d e n t an d in d e p e n d e n t variables equals one. W e assum e so fo r th e sake o f sim plicity o f o u r analytical arg u m en t, b u t once o u r p o in t is m ad e, w e show via n u m erical sim u latio n th a t th e sam e results still h o ld w hen this assu m p tio n is relaxed.

5. A test resu lt is a lin ear fu n c tio n o f ability an d test difficulty unless a floor o r ceiling effect arises (m ore details below ).

6. T he e rro r te rm o f th e lin ear d ep en d en cy betw een d e p e n d e n t a n d in d e p e n d e n t variables an d th e e rro r te rm o f th e test result are in d ep en d e n t.

W e w ill refer to follow ing ra n d o m variables:

1. T he tru e test score Z as it is defined on th e g ro u n d o f classical test th e o ry (G uilford, 1954).

2. T he in d e p e n d e n t variable X—th e observable variable w hose possible influence o n d e p e n d e n t variable is exam ined. W e suggest th a t th e tru e in d e p e n d e n t variable is n o t observable, a n d w h a t is observable is th e su m o f th e tru e in d e p e n d e n t variable V a n d som e ra n d o m noise Z (see Figure 1). H owever, since th e re p a th fro m X to Z is open, th a t is th e causal im p act flows fro m X to Z, (P earl, 2009) w e can tre a t variable X as possibly influencing d e p e n d e n t variable.

3. T he observable test resu lt Y—is a d e p e n d e n t variable.

W h en ev er w e use uppercase Latin le tte r (e.g., X ) w e refer to ra n d o m variable w hereas th e sam e low ercase letter (e.g. x) refers to a p a rtic u la r o b serv atio n derived fro m th a t variable.

3.2. Impact of Test Difficulty

T h e observable test resu lt Y is a su m o f tru e score Z a n d e rro r

Y. W e exam ine th e relatio n betw een tw o variables: in d e p e n d e n t variable X an d d e p e n d e n t test score Z . W e assum e th e re is a lin ear

Frontiers in Psychology | www.frontiersin.org 4 August 2015 | Volume 6 | Article 1270

(5)

FIGURE 1 | Relation between random variables discussed. Arrows indicate possible causal impact. White circles denote latent variables and gray circles denote observable variables. Lowercase Greek letters denote random noise.

relatio n betw een X a n d Z:

Z = a + 0 X + 8,

w here a an d 0 are lin ear coefficients a n d 8 is error. Variables X a n d Y are observable w hile tru e test score Z is latent.

O b serv ed test results Y is su m o f tru e test score Z a n d ra n d o m noise y:

No floor -2

* ’

: *

* *

•*;

• •• m W • ••

m" * j

• * * " • »

V

'

f

• • . # • *

>- - 1 o

* [

• : • • .

•r./*

ł * • 2

• :

•• • • •

• i ż

;y/, f * • %

. .. •• •

i .

t 1---1---1---1---1— h 1---1---1---1---r

-3 -2 -1 0 1 2 -3 -2 -1 0 1 2

X

FIGURE 2 | Observed data Y for different values of floor f . See description in text.

Y = Z + y .

T herefore, relatio n betw een Y a n d X is p re se n te d below:

Y = a + 0 X + 8 + y .

T he lin ear coefficients a a n d 0 are scale dep en d en t. As scale o f th e variables is o f no in terest here, we can assign to th e m values zero a n d one, respectively. E rro r term s 8 an d y are in d e p e n d e n t a n d n o rm ally d istrib u te d a n d have a m ean equal to zero. T hus, th e sum s = 8 + y o f th e e rro r te rm s also is n o rm ally d istrib u ted a n d has a m ean equal zero. If th e variance o f 8 is equal a82 an d th e v ariance o f y is equal a 2 th e n th e v ariance o f s is equal a82 + a ^ . T hus, th e relatio n betw een th e observed test results an d in d e p e n d e n t variable can be rew ritte n as follows:

Y = X + s. (1)

T he co rrelatio n betw een th e set o f o bserved results Y a n d th e in d e p e n d e n t variable X influencing th e m is th e n negatively linearly related w ith e rro r variance. It can be easily seen th a t w hen s = 0 th e n Y = X an d rX, Y = 1, a n d w hen s ^ r o th e n

rX, Y ^ 0.

W e ad o p te d follow ing d efinition o f test difficulty: w hen test difficulty increases, all results Y decreases p ro p o rtio n a lly to difficulty increase. W e also assum e th a t th ere exists floor value o f test resu lt f w hich is th e m in im al possible test score (see Figure 2). T hus, th e extended relation betw een observed results a n d tru e scores takes fo rm

Y = m ax(X + s, f ). (2)

T h a t im plies th a t if an y o b serv atio n y h a d value low er th a n f , th e n value f w o u ld be assigned to th is observation. D e p en d in g

o n th e a m o u n t o f floor effect, a bigger o r sm aller p a rt o f d ata will equal f . T he value o f floor can be expressed in one o f th e tw o m o st c o n v en ie n t scales: as a sta n d a rd score (in u n its o f th e d istrib u tio n o f X , w here f is m in im al possible value o f scores, e.g., zero co rrect responses) o r as a p ro p o rtio n o f values on floor.

As th e fo rm er scale is m o re suitable for fu rth e r inferences we will apply th a t scale, b u t one can easily tra n sfo rm f values to th e o th e r scale usin g th e n o rm a l d istrib u tio n fun ctio n . Im p o rtan tly , a lth o u g h o u r arg u m e n t refers to floor effect an d th e increasing difficulty o f a test, th e very sam e arg u m e n t sym m etrically applies to ceiling effect a n d th e d ecreasing difficulty o f th e test.

3.3. Analytical Inference

P earson p ro d u c t-m o m e n t co rrelatio n coefficient r betw een tw o ra n d o m variables X a n d Y is a q u o tie n t o f covariance o f th e variables a n d th e p ro d u c t o f th e ir sta n d a rd deviations:

rX,Y =

cov(X , Y) a X a Y o r alternatively:

rX,Y = Y ! l = 1 ( x - x )(yi - y )

/n

= x(xi - x y / E n = 1(y,- - y )

(3)

w here x e X, y e Y, a n d n = |X| = |Y| is th e n u m b e r o f observations. T he covariance cov(X , Y) betw een tw o jo in tly d istrib u te d ra n d o m variables X an d Y is a m easu re o f how strongly a value (relative to expected value) o f one variable is

2

Frontiers in Psychology | www.frontiersin.org 5 August 2015 | Volume 6 | Article 1270

(6)

lin k ed to value (also relative) o f th e second variable, an d it is defined as follows:

cov(X, Y) = ( x — E (x )) ( y - E ( y ) ) j ,

w here E(x) m ean s expected value o f x. So covariance is the stro n g er th e m o re observ atio n s vary fro m respective m e a n jo in tly on b o th scales a n d th e w eaker th e m o re ob serv atio n s tak e h ig h value on one scale an d lo w value on th e o th e r one. As a change in test difficulty does n o t influence th e p a rtic ip a n ts’ ability (A ssum ption 3), a n d th e sta n d a rd deviation o f th is ability equals one, th e facto r 1 / ^ ^ ] n= i(xi — X)2 in F o rm u la (3) rem ain s c o n sta n t an d equal one, so w e can ig n o re it.

C o rrelatio n does n o t d e p en d o n d a ta in tercep t; so as lo n g as th ere is n o floor, values Y can be in creased o r decreased w ith o u t an y change in c o rrelatio n (in fact th ey can be m an ip u la te d in any lin ear way). Such a decrease in value Y w o u ld be w h at w e have defined as th e effect o f an in crease in difficulty. N evertheless, the increase in difficulty w o u ld change th e d istrib u tio n o f Y w hen this change causes som e o f th e d ata p o in ts to re a c h floor. W h e n som e o f th e test results are on floor, w hereas th e in d e p e n d e n t variable rem ain s u n ch an g ed , th e ir co rrelatio n m ay alter in a n o n -lin e a r way. So, le t us exam ine w h a t im p act th e relative change o f f w ould have on F o rm u la (3).

W h e n floor is in tro d u c e d , th e d istrib u tio n o f test results Y b ecom es a m ix tu re o f tw o d istrib u tio n s: a tru n c a te d n o rm a l d istrib u tio n (a n o rm alized n o rm a l d istrib u tio n w ith rem o v ed values below th e th resh o ld ) w ith th re sh o ld f , a n d a degenerate (determ in istic) d istrib u tio n th a t includes on ly value f . F ro m n o w on, th e test difficulty w ill be defined as raisin g floor (and leaving Y unch an g ed ) in stead o f decreasing Y (an d leaving floor u nchanged). T hese tw o app ro ach es are fully equivalent, b u t the fo rm e r is m o re convenient.

P ro p o rtio n p o f results b elo n g in g to th e tru n c a te d n o rm a l d istrib u tio n is a fractio n o f th e n o rm a l d istrib u tio n for values n o t lo w er th a n f :

/

TO0(y)dy,

o r alternatively:

w here <p(x) is th e d en sity o f th e sta n d a rd n o rm a l d istrib u tio n at x a n d e rf is e rro r fu n ctio n , w h ich is a n o n -e le m e n ta ry fu n ctio n related to th e cum ulative n o rm a l distrib u tio n .

T he m e a n value y o f observed test resu lt Y is an expected value o f m ix tu re o f these tw o d istrib u tio n s w ith p ro p o rtio n s p ( f ) an d 1 — P ( f ),

y (f) =

If

( ^ (y )y rfy ) + f (1 — p f ) ) •

W h ereas, th e sta n d a rd deviation a y o f observed test resu lt Y equals

Let us n o w co n sid er covariance. As w e assum ed earlier, covariance o f th e variables analyzed equals one (as lo n g as th ere is no floor effect). T herefore, th e covariance cov(X , Y,f ) o f the jo in t d istrib u tio n o f Y an d X fo r a given value o f f is th e sum o f th e w eighted expected values o f th e p ro d u c ts o f differences b etw een th e values o f th e tw o d istrib u tio n s in clu d ed in Y an d th e ir co m m o n m e a n y:

cov(X , Y ,f ) = / — 0(y)yf — y ( f ) ) dy + / f TO 0(y )y (y — y ( f ^ dy-

N o te th a t because o f th e u n it covariance X is en tirely k n o w n given Y so th e re is n o n eed to in clu d e x values in th e fo rm u la [in fact th e expression cov(X , Y,f ) could be replaced b y cov(Y ,f )].

Figure 3 p resents p lo t o f th ree fun ctio n s [cov(X, Y,f ) , a y ( f ) , an d r X ,y f ) ] over floor. Values o f floor are in sta n d a rd deviation o f X. F loor f = —3 m ean s th a t all values o f Y below —3 w ere replaced b y th e value o f f analogously f = 0 m ean s th a t h a lf o f th e values o f Y w as replaced b y 0.

In consequence, it can be easily seen th a t (a) th e stre n g th o f c o rrelatio n betw een tw o variables is a fu n c tio n o f th e a m o u n t o f floor/ceiling effect fo r th e d e p e n d e n t variable. H ow ever, (b) for an y reasonable p ro p o rtio n o f results on th e floor/ceiling the c o rrelatio n decreases relatively little in co m p ariso n to th e case w h en no floor/ceiling effect is present. F or exam ple, fo r h a lf results on floor/ceiling, w h ich in psychology can be co n sid ered q u ite a stro n g floor/ceiling effect, rX, y d ro p s fro m 1.0 to 0.8 6. T h e rem a in in g p a rt o f th e p a p e r includes em p irical tests o f these tw o analytically derived p red ictio n s using b o th n um erically g en erated a n d actually observed W M C a n d G f variables.

4. Numerical Simulations

U n til now , w e have co n sid ered on ly an idealized case o f e rro r te rm equal zero, th a t is, th e case w h en X = Y. Let us focus on a m o re g eneral case in w h ich th e a m o u n t o f v arian ce in e rro r te rm influences correlation. Below, w e analyze th e change in c o rrelatio n d u e to an in creasin g floor value in d ata g en erated by a n u m erical sim ulation.

O bviously, th e co rrelatio n betw een X a n d Y is low er th a n u n ity w h en th e e rro r (e) is larg er th a n zero (see F o rm u la 1). T he larg er th e erro r, th e less values Y b eco m e d e te rm in e d b y values X, hen ce th e covariance an d c o rrelatio n are low er. So, le t us exam ine th e influence o f floor ( f ) on c o rrelatio n (rX, y ) fo r different values o f error.

In th e p re se n t sim u latio n s, values Y w ere d ete rm in e d by a relatio n sh ip depicted in F o rm u la (1). Values X w ere draw n fro m th e sta n d a rd n o rm a l d istrib u tio n . Values e w ere draw n fro m n o rm a l d istrib u tio n s w ith th e m e a n equal to zero, an d

Frontiers in Psychology | www.frontiersin.org 6 August 2015 | Volume 6 | Article 1270

(7)

variances system atically differing betw een 0 a n d 7. W h e n floor is virtually ab sen t ( f = —3), these values yield co rrelatio n s w ith in th e range betw een 1 a n d 0.18. Every sam ple c o m p rised 100,000 o bservations. To d eterm in e th e influence o f floor o n r, values o f Y in every sam ple w ere tra n sfo rm e d acco rd in g to F o rm u la (2) (one h u n d re d values o f floor w ere used). Figure 4 show s th e changes in co rrelatio n over floor f , a n d e rro r e.

c o v ( X , Y , f ) - - o y ( 0 rK Y (f)

T--- 1--- 1--- 1--- 1--- 1--- T

- 3 - 2 - 1 0 1 2 3

f

FIGURE 3 | Covariance, standard deviation, and correlation as functions of floor (in units of standard deviation of X , 0 = x). See description in text.

8= 0 --- 8 = 3 .5 8= 7

■2 0 2

f

FIGURE 4 | Change in correlation over floor and error. The error value differs from zero (the darkest highest line) to five standard deviations of X (the lightest lowest line).

W e can see th a t th e low er th e base co rrelatio n (i.e., w hen floor is absent), th e less it decreases w ith an increasing floor level. For exam ple, for th e initial co rrelatio n equal to 0.5 th a t in psychology is a m o d erately stro n g relatio n sh ip betw een variables, a n d for th e floor level reach in g 25 p erc e n t o f cases ( f = —0.67) th a t sh ould also be co n sid ered a m o d erate floor effect, th e c o rrelatio n decreases fro m 0.5 to only 0.45. T he sam e initial c o rrelatio n (0.5) does n o t even reach 0.25 (less th a n a 50% decrease) u n til virtually all Y values reach floor ( f = 2.61). T hus, it can be co n clu d e d th a t th e co rrelatio n fo rm u la is highly im m u n e to even large floor/ceiling effects, a n d m o d erate differences in floor/ceiling betw een task co n d itio n s n eed n o t resu lt in any significant differences in co rrelatio n betw een these co n d itio n s, given th a t an overall co rrelatio n betw een W M - a n d G f-tests is relatively low, o r th e sam ple is relatively small.

5. Empirical Verification of the Simulation Results

T he existing evidence o n changes in th e W M - G f co rrelatio n s w hen th e floor o r ceiling effects occu r is m ixed. Som e studies did n o t find an y such changes (Salthouse, 1993, 2014); som e n o te d a slight decrease in co rrelatio n s w ith in creasin g test difficulty (U n sw o rth a n d Engle, 2005; W iley et al., 2011), w hereas one stu d y (Little et al., 2014) suggested a m o d erate increase. In th is section, we aim to resolve th is discrep an cy by m ean s o f a theoretically- d riven reanalysis o f th e results o f o u r tw o recently p u b lish ed studies (C h uderski, 2013, 2015), a d m in istered to a large sam ple o f p artic ip a n ts (N = 939 in total). First, we analyzed d ata for N = 347 (fro m C huderski, 2015) reg ard in g th e sta n d a rd m easu re o f W M C —th e th ree v arian ts o f th e com plex span task (u n fo rtu n ately , th ey w ere n o t used in th e o th e r study), as well as d ata fro m th e Raven A dvanced Progressive M atrices. Second, we u sed co m b in ed sam ples o f N = 347 a n d N = 592 (the latter fro m C h uderski, 2013) a n d looked in to d ata fro m th e Raven as well as tw o alternative (strongly co rrelatin g w ith com plex spans; see fro m C h uderski, 2015) m easu res o f W M C —th e sh o rt- te rm m e m o ry task a n d th e relatio n al in teg ratio n task (th a t w ere u sed in b o th studies). M ost p artic ip a n ts w ere allow ed 60 m in.

to com plete th e Raven (except for 288 people w ho w ere given 40 m in). D ue to relatively u n tim e d testing, th e p a rticip an ts h ad chance to a tte m p t m o st item s o f th e Raven. T he overall c o rrelatio n betw een th e Raven a n d th e m ean fro m z-scores in th ree com plex spans w as r = 0.51 (p < 0.001; N = 347). Sim ilar was th e co rrelatio n betw een th e Raven a n d th e m ean z-score o f th e sh o rt-te rm m e m o ry an d relatio n al in te g ra tio n tasks (r = 0.43, p < 0.001; N = 939). F or p a rtic ip a n ts data, p ro ced u re, descriptive statistics, reliabilities, a n d th e co rrelatio n s betw een tasks refer to C h u d ersk i (2013, 2015).

5.1. Raven-test

T he 36 item s o f R aven’s A dvanced Progressive M atrices (Raven et al., 1983, Section 4: A dvanced Progressive M atrices) consist o f a th re e -b y -th re e m atrix o f figural p a tte rn s in w hich th e b o tto m - rig h t p a tte rn is m issing; subjects m u st choose a p o te n tia l m atch for th e m issing p a tte rn fro m eight response o p tio n s (one o p tio n is correct). T he task is to discover th e rules gov ern in g th e

Frontiers in Psychology | www.frontiersin.org 7 August 2015 | Volume 6 | Article 1270

(8)

c o nfiguration o f th e p a tte rn s an d apply th e m to select th e single c o rrect resp o n se option.

5.2. Complex Span Tasks

A d ap ted versions o f th re e com plex span tasks: th e o p eratio n span, read in g span, an d sy m m etry span tasks w ere applied. E ach task re q u ire d p articip an ts to m em o rize a sequence o f th re e to seven (i.e., set size) stim uli. E ach stim ulus, o u t o f n in e possible stim uli fo r th a t task, w as p resen ted fo r 1.2 s. E ach stim u lu s w as follow ed b y a sim ple decision task, p resen ted u n til a resp o n se w as given, b u t fo r a m a x im u m o f 9 s. A fter tw o tw o -stim u li tra in in g trials, th ree trials fo r each set size (in in creasin g o rd er) w ere p resen ted in each com plex span task. T he o p eratio n span task analog req u ire d th e m e m o riz a tio n o f letters, w hilst d eciding w ith a m o u se b u tto n if an in te rm itte n t sim ple arith m etical eq u atio n (e.g., “2 x 3 — 1 = 5”) w as correct. T he m odified read in g span task consisted o f m em o rizin g digits, w hilst checking if le tte r strings (e.g., “EW ZTE,” “K TA N ”) b egan an d en d e d w ith th e sam e letter.

T he spatial span task involved m e m o riz in g lo catio n s o f a red square in th e 3 x 3 m atrix , w hilst deciding w h ich o f tw o p resen ted b ars w as larger. T he resp o n se p ro c e d u re in each task consisted o f a p re se n ta tio n o f as m a n y 3 x 3 m atrices as w as a p articu lar set size, in th e cen te r o f th e c o m p u te r screen, fro m left to right.

E ach m a trix c o n tain ed th e sam e set o f all n in e possible stim uli fo r a given task. A p a rtic ip a n t w as re q u ire d to p o in t w ith the m o u se a t those stim uli th a t h a d b een p re se n te d in a sequence, in th e co rrect order. O n ly a choice th a t m a tc h e d b o th th e id en tity a n d o rd in al p o sitio n o f a given stim ulus w as tak en as th e co rrect answ er. T h e d e p e n d e n t variable fo r each com plex span task w as th e p ro p o rtio n o f correctly p o in te d stim uli to all stim uli in the task.

5.3. Short-term Memory Task

A v a ria n t o f an array -co m p ariso n task w as used th a t consisted o f 90 trials. O n each trial a v irtu al 4 x 4 array w as filled w ith five to n in e stim uli, picked fro m a set o f te n G reek sym bols (e.g., a , p , y, a n d so o n ), th e n follow ed b y a black sq u are m ask o f th e sam e size as th e array, p re se n te d fo r 1.2 s, an d th e n a n o th e r a rray w as show n. In a ra n d o m 50% o f trials, th e second array was id en tical to th e first; in th e re m a in in g trials th e second array differed fro m th e first by exactly on e item in o n e p o sitio n , w hich was always a n e w item (n o t a d u p licate o f an ite m fro m a n o th e r po sitio n ). T h e task w as to press o n e o f tw o resp o n se keys to in d icate w h e th e r th e h ig h lig h ted item w as th e sam e o r different in th e tw o arrays. T h e task w as self-paced.

5.4. Relation Integration Task

N o -m e m o ry version o f th e a lp h an u m eric m o n ito rin g task, o riginally devised by O b erau er e t al. (2008), w as used. T he stim ulus fo r each trial on th e task consisted o f a 3 x 3 array o f syllables. P articip an ts w ere asked to detec t w h e th e r any o f th e row s o r co lu m n s consisted o f th ree syllables e n d in g w ith th e sam e letter. T he array could e ith e r in clu d e on e o f th e specified configurations; on th ese trials p a rticip an ts w ere re q u ire d to press th e space key to in d icate th a t th ey h a d detected th is co nfiguration, o r could n o t c o n tain an y o f th e specified configurations. T rials lasted 5.5 s an d w ere follow ed b y a 0.1 s b lin k sep aratin g su b seq u en t arrays. T h ere w ere 80 test trials.

5.5. Results and Discussion

T h e m e a n scores fo r consecutive Raven item s sp an n ed fro m M = 0.92 to M = 0.09 (the floor defined b y th e th eo retical ra n d o m level w as 0.125). All W M tasks yielded n o rm a l d istrib u tio n an d v irtu ally n o floor/ceiling effects.

First, fo r th e sake o f co m p ariso n w ith p revious studies (S althouse, 1993; W iley et al., 2011; Little e t al., 2014), we calculated th e p o in t-b iserial co rrelatio n s betw een each item o f th e R aven-test (o rd ered according to decreasing accuracy) an d W M C (the m e a n o f z scores in th re e com plex tasks), for N = 347. As fo r th e easiest R aven item s a su b stan tial ceiling effect existed, w h ich d isap p eared fo r th e m e d iu m item s, w hereas for the m o s t difficult item s a visible floor effect show ed up (see Table 1), th e goal o f th e analysis w as to show th a t, co nsistently w ith o u r th eo retical conclusions, th e item -w ise c o rrelatio n betw een Raven a n d W M C w o u ld be in creasin g fro m th e easy up to m e d iu m item s, b u t it w o u ld start decreasing fro m th e m e d iu m dow n to difficult item s (as defined b y e rro r rate on p articu lar item s). H ow ever, in lin e w ith o u r above analyses, relatively slight (th o u g h significant) increases an d decreases in c o rrelatio n w ere expected.

T h ere w as an insig n ifican t lin e a r d e p en d en cy betw een e rro r rate on a Raven ite m an d its c o rrelatio n w ith W M C score, F(1, 34) = 0.4, p = 0.53 (see Figure 5). H ow ever, th e segm ented regression in clu d in g th e b re a k p o in t (i.e., a different lin e a r coefficients b efore a n d after th e b re a k p o in t) th a t o p tim ized the fit revealed significant n o n -lin e a r relatio n betw een difficulty o f th e R aven-test item an d W M C score, adjusted R2 = 0.17 (see Table 2 for param eters, an d Figure 5 for th e fitted values).

A NO VA test in d icated th a t b e tte r fit o f segm ented regression over lin e a r regression co m p en sated larg er com plexity o f the form er, F(2) = 4.73, p = 0.02. C o rrelatio n betw een e rro r rate an d th e G f-W M C c o rrelatio n equaled r = 0.46 fo r e rro r rate n o t h ig h e r th a n b re a k p o in t o f 0.44 (n = 21), w hile it equaled r = - 0 . 5 2 fo r e rro r rate above th is b re a k p o in t (n = 15). A lso, a second degree p o lynom ial m o d el fitted b e tte r th a n a first degree p o ly n o m ial m odel, F(1) = 134.68, p <

0.001, (adjusted R2 = 0.14 an d —0.02, respectively) an d its p red ictio n s closely m atch ed th e p red ictio n s o f th e segm ented regression (to th e e x te n t o f difference in th e ir shapes; see Figure 5).

H ow ever, b ecause each p a rtic ip a n t gave on ly on e resp o n se to each R aven’s item , w e w ere n o t able to directly estim ate th e a m o u n t o f floor/ceiling effects fo r single item s. Instead, we c o m p u te d ceiling/floor effect fo r n in e b in s o f item s (four item s in each) c o n stru cte d according to in creasin g difficulty o f item s.

As can be seen in Figure 6 , th e observed co rrelations w ith W M C differed betw een th e bins. T he w eakest co rrelatio n for th e first b in was significantly w eaker th a n th e stro n g est co rrelatio n fo r the s ix th b in , z = —1.86, p = 0.031, w hereas th e co rrelatio n fo r the la tte r b in was m arg in ally stro n g er th e n th e co rrelatio n fo r th e last b in , z = 1.55, p = 0.06.

F u rth e rm o re , fo r each b in w e used th e p ro p o rtio n o f scores w ith eith er fo u r o r zero co rrect resp o n ses as th e p ro p o rtio n o f eith er ceiling o r floor effect fo r th a t bin. W e used the p ro p o rtio n o f th e d o m in a n t effect in each b in as th e m easu re o f floor/ceiling effect level. U sing o u r fo rm al m odel, w ith such a m easu re w e w ere able to p re d ic t specific values o f co rrelatio n

Frontiers in Psychology | www.frontiersin.org 8 August 2015 | Volume 6 | Article 1270

(9)

TABLE 1 | Accuracy on the consecutive Raven-test items [with the 95% confidence intervals].

Item Accuracy Item Accuracy Item Accuracy

1 0.87 [0.85, 0.89] 13 0.64 [0.60, 0.67] 25 0.50 [0.47, 0.54]

2 0.92 [0.90, 0.93] 14 0.80 [0.77, 0.83] 26 0.43 [0.40, 0.46]

3 0.90 [0.88, 0.92] 15 0.76 [0.73, 0.79] 27 0.39 [0.36, 0.43]

4 0.86 [0.84, 0.88] 16 0.78 [0.76, 0.81] 28 0.27 [0.25, 0.30]

5 0.86 [0.84, 0.88] 17 0.73 [0.70, 0.76] 29 0.24 [0.21, 0.27]

6 0.91 [0.89, 0.93] 18 0.67 [0.64, 0.70] 30 0.39 [0.36, 0.42]

7 0.89 [0.87,0.91] 19 0.73 [0.70, 0.76] 31 0.34 [0.31, 0.37]

8 0.83 [0.80, 0.85] 20 0.69 [0.66, 0.72] 32 0.30 [0.27, 0.33]

9 0.91 [0.89, 0.93] 21 0.58 [0.55, 0.62] 33 0.37 [0.34, 0.40]

10 0.82 [0.79, 0.84] 22 0.55 [0.52, 0.58] 34 0.26 [0.23, 0.29]

11 0.91 [0.89, 0.93] 23 0.58 [0.55, 0.61] 35 0.35 [0.32, 0.38]

12 0.86 [0.83, 0.88] 24 0.41 [0.38, 0.44] 36 0.09 [0.07,0.11]

FIGURE 5 | Relation between the dificulty of a Raven-test item (error rate) and the correlation between accuracy on that item and WMC (measured with the complex span tasks). The solid line represents the segmented regression line. The dashed line reflects the quadratic regression line N = 347.

betw een accuracy in th e consecutive Raven bins, a n d W M C (as a re m in d er: in th e m o d el th e co rrelatio n w ith W M C in each b in d ep en d s o n b o th th e a m o u n t o f floor/ceiling for th a t b in an d th e overall co rrelatio n betw een th e variables in th e no-effect case; see Figure 4 ). As we did n o t kn o w th e no-effect correlation, we fitted its value (it w as th e only p a ra m e te r fitted). In result, th e m atch betw een o bserved a n d p re d ic te d c o rrelatio n was very good, RM SD = 0.028, r = 0.82, x (28) = 3.31, p = 0.91 (see Figure 6 ).

W e also rep eated th e above described single-item an d b in analyses for th e 939-people data, a n d th e sh o rt-te rm m em o ry an d relatio n in teg ratio n tasks (i.e., for th e m ean z-score on

these tw o la tte r tasks). F or th e single item s, exactly as in p revious analysis, th e seg m en ted regression fitted th e d ata m u ch b e tte r th a n lin ear regression, F(2) = 6.10, p = 0.006 (adjusted R2 = 0.22 a n d —0.013, respectively, see Table 3 for p aram eters, a n d Figure 7 for th e fitted values). T he b reak p o in t was d etected at e rro r rate equaling 0.42. E rro r rates before th a t b re a k p o in t yielded a positive co rrelatio n w ith th e G f- W M C co rrelatio n coefficients (r = 0.53), w hereas e rro r rates after th e b re a k p o in t co rrelated negatively w ith th e G f-W M C c o rrelatio n coefficients (r = —0.50). Also, sim ilarly as for th e com plex tasks, a second degree p o ly n o m ial m o d el (adjusted R2 = 0.23) fitted b e tte r th a n a first degree poly n o m ial m odel, F(1) = 11.71, p = 0.002, a n d it gave th e overall p red ictio n s quite com patible w ith th e seg m en ted regression o u tp u t.

Finally, th e analysis o f th e bins, co n stru c te d in th e sam e w ay as for th e N = 347 sam ple, b u t n o w for th e N = 939 sam ple, yielded a n o n -lin e a r relatio n betw een b in difficulty an d its c o rrelatio n w ith W M C , n o w w ith th e la tte r variable m easu red in a different way. W e p red icted these d ata u sin g o u r form al m odel, a n d received only a little b it w orse m atch to th e observed 939-people d ata th a n in th e case o f th e N = 347 sam ple, RM SD = 0.026, r = 0.57, X(28) = 6.87, p = 0.55 (see Figure 8), w hich m ig h t have resu lted fro m th e overall w eaker c o rrelatio n betw een G f an d W M C , w hen th e la tte r was m easu red w ith th e sh o rt-te rm m e m o ry a n d relatio n al in teg ratio n tasks.

A dditionally, we w ere in terested in testin g validity o f an o th e r claim su p p o sed to reject th e storage capacity theory, reg ard in g differences betw een item -w ise Raven an d W M C correlations.

Specifically, W iley et al. (2011) re p o rte d m u c h stro n g er item -w ise co rrelatio n s w ith th e com plex span task if th e set o f rules for a p a rtic u la r Raven item w as u sed for th e first tim e th ro u g h o u t th e test (an d th u s possibly placed larger d em an d s o n executive c o n tro l o r rule ab stractio n ), th a n w h en it w as rep eated fro m one o f th e prev io u s item s. C onsequently, we u sed item cod in g from W iley et al., resu ltin g in 18 item s coded as new -rule item s, a n d 18 as o ld -ru le ones (as we on ly u sed R aven’s Set II). W e u sed th e 347- people d ataset for th e sake o f co m p atib ility o f studies (i.e., b o th analyses p e rta in e d to Raven co rrelatio n s w ith th e com plex span

Frontiers in Psychology | www.frontiersin.org 9 August 2015 | Volume 6 | Article 1270

(10)

TABLE 2 | The 95% confidence intervals for parameters of the regression analyses for the correlation of the Raven items with W MC (calculated from the three complex span tasks; N = 347) over difficulty of the items.

Regression Parameter Point estimation value 95% confidence interval

Linear [F^, 34) = 0.4] Intercept 0.21 [0.17,0.24]

Slope 0.02 [—0.05, 0.1]

Quadratic [F(2, 33) = 3.78] Intercept 0.16 [0.11,0.21]

Linear term coefficient 0.39 [0.1, 0.68]

Quadratic term coefficient —0.43 [—0.76, —0.1]

Segmented [F(3, 32) = 9.95] Intercept 0.17 N/A

Breakpoint 0.44 [0.14, 0.61]

Slope before breakpoint 0.23 [0.06, 1.79]

Slope after breakpoint —0.46 [—0.68, —0.04]

Point estimation values are values of parameters fitted by regression which do not take in the account the error of estimation.

FIGURE 6 | Comparison of three measures as a function of the consecutive bins increasing in difficulty. Left: the observed (solid line) and predicted (dotted line) bin correlation with WMC (measured with the complex span tasks). Right: the proportion of floor/ceiling effect (dashed line). N = 347

tasks). In o u r data, b o th co rrelatio n s d id n o t differ significantly, z(34) = 1.41, p = 0.17 (m ean r = 0.223 vs. r = 0.197, for th e new - vs. o ld -ru le item s, respectively). O u r d ata m atch th e results of Little et al. (2014), w ho also failed to find any differences in W M predictive p o w er betw een th e new - a n d o ld -ru le item s.

M oreover, in a recen t design th a t p re v e n te d several c o n fo u n d s fo u n d in W iley et al. an d Little et al. studies (as well as in o u r study), H arriso n et al. (2014) fo u n d a low er W M C -G f co rrelatio n for th e new -rule item s th a n for o ld -ru le item s. T hus, W iley et al.’s (2 0 1 1) results seem to be an artifact, likely resu ltin g fro m th e ir use o f only one v a ria n t o f th e com plex span task (and, thus, a large a m o u n t o f task-specific variance), as well as an overall low W M C -G f co rrelatio n observed.

6. Conclusion

T his p a p e r aim ed to investigate th e co m m o n ly ad o p ted assu m p tio n w ith in th e w o rk in g m e m o ry a n d fluid intelligence research, w hich holds th a t if th e storage capacity u n d erlay th e W M - G f co rrelatio n , th e n such a c o rrelatio n sh o u ld increase w ith an increasing difficulty o f a G f-test, because m o re difficult test item s are m o re sensitive to in d iv id u al differences in W M . As often no such lin k is observed, o n th a t basis th e storage-capacity a cco u n t is rejected, a n d o th e r accounts (e.g., ones referrin g to executive c o n tro l o r pro cessin g speed) are favored. In co n trast, w hen th is lin k is fo u n d , it is used to su p p o rt th e storage- capacity account. T herefore, th e above claim yields im p o rta n t im p licatio n s for th e c u rre n t th eo rizin g on th e cognitive basis o f intelligence. O u r fo rm al analysis d e m o n s tra te d th a t th e above assu m p tio n is in co rrect, an d re aso n in g th a t is derived fro m th is assu m p tio n can speak n e ith e r for n o r against th e storage capacity a cco u n t (n o r an y o th e r account).

Specifically, usin g b o th analytical inference a n d n u m erical sim ulations, we have show n th a t th e W M C -G f c o rrelatio n p rim a rily d ep en d s o n th e a m o u n t o f floor/ceiling effect for p a rtic u la r item s/bins. T hus, w h eth er th e item -w ise W M C c o rrelatio n o f a given G f-test increases o r decreases (or rem ain s un ch an g ed ) w ith an increasing difficulty (e rro r rate) o f its item s d ep en d s o n th e overall difficulty o f th e test, as well as th e d istrib u tio n o f difficulty across its item s. F or easy progressive tests, in w hich th e early item s yield huge ceiling, b u t th e late item s do n o t ap p ro ach floor, th e item -w ise W M C -G f c o rrelatio n will in d eed increase th ro u g h o u t th e test. In co n trast, for difficult tests, in w hich th e early item s locate betw een ceiling a n d floor, b u t th e late item s ap p ro ach floor, th e respective co rrelatio n will decrease.

For tests such as Raven, w hose item s span fro m su b stan tial ceiling to su b stan tial floor, th e q u ad ratic relatio n sh ip will be observed.

Fully co n firm in g th e p red ictio n s o f o u r th eo retical m odel, we d e m o n s tra te d th is latter relatio n sh ip em pirically, u sin g large sam ples a n d tw o alternative m e th o d s o f W M C m easu rem en t.

Finally, for tests w hose item s vary in difficulty, b u t n e ith e r th e easier item s ap p ro ach ceiling n o r th e h a rd e r ones ap p ro ach floor, n o significant differences in item -w ise co rrelatio n w ith W M C will be observed. T hus, th e investigated claim w hich h olds th a t

Frontiers in Psychology | www.frontiersin.org 10 August 2015 | Volume 6 | Article 1270

(11)

TABLE 3 | Confidence intervals for parameters of the regression analyses for the correlation of the Raven items with WMC (calculated from the short-term memory and relation integration tasks; N = 939) over difficulty of the items.

Regression Parameter Point estimation value 95% confidence interval

Linear [F(1, 34) = 0.55) Intercept 0.19 [0.16, 0.21]

Slope —0.019 [—0.07, 0.032]

Quadratic [F(2, 33) = 6.22] Intercept 0.14 [0.11,0.17]

Linear term coefficient 0.31 [0.11,0.5]

Quadratic term coefficient —0.38 [—0.6, —0.15]

Segmented [F(3, 32) = 13.5] Intercept 0.15 N/A

Breakpoint 0.42 [0.17,0.66]

Slope before breakpoint 0.15 [0.0036, 0.65]

Slope after breakpoint —0.35 [—0.46, —0.065]

Point estimation values are values of parameters fitted by regression which do not take in the account the error of estimation.

0.25

0.25 0.50 0.75

Error rates for consecutive RAPM items

FIGURE 7 | Relation between the dificulty of a Raven-test item (error rate) and the correlation between accuracy on that item and WMC (measured by the short-term memory and relational integration task).

The solid line represents the segmented regression line. The dashed line reflects the quadratic regression line N = 939.

FIGURE 8 | Comparison of three measures as a function of the consecutive bins increasing in difficulty. Left: the observed (solid line) and predicted (dotted line) bin correlation with WMC (measured by the short-term memory and relational integration task). Right: the proportion of floor/ceiling effect (dashed line) N = 939.

if th e storage-capacity acco u n t w as tru e, th e n th e W M - G f test c o rrelatio n s sh o u ld increase sim ply as th e fu n ctio n o f item s difficulty, does n o t seem justified. Vice versa, even observing exactly such an increase does n o t au to m atically su p p o rt th e storage-capacity account. T herefore, n o changes in co rrelatio n s due to differences in W M /G f lo ad (an d resu ltin g floor/ceiling effects), o r lack o f th em , can be u sed as an arg u m e n t for o r against any th e o ry o f W M a n d /o r Gf.

A n o th e r p ro b lem , cogently n o te d by a reviewer, p e rta in in g to deriv atio n s o f th eo retical conclusions fro m th e item -w ise analyses o f G f-tests, is related to th e fact th a t to date all such analyses relied o n tests w hich include a very lim ited po o l o f item s, p re se n te d in a fixed o rder. T hus, a given relatio n sh ip betw een W M C a n d th e order/d ifficu lty o f a G f-test item m ay be ob scu red

by u n k n o w n p eculiarities c o n cern in g th a t item (e.g., an aw esom e rule), o r th e fact th a t th e item w as p re se n te d at a p articu lar p o sitio n w ith in a sequence (e.g., late one, w hen a p a rtic ip a n t already becam e tire d o r subject to tim e p ressure). T hus, a m o re co rrect w ay o f testin g th e m o d el p re se n te d here w o u ld be to use a test for w hich lo ad -v aried item s are g en erated dynam ically (i.e., th e ir po o l is large), a n d in th e ra n d o m o rd e r co n cern in g th e ir difficulty. H ow ever, as th e use o f such tests in literatu re is still rare (for som e exceptions see E m b ertso n , 1995; P rim i,

2002

; A ren d asy et al., 2008), a n d th e p re se n t stu d y w as p rim arily dev o ted to th e critical evaluation o f existing studies o n th e item - wise W M C /G f analyses (th a t m o st w idely u sed Raven A PM ), here we also focused on th e Raven. H ow ever, th e fu tu re testin g o f o u r m o d el against d ata fro m a dynam ically gen erated G f-test (data

Frontiers in Psychology | www.frontiersin.org 11 August 2015 | Volume 6 | Article 1270

Cytaty

Powiązane dokumenty

A one-dimensional steady state radial analysis [43] based on first principles is done to determine the thickness of the insulation for a desired outer insulation temperature.

Dla samca i samicy niezwykle istotne jest przekazanie życia nowym pokoleniom, gdyż staje się ono gwarancją na żywot­ ność genów.. Dlatego też, obie płcie

Problems of research of Lemkos’ national identity in historiography: political tendency and

Bolesława Micińskiego, w: „Teksty Drugie&#34; 1990 nr 5-6, s. U Maine de Birana metafora lustra związana jest z deprecjacją wzroku, jako organu poznania, na rzecz dotyku,

O skali utraty zaufania w Na­ czelnej Komendzie Armii świadczy nie tylko odwołanie z funkcji szefa sztabu Komendy Legionów, ale także odrzucenie w lipcu 1916

In de methaanreaktor worden lagere organische zuren als azijnzuur, propionzuur en boterzuur omgezet door verschillende bacterieen tot methaan en kooldioxide.. Het

The following challenges are identified: (1) the latest solution should account for previously integrated data (sequential approach); (2) due to the nature of a mining operation, it

Ostre zapalenie dróg żółciowych Dreszcze i gorączka żółtaczka silny ból brzucha o charakterze kolki żółciowej Triada Charcota.. Ostre zapalenie dróg żółciowych