• Nie Znaleziono Wyników

of Educational Value Added methodology ~ in assessing the quality of school education . ~

N/A
N/A
Protected

Academic year: 2022

Share "of Educational Value Added methodology ~ in assessing the quality of school education . ~ "

Copied!
10
0
0

Pełen tekst

(1)

Sfawomir Pasikowski

Pomeranian University in Stupsk

E.

,.,

"

Th e est1mat1on t eory as a as1s . . h b . J~

t'

~r»:

of Educational Value Added methodology ~ in assessing the quality of school education . ~

ABSTRACT

Tiie widespread adoption of the idea of the Educational Value Added (EVE) meets in Poland with the many voices of support and opposition, within the same environment, i.e. among teachers and academics. A dissociated evaluation of the issues makes it possible for us to look in a less biased way at the arguments presented by the proponents and the opponents of assessment and predictions oflearning outcomes by means of EVE. Much of the criticism against EVE and the failures in its application, but also excessive confidence to it, is rooted in unfamiliarity with its fundamentals, primarily based on the Estimation Theory.

Keywords: estimation theory, EVE, Educational Value Added

Contrary to what it appears to be, the article will not to be devoted to EVE.

Its focus will be on one essential element on which this concept is based, i.e.

the operational sense. In the era of the exploration of measurable learning outcomes and the ability to monitor the contribution of school education, solutions which make use of mathematical analysis apparatus are becoming increasingly popular. EVE and the methodology of assessing it are an example of such a practice ( Z6ltak 2013). This method has gained many supporters, but it is not devoid of opponents. Although capable of arousing controversy,

International Forum for Education, 2014, No.7

(2)

it is widely applied in the education system, at the lower and upper secondary level. Its popularity may be rooted in its aptitude to perceive global processes and to make comparisons and generalisations on a national scale. However, such potential is only accessible when large data sets are observed and when the features of interest to researchers are measured by means of the same tool characterised by a set of satisfactory psychometric properties pre-established before the measurement. This article looks at the conditions which must be fulfilled in order to be able to estimate the effects of education as well as at the

role played by the Estimation Theory.

There is a frequent question of what pupils, classes or schools look like within a wider set. What is the average level of achievement that may serve as a reference point for particular pupils or learning outcomes of particular schools? What percentage of pupils reach a certain threshold? What propor- tion of pupils or schools exceeds this threshold, and how many perform below the target level? And finally, how to answer these questions in the absence of access to all units within a given population? Thus, observation of only its fragment labelled as a sample makes it possible for researchers to translate the findings at the micro level into generalised conclusions. A cred- ible answer to these questions may be formulated on the basis of the Estima- tion Theory.

The Estimation Theory next to Theory of Testing Statistical Hypotheses is included within the discipline of mathematical statistics, whose main aim is to process the allocation of samples onto populations. The object of the Esti- mation Theory is to estimate population parameters, such as the arithmetic mean, fraction, standard deviation, but also correlation between coefficients or parameters of regression equations. These estimates are based on data ob- tained from a fragment of the total population, i.e. a sample. Estimation has a point or interval nature. The former when the specific unknown value of the parameter is estimated. The latter, when the interval which covers the un- known value is subject to estimation.

Like any other inference based on incomplete data, this method is also burdened with a margin of error. However, the Estimation Theory has means of controlling the error in order to minimise its impact on the final assessment of a given phenomenon, as to make the burden as insignificant as it is possible.

Application of the estimation theory to research practice must be pre- ceded by prior knowledge of statistical description. This means the users of estimation theory should have orientation in the scope of empirical distribu-

(3)

1 54 SLAWOMIR PASIKOWSKI

tion 1 definitions, in what types of distribution there are, and also sh~uld ~ow the characteristics of distribution expressed by central tendency, dtsperston, skewness and kurtosis.

Estimation is a form of reductive reasoning, which means that a reason is concluded on the basis of a consequence (Krzykowski 2010). In such a situ- ation, the reason denotes the distribution of variables within population or parameters of the distribution, the consequence being measurement data obtained by observing a sample. However, if the data from the sample can be regarded as a consequence, the sample must be simple. It means that all ele- ments must be selected from the population according to the scheme: unlim- ited individual random sampling, with replacement. Only such a sample is considered as representative. However in practice, the most common popula- tions are finite and the selection of a replacement may be difficult. Therefore, a dependent draw is more often used, namely without replacement of the ele- ment which has already been once selected, but taking into account the relevant corrections in calculation formulas (Kolodziej 2013, Ostasiewicz at al. 2011).

Random sampling is a necessary precondition for the application of the Estimation Theory (Kot at al. 2007). The theory is based on central limit theorems underlying mathematical inference. Such an inference allows to formulate conclusions for general population based on data isolated form a sample.

However, not every sample is suitable for such a procedure. In general, the sample must accurately reflect the characteristics of the population. To make it happen, each element of the population needs to have an equal chance to be included in the sample. Sometimes, researchers select a solution with differing probability of inclusion in the sample, not haphazard or biased, but rather equipped with chances to reflect the frequency of a particular type of element within the population (Cohen at al. 2010). The relations between unbiased sample organisation and the population are described by central limit theo- rems.

1 .The ~stribution of features is a number (frequency) assigned to each value which char- actenses this fe~ture. For example, the distribution of the variable as "gender» in the class can be as follows: g~rls-15 people, boys- 12 people The dt"st ·b t• · n u ton o t f h e vana · bl e as grow " thn :

156 em- 3 persons, 157cm- 0 people, 158 em- 1 person, etc.

(4)

They are a group of laws considering increasing convergence of random variables2 to a specific probability distribution (Kot at al. 2007). To simplify, these laws say that the frequency of results increasingly coincides with the probability of the occurrence of the results. Well, assuming that the frequency of some results within a given population is known, it should be expected that the frequency of the results in a research sample is going differ less from the frequency distribution in the population when the sample size has been in- creased. Hence, a sufficiently large number of observations reflects the regular- ity occurring in the general population.

A special case of laws applies to a situation in which the distribution of a random variable converges with normal distribution3Lindenberg-Levy's central limit theorem may serve as an ultimate model. Overall, it says that with the increase in the number of units, the distribution of statistics in a sample approaches normal distribution. As an example, exam results were observed at one school. At the end, the sum of points obtained by all the examined pupils was calculated to yield a sum or arithmetic mean points for the school.

If we make a similar operation for more schools and we treat the sum or the general arithmetic means as separate results, the distribution of these sums and the distribution of the arithmetic means will approach normal distribution with an increase in the number of the surveyed schools.

If we accept limit theorems, we will be able to assess the probability of occurrence of variable values located in research field.

For example, suppose that we want to determine the chance or probability of the attainment of a value higher by at least 1 point in this year's national secondary school examination in the subjects of Mathematics and Natural Sciences. The study comprises 500 schools in one province. In each of the schools, the arithmetic mean of scored points is going to be calculated, and next all the means are going to be re-calculated to the general arithmetic mean.

The mean of scored points reached by the pupils of the province so far was 59 points with a standard deviation of 15 points. Assuming that the distribu- tion of mean of exam results has a normal shape, we can determine the prob- ability using the following formula:

P(Yn >(59+1))=Zn

2 random variable - a variable which assumes specific numerical values with a set probability.

3 normal distribution - distribution in the shape of a bell curve.

(5)

156 SLAWOMIR PASIKOWSKI

where yn stands for a random variable4, and zn symbolises the value of the standardised random variable. Standardisation allows referring the value of the variable to the normal distribution table, so that it is possible to read the probability of the calculated values.

This probability is defined by the idea of normal distribution. Graphical illustration of the distribution is represented by the Gauss curve, in which the area under the curve corresponds to 100% probability, whilst the Z values placed on the X axis specify a cutoff point of an area corresponding to a gjven level of probability. Below occurs the expansion of this thread.

The following formula is necessary to obtain information which allows us to determine the probability of attainment of a value higher by at least 1 point in this year's national upper-secondary level leaving examinations in the sub- jects of Mathematics and Natural Sciences:

"~

Yi - f.1

*

n

Zn = L..

·~• J;.

cr n

'LY; - sum of the individual results, ~ - arithmetic mean in the population, cr - standard deviation in the population, n - sample size

Although we are interested in the arithmetic mean, to be able to use the formula we need information about the sum of points. Based on this formula, the sum may be restored. After placing the inserting the data into the formula (~=59, cr = 15, n = 500, :LYi =(59 +1)

*

500), the value ofZn = 1.49 is ob- tained. Next, the corresponding value is read from the table of normal distribu- tion. In this case, it equals 0.9319. There is the probability that the result will not be higher than the set value. The next step is calculation: 1-0,9319 = 0,0681. This outcome means that the probability of obtaining a higher result by at least 1 point during this year's exam is less than 7%.

The range of Z values is associated with the tree-sigma rule. Presentation of the three-sigma rule will clarify the discussed issues. If we treat the arith- metic mean of the population as the point 0 (middle position on the X axis), the deviation of the values to the right and left of this point can be described

" In this case, the random variable is an arithmetic mean of reached points at this year's

national examination.

(6)

' \

., I

·\

\\ '

"

. .

'•,

.....

/

f. ,.t

'

.;.·

II' ..

,. !

!,

.

r

by equal distances. These distances are defined by the value of the standard deviation.

Normal distribution is symmetrical, and may almost entirely be described by the three sigma rule: three standard deviations to the right and three to the left from the zero point (arithmetic mean). Within one standard deviation from the zero point to the right is 34.14% of the area under the curve, which is interpreted as 34.14% of the population.

In the case of symmetrical distribution, it is easy to calculate that approxi- mately 70% units of the general population are located within the area de- scribed by one standard deviation to the right and to the left of the arithmetic mean. Within two standard deviations there is already 98%, and three standard deviations include 99.7% units of population. According to the common rule, when the distribution is unknown, there are at least 75% units within the frame of two standard deviations (Field 2009).

Normal distribution and the three sigma rule are also used in the interpre- tation of probability defined as frequency5100% of the area under the curve corresponds to the probability of a definite occurrence. According to the theory of probability axioms, it is expressed by number 1, which symbolises the whole/total. The standardised values are placed on the X axis cut area under the curve and as such, they determine the level of probability, also de- fined as the confidence level. The concept of confidence level makes it possible to describe the reliability of the results of statistical analyses. The standard levels of confidence are 0.9, 0.95 and 0.99. The levels correspond to the Z val- ues: 1.64, 1.96 and 2.58, which lie on the X axis.

Thus, the standard deviation value equals 1.64 cut off the field on the left side, which represents nearly 95% of the area under the curve. It corresponds to 95% of probability. If the test results are standardised according to the formula:

X·-X

Z=-'-- S

it will also be easy to read the result of an individual pupil or school on the bell curve graph. For example, score 1 means that 84% of the results do not exceed

5 Probability is defined as an occurrence frequency of an event (e.g. the specific outcome) with respect to all events of the class. It is said about the ratio of favorable events to all the possible events.

(7)

58 SI:.AWOMIR PASIKOWSKI

this value. It comes to the area under the curve to the left of the Z value equal 1. In other words, a standardised exam score at 1 Z says that 84% of candidates did not reach higher than the indicated results. Moreover, such standardisa- tion also allows for comparisons between variables independently from the nature of measurement units. For example, Maths test results with results of a language test; the test results from 2005 with the test results from 2010, differing in terms of length and the range of scores.

Suppose we want to determine the average level of the national test execu- tion in 2014. Having only the results of a sample, it is possible to estimate the unknown values in the population. In the case of point estimation, it is suffi-

cient to calculate the standard error by dividing the standard deviation by the square root of the sample size. The standard error indicates how much we may be mistaken if a parameters in the population are estimated on the basis of the statistic value within the sample. However, this procedure carries considerable risks. Therefore, interval estimation is much more often used. This kind of estimation relies on the construction of a confidence interval.

Besides the values calculated on the basis of the sample, we need to have information on the expected level of confidence in the estimated value of an arithmetic mean. In other words, information on the likelihood that the ob- tained result corresponds to the unknown value in the population. Here again, we come back to a normal distribution as an illustration of probability.

Z values are considered in the estimation procedure as corresponding to the set level of confidence. The aim is to build a confidence interval which determines a range of values. The range can cover the unknown value occur- ring in the population. The confidence interval is built on the basis of data from the sample. For example, the arithmetic mean value, the standard de- viation and the sample size. The last is the Z value corresponding to the adopted level of probability. These elements are components of the following formula:

{- s - s }

P x-za

J;;

<m <x+za

J;;

=1-a

The conglomerate subtracted and added to the arithmetic mean is referred to as an absolute error (Kolodziej 2013, Ostasiewicz at al. 2011), and it deter- mines the width of the confidence interval. The smaller the standard deviation

and the greater the sample size, the narrower the interval at a given confidence

(8)

levell-a

(a

is a tolerable risk of error in the estimation).Thus, the more precise the estimate.

Suppose that in a sample of 100 randomly selected upper-secondary level pupils from all schools in Kolobrzeg, the arithmetic average of a language test was at 32 points, and the standard deviation was 8 points. If we want to esti- mate the arithmetic mean of this part of the national test in the Kolobrzeg population with 95% confidence (1-0.05), we need to add these outcomes and

Za

= 1.966 on the formula to the obtained results of calculations. In con- sequence, the width of confidence interval will range from 30.4 to 33.6 points.

If the sample size is 200, then the interval will be shortened: 30.9-33.l.After reducing the dispersion of sample results from 8 to 4, the interval will be even shorter: 31.4-32.5. When the initial values.of the deviation and the sample size remain unchanged (s=8, n=lOO ), but the confidence level rises from 95%

to 99%, then it will result in the opposite effect, where the interval will be wider: 29.9-34.0. Now, increase in the reliability of the estimation reduces the accuracy of the result. However, we are more confident that the interval will cover the unknown value of the arithmetic mean in the population.

Notwithstanding, the proposed example ignores a fundamental issue, as it omits the fact that the population of pupils in Kolobrzeg is not infinite. In the case of such populations, the sampling is usually conducted without replace- ment of the drawn elements. Then, a special correction should be applied within the above statistical formula (Lehmann, Casella 1998; Kolodziej 2013;

Ostasiewicz at al. 2011). The correction is called a finite population correction

(fpc)

and has the following formula:

[ii=;

fpc=v~

N - population size, n - sample size

The absolute error of the estimate should be multiplied by this correction.

As a result, the confidence interval is reduced. Consider the population sample

6 The interval has two ends, therefore, 95% of the surface area must be symmetrically distributed under the curve. Such an area is cut off by the value of 1.96 on the left and right sides of the distribution. Thus, on each side there is 2.5% of the field not covered by the pre- scribed range.

(9)

160 SLAWOMlR PASIKOWSKI

of 1600 pupils in Kolobrzeg. The same initial values of the arithmetic mean, standard deviation and sample size and coefficient of finite population correc- tion for 1600 elements will jointly provide the width of confidence interval at 30.5-33.5 points. Correction (fpc) is omitted when the sample size is less than 5% of the population. The value of the correction converges with to the value set at 1.

The theory of estimation allows estimating the structure ratio (fraction, proportion) in the population. The procedure is the same, jointly with consid- eration of the finite population correction (fpc). Other parameters of the population, such as measures of variability, skewness and kurtosis may also be estimated. But these analyses are rarely implemented.

The theory of estimation creates great opportunities only if its application is performed in accordance with the assumptions on which it is based and adequately to the conditions provided by this theory. If we remember that this theory is used to assess the regularities occurring in large data sets (often ex- ceeding the possibility of observation), the derived conclusions will be char- acterised by high levels accuracy. The disadvantage in using this theory can be considerable requirements for sampling. This, together with the doubts to- wards the EVE methodology, is probably one of the key sources of criticism directed at the theory. If this criticism is not rooted in the act of questioning the legitimacy of the probability theory in the modelling of social phenomena, it is in all likelihood caused by the lack of familiarity with the estimation the- ory and its foundations. General reluctance towards 'mathematisation' of social research may also account for the range of criticism.

REFERENCES:

COHEN, L., MANION, L., MORRISON, K. (2010), Research Methods in Education, London, New York.

FIELD, A. ( 2009), Discovering Statistics Using SPSS, SAGE, London.

LEHMANN, E.L., CASELLA, G. ( 1998), Theory of Point Estimation, Springer Verlag, New York.

KOI:.ODZlEJ, A. Teoria estymacji w praktyce badati spolecznych, Difin Warszawa.

KOT, S.M., JAKUBOWSKI,)., SOKOI:.OWSKl, A. (eds.) (2007), Statystyka. Podr~cznik

dla studiow ekonomicznych, Difin, Warszawa.

(10)

KRZYKOWSKI, G. ( 2010),

Filozoficzne koncepcje estymacji statystycznej,

in:

Przeglqd Statystyczny,

no. 4.

OSTASIEWICZ, S., RUSNAK, Z., SIEDLECKA, U. (2011),

Statystyka. Elementy teorii

i

zadania,

UE, Wroclaw.

26LTAK, T. (2013 ),

EWD jako spos6b badania efektywnosci szk6l.

in:

Scie:iki rozwoju

edukacyjnego mlodziezy- szkoly pogimnazjalne, M. Karwowski (ed.), IFiS PAN,

Warszawa.

Cytaty

Powiązane dokumenty

W każdym przedsiębiorstwie bezpieczeństwo pracy opiera się na wymogach określonych prawem oraz elementach, które dodatkowo przyczyniają się do pod-

Błąd średniokwadratowy modelu produkcji sprzedanej przemysłu otrzymanego przy użyciu sieci neurono- wej RBF z ośmioma neuronami w warstwie ukrytej, z pominięciem etapu redukcji

Celem pracy jest przedstawienie możliwości terapeu- tycznych światła spolaryzowanego w leczeniu bli- znowca u 63-letniego pacjenta po zabiegu operacyj- nym

Analysis of the effect of slaughter time on pelt length (Tab. 2) demonstrated that chinchillas slaughtered later than at 250 days of age were characterized

6 – two evaporator-cooling towers built on the roof of the building; 7 – the set of circulation pumps (three pumps adapted for operation in 2 + 1 system); 8 – the two-step

In this paper the multiple particle model is formu- lated and the forward–reverse estimator is applied for the estimation of the mean ensemble concentration and the standard

The following variables express the income distributions in our countries: 1 the Gini coefcient, 2 the relation between the 10 percent richest and the 10 percent poorest

he Roman legionary fortress of Novae, once the headquarters for the 1 st Italic legion (Legio i italica) in the.. province of Moesia inferior, lies in northern Bulgaria, not far