• Nie Znaleziono Wyników

Andrzejczak K. The quality management system of translation technical texts.

N/A
N/A
Protected

Academic year: 2021

Share "Andrzejczak K. The quality management system of translation technical texts."

Copied!
8
0
0

Pełen tekst

(1)

THE QUALITY MANAGEMENT SYSTEM OF

TRANSLATION TECHNICAL TEXTS

Andrzejczak K. J.

Politechnika Poznańska, Instytut Matematyki, ul. Piotrowo 3a, 60-965 Poznań

Abstract: The presentation of the indicator of the translating quality of technical texts is purpose

of this report, as well as its use in designing of the system of management of the quality of translation. The translators use Computer-Aided Translation tools (CATs) and dynamically developed base of knowledge. The quality management system of translation is based on the point and interval estimations of the indicator of quality. Procedures for sampling inspection take into consideration ISO 2859 standards.

1. Introduction

Let us consider the process of the translation of technical text of considerable repeatabil -ity of technical terms. In the field of translation, a translation unit (TU) is a segment of a text which the translator treats as a single cognitive unit for the purposes of establishing an equivalence. The translation unit may be a single word, or it may be a phrase, one or more sentences, or even a larger unit. Translation units must meet certain requirements. If a translated TU meets those requirements we call it a correct translation and computer aided translation system (CAT) adds it to a dynamically evolving knowl -edge base of translated TUs of certain branch of knowl-edge.

Repeating TUs form a certain knowledge base are automatically translated by the sys-tem and they only need translator's acceptance. The set of L translation units is called a copybook. Each copybook is translated by a single translator or certain group of transla -tors. Translated copybooks undergo internal inspection before being given back to the receiver. The result of such an inspection is either acceptance of the copybook or its sending back for another translation. The basis of such a decision is full or partial inspection of the translation of the translation units. Each of the tested units is qualified ei -ther as meeting the requirements or not meeting them. The natural quality measure of the translation of the copybook is an index of translation conformity d. We can define this index as a quotient: d = D/L, where: L - number of all TUs in the copybook, and D - number of TUs translated properly. The value of index d is in the range [0, 1], and, when presented in permilles (0/

00) – [0, 1000]. According to the number of TUs L, we

can divide copybooks into groups: very thin, if L  100, thin, if 100 < L  1000, thick, if 1000 < L  5000, very thick, if L > 5000.

(2)

The thickness of the copybook is important for choosing the method of quality control of its translation. Obviously, the best method of determining the d index for certain copy-book and certain set of translators is to check the copycopy-book entirely. Only then we can obtain the real value of the d index. This method is expensive and, for economical rea-sons, is recommended only for thin copybooks.

This paper is devoted to the methods of acceptance inspection of a great number of at least thick copybooks. Thorough inspection of the translation quality of such copybooks is expensive, and it would be inadvisable for economical reasons. The costs of the qual-ity inspection can be substantially decreased by implementing special techniques of probability theory and mathematical statistics.

We have to underline that the d index depends not only on the difficulty of the copybook but also on individual predispositions of the translators. Therefore we can think of two main factors influencing the quality of translation. The internal factor is connected with the copybook content and the external factor is connected with translators' skills. The quality management system must take into account both of those mentioned factors.

2. Elements of the quality management system

The basic elements of the quality management system of thick copybooks are a conse quence of their inspection at random. Therefore, in quality management of the transla tion the key factor is the selection of the appropriate statistical methods and minimiza -tion of the decision errors. The quality management system of the transla-tion consists of the following elements of statistical research:

1. Establishing of the size of random sample for the determination of the unknown d in-dex.

2. Defining of the method of choosing the simple random sample. 3. Estimation of the index d for certain set of translators.

4. Determination of the confidence level for the index d. 5. Statistical inspection of the translation quality.

Other elements of quality management system of the translation arise from specific needs of translation quality control. Those needs lead to modification, narrowing or ex-panding of the range of statistical analysis. Because of the combination of the consid-ered factors, comparison analysis could be taken into account.

2.1. Sample size of the inspected translation units

The questions is: how many translated TUs should be inspected to estimate with high precision the d index for texts translated by a certain group of translators?

To determine the number n of translation units that will be inspected for the determina-tion of unknown index d we have to assume a priori a small error b of the estimadetermina-tion (e.g. 0,005) and sufficiently high confidence level 1   (e.g. 0,95). For a new copybook

(3)

and a new translator the index d is usually unknown. An experienced translator has an attributed individual index d0, basing upon previously inspected translations. This index

is dynamic in nature and from time to time it is brought up to date. Individual transla-tors' indexes are the proof of translators quality.

To determine the number n we have two models to choose from. The choice is being made on the basis whether the individual index d0 is known. The initial information

about the individual index d0 allows us to decrease the sample size of the TUs inspected.

Table 1. Models of the determination of the sample size of the TUs

model individual index d0 number of the TUs remarks

I known

 2 2 1 2 0 0

u

b

d

1

d

n

1 2

u

quantil of the standardize

normal distribution II unknown        2 2 1 2 u b 4 1 n   ceiling function

Translation agencies specializing in the translation of technical texts have a strategy of employment of translators characterized by high value of d0 index and high efficiency of

translation.

In order to choose appropriate translators, the agency might apply the minimum value of the individual index dmin. During the initial phase of cooperation the index dmin might be lower to let the new translators adapt to company criteria and to carry out their positive selection. Since the final texts should be error-free, so dmin index value must be suffi-ciently high (dmin  0,960).

The quality management system (QMS) is optimized according to the costs of their em-ployment and the costs of the inspection of translation. The individual d0 index has a key

influence on control sample size and inspection costs. In Table 2 one can find the exem -plary calculations of control sample size necessary for the estimation of the index d0.

The numbers of TUs that need to be inspected for two first variants are relatively small. This results from the fact that we have assumed relatively high value of estimation error b = 0,050. If we decrease this number to b = 0,010 the sample size rises dramatically. Greater sample sizes come from the assumed high precision of calculations. The de-crease of the maximum acceptable error of estimation of d or dede-crease of the confidence level 1   results in decreasing of the minimum number of TUs taken for inspection.

Table 2. NumberofnecessaryTUsinspected

data individual index d0 number of the TUs

(4)

b = 0,050, =0,10 0,950 52

b = 0,010, =0,10 unknown 6765

b = 0,010, =0,10 0,950 1286

This example confirms the key role of information about the individual index d0 for a translator. Having known, for the particular group of translators, the size of the sample, we have to choose it to fulfill the conditions of its randomness and independence. For this purpose, from the numbered TUs of certain copybook we choose simple random sample containing n elements, usually according to discrete uniform distribution. The process of choosing TUs in the translation quality control system, is being automated. 2.2. First method of index d estimation

To estimate an unknown translation conformity index d for the copybook, we chose a simple random sample containing n TUs, in which the translation conformity has been inspected. Assume that M of n TUs would be translated correctly. On the basis of these data we can estimate the d index for a certain translator or group of translators. There are two possible procedures that would be presented in this paper. In the first one we as-sume that the copybooks are so thick that we can neglect the TUs number and therefore apply approximation methods of estimation, taking into account limit theorems. This method is covered in this section. The other method will be described in the section 2.3. In the first method point estimator of index d calculated using maximum likelihood method, can be expressed as:

n M

dˆ (1)

where n is the number of the inspected TUs and M is the number of the TUs translated correctly. The estimator has good properties and is recommended for the estimation of the index d for a copybook that is at least thick. Unfortunately, when we employ this estimator we do not know the error made. Therefore we should parallel employ interval estimation of the unknown d index.

The length of the confidence interval is a measure of the precision of the estimation of an unknown parameter. The confidence limits depend on the sample size. The natural consequence of such a fact is the possibility of shortening of the confidence interval by increasing the sample size whereas the confidence level 1   is fixed. For the construc-tion of the confidence interval we employ the Moivre-Laplace theorem.

From this theorem appears that the frequency of occurrence of the TUs translated cor -rectly in n independent TUs asymptotically is a subject of normal distribution

) n / ) d 1 ( d , d ( N  (2)

The lower bound of the right side confidence interval for the unknown index d for the thick copybooks can be found in Table 3.

(5)

model lower bound of the right side interval I          n M 1 n M u n M 2 1 II                     2 1 1 2 1 2 1 u 4 1 n ) M n ( M n u n 2 u n M u n n

Calculations of level 95% confidence intervals for the index d assuming that the inspec-tion of 200 TUs showed 3 TUs translated incorrectly gave: for model I – the interval (970, 1000] 0/

00 and for model II – the interval (963, 1000] 0/00.

2.3. Second method of estimation of index d

If the copybooks are not thick, or the d index is high (d > 0,96), special estimation meth-ods should be applied. Because TUs translated incorrectly are sporadic events, we will use so called estimation of seldom events. Estimating index d in finite population, we know about the empirical distribution of the estimator more than in the method

dis-cussed in part 2.2. Namely, random number X of the TUs translated correctly is a subject of hypergeometric distribution. This distribution is a probabilistic model of the number of correctly translated TUs.

The hypergeometric distribution models the total number of successes in a fixed size sample drawn without replacement from a finite population. Sampling without replace -ment means that once a particular sample is chosen, it is removed from the relevant population for all subsequent selections. The hypergeometric distribution has three parame -ters that have direct physical interpretations. L is the size of the population, that is the number of TUs in the copybook. D is the number of items with the desired characteristic in the population (correctly translated TUs). n is the number of samples drawn, that is – TUs inspected.

If x is the number of TUs translated correctly in the sample drawn, then the probability function of this distribution takes positive values for x fulfilling the condition:

} , min{ } , 0 max{ nWxD n                     x n D L x D n L 1 ) n , L , D ; x ( HG (3)

The number D is unknown for a certain copybook, and therefore it must be estimated. Because the index d = D/L, then we can derive the probability of the event X = x and then build the confidence interval for this index. Index d can be interpreted as probabil-ity of the event that the randomly drawn TU from the inspected copybook is translated correctly.

(6)

The hypergeometric distribution can be found in statistical programs like Statgraphics, Statistica, mathematical programs like Derive, Maple, MatLab, and spreadsheets like popular Excel. The calculations of the probabilities of the events for the hypergeometric distribution can be easily automated using computer technology. Particularly, due to an-alytical difficulties, for deriving the quantile of order  for this distribution, using of computer-assisted technology is recommended. This problem is a special case of the in-terpolation problem of inverse cumulative discrete distribution function.

In Table 4 expected value, mode and variance for the hypergeometric distribution are presented. Symbol

 

. denotes so called floor function.

Table 4. Expected value, mode and variance for the hypergeometric distribution

expected value E(X) = nD/L

mode         2 L ) 1 n )( 1 D ( ) X ( Mo variance ) 1 L ( L ) n L )( D L ( nD ) X ( D2 2    

In certain cases for the hypergeometric distribution a binomial distribution approximation can be used. Calculaapproximations of inverse binomial distribuapproximation is also a complicated pro -cedure, especially for big values of n. In section 2.2. we went a step further and as an approximation of hypergeometric distribution we used normal distribution. We may as-sume that the distribution of the estimator of parameter d is approximately normal only when the condition min{nD/L, n(LD)/L } > 30 is fulfilled.

3. Statistical quality control of the translated copybooks

During the acceptance inspection of the translated copybook methods of statistical qual-ity control (SQC) are employed. The d index, previously presented, is the measure of the quality of the translated copybook. In sections 2.2 and 2.3 estimation methods for this index were presented. The result of the inspection of the TUs drawn is a basis for the de -cision about the acceptance of the copybook. In general, there are three possible deci-sions:

 acceptance of the copybook,

 rejection of the copybook,

 double sampling inspection of the copybook.

In the case of acceptance, there is no further inspection of the TUs from this copybook. In the case of rejection, there are various possible actions that can be taken. For example – brand new translation of the complete copybook. Double sampling means that we draw from the inspected copybook a new random sample and this sample undergoes an-other inspection.

(7)

The description of the sampling procedure is called the translation quality sampling scheme. In the simplest case we use single sampling inspection. The statistical methods presented in this paper let us design a certain scheme for the inspection of copybooks. In the case of doubts double sampling inspection is used, and the final decision is made on the basis of the results of both samples of TUs.

In the case of double sampling inspection after inspecting the first TUs sample of the size n1 there are three possible decisions:

copybook acceptance when m1  a1 ,

copybook rejection when m1  b1 (where b1  a1),

another sample of the size n2, is being drawn when b1 < m1 < a1

where m1 is the number of correctly translated TUs, a1 is acceptance value, b1 is

rejec-tion value. When the number of correctly translated TUs in both samples exceeds certain value a2, the copybook is accepted, otherwise it is rejected.

In the double sampling scheme the size of the first sample is smaller than the size of the equivalent single sampling scheme. If in the inspected copybook index d is very high, then the decision of acceptance of this copybook can be made after first sample inspec -tion. Similarly, when index d is small, then after first sample inspection the copybook is rejected. From time to time the decision will be made after inspection of both samples. Multiple sample inspection is statistically cheaper and gives better results than single sampling inspection. However, the procedure of such an inspection is more complex. When the individual index d0 is significantly higher than dmin set by the translation

agency then multiple sampling scheme should be applied.

Carrying the sample inspection according to SQC we are dealing with random variables. The number of TUs correctly translated in the copybook and in the sample inspected are random variables. Therefore, for the description of the inspection schemes probabilistic and mathematical statistics methods were necessary.

In this paper we were using the d index that we might call positive index. However, sta-tistical terminology for the negative quality control is usually used. Therefore, SQC will be described for the negative quality control, i.e. for the index w = 1 – d.. From this point one can easily transform the concepts of the inspection schemes onto the index d. Numerous literature sources concerning the index w can be found in the book of Olgierd Hryniewicz.

In simple sampling scheme the acceptance of the copybook is equivalent to the case in which we cannot reject the null hypothesis: H0: w  w0, where w0 is producer’s risk

quality (PRQ). Then, the rejection of the copybook is equivalent to the rejection of the null hypothesis and the acceptance of the alternative hypothesis: H1: w  w1, where w1 is

consumer’s risk quality (CRQ).

(8)

risk - PR). In our case this is the probability of the rejection of the copybook having the index of TUs translated incorrectly equal to translator’s risk w0.. The probability  of

type 2 error is called consumer’s risk (CR) and it is the probability of the acceptance of the copybook having the index of TUs translated incorrectly equal to w1 risk. For the

de-sign of quality management system for the translation of technical texts, we can use as standard, known from the ISO 2859, research plans, inspection schemes and procedures of sampling inspection used for quality control in production and service.

Research schemes set in official standards not always meet users’ demands. The aim of this paper was presentation of universal – mathematical methods being the building blocks for designing of special quality management system for the translation of techni -cal texts. The methods described herein might be a source of further studies concerning not only the process of inspection but also optimization of the costs of such an inspec -tion.

References

1. Andrzejczak K..: Statystyka elementarna z wykorzystaniem systemu Statgraphics. Wydawnictwo PP, Poznań 1997.

2. Hryniewicz O.: Nowoczesne metody statystycznego sterowania jakością. Omnitech Press, Warszawa 1996.

Cytaty

Powiązane dokumenty

The installation allows the audience, assuming it consists of rightful citizens of democratic countries, to empathize with an unregistered migrant trying to cross the

Trudno oprzeć się wrażeniu, że Leder, chcąc nami ojcowsko potrzą- snąć, wytrącić z kolein prostych schematów i utrwalonych tożsamości, sam oddaje się intensywnej

Kaz˙dy region obejmuje kilka miejscowos´ci: Region I − Arnhem, Utrecht, En- schede, Assen, Nijmegen, Ulf i Groningen (duszpasterz ks. Aleksander Melbru- da); Region II −

b) prawo inkardynowania duchownych do prałatury i promowania alumnów do s´wie˛cen´ z tytułu słuz˙by prałaturze (kan. S ˛adze˛, z˙e utworzenie prała- tury personalnej dla

N a sk u tek wym ienionych zabiegów przekład znalazł się na innym poziomie stylistycznym niż oryginał, w większości wyrażony słowam i potocznym i, bowiem w

Wielokrotne wypowiedzi na temat zachowania własnej toz˙samos´ci w procesie integracji ze społeczen´stwem kraju osiedlenia powtórzy w przemówieniu do Polonii z krajów Beneluksu

В ноябре–декабре были ликвидированы Политическое управление и канцелярия военного министра, распущена комиссия по приведению армии

odbyło się w Paryżu międzynarodowe kolokwium na temat początków czasopiśmiennictwa farmaceutycznego (do roku 1840) zorganizo- wane z okazji 170 rocznicy powołania i wydawania