• Nie Znaleziono Wyników

SOCCEsRs: Open Challenge for Correcting Errors of Speech Recognition Systems

N/A
N/A
Protected

Academic year: 2021

Share "SOCCEsRs: Open Challenge for Correcting Errors of Speech Recognition Systems"

Copied!
16
0
0

Pełen tekst

(1)

SOCCEsRs: Open Challenge for Correcting Errors of Speech Recognition Systems

2019-04-09

(2)

Outline

1 Task description

2 Dataset

3 Evaluation

(3)

Task description

”Develop a method that improves the result of speech recognition

process on the basis of the (erroneous) output of the ASR system

and the correct human-made transcription”

(4)

Task description

Speech corpus

ASR response ASR

Error correction

system

Corrected ASR response

Transcription Recording

Reference

(5)

Rationale

Why such transformation is needed? It can be used in ASR system as postprocessing stage as:

I re-scoring of hypothesis produced by ASR using information not present at earlier stages of processing (i.e. one directrional LM)

I adaptation of a black box ASR system to some specific domain

(6)

Dataset

I 9000 sentences from Polish Wikinews I Recorded in studio

I Transcribed by human annotators

I Recognized by stat-of-the-art ASR system

I 8000:1000 train/test split

(7)

Dataset normalization

I all words are UPPERCASED I punctuation marks are removed

I numbers and special characters are replaced by their spoken

forms

(8)

Example lines

Example line from tsv file containing training dataset:

(9)

Evaluation

I average relative WER change I relative SER change

I CharMatch

(10)

Average relative WER change

I relative difference between:

I Word Error Rate of ASR hypothesis averaged over all test sentences

I Word Error Rate of hypothesis corrected by the proposed system averaged over all test sentences

(11)

Word Error Rate

W ER = S + D + I N = H + S + D

where S = number of substitutions, D = number of deletions, I =

number of insertions, H - number of hits, N - length of reference

sentence. See i.e. [?] for in-depth explanation.

(12)

Relative WER change

W ER

r

∆ = 1 −

P|C|

i=0 W ERi

W ER0i

|C|

where: |C| = number of sentences in corpora, W ER

i

= W ER of i-th sentence in the corpora, processed by the system, W ER

0i

= W ER of the i-th sentence in the corpora, returned by ASR.

In example: if average W ER

0

of raw ASR system is 8 and after

processing sentences through the error correction system W ER is 6

then W ER

r

∆ = 25%.

(13)

Relative SER change

I relative difference between Sentence Error Rate of ASR hypothesis and hypothesis corrected by the proposed system SER

I SER: ratio of number of sentences with W ER = 0 (correctly recognized sentences) to number of all sentences in the corpora.

I Relative SER reduction is defined similarly to W ER

r

∆:

SER

r

∆ = 1 − SER SER

0

(14)

CharMatch

Introduced in [?].

F

0

.5-measure defined in as follows:

F

0

.5 = (1 + 0.5

2

) × P × R 0.5

2

P + R Where: P is precision and R is recall:

P =

P

i

T

i P

i

d

L

(h

i

, s

i

) , R =

P

i

T

i P

i

d

L

(h

i

, r

i

)

Where: r

i

- i-th reference utterance, h

i

- i-th ASR hypothesis, s

i

- i-th system output, d

L

(a, b) - Levenshtein distance between

sequences a and b, T

i

- number of correct changes performed by the

system

(15)

CharMatch

T

i

- number of correct changes performed by the system, calculated as:

T

i

= d

L

(h

i

, r

i

) + d

L

(h

i

, s

i

) − d

L

(s

i

, r

i

)

2

(16)

Thank you

Thank you for your attention!

Cytaty

Powiązane dokumenty

The idea of the presented method is based on the fact that measuring head of the probe, compri- sing a resistance – temperature transducer heated by electric current, is

The analysis of his poems created at the peak of Polish modernity, his concrete poetry and particularly his manifestos – the focal point of his artistic

‘Proper spatial planning’ also means that serious water problems have to be dealt with in the land-use plan of that specific area where that problems are expected..

Dwie pierwsze (New Combinations: Literature oraz New Combinations: Art) zbierają wypowiedzi przynoszące nowe, komparatystyczne podejście do dzieła Schulza poprzez badanie wpływu

The classifiers selected for emotion class happiness have similar performance with 71 training samples in the case of Berlin data set and 48 training samples for

Osłabły zatem — choć w nierównym stopniu — wszystkie cechy czyniące w poprzednim okre­ sie ze świadomości rewolucyjnej formę świadomości społecznej, zaczął się proces

The application of the Mayor of Trzebiatów, prepared by the Field Branch of the National Heritage Institute in Szczecin, to the recognize of the Old Town Complex in Trzebiatów as

For the relationship between changes in income inequality and residential segregation between the top and bottom socioeconomic groups to emerge, at least one of the three