• Nie Znaleziono Wyników

Index of /rozprawy2/11638

N/A
N/A
Protected

Academic year: 2021

Share "Index of /rozprawy2/11638"

Copied!
3
0
0

Pełen tekst

(1)

vii

Contents

1 Introduction 1

1.1 Problem Statement . . . 1

1.2 Objectives of the Research . . . 1

1.3 Research Contribution . . . 1

1.4 Motivation . . . 2

1.4.1 Relevance of the Research . . . 2

1.4.2 Applications . . . 3

1.4.3 Author’s Research in the Topic . . . 3

1.5 Outline of the Thesis . . . 4

2 Spoken Language Recognition 5 2.1 Spoken Language Recognition . . . 5

2.2 What is Language? . . . 6

2.2.1 Humans Ability to Recognize Languages . . . 6

2.2.2 Differentiating Between Languages . . . 7

2.3 Approaches to Language Recognition . . . 7

2.3.1 Syntax . . . 7

2.3.2 Words . . . 8

2.3.3 Prosody . . . 8

2.3.4 Phonotactics . . . 8

2.3.5 Acoustic Phonetics . . . 8

2.4 Review of Language Recognition Research . . . 8

2.5 I-vector Based Language Recognition System . . . 9

2.5.1 Features Extraction . . . 9

2.5.1.1 MFCC . . . 10

2.5.1.2 Shifted Delta Cepstra . . . 11

2.5.1.3 Voice Activity Detection . . . 12

2.5.1.4 Feature Normalisation Methods . . . 13

2.5.1.5 Summary of Feature Extraction . . . 13

2.5.2 I-vector Subspace Modeling . . . 13

2.5.2.1 Gaussian Mixture Model . . . 14

2.5.2.2 Universal Background Model . . . 15

2.5.2.3 Joint Factor Analysis . . . 15

2.5.2.4 I-vectors . . . 16

2.6 Language Recognition in the I-vector Space . . . 19

2.6.1 Cosine Distance Scoring . . . 19

2.6.2 Generative Gaussian Classifier . . . 19

2.6.3 Mixtures of von Mises-Fisher Distributions . . . 20

2.6.4 Support Vector Machines . . . 20

2.6.5 Logistic Regression . . . 22

2.6.6 Problabilistic Linear Discriminant Analysis . . . 22

2.6.7 Intersession Compensation Techniques . . . 22

(2)

viii

2.6.7.2 Whitening . . . 22

2.6.7.3 Within-Class Covariance Normalization . . . 23

2.6.7.4 Linear Discriminant Analysis . . . 24

2.6.8 Score Level and System Fusions . . . 24

2.7 Summary . . . 25

3 Spoken Language Clustering 27 3.1 Problem Definition . . . 27

3.1.1 Relation to Language Diarization . . . 27

3.2 Overview and Applications . . . 28

3.3 Language Clustering as an Unsupervised Learning Task . . . 28

3.4 Clustering Algorithms . . . 29

3.4.1 Spherical K-means . . . 30

3.4.2 Von Mises-Fisher Mixtures . . . 30

3.4.3 Mean Shift . . . 30

3.4.4 Agglomerative Hierarchical Clustering . . . 31

3.4.5 HDBSCAN . . . 32

3.5 Evaluation of Clustering . . . 33

3.5.1 External Quality Measures . . . 33

3.5.1.1 Impurity Measures . . . 33

3.5.1.2 Adjusted Rand Index . . . 34

3.5.1.3 BBN Metric and Clustering Efficiency . . . 35

3.5.2 Internal Quality Measures . . . 35

3.5.2.1 Nearest Neighbor Purity Estimator and Estimated Clustering Efficiency . . . 35

3.5.2.2 Silhouette . . . 36

3.6 Summary . . . 36

4 Language Database Description and Analysis 37 4.1 NIST 2015 Language Recognition I-vector Machine Learning Chal-lenge Database . . . 37

4.1.1 Utterance Duration . . . 39

4.1.2 Data Analysis . . . 39

4.1.2.1 Silhouette Plots . . . 39

4.1.2.2 2D Visualization of the Data . . . 39

4.2 Summary . . . 40

5 Language Clustering Experiments 45 5.1 I-vector Preprocessing . . . 46

5.2 Spherical K-means Clustering Experiments . . . 46

5.3 Von Mises-Fisher Clustering Experiments . . . 49

5.4 Agglomerative Hierarchical Clustering Experiments . . . 52

5.5 Mean Shift Clustering Experiments . . . 53

5.5.1 Pruning with Noise Redistribution . . . 54

5.5.2 Pruning with Noise Removal . . . 56

5.6 HDBSCAN Clustering Experiments . . . 59

5.7 Results Analysis . . . 61

(3)

ix

6 Language Recognition with Clustering-based Modeling Experiments 67

6.1 Cluster-based Modeling . . . 69

6.2 Language Recognition Experiments and Results . . . 71

6.2.1 Performance Evaluation . . . 71

6.2.1.1 NIST Cost Function . . . 72

6.2.1.2 Average Decision Cost Function Cavg . . . 72

6.2.1.3 I-vector Preprocessing . . . 73

6.2.2 Experiments and Results from Baseline Systems . . . 74

6.2.2.1 Cosine Distance Scoring . . . 74

6.2.2.2 Gaussian Classifier . . . 74

6.2.2.3 Von Mises-Fisher Classifier . . . 75

6.2.2.4 Support Vector Machine . . . 76

6.2.2.5 Logistic Regression . . . 77

6.2.3 Experiments and Results from Systems with Cluster-based Modeling . . . 78

6.2.3.1 Cosine Distance Scoring with Cluster-based Mod-eling . . . 79

6.2.3.2 Logistic Regression with Cluster-based Modeling . 80 6.2.4 Effect of Randomness . . . 80

6.2.5 Impact of Training Data Size and the Number of Languages 81 6.2.5.1 Impact of the Training Size . . . 81

6.2.5.2 Impact of the Number of Languages . . . 82

6.2.6 Score and System Level Fusion . . . 83

6.3 Results Analysis and Disscussion . . . 84

6.3.0.1 Statistical Significance of the Results . . . 87

6.3.0.2 Impact of Cluster-based Modeling on Computa-tional Complexity . . . 88

6.4 Conclusion . . . 88

7 Conclusions 89 7.1 Future Work . . . 90

Cytaty

Powiązane dokumenty

October 2005 Functional Programming for DB DB Foundations 1 data design separated from process

eXtensible Markup

• Definicje typów mogą wystąpić globalnie (z atrybutem name) albo wewnątrz deklaracji opisywanego elementu (anonimowo, celem

The vertical black ticks indicate the phases borders (with transition areas between the end of the preparation and the start of the stroke, and between the end of the stroke and

Wydaje się, iż w chwili gdy główne osoby w y ­ stępujące w Sprawozdaniu już n ie żyją oraz w 'czasie kiedy W itkiewicz staje się klasykiem, wzgląd

Całkowicie rozbita została stfera św ięta arcydzieł, mówi się o nich obok kiczu.?. L ite ra tu ra jest przekraczaniem św iata, przekraczaniem tego, co

[r]

Natural gas statistics on production, total imports and exports, stock changes, stock levels, gross inland consumption and consumption in the transformation sector, energy