• Nie Znaleziono Wyników

Towards a parsing method for free word order languages

N/A
N/A
Protected

Academic year: 2021

Share "Towards a parsing method for free word order languages"

Copied!
5
0
0

Pełen tekst

(1)

TOWARD A PAP~I'NG METHOD FOR FREE WORD O ~ LANGUAGES x

J a n u s z S . H t e ~ , S t a n t s Z a w S z p a k o w i o z I n s t i t u t e o f I n f o r m a t i c s , Warsaw U n i v e r s i t y P.0.H. 1210, 00-901Warezawa, Poland

,,Free word order" is a traditional term that should not be taken literally. However, we shall retain the term for its conciseness.

F o r m a l d e s c r i p t i o n s o f = y n t a x h a v e b e e n u s u a l l y b a s e d e i t h e r on t h e i m m e d i a t e c o n s t i t u e n t s o r on t h e d e p e n d e n c y p~ilosophy. Neither of them seems directly applicable to free word order languages. The intertwining phrases cannot be de-

scribed naturally by IC rules. Some coordinate constructions are difficult to describe by me~us of dependency relations.

I n o u r o p i n i o n , p a r s e r s f o r f r e e word o r d e r l a n g u a g e s s h o u l d n o t be b a s e d on t h e m e t h o d s d e v e l o p e d w i t h i n t h e IC f r a m e w o r k . S c a r c e e x p e r i m e n t s w i t h p a r s e r s b a s e d on t h e d e p e n d e n c y f o r m - a l i s m , e g . / 5 / , do n o t seem p r o m i s i n g . T h e r e f o r e , we d e c i d e d t o t a k e a f r e s h s t a r t and t o a t t a c k t h e p r o b l e m by r e a n a l y s - i n g t h e b a s i c n o t i o n s o f s y n t a x and p a r s i n g . We f o c u s o u r a t t e n t i o n on t h o s e f o r m a l a s p e c t s o f a l a n g u a g e s y s t e m w h i c h m i g h t be most u s e f u l f o r a u t o m a t i c t e x t p r o c e s s i n g . We a s s m a e t h a t t h e m o r p h o l o g i c a l l e v e l i s d e s c r i b e d a l o n g t h e l i n e s o f

/2/4

x) This paper is an extended abstract of /3/.

- 3 7 -

(2)

2 . The N o t i o n of S.~t_ax.

I n t h i s p a p e r , we u n d e r s t a n d s y n t a x a s t h e d o m a i n o f forr~al r e l a t i o n s b e t w e e n w o r d s , i . e . r o u g h l y a s s o - c a l l e d s u r f a c e s y n t a x . We d e f i n e t h e n o t i o n u s i n ~ a m o r p h o l o g y - b a s e d c r i t e r i o n , d e s c r i b e d b e l o w .

The outcome o f m o r p h o l o g i c a l a n a l y s i s c a n be ambiguous f o r a n i s o l a t e d word. I n most s i t u a t i o n s , however, t h e morpho- l o g i c a l f e a t u r e s o f a word a r e u n i q u e l y d e t e r m i n e d by some f o r m a l p r o p e r t i e s of i t s c o n t e x t .

Sometimes t h e a m b i g u i t y r e m a i n s , a s i n t h e f o l l o w i n g s e n t e n c e

O p ~ n i e n i e b r y g ~ d piecowyoh spowodowa~o p o t g p i e n i e wuJa J a n a .

1 I I ! !,,. I ! ! I 1

" ~ l ~ g e n " V - ' ~ NPgen"

! 1 I 1

~nom./aco. ~nom./ac£.

T h e r e a r e f i v e i n d e p e n d e n t a m b i g u i t i e s i n t h i s s e n t e n c e , y i e l d i n g 32 c o h e r e n t r e a d i n g s . Two o f them a r e due t o She n e u t r a l i z a t i o n o f a g e n t / p a t i e n t f u n c t i o n d u r i n g n o m i n a l i s a t ~ i o n . F o r e x a m p l e , " p o t g p i e n i e x" means " d i s a p p r o v a l o f ~ ' ( e i t h e r "x d i s a p p r o v e s y" o r " y d i s a p p r o v e s x " ) i s u c h a n a m b i g u i t y c a n be r e s o l v e d o n l y by e x s m i n i n g t h e meav.ing o f a g i v e n p h r a s e , so we c a l l i t s e m a n t i c o n e .

The next ambiguity Occurs in the phrase "wuJa Jana", that means either "uncle John"gen ° or "John's uncle"gen.O Here we can see two kinds of syntactic relations: case agreem- ent (the former interpretation) or government (the latter one), which both require "Jana" to be in genitive case. Such an ambi~ulty we consider as purely syntactic one.

In the phrase " b r y ~ d plecowych" we call discern either case agreement ("piecowych"gen" is then an adjective) or government ("piecowych"gen ° is then a noun). Here, the elim- ination o f m o r p h o l o g i c a l homon~my g i v e s r i s e t o a l t e r n a t i v e c o n s t r u c t i o n s , t h u s i n c r e a s i n g t h e s y n t a c t i c a m b i ~ i t y ,

- 3 8 -

(3)

The last ambiguity stems from the nominative/accusatlve neutralization both of a virtual subject and a virtual object of the sentence. It suffices to assign a syntactic function to one of them~ the function of the other and the morpho- logical characteristics of both of them will be fully determ- ined.

The example demonstrates how certain relations between sentence components allow to disambiguate t h e morphological properties of individual words without resorting %0 their meanings. In our approach, these relations constitute the level of syntax /3/.

Syntactic relations (eg. agreement, government) consist in matching syntactic properties (eg. case, gender) of re- speotiveu units. The basic unit is a morphological word /2/.

By the syntactic structure of a sentence we understand some explicit representation of all the syntactic relations between its components, usually - a graph. Such a graph need not necessarily be connected. For example, some modifiers are linked to their heads only by semantic relations and not by syntactic ones. Similarly, some elllptio sentences may have disconnected syntactic representationu.

We ~ulderstand parsing as a process of establishing all s y n t a c t i c s t r u c t u r e s o f a g i v e n t e x t . A l t h o u g h s u c h s t r u c t u r - e s a r e r a t h e r u n s o p h i s t i c a t e d , t h e y a r e p r a c t i c a l l y v e r y important for low-level text p r o c e s s i ~ .

In search of an adequate parsing method, we found the idea of ~arcus /4/ most appealing. He claims that natural languages are designed to be deterministically parsed from left to right and that writing a grammar should consist in finding out local clues which enable the parser to select properly what to do next. This idea seems even more advantag- eous for free word order languages. Rich inflection makes

(4)

t h e l o c a l c l u e s much more e x p l i c i t and t h e p a r e e r ' s e x p e c t a t - i o n s more p r e c i s e . B e s i d e s , s u c h an o r g a n i s a t i o n o f t h e p a r s - i n g p r o c e s s i s c o m p a t i b l e w i t h t h e r e s o u r c e c o n t r o l h y p o t h e s - i s / 1 / w h i c h i s hoped t o a c c o u n t f o r s e m a n t i c i m p l i c a t i o n s o f f r e e word o r d e r .

~ C o n c l u s i o n

As a p r a c t i c a l c o n s e q u e n c e o f t h e c o n s i d e r a t i o n s g i v e n a b o v e , we a d o p t t h e f o l l o w i n g r e s e a r c h p r o g r a m . As a s t a r t i n g p o i n t we t a k e t h e e x i s t i n g I C - b a s e d s y n t a c t i c d e s c r i p t i o n o f P o l i s h s e n t e n c e s w i t h n e u t r a l word o r d e r / 6 / , c o n s i s t i n g o f a b o u t 500 r u l e s (some p v ~ t s o f i t h a v e b e e n r e w r i t t e n i n g r e a t - e r d e t a i l / 7 / , w i t h t h e number o f r u l e s i n c r e a s i n g 5 - 1 0 t i m e s ) . We a r e g o i n g t o r e s t r u c t u r e t h e d e s c r i p t i o n t o o b t a i n a n i n d e x o f e x p e c t a t i o n s r e l a t e d t o e a c h s y n t a c t i ~ u n i t . We s h a l l i n - c o r p o r a t e t h e c l u e s , t h u s o b t a i n e d , i n t o some M a r c u s - s t y l e p a r s i n g s t r a t e ~ . We e x p e c t t h a t i t w i l l l e a d t o a n e f f i c i e n t and l i n g u i s t i c a l l y sound p a r s e r f o r P o l i s h .

R e f e r e n c e s

/ l J B i e n J . S . : A P r e l i m i n a r y S t u d y on L i n g u i s t i c I m p l i c a t i o n s o f R e s o u r c e C o n t r o l i n N a t u r a l Language U n d e r s t a n d i n g . ISSCO Working P a p e r 4 4 , Geneve 1980.

/ 2 / B i e n J . S . , S a l o n i Z . : The n o t i o n o f m o r p h o l o g i c a l word and its application to the description of Polish inflect- ion (preliminary version) /in Polish/. l~ace Pilologicz- ne XXXI, t o a p p e a r .

/3/ Bien J.S,, Szpakowlcz S. : Toward a Parsing Method for Free Word Order Languages. In: Papers In Comphtetlonal Linguls- tics II. IInf UW R e p o r t s , t o a p p e a r .

/4/ Marcus M.P,: A Theory of Syntactic Reco~lition for Natur- al Language. ~LIT Press 1980,

/5/ Panevov~ J., Sgall P, : On Some Issues of Syntactic Anal- ysis of Czech. In: The l~e~ue Bulletin of Mathematical Lingulstlcs 34, 1980, 21-32.

40 -

(5)

[6/ Szpakowioz S . : Formal s y n t a c t i c d e s c r i p t i o n of P o l i s h s e n t e n c e s / i n P o l i s h / . Wydawnictws Uniwersytetu Warszaw- skiego~ i n p r e s s .

/ 7 / Szpakowicz S . , g w i d z i ~ s k i M. : An o u t l i n e of s e n t e n q s schemes c l a s s i f i c a t i o n i n contemporary w r i t t e n P o l i s h / i n P o l i s h / . S t u d i a ~-~amatyczne V, WrooXaw, t o a p p e a r .

- 4 1 -

Cytaty

Powiązane dokumenty

Choose the right word to complete the sentence.. Look at

Choose the right word to complete the

0. When I graduate from secondary school I’m going to study Mathematics at Cambridge University. I’m not used to studying systematically. I’m not very ambitious. Last lesson

break food such as carrots into small pieces by rubbing it against kitchen tool utensil with holes (grater)0. _ r_ _ _

Each of these problems coincides with the problem treated in paper

( 0. The results obtained here overlap some results of E.. the successive zeros of an oscillatory solution x{t). This condition is a generalization of one given

It is n o w clearly visible that parsing free word-order languages is really dif- ferent from the syntactic analysis of, say, English.. But w e believe that,

Put the sentences in the correct order... Put the sentences in the