Node: Formalism, Next: , Previous: Introduction, Up: Top



2 Malaga's Grammar Formalism

A formal grammar for a natural language can be used to check whether a sentence or a word form is grammatically well-formed (a word form is a special inflectional form of a word, so "book" and "books" are two different word forms of the word "book"). Furthermore, a grammar can describe the structure and meaning of a sentence or a word form by a data structure that has been constructed during the analysis process.

Malaga is using a formalism that is derived of the Left-Associative Grammar (LAG), which has been developed by Roland Hausser. An LAG analyses a sentence (or a word form) step by step: its parts are concatenated from the left to the right, hence the name "Left-Associative Grammar". A single LAG rule can only join two parts to a bigger one: it concatenates the state part (which is the beginning of the sentence or word form that has already been analysed) and the link part (which is the next word form or the next allomorph). In contrast to LAG, Malaga's formalism already reads in the first part of a word form or of a sentence by applying a rule. Take a look at the following sentence:

Shakespeare liked writing comedies.

The sentence is being analysed by five rule applications:

"" + "Shakespeare"
"Shakespeare" + "liked"
"Shakespeare liked" + "writing"
"Shakespeare liked writing" + "comedies"
"Shakespeare liked writing comedies" + "."

To apply a rule it's not sufficient to know the spelling of a word or an allomorph. A rule also requires morphological and syntactic information, such as word class, gender, meaning of a suffix etc. This information, which is associated with an element of an utterance, like a sentence, a word form or an allomorph, is called its feature structure. The analysis of a sentence or a word form returns such a feature structure as result.

Now let us take a closer look at how a sentence is analysed.

  1. Before we can start to analyse a sentence, the analysis automaton must be in an initial state. The initial state includes:
  2. The next word form that is going to be added is read and analysed morphologically. If there is no valid word form, the analysis process aborts.
  3. The feature structure that morphology assigns to this word form is called the link's feature structure. The feature structure of the input that has been analysed syntactically so far is called the state's feature structure.
  4. The active combination rule checks whether it is allowed to combine the state's surface (which may be empty if the rule is operating on the initial state) with the link, i.e., the next word form. The combination rule takes the feature structures of the state and of the link as parameters. They can be compared by logical tests, and finally the feature structure of the successor state (whose surface includes the word form that has been read), is constructed by the rule. The rule also specifies which successor rule is active in the successor state. Execution then continues at step 2.

    Instead of specifying a successor rule, a rule can also accept the analysed sentence. In that case, the feature structure of the successor state will be used as the feature structure of the complete analysed sentence.

Morphological analysis operates analogously, except that a word form, composed from allomorphs, is being analysed. The link (step 2) is found in the allomorph lexicon.

This sketch is of course simplified. There can be ambiguities in an analysis, induced by several causes:

These ambiguities are coped with by dividing the analysis into several subanalyses: if there are two lexicon entries for a word form, for example, the analysis continues using the first entry (and its feature structure) as well as the second one. You can compare this with a branching path. The analyses will be continued independently of each other. So, one analysis path can accept the input while the other fails. Each analysis path can divide repeatedly when other ambiguities are met. If several analysis paths are continued until they accept, the analysis process returns more than one result.