Edinburgh Speech Tools  2.1-release
 All Classes Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Example code for Linguistic Classes

Table of Contents

Some examples of usage of linguistic classes

Adding basic information to an EST_Item

An item such as:

\[ \left [ \begin{array}{ll} \mbox{POS} & \mbox{\emph{Noun}} \\ \mbox{NAME} & \mbox{\emph{example}} \\ \mbox{FOCUS} & \mbox{+} \\ \end{array} \right ] \]

is constructed as follows: (note that the attributes are in capitals by linguistic convention only: attribute names are case sensitive and can be upper or lower case).

//@ code
p.set("POS", "Noun");
p.set("NAME", "example");
p.set("FOCUS", "+");
p.set("DURATION", 2.76);
p.set("STRESS", 2);
//@ endcode

The type of the values in features is a EST_Val class, which is a union which can store ints, floats, EST_Strings, void pointers, and EST_Features. The overloaded function facility of C++ means that the EST_Item::set() can be used for all of these.

Accessing basic information in an Item

When accessing the features, the type must be specified. This is done most easily by using of a series of functions whose type is coded by a capital letter:

//@ code
cout << "Part of speech for p is " << p.S("POS") << endl;
cout << "Duration for p is " << p.F("DURATION") << endl;
cout << "Stress value for p is " << p.I("STRESS") << endl;
//@ endcode

  Output:
    "Noun"
    2.75
    1

A optional default value can be given if a result is always desired

//@ code
cout << "Part of speech for p is "
<< p.S("POS") << endl;
cout << "Syntactic Category for p is "
<< p.S("CAT", "Noun") << endl; // noerror
//@ endcode

Nested feature structures in items

Nested feature structures such as

\[ \left [ \begin{array}{ll} \mbox{NAME} & \mbox{\emph{d}} \\ \mbox{PLACE OF ARTICULATION \boxed{1} } & \left [ \begin{array}{ll} \mbox{CORONAL} & \mbox{\emph{+}} \\ \mbox{ANTERIOR} & \mbox{\emph{+}} \\ \end{array} \right ] \\ \mbox{VOICE} & \mbox{\emph{+}} \\ \mbox{CONTINUANT} & \mbox{\emph{--}} \\ \mbox{SONORANT} & \mbox{\emph{--}} \\ \end{array} \right ] \]

can be created in a number of ways:

//@ code
p.set("NAME", "d");
p.set("VOICE", "+");
p.set("CONTINUANT", "-");
p.set("SONORANT", "-");
p.set("PLACE OF ARTICULATION", f); // copy in empty feature set here
p.A("PLACE OF ARTICULATION").set("CORONAL", "+");
p.A("PLACE OF ARTICULATION").set("ANTERIOR", "+");
//@ endcode

or by filling the values in an EST_Features object and copying it in:

//@ code
f2.set("CORONAL", "+");
f2.set("ANTERIOR", "+");
p.set("PLACE OF ARTICULATION", f2);
//@ endcode

Nested features can be accessed by multiple calls to the accessing commands:

//@ code
cout << "Anterior value is: " << p.A("PLACE OF ARTICULATION").S("ANTERIOR");
cout << "Coronal value is: " << p.A("PLACE OF ARTICULATION").S("CORONAL");
//@ endcode

The first command is EST_Item::A() because PLACE is a feature structure, and the second command is EST_Item::S() because it returns a string (the value or ANTERIOR or CORONAL). A shorthand is provided to extract the value in a single statement:

//@ code
cout << "Anterior value is: " << p.S("PLACE OF ARTICULATION.ANTERIOR");
cout << "Coronal value is: " << p.S("PLACE OF ARTICULATION.CORONAL");
//@ endcode

Again, as the last value to be returned is a string EST_Item::S() must be used. This shorthand can also be used to set the features:

//@ code
p.set("PLACE OF ARTICULATION.CORONAL", "+");
p.set("PLACE OF ARTICULATION.ANTERIOR", "+");
//@ endcode

this is the easiest and most commonly used method.

Utility functions for items

The presence of a attribute can be checked using EST_Item::f_present(), which returns true if the attribute is in the item:

//@ code
cout << "This is true: " << p.f_present("PLACE OF ARTICULATION");
cout << "This is false: " << p.f_present("MANNER");
//@ endcode

An attribute can be removed by EST_Item::f_remove.

//@ code
p.f_remove("PLACE OF ARTICULATION");
//@ endcode

Building a linear list relation

It is standard to store the phones for an utterance as a linear list in a EST_Relation object. Each phone is represented by one EST_Item, whereas the complete list is stored as a EST_Relation.

The easiest way to build a linear list is by using the EST_Relation::append(), which when called without arguments, makes a new empty EST_Item, adds it onto the end of the relation and returns a pointer to it. The information relevant to that phone can then be added to the returned item.

//@ code
EST_Relation phones;
a = phones.append();
a->set("NAME", "f");
a->set("TYPE", "consonant");
a = phones.append();
a->set("NAME", "o");
a->set("TYPE", "vowel");
a = phones.append();
a->set("NAME", "r");
a->set("TYPE", "consonant");
//@ endcode

Note that the -> operator is used because the EST_Item a is a pointer here. The same pointer variable can be used multiple times because every time EST_Relation::append() is called it allocates a new item and returns a pointer to it.

If you already have a EST_Item pointer and want to add it to a relation, you can give it as an argument to EST_Relation::append(), but this is generally inadvisable as it involves some unnecessary copying, and also you have to allocate the memory for the next EST_Item pointer yourself every time (if you don't you will overwrite the previous one):

//@ code
a = new EST_Item;
a->set("NAME", "m");
a->set("TYPE", "consonant");
phones.append(a);
a = new EST_Item;
a->set("NAME", "ei");
a->set("TYPE", "vowel");
//@ endcode

Items can be prepended in exactly the same way:

//@ code
a = phones.prepend();
a->set("NAME", "n");
a->set("TYPE", "consonant");
a = phones.prepend();
a->set("NAME", "i");
a->set("TYPE", "vowel");
//@ endcode

Iterating through a linear list relation

Iteration in lists is performed with EST_Relation::next() and EST_Relation::prev(), and an EST_Item, used as an iteration pointer.

//@ code
for (s = phones.head(); s != 0; s = s->next())
cout << s->S("NAME") << endl;
//@ endcode

Output:

  name:i    type:vowel
  name:n    type:consonant
  name:f    type:consonant
  name:o    type:vowel
  name:r    type:consonant
  name:m    type:consonant

//@ code
for (s = phones.tail(); s != 0; s = s->prev())
cout << s->S("NAME") << endl;
//@ endcode

Output:

  name:m    type:consonant
  name:r    type:consonant
  name:o    type:vowel
  name:f    type:consonant
  name:n    type:consonant
  name:i    type:vowel

EST_Relation::head() and EST_Relation::tail() return EST_Item pointers to the start and end of the list. EST_Relation::next() and EST_Relation::prev() returns the next or previous item in the list, and returns 0 when the end or start of the list is reached. Hence checking for 0 is a useful termination condition of the iteration. Taking advantage of C shorthand allows us to write:

//@ code
for (s = phones.head(); s; s = s->next())
cout << s->S("NAME") << endl;
//@ endcode

Building a tree relation

It is standard to store information such as syntax as a tree in a EST_Relation object. Each tree node is represented by one EST_Item, whereas the complete tree is stored as a EST_Relation.

The easiest way to build a tree is by using the EST_Relation::append_daughter(), which when called without arguments, makes a new empty EST_Item, adds it as a daughter to an existing item and returns a pointer to it. The information relevant to that node can then be added to the returned item. The root node of the tree must be added directly to the EST_Relation.

//@ code
//@example prog01
EST_Item *r, *np, *vp, *n;
r = tree.append();
r->set("CAT", "S");
np = append_daughter(r);
np->set("CAT", "NP");
n = append_daughter(np);
n->set("CAT", "PRO");
n->set("NAME", "John");
vp = append_daughter(r);
vp->set("CAT", "VP");
n = append_daughter(vp);
n->set("CAT", "VERB");
n->set("NAME", "loves");
np = append_daughter(vp);
np->set("CAT", "NP");
n = append_daughter(np);
n->set("CAT", "DET");
n->set("NAME", "the");
n = append_daughter(np);
n->set("CAT", "NOUN");
n->set("NAME", "woman");
cout << tree;
//@ endcode

Output:

(S 
   (NP 
      (N (John))
   )
   (VP 
      (V (loves)) 
      (NP 
         (DET the) 
         (NOUN woman))
   )
)

Obviously, the use of recursive functions in building trees is more efficient and would eliminate the need for the large number of temporary variables used in the above example.

Iterating through a tree relation

Iteration in trees is done with EST_Relation::daughter1() EST_Relation::daughter2() EST_Relation::daughtern() and EST_Relation::parent(). Pre-order traversal can be achieved iteratively as follows:

//@ code
n = tree.head(); // initialise iteration variable to head of tree
while (n)
{
if (daughter1(n) != 0) // if daughter exists, make n its daughter
n = daughter1(n);
else if (n->next() != 0)//otherwise visit its sisters
n = n->next();
else // if no sisters are left, go back up the tree
{ // until a sister to a parent is found
bool found=FALSE;
for (EST_Item *pp = parent(n); pp != 0; pp = parent(pp))
if (pp->next())
{
n = pp->next();
found=TRUE;
break;
}
if (!found)
{
n = 0;
break;
}
}
cout << *n;
}
//@ endcode

A special set of iterators are available for traversal of the leaf (terminal) nodes of a tree:

//@ code
//@ example prog02
//@ title Leaf iteration
for (s = first_leaf(tree.head()); s != last_leaf(tree.head());
s = next_leaf(s))
cout << s->S("NAME") << endl;
//@ endcode

Building a multi-linear relation

This is not yet fully implemented?

Iterating through a multi-linear relation

This is not yet fully implemented?

Relations in Utterances

The EST_Utterance class is used to store all the items and relations relevant to a single utterance. (Here utterance is used as a general linguistic entity - it doesn't have to relate to a well formed complete linguistic unit such as a sentence or phrase).

Instead of storing relations separately, they are stored in utterances:

//@ code
utt.create_relation("Word");
utt.create_relation("Syntax");
//@ endcode

EST_Relations can be accessed though the utterance object either directly or by use of a temporary EST_Relation pointer:

//@ code
EST_Relation *word, *syntax;
word = utt.relation("Word");
syntax = utt.relation("Syntax");
//@ endcode

The contents of the relation can be filled by the methods described above.

Adding items into multiple relations

A major aspect of this system is that an item can be in two relations at once, as shown in Figure 6-2.

In the following example, using the syntax relation as already created in prog01, shows how to put the terminal nodes of this tree into a word relation:

//@ code
//@example prog03
//@title adding existing items to a new relation
word = utt.relation("Word");
syntax = utt.relation("Syntax");
for (s = first_leaf(syntax->head()); s != last_leaf(syntax->head());
s = next_leaf(s))
word->append(s);
//@ endcode

Thus the terminal nodes in the syntax relation are now stored as a linear list in the word relation.

Hence

//@ code
cout << *utt.relation("Syntax") << "\n";
//@ endcode

produces

Output:

(S 
   (NP 
      (N (John))
   )
   (VP 
      (V (loves)) 
      (NP 
         (DET the) 
         (NOUN woman))
   )
)

whereas

//@ code
cout << *utt.relation("Word") << "\n";
//@ endcode

produces

Output

John
loves
the
woman

Changing the relation an item is in

Even if an item is in more than one relation, it always has the idea of a "current" relation. If the traversal functions (next, previous, parent etc) are called, traversal always occurs with respect to the current relation. An item's current relation can be changed as follows:

s = utt.relation("Word")->head(); // set p to first word
s = next(s); // get next word: s = parent(s) would throw an error as there
// is no parent to s in the word relation.
s = prev(s); // get previous word
s = s->as_relation("Syntax"); // change relation.
s = parent(s); // get parent of s in syntax relation
s = daughter1(s); // get first daughter of s: s = next(s) would throw an
// error as there is no next to s in the syntax relation.

while s is still the same item, the current relation is now "Syntax". The current relation is returned by the EST_Item::relation() function:

cout << "Name of current relation: " << s->relation()->name() << endl;

If you aren't sure whether an item is in a relation, you can check with EST_Item::in_relation(). This will return true if an item is in the requested relation regardless of what the current relation is.

cout << "P is in the syntax relation: " << s->in_relation("Word") << endl;
cout << "Relations: " << s->relations() << endl;

Feature functions

evaluate functions

setting functions