skip navigation.

My research interests are: Machine Learning in Natural Language Processing, Automated Reasoning, and Knowledge Representation. More specifically, I am mainly working on modelling of (spoken) human-human and human-machine spoken dialogue, and on grammatical inference, a branch of unsupervised machine learning:
  • Dialogue Modeling

    I am working on the modelling of human-human and human-machine spoken dialogue.

    • Dialogue act recognition

      A topic I am working on is the recognition of dialogue acts (with focus on those defined in the DIT++ taxonomy ), mainly in task-oriented human-human and human-machine spoken dialogue.

    • Dialogue management

      I am looking at dialogue management strategies and test my models in a dialogue system that I am developing in parallel.

    To support tagset annotation and evaluation with multiple annotators, I have developed a flexible web-based tool, DitAT, that distributes and collects data over the LAN or Internet and calculates statistics such as inter-annotator agreement (See: Cohen's Kappa web demo). It has been used successfully in experimental setup and class room setting.

  • Grammatical Inference

    Another topic I am working on is the induction of structure in symbolic sequential data, with focus on structure in natural language and in music.

    • Alignment-based learning

      I have been working on extending, optimizing, and applying the Alignment-Based Learning (ABL) framework, a grammar induction framework that induces structure from symbolic sequences (See: ABL web demo and info). One major extension was the introduction of a new and efficient alignment learning algorithm based on generalized suffix trees (GSTs) (See: See GST web demo).

      In 2006 I have been preparing a first public release of an ABL implementation, combining prior work of Menno van Zaanen with my algorithms and extensions.

    • Machine translation

      With Menno van Zaanen I worked on applying grammar induction in machine translation (MT). We have developed TABL (Translation using ABL), a system that induces syntactic and syntactic-semantic structure in two languages and learns a mapping between these structures in order to automatically learn MT systems from multi-lingual corpora.

    • Composer classification

      A classifier is presented and evaluated that uses structure in musical pieces that has been found using alignment-based learning. The system is able to learn composer characteristics from example pieces and uses this information to classify unseen musical pieces into classes representing composers.