Natural Language Generation

"Don't talk unless you can improve the silence."-- Jorge Luis Borges (1899-1986)

Borges set up a very high baseline with the quotation cited above. While computers nowadays employ a substantial amount of text as part of their means to communicate with their users, most of them are pre-written by people (see Dr. Reiter's Templates vs. NLG INLG'95 paper). Why is that? Well, as argued by Sean M. Burke in his Perl Journal article, people consider that computers are "smart" and that language is "easy". These are both major misconceptions, but again, as they quotation from Borges, put a very high expectation for NLG. In that sense, the final user expectations are so high that most of the text is human-authored and therefore has very little space for flexibility (but is of very high quality).

My interest in NLG started at the end of my undergrad when I took contact with Definite Clause Grammars. I went to CU with the intention of doing a PhD. in the field and I had the opportunity to work with Prof. Kathy McKeown as my advisor.

If you are interested in a gentle introduction to NLG, I gave a colorful tutorial at BarCampNYC4, which is available for download from my talks page.

My interests

From the different sub-problems of NLG, I am interested in the more conceptual areas, the ways to organize the ideas that make up the text. In NLG parlance this is known as Content Planning, Text Planning, Document Planning or Microplanning. Content Planning was thus the subject of my Candidacy Exam, December 11th 2001 and my PhD Dissertation.

Again, because the NLG problem is harder that what it seems a priori, my research focus on the (semi)automatic construction of NLG systems. My thesis was precisely the automatic acquisition via machine learning of Content Planning logic. You can see more about my work in my publications page.

The NLG community is reduced in size (but not in ideas ;-). As an effort to reach out to a larger audience, I started a jFUF, a project to migrate from Lisp to Java FUF/SURGE, a GPL-licensed surface realizer. This project got some momentum at the beginning but is currently stalled by lack of time on my behalf. Please contact me if you are interested in coordinating efforts about it.