OpenSchema

OpenSchema provides a declarative implementation of (McKeown 1985) document structuring schemata. Schemata are a widespread strategic generation solution, targetting texts where the structure has got fixed after a period of time (so no reasoning process can re-create this structure).

OpenSchema is a SourceForge project. The project's unreleased files are available through GIT.

[ SourceForge project ] [ Javadoc ] [ GIT ] [ Roadmap ]


What is strategic generation?

Strategic generation is the stage of a generation system that decides what to say, compared to the tactical generation component, that decides how to say it. It involves selecting the relevant content to include in the output (Content Selection) and imposing an order to it (Document Structuring).

The name strategic generation and the division into two modules is one of the eldest architectures in NLG. Current generation architectures divide the system into at least three modules, text planning (the strategic generation component), sentence planning and surface realization (the tactical generation component). The RAGS project is a recent example of a concensus architecture.

[ SourceForge project ] [ Javadoc ] [ GIT ] [ Roadmap ]


What are schemata?

Document Structuring Schemata are a means to do strategic generation. Starting from a pool of coarsely relevant knowledge (the relevant knowledge pool), schemata are finite state machines with terminals in a language of rhetorical predicates.

In her work, McKeown analyzed a number of texts in different genre (all of expository in nature) and found out four different organizational strategies that captured a large number of texts. She termed these four strategies as ``schemata.'' and defined them as

(...) a representation of a standard pattern of discourse structure which efficiently encodes a set of communicative techniques that a speaker can use for a particular discourse purpose. It defines a particular organizing principle for text and is used to structure the information that will be included in the answer. It is used to guide the generation process, controlling decisions of what to say when in the text.

(McKeown, 1985, p.20)

OpenSchema extends McKeown's work by providing a declarative definition of her predicates, now not necessarily rhetorical in nature (a more appropriate name for these predicates is communicative predicates).

[ SourceForge project ] [ Javadoc ] [ GIT ] [ Roadmap ]




By: Pablo A. Duboue - <pablo.duboue@gmail.com>.

Hosted by: SourceForge.net Logo