lemon2gf is a Python script that transforms an ontology and one or more attached lemon lexica into a Grammatical Framework (GF) grammar.
lemon2gf can be downloaded from GitHub.
lemon2gf consists of two steps: mapping the ontology to an abstract syntax, and mapping a corresponding lexicon to a concrete syntax.
The generated GF grammar builds on a core that specifies the following categories:
cat
Class;
Individual Class;
Statement;
In addition, it comprises domain-independent expressions and constructions, such as determiners, coordination and negation, that will not be provided in a domain lexicon.
Classes are mapped to constants of category Class,
e.g. the class dbpedia:Mountain becomes:
Mountain : Class;
Individuals are mapped to constants with their RDF type as type
parameter, e.g. resource:Nanga_Parbat becomes:
Nanga_Parbat : Individual Mountain;
Object properties are mapped to functions from individuals of the
domain type to individuals of the range type and have the return type
Statement, e.g. dbpedia:firstAscentPerson with
domain dbpedia:Mountain and range
dbpedia:Person becomes:
firstAscentPerson : Individual Mountain -> Individual Person -> Statement;
If a property does not specify a domain or range,
owl:Thing is assumed as default.
Datatypes are mapped to categories (keeping the prefix),
e.g. xsd:double becomes:
cat xsddouble;
Datatype properties are mapped to functions from individuals of
the domain type to the range category, again with return type
Statement, e.g. dbpedia:elevation with domain
dbpedia:Place and range xsd:double
becomes:
elevation : Individual Place -> xsddouble -> Statement;
Additionally, for each class a list of all superclasses are collected
and mapped to according GF judgements that allow for simple type
coercions, e.g. coercing Individual Mountain to
Individual Place.
First, all senses are collected. In case a sense does not refer to an
ontology URI, for example either refers to a URI defined in the lexicon
itself, or is composed of several subsenses, then the abstract syntax is
extended with according GF functions or constants. Currently, lemon2gf
covers the basic OWL constructs such as owl:unionOf,
owl:intersectionOf, owl:complementOf,
owl:propertyChainAxiom and owl:inverseOf.
Subsequently, for each sense all lexical entries denoting this sense are retrieved from the lexicon, together with all relevant morphosyntactic information. On the basis of its syntactic frame or, if no frame is specified, its part of speech, templates for GF linearization judgements are instantiated. For example, the following two lexical entries…
:mountain a lemon:LexicalEntry ;
lexinfo:partOfSpeech lexinfo:noun ;
lemon:canonicalForm [ lemon:writtenRep "mountain"@en ] ;
lemon:sense [ lemon:reference dbpedia:Mountain ] .
:peak a lemon:LexicalEntry ;
lexinfo:partOfSpeech lexinfo:noun ;
lemon:canonicalForm [ lemon:writtenRep "peak"@en ;
lexinfo:number lexinfo:singular ] ;
lemon:otherForm [ lemon:writtenRep "peaks"@en ;
lexinfo:number lexinfo:plural ];
lemon:sense [ lemon:reference dbpedia:Mountain ] .
…are converted into the following linearization judgements:
lin Mountain = variants { mkCN mountain_N; mkCN peak_N };
oper
mountain_N = mkN "mountain";
peak_N = mkN "peak" "peaks";
lemon2gf covers the most common LexInfo parts of speech and frames, and can be easily extended.