TBX

TBX (Term Base eXchange, ISO 30042) is a standard that is used for exchanging terminology databases and in particular translation memories. Therefore, a certain degree of interoperability with lemon would be useful. As with LMF, TBX contains much header information and this can mostly be either discarded or mapped onto the descriptions given in 4.7.2. The body of a TBX document consists primarily of a set of termEntrys each of which is a concept, these roughly correspond to sense/reference pairs. The conversion is trivial if there is a known ontology referenced by the term base (ideally indicated by a ref tag), however if this is not the case either the id attribute of termEntry can be used as a URI or some invented URI should be used. Doing this effectively raises the concepts expressed in the TBX document into a pseudo-ontology, so they can be used with lemon. In TBX each termEntry is first split into languages and then into terms, as such each langSet should be mapped to a lexicon in lemon. Finally, the terms should be mapped to lexical entries with a given form. As TBX does not have a simple method for indicating the standard form, either the forms should be guessed from properties such as lemma, or all just marked with the general form.

John McCrae 2012-07-31