application/x-abiword AbiWord Sun Jan 19 23:26:03 2003

Abiword barbarism support

Jordi Mas <jmas@softcatala.org> 19/01/2003

What is a barbarism

Barbarism is a problem that manly concerns to minority languages, i.e. languages that are competing, in the same territory, with a more powerful one, called "rooflanguage", for example Welsh, Catalan, Occitan, and others.

When two languages compete in the same territory comes up interferences, but they are not symmetric. The roof language is weakly affected but the minority one can be strongly affected, and can disappear (glottophagy). One of these interferences is barbarism.

For example, in Catalan the correct word (that belongs to the language) for 'size' is 'mida' but due to the Spanish influence some people uses 'tamany' (that comes from Spanish tamaño) that is wrong.

Abiword's barbarism support


I strongly belive that the barbarism support should be part of a good spell checker. Since ispell and aspell do not have support for barbarism we have implemented a quick workaround. Ideally, the barbarism support integrated with the spell checker would accept rules, much need it for verbs if you do not want to list of all its declinations.

Our quick workaround is a light-weight XML file that allows to define barbarism and possible replacements. The file should be placed in the same directory that the spell checker and it will have the same the same name that the spell checker but with ended with '-barbarism.xml' instead of ".hash". In the case of the Catalan language, the file is called

catala-barbarism.xml


When we run the spell checker over a text every barbarism is marked as an error and the barbarism suggestions defined in the XML file are offered to fix it. They always appear at the top of the suggested words by the spell checker.

File format


- The file starts with the '
AbiBarbarism' tag, like this line

<AbiBarbarism app="AbiWord" ver="1.0" language="ca-ES">

the language attribute contains the locale for the language that we are defining in the file.

- The 'barbarism' tag defines a barbarism entry where the word attribute contains the word defined as barbarism. For example:

<barbarism word="tiro">


where 'tiro' is the barbarism.

- The 'suggestion' tag defines a suggested replacement for the barbarism that we have just defined. The word attribute contains the suggested word. For example:

<suggestion word="tret" />

Sample barbarism file

<?xml version="1.0" encoding="iso-8859-1"?>

<!--

Fitxer de barbarismes per l'Abiword

(c) 2002 Softcatalà Jordi Mas i Hernàndez <jmas@softcatala.org>

1. Introducció

Aquest arxiu és una recopilació en format XML de barbarismes en llengua

catalana i les seves corresponents correcions. En principi, ha estat

dissenyat per millorar les capacitats del corrector l'Abiword

(www.abiword.com) però aquesta llista pot ser utilizada per millorar

qualsevol altre corrector ortogràfic.

2. Llicència

Aquest document és (c)1998-2002 Softcatalà. Es permet l'ús,

distribució, i/o modificació d'aquest document d'acord amb Llicència

GNU per a documentació lliure versió 1.1 o superior publicada per la

Free Software Foundation; amb les secció invariable "copyright" i

sense cap secció de text de portada.

-->

<AbiBarbarism app="AbiWord" ver="1.0" language="ca-ES">

<barbarism word="tamany">

<suggestion word="mida" />

<suggestion word="grandària" />

</barbarism>

<barbarism word="tamanys">

<suggestion word="mides" />

<suggestion word="grandàries" />

</barbarism>

<barbarism word="boleto">

<suggestion word="billet" />

</barbarism>

<barbarism word="tiro">

<suggestion word="tret" />

</barbarism>

<barbarism word="tanteig">

<suggestion word="tempteig" />

</barbarism>

<barbarism word="aconteixement">

<suggestion word="esdeveniment" />

</barbarism>

<barbarism word="àngul">

<suggestion word="angle" />

</barbarism>

<barbarism word="atràs">

<suggestion word="endarreriment" />

</barbarism>

<barbarism word="búsqueda">

<suggestion word="cerca" />

</barbarism>

<barbarism word="cantitat">

<suggestion word="quantitat" />

</barbarism>

<barbarism word="cotidià">

<suggestion word="quotidià" />

</barbarism>

<barbarism word="despreci">

<suggestion word="menyspreu" />

</barbarism>

<barbarism word="enfermetat">

<suggestion word="malaltia" />

</barbarism> <barbarism word="fetxa">

<suggestion word="data" />

</barbarism> <barbarism word="impar">

<suggestion word="imparell" />

</barbarism> <barbarism word="monasteri">

<suggestion word="monestir" />

</barbarism> <barbarism word="promedi">

<suggestion word="promig" />

</barbarism> <barbarism word="testic">

<suggestion word="testimoni" />

</barbarism> <barbarism word="tregua">

<suggestion word="treva" />

</barbarism> <barbarism word="títul">

<suggestion word="títol" />

</barbarism>

<barbarism word="cumpleanys">

<suggestion word="aniversari" />

</barbarism>

</AbiBarbarism>