Abiword barbarism support
Jordi Mas <jmas@softcatala.org> 19/01/2003
What is a barbarism
Barbarism is a problem that manly concerns to minority languages, i.e. languages that are competing, in the same territory, with a more powerful one, called "rooflanguage", for example Welsh, Catalan, Occitan, and others.
When two languages compete in the same territory comes up interferences, but they are not symmetric. The roof language is weakly affected but the minority one can be strongly affected, and can disappear (glottophagy). One of these interferences is barbarism.
For example, in Catalan the correct word (that belongs to the language) for 'size' is 'mida' but due to the Spanish influence some people uses 'tamany' (that comes from Spanish tamaño) that is wrong.
Abiword's barbarism support
I strongly belive that the barbarism support should be part of a good spell checker. Since ispell and aspell do not have support for barbarism we have implemented a quick workaround. Ideally, the barbarism support integrated with the spell checker would accept rules, much need it for verbs if you do not want to list of all its declinations.
Our quick workaround is a light-weight XML file that allows to define barbarism and possible replacements. The file should be placed in the same directory that the spell checker and it will have the same the same name that the spell checker but with ended with '-barbarism.xml' instead of ".hash". In the case of the Catalan language, the file is called
catala-barbarism.xml
When we run the spell checker over a text every barbarism is marked as an error and the barbarism suggestions defined in the XML file are offered to fix it. They always appear at the top of the suggested words by the spell checker.
File format
- The file starts with the 'AbiBarbarism' tag, like this line
<AbiBarbarism app="AbiWord" ver="1.0" language="ca-ES">
the language attribute contains the locale for the language that we are defining in the file.
- The 'barbarism' tag defines a barbarism entry where the word attribute contains the word defined as barbarism. For example:
<barbarism word="tiro">
where 'tiro' is the barbarism.
- The 'suggestion' tag defines a suggested replacement for the barbarism that we have just defined. The word attribute contains the suggested word. For example:
<suggestion word="tret" />
Sample barbarism file
<?xml version="1.0" encoding="iso-8859-1"?>
<!--
Fitxer de barbarismes per l'Abiword
(c) 2002 Softcatalà Jordi Mas i Hernàndez <jmas@softcatala.org>
1. Introducció
Aquest arxiu és una recopilació en format XML de barbarismes en llengua
catalana i les seves corresponents correcions. En principi, ha estat
dissenyat per millorar les capacitats del corrector l'Abiword
(www.abiword.com) però aquesta llista pot ser utilizada per millorar
qualsevol altre corrector ortogràfic.
2. Llicència
Aquest document és (c)1998-2002 Softcatalà. Es permet l'ús,
distribució, i/o modificació d'aquest document d'acord amb Llicència
GNU per a documentació lliure versió 1.1 o superior publicada per la
Free Software Foundation; amb les secció invariable "copyright" i
sense cap secció de text de portada.
-->
<AbiBarbarism app="AbiWord" ver="1.0" language="ca-ES">
<barbarism word="tamany">
<suggestion word="mida" />
<suggestion word="grandària" />
</barbarism>
<barbarism word="tamanys">
<suggestion word="mides" />
<suggestion word="grandàries" />
</barbarism>
<barbarism word="boleto">
<suggestion word="billet" />
</barbarism>
<barbarism word="tiro">
<suggestion word="tret" />
</barbarism>
<barbarism word="tanteig">
<suggestion word="tempteig" />
</barbarism>
<barbarism word="aconteixement">
<suggestion word="esdeveniment" />
</barbarism>
<barbarism word="àngul">
<suggestion word="angle" />
</barbarism>
<barbarism word="atràs">
<suggestion word="endarreriment" />
</barbarism>
<barbarism word="búsqueda">
<suggestion word="cerca" />
</barbarism>
<barbarism word="cantitat">
<suggestion word="quantitat" />
</barbarism>
<barbarism word="cotidià">
<suggestion word="quotidià" />
</barbarism>
<barbarism word="despreci">
<suggestion word="menyspreu" />
</barbarism>
<barbarism word="enfermetat">
<suggestion word="malaltia" />
</barbarism> <barbarism word="fetxa">
<suggestion word="data" />
</barbarism> <barbarism word="impar">
<suggestion word="imparell" />
</barbarism> <barbarism word="monasteri">
<suggestion word="monestir" />
</barbarism> <barbarism word="promedi">
<suggestion word="promig" />
</barbarism> <barbarism word="testic">
<suggestion word="testimoni" />
</barbarism> <barbarism word="tregua">
<suggestion word="treva" />
</barbarism> <barbarism word="títul">
<suggestion word="títol" />
</barbarism>
<barbarism word="cumpleanys">
<suggestion word="aniversari" />
</barbarism>
</AbiBarbarism>