Neurale netwerke as moontlike woordafkappingstegniek vir Afrikaans

Text in Afrikaans === Summaries in Afrikaans and English === In Afrikaans, soos in NederJands en Duits, word saamgestelde woorde aanmekaar geskryf. Nuwe woorde word dus voortdurend geskep deur woorde aanmekaar te haak Dit bemoeilik die proses van woordafkapping tydens teksprosessering, wat deesdae...

Full description

Bibliographic Details
Main Author: Fick, Machteld
Other Authors: Swanepoel, Carel Johannes
Format: Others
Language:af
Published: 2009
Subjects:
Online Access:Fick, Machteld (2002) Neurale netwerke as moontlike woordafkappingstegniek vir Afrikaans, University of South Africa, Pretoria, <http://hdl.handle.net/10500/584>
http://hdl.handle.net/10500/584
Description
Summary:Text in Afrikaans === Summaries in Afrikaans and English === In Afrikaans, soos in NederJands en Duits, word saamgestelde woorde aanmekaar geskryf. Nuwe woorde word dus voortdurend geskep deur woorde aanmekaar te haak Dit bemoeilik die proses van woordafkapping tydens teksprosessering, wat deesdae deur rekenaars gedoen word, aangesien die verwysingsbron gedurig verander. Daar bestaan verskeie afkappingsalgoritmes en tegnieke, maar die resultate is onbevredigend. Afrikaanse woorde met korrekte lettergreepverdeling is net die elektroniese weergawe van die handwoordeboek van die Afrikaanse Taal (HAT) onttrek. 'n Neutrale netwerk ( vorentoevoer-terugpropagering) is met sowat. 5 000 van hierdie woorde afgerig. Die neurale netwerk is verfyn deur 'n gcskikte afrigtingsalgoritme en oorfragfunksie vir die probleem asook die optimale aantal verborge lae en aantal neurone in elke laag te bepaal. Die neurale netwerk is met 5 000 nuwe woorde getoets en dit het 97,56% van moontlike posisies korrek as of geldige of ongeldige afkappingsposisies geklassifiseer. Verder is 510 woorde uit tydskrifartikels met die neurale netwerk getoets en 98,75% van moontlike posisies is korrek geklassifiseer. === In Afrikaans, like in Dutch and German, compound words are written as one word. New words are therefore created by simply joining words. Word hyphenation during typesetting by computer is a problem, because the source of reference changes all the time. Several algorithms and techniques for hyphenation exist, but results are not satisfactory. Afrikaans words with correct syllabification were extracted from the electronic version of the Handwoordeboek van die Afrikaans Taal (HAT). A neural network (feedforward backpropagation) was trained with about 5 000 of these words. The neural network was refined by heuristically finding a suitable training algorithm and transfer function for the problem as well as determining the optimal number of layers and number of neurons in each layer. The neural network was tested with 5 000 words not the training data. It classified 97,56% of possible points in these words correctly as either valid or invalid hyphenation points. Furthermore, 510 words from articles in a magazine were tested with the neural network and 98,75% of possible positions were classified correctly. === Computing === M.Sc. (Operasionele Navorsing)