![buchspektrum Internet-Buchhandlung](../buchspektrumlogo.gif) Neuerscheinungen 2018Stand: 2020-02-01 |
Schnellsuche
ISBN/Stichwort/Autor
|
Herderstraße 10 10625 Berlin Tel.: 030 315 714 16 Fax 030 315 714 14 info@buchspektrum.de |
![](https://multimedia.knv.de/cover/68/21/51/6821512400001n.jpg)
Abdulwahed Almarimi
Dissimilarities Detections in Arabic and English Texts
Using n-grams, Histograms and Self Organizing Maps
2018. 128 S. 220 mm
Verlag/Jahr: SCHOLAR´S PRESS 2018
ISBN: 6-202-30271-2 (6202302712)
Neue ISBN: 978-6-202-30271-5 (9786202302715)
Preis und Lieferzeit: Bitte klicken
The main goals of our research is to apply mathematical methods to cover anomalies and discrepancies in texts. English and Arabic texts were analyzed from many statistical characteristics point of view. We covered some basic statistical differences between lengths of used words in both languages and the results were applied in some heuristics for measurements of text parts dissimilarities. In the research we prepared three methods for the analysis of texts: (1) Element n-gram profiles method: The method is based on similarity/dissimilarity occurrences of n-grams in text parts in a comparison to a full text. (2) Histogram method: Histograms of text sequences are analyzed from a cluster point of view. If a cluster dispersion is not large, the text is probably written by the same author. If the cluster dispersion is large, the text is critical and it will be split in two or more parts and the same analysis will be done for the text parts. (3) Neural networks { Systems of Self-Organizing Maps: The systems were trained to input sequences and after the training they determine text parts with anomalies using a cumulative error and some complex analysis.
D. Abdulwahed Almarimi, Born 12.12.1985 in Bani Waleed, Libya. PhD from Pavol Jozef Safárik University in Kosice, Slovakia.