A Similarity Detection Method Based on Distance Matrix Model with Row-Column Order penalty Factor

Jun Li, Yaqing Han, Yan Niu

Abstract


Paper detection involves multiple disciplines, and making a comprehensive and correct evaluation of academic misconduct is quite a complex and sensitive issue. There are some problems in the existing main detection models, such as incomplete segmentation preprocessing specification, impact of the semantic orders on detection, near-synonym evaluation, slow paper backtrack and so on. This paper presents a sentence-level paper similarity comparison model with segmentation preprocessing based on special identifier. This model integrates the characteristics of vector detection, hamming distance and the longest common substring and carries out detection specific to near-synonyms, word deletion and changes in word order by redefining distance matrix and adding ordinal measures, making sentence similarity detection in terms of semantics and backbone word segmentation more effective. Compared with the traditional paper similarity retrieval, the present method adopts modular-2 arithmetic with low computation. Paper detection method with reliability and high efficiency is of great academic significance in word segmentation, similarity detection and document summarization.


Full Text: PDF

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

Bulletin of EEI Stats