The research and realization that multilateral languages consistency
intelligence judgment for the WPS program Source
Abstract
The paper researches the problems in the multi-language conformance testing of WPS, and defines the consistency judgment that includes three levers: character, word, semantic. The rule of coding Unicode and the character set of the given language is used to realize the character lever. Then it segments the sentence to words, and compares the words with standard dictionary to realize the word lever. Thirdly the paper builds an N-gram language model segmentation-based, and use this model to realize the semantic lever.
Finally, we developed a tool for English and Chinese consistency judgment by the method in this paper in WPS. For English, there adopted the method of looking up dictionary based spelling to realize the word consistency in sentences. For Chinese, there used the Statistical language model, which could express the frequency of word pair, to realize the semantic consistency in sentences. Three projects of WPS are examined using this tool. As a result it finds 33 errors in English edition, 15 errors in Chinese edition. It proves that the method in this paper is feasible and the Statistical language model is useful in the multilateral languages consistency judgment.