The reason is simple: there is no need.

Chinese is a very peculiar language and you don’t need to differentiate words by word spacing.

It is very interesting that even if you take the order of the words in a sentence and somewhat disrupt it, a Chinese person can read it, and even when he finishes reading it, he doesn’t realise that the order of the words in the sentence is wrong.

But in Chinese you need to break the sentences, that is, between different sentences, you need to separate them.

This used to cause problems for computers to handle translations from Chinese to other languages, because English and so on are distinguished by words, whereas Chinese is not. So a special technique called “word separation” has emerged. This is where a sentence is spaced out by word, as in English, so that it can be translated with languages such as English.

Rule-based natural language processing in Chinese is necessary to deconstruct a sentence and understand the meaning behind it.

Chinese natural language processing based on statistical rules, done with some deep networks, is essentially probability seeking, and theoretically works better without splitting words when the sample set is large enough, but this is just theory and it is impossible to find that volume of sample size for any language.

However, with the development of computer technology, such as automatic translation, which is done with deep networks, there is no need for word separation.

By admin

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.