NATURAL PROCESSING LANGUAGE
Natural processing language is an aspect of artificial intelligence and computer science that handle the interface between human languages and computers. It involves the computational modelling of different characteristics of language and the deployment of variety of systems. These systems include spoken language systems incorporate natural language with speech. NLP works with linguistic computational features, it employs computer in comprehending, handling speech and text of natural language to achieve useful feat. There are several fields NLP can be applied to; speech recognition, expert systems, artificial intelligence, cross language information retrieval (CLIR), text processing, language translation, speech recognition, and user interfaces. This innovative technology is saddled with getting computers to communicate and process human languages, and perform closer to human level of language thoughtfulness. Computers are yet to reach same instinctive comprehension of natural language like humans do. There is clear difference in the method in which human communicates with one another and the way they do with computers. During program development phase, the structure and syntax are carefully selected to suit the intended task, unlike conversing with other people whereby a lot of freedoms are considered. Ranging from sentence length, sarcasm and jokes, to several ways of expressing same thing.
Recent advancement in innovative technologies has enabled computers to perform range of things with human or natural language. Deep learning supports the implementation of programs to perform task like text summary, language translation, and semantics understanding. The rise in the implementation and application of artificial intelligence to our daily activities has made it ubiquitous. It is imperative for human to be able to communicate more with computers in the language we are familiar and comfortable with, speaking to computers in their natural language. Natural Language Processing (NLP) is seen as the canopy term that binds other natural language technologies which include Natural Language Understanding (NLU), Natural Language Generation (NLG), and Natural Language Interaction (NLI).
COMPLEXITIES OF UNDERSTANDING DIFFERENT LANGUAGES USING NATUAL LANGUAGE PROCESSING
Recently, significant feat has been recorded in enabling computers to comprehend human language using Natural Language Processing (NLP). Nevertheless, the multifaceted multiplicity and dimensionality features of data sets, make the execution a problem in some cases. Concerning implementation of NLP in Asia, with main focus on south East Asia, voice and text-based data and their practical applications will vary. In other to capture the whole process, NLP needs to include several diverse procedures for interpreting Asia local language. It could involve machine learning, statistical, algorithmic, or rules-based approaches. Ambiguity is an aspect of cognitive sciences without a definite resolution, range of language ambiguity differs greatly based on the speaker. Technically, any language sentence with plenty grammar can generate another meaning, for human to find it challenging in dealing with conversation vagueness sometimes, then it is inevitable for natural language understanding systems.
- TYPES OF AMBIGUITY
Outlining ambiguity can sometimes seems vague. There are different forms of ambiguity regarding natural language processing (NLP), and artificial intelligence (AI) systems.
- Lexical Ambiguity: This is a single word ambiguity. A word can be ambiguous with respect to its syntactic category. Lexical ambiguity can be decided by Lexical type clarification like parts-of-speech labeling. It also stores word and complementary knowledge.
- Syntax: This is a part of grammar that define how words are assembled and linked with one another to make a sentence. Syntax involves the transformation of a linear order of tokens (a key to each word or punctuation mark in natural language) into a classified syntax tree. The main issue with syntax level are: sentence assembling, speech tagging, and identifying syntactic categories.
- Semantics: This type of ambiguity is characteristically associated with sentence interpretation. It includes task like interpreting one natural language to another, synonyms searching, creating question-answering systems, and clarification of word sense.
- Morphology Ambiguity: This ambiguity came into being due to advance processing carried out on the root words to make use of them in a specific sentence. It involves processing of word forms.
- Discourse: Discourse level processing needs a pooled knowledge and the interpretation is carried out using this context. Anaphoric ambiguity comes under discourse level. One of the exhausting task in Natural Language Processing (NLP), some of the problem are belief, sentiment, and user intention processing. It also process connected text.
- Pragmatic Ambiguity: This is refer to the situation whereby whereby the circumstance of a phrase gives it multiple meaning. It involves user modelling, and intention processing.
- Referential Ambiguity: When a phrase or a word in a particular sentence could refer to two or more properties or things, it is referential ambiguity. It is always clear from the circumstance which meaning is intended but not always.
- Phonology: It is described as words that sound the same way but have different meaning. This type of ambiguity forces the NLP model to interpret the context of the sentence and place it in the right context. It can be referred to processing of sound.
STAGES IN NATURAL LANGUAGE PROCESSING (NLP)
Basic steps necessary to be followed to build Natural Language Processing (NLP) model are as follows:
Stage 1: Segmentation of Sentence
The first stage required to build NLP model is breaking of prearranged paragraph into single sentences. This is done to process the meaning line by line.
Stage 2: Word Tokenization
After sentence segmentation, it is followed by word extraction from each sentence one after the other. The tokenization algorithm can be programmed to identify a word whenever a ‘space’ is observed. All these would be achieved following Asian natural language.
- Stage 3: Prediction of Parts of Speech
It involves classifying words into their respective part of speech as duly represented in Asian language. Parts of speech classification will help the machine learning model to comprehend its role in sentence. Machine learning might not actually know the meaning of each word in sentence setting the way human being do. A lot of data has to be fed into the model along with precise label of each word’s meaning and part of speech.
- Stage 4: Text Lemmatization
The machine learning model learns to identify the most basic form of words in a sentence. By differentiating between closely related words.
Stage 5: Pinpointing Stop Words
This stage is saddled with identifying the importance of each word in a sentence. There are a lot of filter words in that appear frequently in English language, and it is definite that Asia Language will also have some commonly used filter words that introduces a lot of noise into a sentence. It is necessary for machine learning to identify them and flag them as stop words i.e. words that can be filtered out before undertaking statistical investigation.
Stage 6: Dependency Parsing
It is the stage where grammatical laws of Asian language would be employed to identify how words are related to one another
Stage 7: Entity Analysis
This is achieved by going through the entire sentence in Asian Language and identify all the important words in the text. And the words in the sentence will be categorized as been programmed to work.
Stage 8: Pronouns Parsing
This is the last stage in building NLP model and it is one of the hardest stage. This stage will employ machine learning to keep track of pronouns with respect to the sentence context. It is very easy for human to comprehend the meaning right from the context of the sentence unlike computers. Therefore, a Machine Learning model is required to be fed with a lot of data alongside correct tags for the model to be able to identify the pronouns effect in a sentence.