Morphological Analysis for Hindi Language. by Prity Goyal

By:

Goyal, Prity

Contributor(s):

Harit, Gaurav

Material type: Text

TextPublication details: IIT Jodhpur Department of Computer Science & Engineering 2020Description: ix,38p. HBSubject(s):

DDC classification:

006.304 914 G724M

Summary: "Morphological analysis is the process of providing grammatical information about the word on the basis of properties of the morpheme it contains. It plays a vital role in Natural Language Processing (NLP) and ease the job of machine translation.In this work, we have developed a morphological analyzer for the Hindi language.The analyzer takes Hindi words as input and divides it into its prefixes,suffixes, and roots/bases. The analyzer also provides the details of its grammatical feature/categories like number (singular/plural) and gender masculine/feminine).It is rule-based analyzer and works well with both the inflectional and derivationalmorphemes.Stemming is the process of trimming of suffix and prefix from the input word to get the corresponding root word. Several times, merely trimming the affixes do not always yield in a correct stemmed word. Lemmatizers are commonly used to overcome this challenge. A typical lemmatizer extract the lemma from the given word and adds special rules to make the trimmed word a correct stem. In this work,we have designed an inflectional lemmatizer that creates rules for extracting the suffixes, prefixes, and additional rules for making a correct root word.We also present an approach to identify the gender from the first name of a person.The gender classification is done by identifying similarities from masculine or feminine name. We created a data-set containing masculine and feminine names.Decision tree is used to categorized names into masculine and feminine classes. The same approach has been used for number classification.The building of a derivational analyzer requires information about the derivational variants. To extract the essential features of a derived word, the derivational variants in the language should be known, and then they must be analyzed. Therefore,we have trained a model for classifying Hindi derivational variants using supervised approach. To this end, total 11 derivational suffixes have been used in verb-to-noun derivations. We have used an (support vector machine) SVM classifier with seven features and 400-word pair training data for our classifier. This classifier is used to find out whether the derivational relationship exist or not between word pair. "

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Home library	Collection	Call number	Status	Date due	Barcode	Item holds
Thesis	S. R. Ranganathan Learning Hub Course Reserve	Reference	006.304 914 G724M (Browse shelf(Opens below))	Not For Loan		TM00202

Total holds: 0

"Morphological analysis is the process of providing grammatical information about the word on the basis of properties of the morpheme it contains. It plays a vital role in Natural Language Processing (NLP) and ease the job of machine translation.In this work, we have developed a morphological analyzer for the Hindi language.The analyzer takes Hindi words as input and divides it into its prefixes,suffixes, and roots/bases. The analyzer also provides the details of its grammatical feature/categories like number (singular/plural) and gender masculine/feminine).It is rule-based analyzer and works well with both the inflectional and derivationalmorphemes.Stemming is the process of trimming of suffix and prefix from the input word to get the corresponding root word. Several times, merely trimming the affixes do not always yield in a correct stemmed word. Lemmatizers are commonly used to overcome this challenge. A typical lemmatizer extract the lemma from the given word and adds special rules to make the trimmed word a correct stem. In this work,we have designed an inflectional lemmatizer that creates rules for extracting the suffixes, prefixes, and additional rules for making a correct root word.We also present an approach to identify the gender from the first name of a person.The gender classification is done by identifying similarities from masculine or feminine name. We created a data-set containing masculine and feminine names.Decision tree is used to categorized names into masculine and feminine classes. The same approach has been used for number classification.The building of a derivational analyzer requires information about the derivational variants. To extract the essential features of a derived word, the derivational variants in the language should be known, and then they must be analyzed. Therefore,we have trained a model for classifying Hindi derivational variants using supervised approach. To this end, total 11 derivational suffixes have been used in verb-to-noun derivations. We have used an (support vector machine) SVM classifier with seven features and 400-word pair training data for our classifier. This classifier is used to find out whether the derivational relationship exist or not between word pair.
"

There are no comments on this title.

to post a comment.

Print
Send to device
Save record
BIBTEX Dublin Core MARC (non-Unicode/MARC-8) MARCXML RIS
More searches

Search for this title in:
Other Libraries (WorldCat) Other Databases (Google Scholar) Online Stores (Bookfinder.com) Open Library (openlibrary.org)