ALU-map: A Natural Language Processing-Based Alu Feature Annotation on the Human Genome (Record no. 16604)
[ view plain ]
000 -LEADER | |
---|---|
fixed length control field | 02072nam a22001697a 4500 |
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER | |
Classification number | 572.8 |
Item number | S531A |
100 ## - MAIN ENTRY--AUTHOR NAME | |
Personal name | Sharma, Shreya |
245 ## - TITLE STATEMENT | |
Title | ALU-map: A Natural Language Processing-Based Alu Feature Annotation on the Human Genome |
Statement of responsibility, etc | by Shreya Sharma |
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT) | |
Place of publication | IIT Jodhpur |
Name of publisher | Department of Bioscience and Bioengineering |
Year of publication | 2023 |
300 ## - PHYSICAL DESCRIPTION | |
Number of Pages | vii,39p. |
Other physical details | HB |
500 ## - GENERAL NOTE | |
General note | Here's your revised text with corrected punctuation, DDC number, and five topical terms:<br/><br/>The present study is aimed at developing a database entitled "ALU-map" that structures information related to various roles of Alu elements using state-of-the-art techniques, including Natural Language Processing (NLP) models such as BERT and BioBERT. We have explored the performance of these models by training them on literature abstracts retrieved from the PubMed database. Each abstract was assigned 10 different biological labels, assuming that a given abstract can hold information related to any of these labels, meaning a task of multilabel classification. The study also aims to develop a fine-tuned BERT model that would classify Alu abstracts into all the above-mentioned categories. While fine-tuning these models performs well, there are key limitations, which we also discuss. Finally, we constructed a database where all Alu abstracts are annotated into 10 different categories. If an abstract belongs to a category, then 1 is assigned; otherwise, 0. This database provides information on the involvement of Alu elements at different levels of biology, such as genetic, transcriptomic, proteomic, pathways, and as biomarkers, where the biological functions of Alu elements have been reported. We strongly believe that this database holds immense potential to serve researchers and scientists working in the field, providing them with invaluable resources and aiding their advancements. |
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM | |
Topical Term | Department of Bioscience and Bioengineering |
Topical Term | BERT and BioBERT models |
Topical Term | MTech Theses |
700 ## - ADDED ENTRY--PERSONAL NAME | |
Personal name | Mukerji, Mitali |
942 ## - ADDED ENTRY ELEMENTS (KOHA) | |
Koha item type | Thesis |
Withdrawn status | Lost status | Damaged status | Not for loan | Collection code | Permanent Location | Current Location | Shelving location | Date acquired | Source of acquisition | Full call number | Accession Number | Price effective from | Koha item type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Theses | S. R. Ranganathan Learning Hub | S. R. Ranganathan Learning Hub | Reference | 01/04/2024 | Office of Academics | 572.8 S531A | TM00540 | 01/07/2024 | Thesis |