| |
| | BioMed Central | Full text | Machine learning and word sense disambiguation in the biomedical domain: design and ... |
 | | Word sense disambiguation (WSD) is critical in the biomedical domain for improving the precision of natural language processing (NLP), text mining, and information retrieval systems because ambiguous words negatively impact accurate access to literature containing biomolecular entities, such as genes, proteins, cells, diseases, and other important entities. |
 | | However, Schuemie [12] analyzed 3,902 biomedical full-text articles and found that only 30% of the gene symbols in the abstracts were accompanied by their corresponding full names, and only 18% of the gene symbols in the full text were accompanied by their gene names. |
 | | Podowski [28] built a two-step classification system to disambiguate gene symbols: the first classifier determined whether the word was a gene versus a non-gene, and the other determined the appropriate gene for a symbol classified as a gene by the first classifier. |
| www.biomedcentral.com /1471-2105/7/334 (8545 words) |
|