Large scale experiments on correction of confused words

Thumbnail Image
Huang, Jin Hu
Powers, David Martin
Journal Title
Journal ISSN
Volume Title
Institute of Electrical and Electronics Engineers Computer Society (IEEE Publishing)
The paper describes a new approach to automatically learn contextual knowledge for spelling and grammar correction; we aim particularly to deal with cases where the words are all in the dictionary and so it is not obvious that there is an error. Traditional approaches are dictionary based, or use elementary tagging or partial parsing of the sentence to obtain context knowledge. Our approach uses affix information and only the most frequent words to reduce the complexity in terms of training time and running time for context-sensitive spelling correction. We build large scale confused word sets based on keyboard adjacency and apply our new approach to learn the contextual knowledge to detect and correct them. We explore the performance of auto-correction under conditions where significance and probability are set by the user.
Grammars, Linguistics, Spelling aids, Text analysis
Huang, J.H. and Powers, D.M. 2001. Proceedings of the 24th Australasian Computer Science Conference (ACSC), 77-82.