IndicNotes
From WorkOutWiki2008
Sorting
- Canonical equivalence of ୋ and େ + ା
- Order on Indlinux wiki for Telugu is correct, and not the one on the FOSS.IN/2008 workout
- Differences in Hindi and Marathi sorting: ksha in Marathi considered similar to consonant.
- Look at XYZSort on Indlinux wiki.
- Bengali/Assamese: Separate sorting. Check sorting already built into as_IN.
- Links:
- Santhosh notes:
- In Malayalam, ka + halant sorts before ka, unlike Hindi, Oriya, etc.
- Anusvara = half-ma always for Malayalam. So, should always sort the same as half-ma.
- Suggestions on paper from Dr. Pavanaja.
- Final tasklist
- Pravin, and Rahul to complete Indic collation in iso14651_t1_common, and submit to glibc. Will put sorting examples into http://www.indlinux.org/wiki/index.php/XYZSort for checking by community. Timescale: 4 months.
- ICU: Gopal
- Unicode CLDR: Pavanaja, Karunakar
-
Chhatisgarhi locale: Gora to connect Pravin, and Rahul to Ravishankar Shrivastava.
Spell-checking
- Copying aspell phonetic tables to Indlinux wiki, e.g., http://www.indlinux.org/wiki/index.php/XYZPhonetic
- Final tasklist
- Gora: Incorporate changes into aspell dictionaries.
- Santhosh: Look at how to incorporate phonetic rules into Hunspell. Post on indlinux-group list. If easy to do, do it himself.
- Web interface for dictionary review, including aspell-like affix rules. need volunteer.
- Agglutinative languages in aspell, hunspell. Hunspell apparently has the capability to handle such languages. Bangla, Malayalam, Tamil. Need volunteer to do first a problem definition, and list possible approaches under existing spell-checking frameworks. Malayalam needs a run-length of up to 10.
- Merge Hunspell, and aspell
- Spell-checking middleware: Sonnet, enchant, gtkspell. Need a common one for all applications. Need volunteers.
- Plugins for applications: Scribus, OpenOffice, Mozilla, IM, chat.
Documentation for the Indic desktop user
- Please contribute at http://code.indlinux.net/cgi-bin/twiki/view/Main/IndicDocs

