Home > flock > How to detect which language a text is written in?

How to detect which language a text is written in?

Something we may want to do someday in Flock, if we want to use the right tokenizer to index history…

How to detect which language a text is written in? Or when science meets human!

Blogged with Flock

Share:
  • Facebook
  • Twitter
  • Reddit
  • Digg
  • FriendFeed
  • Wikio
  • Identi.ca
Categories: flock Tags:
  1. May 14th, 2007 at 04:29 | #1

    Algorithme plus vieux, utilisé par Spamassassin, et a priori utilisé aussi par Maciej Ceglowski du temps où il travaillait sur le NITLE blog census (http://www.hirank.com/semantic-indexing-project/census/lang.html)

    http://odur.let.rug.nl/~vannoord/TextCat/

  1. No trackbacks yet.