-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating
Natural Language Processing with Java - Second Edition
By :
MALLET is a well-known library in topic modeling. It also supports document classification and sequence tagging. More about MALLET can be found at http://mallet.cs.umass.edu/index.php. To download MALLET, visit http://mallet.cs.umass.edu/download.php (the latest version is 2.0.6). Once downloaded, extract MALLET in the directory. It contains the sample data in .txt format in the sample-data/web/en path of the MALLET directory.
The first step is to import the files into MALLET's internal format. To do this, open the Command Prompt or Terminal, move to the mallet directory, and execute the following command:
mallet-2.0.6$ bin/mallet import-dir --input sample-data/web/en --output tutorial.mallet --keep-sequence --remove-stopwordsThis command will generate the tutorial.mallet file.
The next step is to use train-topics to build a topic model and save the output-state, topic-keys, and topics using the train-topics command:
mallet-2.0.6$ bin/mallet train-topics -...
Change the font size
Change margin width
Change background colour