apertium-sv-da/README

   1 TRANSLATOR
   2
   3 You need apertium-3.0 and lttoolbox-3.0 to use this translator.  To compile
   4 the linguistical data simply do:
   5
   6 make
   7
   8 inside of this directory.
   9
  10 TAGGER
  11
  12 To use this language-pair package with apertium YOU DO NOT NEED TO
  13 RETRAIN THE TAGGER. Probabilities and auxiliary data are provided for
  14 both the ca-es and the es-ca translation directions which should be
  15 acceptable for most applications, and should work even if you change
  16 the dictionaries in a reasonably way.
  17
  18 If for some reason you need to retrain the tagger (for example, you
  19 have made really extensive changes to the dictionaries such as
  20 creating new lexical categories), you have two alternatives:
  21
  22 * To perform a supervised training:
  23
  24   To this end tagged corpora is provided, but tagged corpora
  25   (es-tagger-data/es.tagged and ca-tagger-data/ca.tagged) could be
  26   obsolete for some words. If this is the case, the tagger training
  27   program  will show you where the problems are and you will need
  28   to solve them by hand. Be sure to solve the problems by modifying
  29   ONLY the .tagged file, NEVER the .untagged file that is
  30   automatically generated.
  31
  32   The supervised training is done by typing: make tagger_supervised
  33
  34 * To perform an unsupervised training:
  35
  36   For this purpose you will need to assemble a large (hundreds of
  37   thousand of words) plain-text corpus for each language (for example,
  38   using a robot to harvest text from online newspapers) and put them in
  39   the proper place, for instance es-tagger-data/es.crp.txt and
  40   ca-tagger-data/ca.crp.txt. This type of training does not need human
  41   intervention but, as expected, results will be less adequate than
  42   those obtained with the supervised training.
  43
  44   The unsupervised training is done through the iterative Baum-Welch
  45   algorithm. By default the number of iterations is set to 8, but you
  46   can change this value by editing the Makefile and changing the
  47   value of TAGGER_UNSUPERVISED_ITERATIONS.
  48
  49   The unsupervised training is done by typing: make tagger_unsupervised
  50