DOCS/tech/encoding-tips.txt

   1
   2 Some important URLs:
   3 ~~~~~~~~~~~~~~~~~~~~
   4 http://mplayerhq.hu/~michael/codec-features.html               <- lavc vs. divx5 vs. xvid
   5 http://www.ee.oulu.fi/~tuukkat/mplayer/tests/rguyom.ath.cx/    <- lavc benchmarks, options
   6 http://ffdshow.sourceforge.net/tikiwiki/tiki-view_articles.php <- lavc for win32 :)
   7 http://www.bunkus.org/dvdripping4linux/index.html              <- a nice tutorial
   8 http://forum.zhentarim.net/viewtopic.php?p=237                 <- lavc option comparison
   9 http://www.ee.oulu.fi/~tuukkat/mplayer/tests/readme.html       <- series of benchmarks
  10 http://thread.gmane.org/gmane.comp.video.mencoder.user/1196    <- free codecs shoutout and recommended encoding settings
  11
  12
  13 ================================================================================
  14
  15
  16 FIXING A/V SYNC WHEN ENCODING
  17
  18 I know this is a popular topic on the list, so I thought I'd share a
  19 few comments on my experience fixing a/v sync. As everyone seems to
  20 know, mencoder unfortunately doesn't have a -delay option. But that
  21 doesn't mean you can't fix a/v sync. There are a couple ways to still
  22 do it.
  23
  24 In example 1, we'll suppose you want to re-encode the audio anyway.
  25 This will be essential if your source audio isn't mp3, e.g. for DVD's
  26 or nasty avi files with divx/wma audio. This approach makes things
  27 much easier.
  28
  29 Step 1: Dump the audio with mplayer -ao pcm -nowaveheader. There are
  30 various options that can be used to speed this up, most notably -vo
  31 null, -vc null, and/or -hardframedrop. -benchmark also seemed to help
  32 in the past. :)
  33
  34 Step 2: Figure out what -delay value syncs the audio right in mplayer.
  35 If this number is positive, use a command like the following:
  36
  37 dd if=audiodump.wav bs=1764 skip=[delay] | lame -x - out.mp3
  38
  39 where [delay] is replaced by your delay amount in hundredths of a
  40 second (1/10 the value you use with mplayer). Otherwise, if delay is
  41 negative, use a command like this:
  42
  43 ( dd if=/dev/zero bs=1764 skip=[delay] ; cat audiodump.wav ) | lame -x - out.mp3
  44
  45 Don't include the minus (-) sign in delay. Also, keep in mind you'll
  46 have to change the 1764 number and provide additional options to lame
  47 if your audio stream isn't 44100/16bit/little-endian/stereo.
  48
  49 Step 3: Use mencoder to remux your new mp3 file with the movie:
  50
  51 mencoder -audiofile out.mp3 -oac copy ...
  52
  53 You can either copy video as-is (with -ovc copy) or re-encode it at
  54 the same time you merge in the audio like this.
  55
  56 Finally, as a variation on this method (makes things a good bit faster
  57 and doesn't use tons of temporary disk space) you can merge steps 1
  58 and 2 by making a named pipe called "audiodump.wav" (type mkfifo
  59 audiodump.wav) and have mplayer write the audio to it at the same time
  60 you're running lame to encode.
  61
  62 Now for example 2. This time we won't re-encode audio at all. Just
  63 dump the mp3 stream from the avi file with mplayer -dumpaudio. Then,
  64 you have to cut and paste the raw mp3 stream a bit...
  65
  66 If delay is negative, things are easier. Just use lame to encode
  67 silence for the duration of delay, at the same samplerate and
  68 samplesize used in your avi file. Then, do something like:
  69
  70 cat silence.mp3 stream.dump > out.mp3
  71 mencoder -audiofile out.mp3 -oac copy ...
  72
  73 On the other hand, if delay is positive, you'll need to crop off part
  74 of the mp3 from the beginning. If it's (at least roughly) CBR this is
  75 easy -- just take off the first (bitrate*delay/8) bytes of the file.
  76 You can use the excellent dd tool, or just your favorite
  77 binary-friendly text editor to do this. Otherwise, you'll have to
  78 experiment with cutting off different amounts. You can test with
  79 mplayer -audiofile before actually spending time remuxing/encoding
  80 with mencoder to make sure you cut the right amount.
  81
  82 I hope this has all been informative. If anyone would like to clean
  83 this message up a bit and make it into part of the docs, feel free. Of
  84 course mencoder should eventually just get -delay. :)
  85
  86 Rich
  87
  88
  89 ================================================================================
  90
  91
  92 ENCODING QUALITY - OR WHY AUTOMATISM IS BAD.
  93
  94 Hi everyone.
  95
  96 Some days ago someone suggested adding some preset options to mencoder.
  97 At that time I replied 'don't do that', and now I decided to elaborate
  98 on that.
  99
 100 Warning: this is rather long, and it involves mathematics. But if you
 101 don't want to bother with either then why are you encoding in the
 102 first place? Go do something different!
 103
 104 The good news is: it's all about the bpp (bits per pixel).
 105
 106 The bad news is: it's not THAT easy ;)
 107
 108 This mail is about encoding a DVD to MPEG4. It's about the video
 109 quality, not (primarily) about the audio quality or some other fancy
 110 things like subtitles.
 111
 112 The first step is to encode the audio. Why? Well if you encode the
 113 audio prior to the video you'll have to make the video fit onto one
 114 (or two) CD(s). That way you can use oggenc's quality based encoding
 115 mode which is much more sophisticated than its ABR based mode.
 116
 117 After encoding the audio you have a certain amount of space left to
 118 fill with video. Let's assume the audio takes 60M (no problem with
 119 Vorbis), and you aim at a 700M CD. This leaves you 640M for the video.
 120 Let's further assume that the video is 100 minutes or 6000 seconds
 121 long, encoded at 25fps (those nasty NTSC fps values give me
 122 headaches. Adjust to your needs, of course!). This leaves you with
 123 a video bitrate of:
 124
 125                 $videosize * 8
 126 $videobitrate = --------------
 127                 $length * 1000
 128
 129 $videosize in bytes, $length in seconds, $videobitrate in kbit/s.
 130 In my example I end up with $videobitrate = 895.
 131
 132 And now comes the question: how do I chose my encoding parameters
 133 so that the results will be good? First let's take a look at a
 134 typical mencoder line:
 135
 136 mencoder dvd://1 -o /dev/null -oac copy -ovc lavc \
 137   -lavcopts vcodec=mpeg4:vbitrate=1000:vhq:vqmin=2:\
 138   vlelim=-4:vcelim=9:lumi_mask=0.05:dark_mask=0.01:vpass=1 \
 139   -vf crop=716:572:2:2,scale=640:480
 140
 141 Phew, all those parameters! Which ones should I change? NEVER leave
 142 out 'vhq'. Never ever. 'vqmin=2' is always good if you aim for sane
 143 settings - like 'normal length' movies on one CD, 'very long movies'
 144 on two CDs and so on. vcodec=mpeg4 is mandatory.
 145
 146 The 'vlelim=-4:vcelim=9:lumi_mask=0.05:dark_mask=0.01' are parameters
 147 suggested by D Richard Felker for non-animated movies, and they
 148 improve quality a bit.
 149
 150 But the two things that have the most influence on quality are
 151 vbitate and scale. Why? Because both together tell the codec how
 152 many bits it may spend on each frame for each bit: and this is
 153 the 'bpp' value (bits per pixel). It's simply defined as
 154
 155          $videobitrate * 1000
 156 $bpp = -----------------------
 157        $width * $height * $fps
 158
 159 I've attached a small Perl script that calculates the $bpp for
 160 a movie. You'll have to give it four parameters:
 161 a) the cropped but unscaled resolution (use '-vf cropdetect'),
 162 b) the encoded aspect ratio. All DVDs come at 720x576 but contain
 163 a flag that tells the player wether it should display the DVD at
 164 an aspect ratio of 4/3 (1.333) or at 16/9 (1.777). Have a look
 165 at mplayer's output - there's something about 'prescaling'. That's
 166 what you are looking for.
 167 c) the video bitrate in kbit/s and
 168 d) the fps.
 169
 170 In my example the command line and calcbpp.pl's output would look
 171 like this (warning - long lines ahead):
 172
 173 mosu@anakin:~$ ./calcbpp.pl 720x440 16/9 896 25
 174 Prescaled picture: 1023x440, AR 2.33
 175 720x304, diff   5, new AR 2.37, AR error 1.74% scale=720:304 bpp: 0.164
 176 704x304, diff  -1, new AR 2.32, AR error 0.50% scale=704:304 bpp: 0.167
 177 688x288, diff   8, new AR 2.39, AR error 2.58% scale=688:288 bpp: 0.181
 178 672x288, diff   1, new AR 2.33, AR error 0.26% scale=672:288 bpp: 0.185
 179 656x288, diff  -6, new AR 2.28, AR error 2.17% scale=656:288 bpp: 0.190
 180 640x272, diff   3, new AR 2.35, AR error 1.09% scale=640:272 bpp: 0.206
 181 624x272, diff  -4, new AR 2.29, AR error 1.45% scale=624:272 bpp: 0.211
 182 608x256, diff   5, new AR 2.38, AR error 2.01% scale=608:256 bpp: 0.230
 183 592x256, diff  -2, new AR 2.31, AR error 0.64% scale=592:256 bpp: 0.236
 184 576x240, diff   8, new AR 2.40, AR error 3.03% scale=576:240 bpp: 0.259
 185 560x240, diff   1, new AR 2.33, AR error 0.26% scale=560:240 bpp: 0.267
 186 544x240, diff  -6, new AR 2.27, AR error 2.67% scale=544:240 bpp: 0.275
 187 528x224, diff   3, new AR 2.36, AR error 1.27% scale=528:224 bpp: 0.303
 188 512x224, diff  -4, new AR 2.29, AR error 1.82% scale=512:224 bpp: 0.312
 189 496x208, diff   5, new AR 2.38, AR error 2.40% scale=496:208 bpp: 0.347
 190 480x208, diff  -2, new AR 2.31, AR error 0.85% scale=480:208 bpp: 0.359
 191 464x192, diff   7, new AR 2.42, AR error 3.70% scale=464:192 bpp: 0.402
 192 448x192, diff   1, new AR 2.33, AR error 0.26% scale=448:192 bpp: 0.417
 193 432x192, diff  -6, new AR 2.25, AR error 3.43% scale=432:192 bpp: 0.432
 194 416x176, diff   3, new AR 2.36, AR error 1.54% scale=416:176 bpp: 0.490
 195 400x176, diff  -4, new AR 2.27, AR error 2.40% scale=400:176 bpp: 0.509
 196 384x160, diff   5, new AR 2.40, AR error 3.03% scale=384:160 bpp: 0.583
 197 368x160, diff  -2, new AR 2.30, AR error 1.19% scale=368:160 bpp: 0.609
 198 352x144, diff   7, new AR 2.44, AR error 4.79% scale=352:144 bpp: 0.707
 199 336x144, diff   0, new AR 2.33, AR error 0.26% scale=336:144 bpp: 0.741
 200 320x144, diff  -6, new AR 2.22, AR error 4.73% scale=320:144 bpp: 0.778
 201
 202 A word for the $bpp. For a fictional movie which is only black and
 203 white: if you have a $bpp of 1 then the movie would be stored
 204 uncompressed :) For a real life movie with 24bit color depth you
 205 need compression of course. And the $bpp can be used to make the
 206 decision easier.
 207
 208 As you can see the resolutions suggested by the script are all
 209 dividable by 16. This will make the aspect ratio slightly wrong,
 210 but no one will notice.
 211
 212 Now if you want to decide which resolution (and scaling parameters)
 213 to chose you can do that by looking at the $bpp:
 214
 215 < 0.10: don't do it. Please. I beg you!
 216 < 0.15: It will look bad.
 217 < 0.20: You will notice blocks, but it will look ok.
 218 < 0.25: It will look really good.
 219 > 0.25: It won't really improve visually.
 220 > 0.30: Don't do that either - try a bigger resolution instead.
 221
 222 Of course these values are not absolutes! For movies with really lots
 223 of black areas 0.15 may look very good. Action movies with only high
 224 motion scenes on the other hand may not look perfect at 0.25. But these
 225 values give you a great idea about which resolution to chose.
 226
 227 I see a lot of people always using 512 for the width and scaling
 228 the height accordingly. For my (real-world-)example this would be
 229 simply a waste of bandwidth. The encoder would probably not even
 230 need the full bitrate, and the resulting file would be smaller
 231 than my targetted 700M.
 232
 233 After encoding you'll do your 'quality check'. First fire up the movie
 234 and see whether it looks good to you or not. But you can also do a
 235 more 'scientific' analysis. The second Perl script I attached counts
 236 the quantizers used for the encoding. Simply call it with
 237
 238 countquant.pl < divx2pass.log
 239
 240 It will print out which quantizer was used how often. If you see that
 241 e.g. the lowest quantizer (vqmin=2) gets used for > 95% of the frames
 242 then you can safely increase your picture size.
 243
 244 > The "counting the quantesizer"-thing could improve the quality of
 245 > full automated scripts, as I understand ?
 246
 247 Yes, the log file analysis can be used be tools to automatically adjust
 248 the scaling parameters (if you'd do that you'd end up with a three-pass
 249 encoding for the video only ;)), but it can also provide answers for
 250 you as a human. From time to time there's a question like 'hey,
 251 mencoder creates files that are too small! I specified this bitrate and
 252 the resulting file is 50megs short of the target file size!'. The
 253 reason is probably that the codec already uses the minimum quantizer
 254 for nearly all frames so it simply does not need more bits. A quick
 255 glance at the distribution of the quantizers can be enlightening.
 256
 257 Another thing is that q=2 and q=3 look really good while the 'bigger'
 258 quantizers really lose quality. So if your distribution shows the
 259 majority of quantizers at 4 and above then you should probably decrease
 260 the resolution (you'll definitly see block artefacts).
 261
 262
 263 Well... Several people will probably disagree with me on certain
 264 points here, especially when it comes down to hard values (like the
 265 $bpp categories and the percentage of the quantizers used). But
 266 the idea is still valid.
 267
 268 And that's why I think that there should NOT be presets in mencoder
 269 like the presets lame knows. 'Good quality' or 'perfect quality' are
 270 ALWAYS relative. They always depend on a person's personal preferences.
 271 If you want good quality then spend some time reading and - more
 272 important - understanding what steps are involved in video encoding.
 273 You cannot do it without mathematics. Oh well, you can, but you'll
 274 end up with movies that could certainly look better.
 275
 276 Now please shoot me if you have any complaints ;)
 277
 278 --
 279  ==> Ciao, Mosu (Moritz Bunkus)
 280
 281 ===========
 282 ANOTHER APPROACH: BITS PER BLOCK:
 283
 284 >          $videobitrate * 1000
 285 > $bpp = -----------------------
 286 >        $width * $height * $fps
 287
 288 Well, I came to similar equation going through different route. Only I
 289 didn't use bits per pixel, in my case it was bits per block (BPB). The block
 290 is 16x16 because lots of software depends on video width/height being
 291 divisable by 16. And because I didn't like this 0.2 bit per pixel, when
 292 bit is quite atomic ;)
 293
 294 So the equation was something like:
 295
 296                  bitrate
 297 bpb =           -----------------
 298        fps * ((width * height) / (16 * 16))
 299
 300 (width and height are from destination video size, and bitrate is in
 301 bits (i.e. 900kbps is 900000))
 302
 303 This way it apeared that the minimum bits per block is ~40, very
 304 good results are with ~50, and everything above 60 is a waste of bandwidth.
 305 And what's actually funny is that it was independent of codec used. The
 306 results were exactly the same, whether I used DIV3 (with tricky nandub's
 307 magick), ffmpeg odivx, DivX5 on Windows or Xvid.
 308
 309 Surprisingly there is one advantage of using nandub-DIV3 for bitrate
 310 starved encoding: ringing almost never apears this way.
 311
 312 But I also found out, that the quality/BPB isn't constant for
 313 drastically different resolutions. Smaller picture (like MPEG1 sizes)
 314 need more BPB to look good than say typical MPEG2 resolutions.
 315
 316 Robert
 317
 318
 319 ===========
 320 DON'T SCALE DOWN TOO MUCH
 321
 322 Sometimes I found that encoding to y-scaled only DVD qualty (ie 704 x
 323 288 for a 2.85 film) gives better visual quality than a scaled-down
 324 version even if the quantizers are significantly higher than for the
 325 scaled-down version.
 326 Keep in mind that blocs, fuzzy parts and generaly mpeg artefacts in a
 327 704x288 image will be harder to spot in full-screen mode than on a
 328 512x208 image. In fact I've see example where the same movie looks
 329 better compressed to 704x288 with an average weighted quantizer of
 330 ~3 than the same movie scaled to 576x240 with an average weighted
 331 quantizer of 2.4.
 332 Btw, a print of the weighted average quantizer would be nice in
 333 countquant.pl :)
 334
 335 Another point in favor of not trying to scale down too much : on hard
 336 scaled-down movies, the MPEG codec will need to compress relatively
 337 high frequencies rather than low frequencies and it doesn't like that
 338 at all. You will see less and less returns while you scale down and
 339 scale down again in desesperate need of some bandwidth :)
 340
 341 In my experience, don't try to go below a width of 576 without closely
 342 watching what's going on.
 343
 344 --
 345 Rémi
 346
 347 ===========
 348 TIPS FOR ENCODING
 349
 350 That being  said, with  video you  have some tradeoffs  you can  make. Most
 351 people  seem to  encode with  really basic  options, but  if you  play with
 352 single coefficient elimination and luma masking settings, you can save lots
 353 of bits,  resulting in  lower quantizers, which  means less  blockiness and
 354 less ugly noise  (ringing) around sharp borders. The  tradeoff, however, is
 355 that you'll  get some "muddiness" in  some parts of the  image. Play around
 356 with the  settings and see  for yourself. The  options I typically  use for
 357 (non-animated) movies are:
 358
 359 vlelim=-4
 360 vcelim=9
 361 lumi_mask=0.05
 362 dark_mask=0.01
 363
 364 If things  look too muddy,  making the numbers closer  to 0. For  anime and
 365 other animation, the above recommendations may not be so good.
 366
 367 Another option that may be useful is allowing four motion vectors per
 368 macroblock (v4mv). This will increase encoding time quite a bit, and
 369 last I checked it wasn't compatible with B frames. AFAIK, specifying
 370 v4mv should never reduce quality, but it may prevent some old junky
 371 versions of DivX from decoding it (can anyone conform?). Another issue
 372 might be increased cpu time needed for decoding (again, can anyone
 373 confirm?).
 374
 375 To get more fair distribution of bits between low-detail and
 376 high-detail scenes, you should probably try increasing vqcomp from the
 377 default (0.5) to something in the range 0.6-0.8.
 378
 379 Of course you also  want to make sure you crop ALL of  the black border and
 380 any half-black  pixels at the  edge of the image,  and make sure  the final
 381 image dimensions after cropping and scaling are multiples of 16. Failing to
 382 do so will drastically reduce quality.
 383
 384 Finally, if  you can't seem  to get good results,  you can try  scaling the
 385 movie down  a bit smaller  or applying a weak  gaussian blur to  reduce the
 386 amount of detail.
 387
 388 Now, my personal success story! I  just recently managed to fit a beautiful
 389 encode of  Kundun (well  over 2  hours long, but  not too  many high-motion
 390 scenes) on  one cd at  640x304, with 66 kbit/sec  abr ogg audio,  using the
 391 options I  described above. So, IMHO  it's definitely possible to  get very
 392 good  results with  libavcodec (certainly  MUCH better  than all  the idiot
 393 "release groups" using DivX3  make), as long as you take  some time to play
 394 around with the options.
 395
 396
 397 Rich
 398
 399 ============
 400 ABOUT VLELIM, VCELIM, LUMI_MASK AND DARK_MASK PART I: LUMA & CHROMA
 401
 402
 403 The l/c in vlelim and vcelim  stands for luma (brightness plane) and chroma
 404 (color planes). These  are encoded separately in  all mpeg-like algorithms.
 405 Anyway, the idea behind these options  is (at least from what I understand)
 406 to use some good heuristics to determine when the change in a block is less
 407 than the  threshold you  specify, and in  such a case,  to just  encode the
 408 block as "no change". This saves bits and perhaps speeds up encoding. Using
 409 a negative value  for either one means the same  thing as the corresponding
 410 positive value,  but the DC  coefficient is also  considered. Unfortunately
 411 I'm not familiar  enough with the mpeg terminology to  know what this means
 412 (my first guess would be that it's  the constant term from the DCT), but it
 413 probably  makes  the  encoder  less  likely  to  apply  single  coefficient
 414 elimination in cases  where it would look bad.  It's presumably recommended
 415 to use negative values for luma  (which is more noticable) and positive for
 416 chroma.
 417
 418 The other options  -- lumi_mask and dark_mask -- control  how the quantizer
 419 is  adjusted for  really  dark or  bright regions  of  the picture.  You're
 420 probably already  at least a  bit familiar  with the concept  of quantizers
 421 (qscale, lower  = more precision, higher  quality, but more bits  needed to
 422 encode). What  not everyone  seems to  know is that  the quantizer  you see
 423 (e.g. in the 2pass logs) is just  an average for the whole frame, and lower
 424 or higher quantizers may in fact be  used in parts of the picture with more
 425 or less detail. Increasing the values of lumi_mask and dark_mask will cause
 426 lavc to  aggressively increase the  quantizer in  very dark or  very bright
 427 regions of the picture (which are  presumably not as noticable to the human
 428 eye) in order to save bits for use elsewhere.
 429
 430 Rich
 431
 432 ===================
 433 ABOUT VLELIM, VCELIM, LUMI_MASK AND DARK_MASK PART II: VQSCALE
 434
 435 OK, a quick explanation. The quantizer you set with vqscale=N is the
 436 per-frame quantizer parameter (aka qp). However, with mpeg4 it's
 437 allowed (and recommended!) for the encoder to vary the quantizer on a
 438 per-macroblock (mb) basis (as I understand it, macroblocks are 16x16
 439 regions composed of 4 8x8 luma blocks and 2 8x8 chroma blocks, u and
 440 v). To do this, lavc scores each mb with a complexity value and
 441 weights the quantizer accordingly. However, you can control this
 442 behavior somewhat with scplx_mask, tcplx_mask, dark_mask, and
 443 lumi_mask.
 444
 445 scplx_mask -- raise quantizer on mb's with lots of spacial complexity.
 446 Spacial complexity is measured by variance of the texture (this is
 447 just the actual image for I blocks and the difference from the
 448 previous coded frame for P blocks).
 449
 450 tcplx_mask -- raise quantizer on mb's with lots of temporal
 451 complexity. Temporal complexity is measured according to motion
 452 vectors.
 453
 454 dark_mask -- raise quantizer on very dark mb's.
 455
 456 lumi_mask -- raise quantizer on very bright mb's.
 457 Somewhere around 0-0.15 is a safe range for these values, IMHO. You
 458 might try as high as 0.25 or 0.3. You should probably never go over
 459 0.5 or so.
 460
 461 Now, about naq. When you adjust the quantizers on a per-mb basis like
 462 this (called adaptive quantization), you might decrease or (more
 463 likely) increase the average quantizer used, so that it no longer
 464 matches the requested average quantizer (qp) for the frame. This will
 465 result in weird things happening with the bitrate, at least from my
 466 experience. What naq does is "normalize adaptive quantization". That
 467 is, after the above masking parameters are applied on a per-mb basis,
 468 the quantizers of all the blocks are rescaled so that the average
 469 stays fixed at the desired qp.
 470
 471 So, if I used vqscale=4 with naq and fairly large values for the
 472 masking parameters, I might be likely to see lots of frames using
 473 qscale 2,3,4,5,6,7 across different macroblocks as needed, but with
 474 the average sticking around 4. However, I haven't actually tested such
 475 a setup yet, so it's just speculation right now.
 476
 477 Have fun playing around with it.
 478
 479 Rich
 480
 481
 482 ================================================================================
 483
 484
 485 TIPS FOR ENCODING OLD BLACK & WHITE MOVIES:
 486
 487 I found myself that  4:3 B&W old movies are very hard  to compress well. In
 488 addition to the 4:3 aspect ratio which  eats lots of bits, those movies are
 489 typically very "noisy", which doesn't help at all. Anyway :
 490
 491 > After a few tries I am
 492 > still a little bit disappointed with the video quality. Since it is a
 493 > "dark" movies, there is a lot of black on the pictures, and on the
 494 > encoded avi I can see a lot of annoying "mpeg squares". I am using
 495 > avifile codec, but the best I think is to give you the command line I
 496 > used to encode a preview of the result:
 497
 498 >
 499 > First pass:
 500 > mencoder TITLE01-ANGLE1.VOB -oac copy -ovc lavc -lavcopts
 501 > vcodec=mpeg4:vhq:vpass=1:vbitrate=800:keyint=48 -ofps 23.976 -npp lb
 502 > -ss 2:00 -endpos 0:30 -vf scale -zoom -xy 640 -o movie.avi
 503
 504 1) keyint=48 is way too low. The  default value is 250, this is in *frames*
 505 not seconds. Keyframes are significantly larger than P or B frames, so the
 506 less keyframes you have, better the overall movie will be. (huh, like Yoda
 507 I  speak ;).  Try keyint=300  or  350. Don't  go  beyond that  if you  want
 508 relatively precise seeking.
 509
 510 2) you may want to play with  vlelim and vcelim options. This can gives you
 511 a significant "quality" boost. Try one of these couples :
 512
 513 vlelim=-2:vcelim=3
 514 vlelim=-3:vcelim=5
 515 vlelim=-4:vcelim=7
 516 (and yes, there's a minus)
 517
 518 3) crop & rescale the movie before  passing it to the codec. First crop the
 519 movie  to  not  encode black  bars  if  there's  any.  For a  1h40mn  movie
 520 compressed to  a 700  MB file,  I would try  something between  512x384 and
 521 480x320. Don't  go below that if  you want something relatively  sharp when
 522 viewed fullscreen.
 523
 524 4)  I would  recommend  using the  Ogg  Vorbis audio  codec  with the  .ogm
 525 container format. Ogg  Vorbis compress audio better than MP3.  On a typical
 526 old,  mono-only audio  stream, a  45 kbits/s  Vorbis stream  is ok.  How to
 527 extract  & compress  an audio  stream  from a  ripped DVD  (mplayer dvd://1
 528 -dumpstream) :
 529
 530 rm -f audiodump.pcm ; mkfifo -m 600 audiodump.pcm
 531 mplayer -quiet -vc null -vo null -aid 128 -ao pcm -nowaveheader stream.dump &
 532 oggenc --raw --raw-bits=16 --raw-chan=2 --raw-rate=48000 -q 1 -o audio-us.ogg
 533 +audiodump.pcm &
 534 wait
 535
 536 For a nice set of utilities to manager the .ogm format, see Moritz Bunkus'
 537 ogmtools (google is your friend).
 538
 539 5) use  the "v4mv"  option. This  could gives you  a few  more bits  at the
 540 expense of a slightly longer encoding.  This is a "lossless" option, I mean
 541 with  this option  you don't  throw away  some video  information, it  just
 542 selects a  more precise motion  estimation method.  Be warned that  on some
 543 very  un-typical scenes  this  option  may gives  you  a  longer file  than
 544 without, although it's very rare and on  a whole film I think it's always a
 545 win.
 546
 547 6) you can try the new luminance & darkness masking code. Play
 548 with the "lumi_mask" and "dark_mask" options. I would recommend using
 549 something like :
 550 lumi_mask=0.07:dark_mask=0.10:naq:
 551 lumi_mask=0.10:dark_mask=0.12:naq:
 552 lumi_mask=0.12:dark_mask=0.15:naq
 553 lumi_mask=0.13:dark_mask=0.16:naq:
 554 Be warned that these options are really experimental and the result
 555 could be very good or very bad depending on your visualization device
 556 (computer CRT, TV or TFT screen). Don't push too hard these options.
 557
 558 > Second pass:
 559 > the same with vpass=2
 560
 561 7) I've found  that lavc gives better  results when the first  pass is done
 562 with  "vqscale=2" instead  of a  target bitrate.  The statistics  collected
 563 seems to be more precise. YMMV.
 564
 565 > I am new to mencoder, so please tell me any idea you have even if it
 566 > obvious. I also tried the "gray" option of lavc, to encode B&W only,
 567 > but strangely it gives me "pink" squares from time to time.
 568
 569 Yes, I've seen  that too. Playing the resulting file  with "-lavdopts gray"
 570 fix the problem but it's not very nice ...
 571
 572 > So if you could tell me what option of mencoder or lavc I should be
 573 > looking at to lower the number of "squares" on the image, it would be
 574 > great. The version of mencoder i use is 0.90pre8 on a macos x PPC
 575 > platform. I guess I would have the same problem by encoding anime
 576 > movies, where there are a lot of region of the image with the same
 577 > color. So if you managed to solve this problem...
 578
 579 You could  also try the  "mpeg_quant" flag. It  selects a different  set of
 580 quantizers and produce  somewhat sharper pictures and less  blocks on large
 581 zones with the same or similar luminance, at the expense of some bits.
 582
 583 > This is completely off topic, but do you know how I can create good
 584 > subtitles from vobsub subtitles ? I checked the -dumpmpsub option of
 585 > mplayer, but is there a way to do it really fast (ie without having to
 586 > play the whole movie) ?
 587
 588 I didn't  find a way under  *nix to produce reasonably  good text subtitles
 589 from vobsubs. OCR  *nix softwares seems either not suited  to the task, not
 590 powerful enough or both. I'm extracting the vobsub subtitles and simply use
 591 them with the .ogm
 592
 593 / .avi :
 594 1) rip the DVD to harddisk with "mplayer dvd://1 -dumpstream"
 595 2) mount the DVD and copy the .ifo file
 596 2) extract all vobsubs to one single file with something like :
 597
 598 for f in 0 1 2 3 4 5 6 7 8 9 10 11 ; do \
 599     mencoder -ovc copy -oac copy -o /dev/null -sid $f -vobsubout sous-titres
 600 +-vobsuboutindex $f -ifo vts_01_0.ifo stream.dump
 601 done
 602
 603 (and yes, I've a DVD with 12 subtitles)
 604 --
 605 Rémi
 606
 607
 608 ================================================================================
 609
 610
 611 TIPS FOR SMOKE & CLOUDS
 612
 613 Q: I'm trying  to encode Dante's Peak and I'm  having problems with clouds,
 614 fog and  smoke: They don't  look fine  (they look very  bad if I  watch the
 615 movie  in TV-out).  There are  some artifacts,  white clouds  looks as  snow
 616 mountains, there are things likes hip in the colors so one can see frontier
 617 curves between white and light gray and  dark gray ... (I don't know if you
 618 can understand me, I want to mean that the colors don't change smoothly)
 619 In particular I'm using vqscale=2:vhq:v4mv
 620
 621 A: Try adding "vqcomp=0.7:vqblur=0.2:mpeg_quant" to lavcopts.
 622
 623 Q: I tried your suggestion and it  improved the image a little ... but not
 624 enough. I was playing with different  options and I couldn't find the way.
 625 I  suppose that  the vob  is not  so good  (watching it  in TV  trough the
 626 computer looks better than my encoding, but it isn't a lot of better).
 627
 628 A: Yes, those scenes with qscale=2 looks terrible :-(
 629
 630 Try with  vqmin=1 in addition to  mpeg_quant:vlelim=-4:vcelim=-7 (and maybe
 631 with "-sws 10 -ssf ls=1" to sharpen a bit the image) and read about vqmin=1
 632 in DOCS/tech/libavc-options.txt.
 633
 634 If after the whole movie is encoded you still see the same problem, it will
 635 means that the  second pass didn't picked-up q=1 for  this scene. Force q=1
 636 with the "vrc_override" option.
 637
 638 Q: By the way, is there a special difficult in encode clouds or smoke?
 639
 640 A: I would say it depends on the sharpness of these clouds / smokes and the
 641 fact that  they are mostly black/white/grey  or colored. The codec  will do
 642 the right thing with vqmin=2 for example on a cigarette smoke (sharp) or on
 643 a red/yellow cloud (explosion, cloud of fire).  But may not with a grey and
 644 very fuzzy cloud like in the chocolat scene. Note that I don't know exactly
 645 why ;)
 646
 647 A = Rémi
 648
 649
 650 ================================================================================
 651
 652
 653 TIPS FOR TWEAKING RATECONTROL
 654
 655 (For the purpose of this explanation, consider "2nd pass" to be any beyond
 656 the 1st. The algorithm is run only on P-frames; I- and B-frames use QPs
 657 based on the adjacent P. While x264's 2pass ratecontrol is based on lavc's,
 658 it has diverged somewhat and not all of this is valid for x264.)
 659
 660 Consider the default ratecontrol equation in lavc: "tex^qComp".
 661 At the beginning of the 2nd pass, rc_eq is evaluated for each frame, and
 662 the result is the number of bits allocated to that frame (multiplied by
 663 some constant as needed to match the total requested bitrate).
 664
 665 "tex" is the complexity of a frame, i.e. the estimated number of bits it
 666 would take to encode at a given quantizer. (If the 1st pass was CQP and
 667 not turbo, then we know tex exactly. Otherwise it is calculated by
 668 multiplying the 1st pass's bits by the QP of that frame. But that's not
 669 why CQP is potentially good; more on that later.)
 670 "qComp" is just a constant. It has no effect outside the rc_eq, and is
 671 directly set by the vqcomp parameter.
 672
 673 If vqcomp=1, then rc_eq=tex^1=tex, so 2pass allocates to each frame the
 674 number of bits needed to encode them all at the same QP.
 675 If vqcomp=0, then rc_eq=tex^0=1, so 2pass allocates the same number of
 676 bits to each frame, i.e. CBR. (Actually, this is worse than 1pass CBR in
 677 terms of quality; CBR can vary within its allowed buffer size, while
 678 vqcomp=0 tries to make each frame exactly the same size.)
 679 If vqcomp=0.5, then rc_eq=sqrt(tex), so the allocation is somewhere
 680 between CBR and CQP. High complexity frames get somewhat lower quality
 681 than low complexity, but still more bits.
 682
 683 While the actual selection of a good value of vqcomp is experimental, the
 684 following underlying factors determine the result:
 685 Arguing towards CQP: You want the movie to be somewhere approaching
 686 constant quality; oscillating quality is even more annoying than constant
 687 low quality. (However, constant quality does not mean constant PSNR nor
 688 constant QP. Details are less noticeable in high-motion scenes, so you can
 689 get away with somewhat higher QP in high-complexity frames for the same
 690 perceived quality.)
 691 Arguing towards CBR: You get more quality per bit if you spend those bits
 692 in frames where motion compensation works well (which tends to be
 693 correlated with "tex"): A given artifact may stick around several seconds
 694 in a low-motion scene, and you only have to fix it in one frame to improve
 695 the quality of the whole sequence.
 696
 697 Now for why the 1st pass ratecontrol method matters:
 698 The number of bits at constant quant is as good a measure of complexity as
 699 any other simple formula for the purpose of allocating bits. But it's not
 700 perfect for predicting which QP will produce the desired number of bits.
 701 Bits are approximately inversely proportional to QP, but the exact scaling
 702 is non-linear, and depends on the amount of detail/noise, the complexity of
 703 motion, the quality of previous frames, and other stuff not measured by the
 704 ratecontrol. So it predicts best when the QP used for a given frame in the
 705 2nd pass is close to the QP used in the 1st pass. When the prediction is
 706 wrong, lavc needs to distribute the surplus or deficit of bits among future
 707 frames, which means that they too deviate from the planned distribution.
 708 Obviously, with vqcomp=1 you can get the 1st pass QPs very close by using
 709 CQP, and with vqcomp=0 a CBR 1st pass is very close. But with vqcomp=0.5
 710 it's more ambiguous. The accepted wisdom is that CBR is better for
 711 vqcomp=0.5, mostly because you then don't have to guess a QP in advance.
 712 But with vqcomp=0.6 or 0.7, the 2nd pass QPs vary less, so a CQP 1st pass
 713 (with the right QP) will be a better predictor than CBR.
 714
 715 To make it more confusing, 1pass CBR uses the same rc_eq with a different
 716 meaning. In CBR, we don't have a real encode to estimate from, so "tex" is
 717 calculated from the full-pixel precision motion-compensation residual.
 718 While the number of bits allocated to a given frame is decided by the rc_eq
 719 just like in 2nd pass, the target bitrate is constant (instead of being the
 720 sum of per-frame rc_eq values). So the scaling factor (which is constant in
 721 2nd pass) now varies in order to keep the local average bitrate near the
 722 CBR target. So vqcomp does affect CBR, though it only determines the local
 723 allocation of bits, not the long-term allocation.
 724
 725 --Loren Merritt