libmpeg3/docs/index.html

   1 <TITLE>LibMPEG3</TITLE>
   2
   3 <CENTER>
   4
   5 <FONT FACE=HELVETICA SIZE=+4><B>Using LibMPEG3 for advanced MPEG
   6 decoding</B></FONT><P>
   7
   8 <TABLE>
   9 <TR>
  10 <TD>
  11 <CODE>
  12 Author: Heroine Virtual Ltd. (Motion picture solutions for Linux without a warranty)
  13 Harassment: broadcast@earthling.net<BR>
  14 Homepage: heroinewarrior.com<P>
  15
  16 libmpeg3 is free software; you can redistribute it and/or modify it
  17 under the terms of the GNU General Public License as published by the
  18 Free Software Foundation; either version 2, or (at your option) any
  19 later version.<P>
  20
  21 libmpeg3 is distributed in the hope that it will be useful,
  22 but WITHOUT ANY WARRANTY; without even the implied warranty of
  23 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  24 GNU General Public License for more details.<P>
  25
  26
  27 In addition to the GPL's warranty stipulation, libmpeg3 is distributed
  28 WITHOUT GUARANTEED SUPPORT.  Support that is not guaranteed includes
  29 technical support, compiler troubleshooting, debugging, version
  30 matching, updating, among other additional labor which may or may not
  31 be required to meet a user's requirements.<P>
  32
  33 </CODE> </TD> </TR> </TABLE> </CENTER>
  34
  35
  36 <H1>Table of contents</H1><P>
  37 <A HREF="#INTRO">Intro</A><BR>
  38 <A HREF="#BUILDING">Building the library</A><BR>
  39 <A HREF="#USAGE">Usage</A><BR>
  40 <A HREF="#TOC">Using tables of contents for editing</A><BR>
  41
  42
  43
  44
  45
  46
  47
  48
  49 <P>
  50
  51 <A NAME=INTRO>
  52
  53 LibMPEG3 decodes several MPEG standards into uncompressed data suitable
  54 for editing and playback.<P>
  55
  56 libmpeg3 currently decodes:<P>
  57
  58 <BLOCKQUOTE>MPEG-2 video<BR>
  59 MPEG-1 video<BR>
  60 mp3 audio<BR>
  61 mp2 audio<BR>
  62 ac3 audio<BR>
  63 MPEG-2 transport streams<BR>
  64 MPEG-2 program streams<BR>
  65 MPEG-1 program streams<BR>
  66 IFO files<BR>
  67 </BLOCKQUOTE><P>
  68
  69 The video output can be in many different color models and frame
  70 sizes.  The audio output can be in twos compliment or floating point.
  71 Frame accurate seeking, normally impossible in transport streams, is
  72 possible in libmpeg3 through the use of a <B>table of contents</B>.
  73 MPEG-2 video in YUV-422 colorspace is decodable.  Digital TV broadcasts
  74 and DVD's can be edited using libmpeg3.  Libmpeg3 takes what is
  75 normally a last mile distribution format and makes it editable.<P>
  76
  77 Because of these and other features libmpeg3 is not intended for
  78 consumer applications but serves users who are interested in high
  79 quality editing and footage acquisition.<P>
  80
  81
  82
  83
  84
  85
  86
  87 <A NAME=BUILDING>
  88 <FONT FACE=HELVETICA SIZE=+4><B>Building the library</B></FONT><P>
  89
  90 libmpeg3 depends on the CFLAGS environment variable to get optimization
  91 flags.  You should set it to <P>
  92
  93 <TT>-O3 -march=i686 -fmessage-length=0 -funroll-all-loops
  94 -fomit-frame-pointer -malign-loops=2 -malign-jumps=2
  95 -malign-functions=2</TT><P>
  96
  97 You must run <B>make</B> to build the library and should be using
  98 Kernel 2.4.9 or later.  The makefile automatically determines
  99 appropriate parameters and puts the library in i686/libmpeg3.a.
 100 Several utilities are also built.  Install the utilities by running
 101 <B>make install</B>.<P>
 102
 103 Unfortunately libmpeg3 excercizes the
 104 system more aggressively than a consumer library and this brings out
 105 different bugs in each kernel version.<P>
 106
 107
 108 2.4.9: ext3 filesystem failure<BR>
 109
 110 2.4.10: memory management failure when running mpeg3toc<BR>
 111
 112 2.4.17: memory management failure after 5 hours of decoding video<P>
 113
 114
 115 As libmpeg3 is not one of the standard MPEG decoding
 116 libraries, these utilities are unlike any you've ever seen before.
 117 Remember a utility is only as illegal or legal as the guy who runs
 118 it.<P>
 119
 120
 121
 122
 123 <A NAME=USAGE>
 124 <H1>Usage</H1><P>
 125
 126
 127 <FONT FACE=HELVETICA SIZE=+4><B>STEP 1: Verifying file compatibility</B></FONT><P>
 128
 129 Programs using libmpeg3 must <CODE>#include "libmpeg3.h"</CODE>.<P>
 130
 131 Call <CODE>mpeg3_check_sig</CODE> to verify if the file can be read by
 132 libmpeg3.  This returns a 1 if it is compatible and 0 if it isn't.<P>
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144 <FONT FACE=HELVETICA SIZE=+4><B>STEP 2: Open the file</B></FONT><P>
 145
 146 You need an <CODE>mpeg3_t*</CODE> file descriptor:<P>
 147 <CODE>
 148 mpeg3_t* file;
 149 </CODE>
 150 <P>
 151
 152 Then you need to open the file:<P>
 153
 154 <CODE>file = mpeg3_open(char *path);</CODE><P>
 155
 156 <CODE>mpeg3_open</CODE> returns a NULL if the file couldn't be opened
 157 for some reason.  Be sure to check this.  Everything you do with
 158 libmpeg3 requires passing the <CODE>file</CODE> pointer.<P>
 159
 160 Another way of opening a file is <P>
 161
 162 <CODE>mpeg3_open_copy(char *path, mpeg3_t *old_file)</CODE><P>
 163
 164 You need to open multiple copies of a file in realtime situations
 165 because only one thread can access a mpeg3_t structure at a time.  The
 166 audio and video can't read simultaneously.  The solution is not to
 167 repeatedly call mpeg3_open but to call mpeg3_open_copy for every file
 168 handle after the first one.  This copies tables from the first file to
 169 speed up opening.<P>
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182 <FONT FACE=HELVETICA SIZE=+4><B>STEP 3: Set optimization strategies</B></FONT><P>
 183
 184 Call <CODE>mpeg3_set_cpus(mpeg3_t *file, int cpus)</CODE> to set how
 185 many CPUs should be devoted to video decompression.  LibMPEG3 can use
 186 any number.<P>
 187
 188 Call <CODE>mpeg3_set_mmx(mpeg3_t *file, int use_mmx)</CODE> to set if
 189 MMX is used for video.  Disabling MMX is mandatory for low bitrate
 190 streams since it is very lossy.  By the way, lately the compiled MMX
 191 output has been producing corrupted video.  This is a change in the way
 192 modern compilers and CPU's handle MMX from the way it was done 4 years
 193 ago but since modern CPU's are so fast, you're better off not using MMX
 194 at all.<P>
 195
 196
 197
 198
 199
 200
 201
 202 <FONT FACE=HELVETICA SIZE=+4><B>STEP 4: Query the file.</B></FONT><P>
 203
 204 There are a number of queries for the audio components of the stream:<P>
 205
 206 <CODE><PRE>
 207 int mpeg3_has_audio(mpeg3_t *file);
 208 int mpeg3_total_astreams(mpeg3_t *file);             // Number of multiplexed audio streams
 209 int mpeg3_audio_channels(mpeg3_t *file, int stream);
 210 int mpeg3_sample_rate(mpeg3_t *file, int stream);
 211 long mpeg3_audio_samples(mpeg3_t *file, int stream); // Total length
 212 </PRE></CODE>
 213
 214 The audio is presented as a number of <B>streams</B> starting at 0 and
 215 including <CODE>mpeg3_total_astreams</CODE> - 1.  Each stream contains a
 216 certain number of <B>channels</B> starting at 0 and including
 217 <CODE>mpeg3_audio_channels</CODE> - 1.
 218
 219 The methodology is first determine if the file has audio, then get
 220 the number of streams in the file, then for each stream get the number
 221 of channels, sample rate, and length.<P>
 222
 223 There are also queries for the video components:<P>
 224
 225 <CODE><PRE>
 226 int mpeg3_has_video(mpeg3_t *file);
 227 int mpeg3_total_vstreams(mpeg3_t *file);            // Number of multiplexed video streams
 228 int mpeg3_video_width(mpeg3_t *file, int stream);
 229 int mpeg3_video_height(mpeg3_t *file, int stream);
 230 float mpeg3_frame_rate(mpeg3_t *file, int stream);  // Frames/sec
 231 long mpeg3_video_frames(mpeg3_t *file, int stream); // Total length
 232 int mpeg3_colormodel(mpeg3_t *file, int stream);
 233 </PRE></CODE>
 234
 235 The video behavior is the same as with audio, except that video has no
 236 subdivision under <B>streams</B>.  Frame rate is a floating point
 237 number of frames per second.<P>
 238
 239 <TT>mpeg3_colormodel</TT> returns either MPEG3_YUV420P or
 240 MPEG3_YUV422P.  MPEG3_YUV422P is only encountered in high quality video
 241 not available in any consumer distribution medium.<P>
 242
 243
 244
 245
 246
 247 <FONT FACE=HELVETICA SIZE=+4><B>STEP 5: Seeking to a point in the file</B></FONT><P>
 248
 249 Each audio stream and each video stream has a position in the file
 250 independant of each other stream.  A variety of methods are available
 251 for specifying the position of a stream: <B>byte offset, frame,
 252 sample</B>.  Which method you use depends on whether you're seeking
 253 audio or video and whether you have a table of contents for the
 254 stream.<P>
 255
 256 The preferred seeking method if you're writing a player is:<P>
 257
 258 <CODE><PRE>
 259 int mpeg3_seek_byte(mpeg3_t *file, int64_t byte);
 260 int64_t mpeg3_tell_byte(mpeg3_t *file);
 261 </PRE></CODE>
 262
 263 This seeks all tracks to an absolute byte offset in the file.  The
 264 byte offset is from 0 to the result of:<P>
 265
 266 <CORE><PRE>
 267 mpeg3_get_bytes(mpeg3_t *file)
 268 </PRE></CODE>
 269
 270
 271 The alternative to byte seeking is <B>frame or sample seeking</B>.
 272 Frame seeking is only possible if a <B>table of contents</B> exists.
 273 The <B>mpeg3toc</B> that comes with libmpeg3 creates tables of contents
 274 from MPEG 1 & 2 streams.  Sample seeking is only possible if the stream
 275 is fixed bitrate audio.  The audio seeking is handled by:<P>
 276
 277 <CODE><PRE>
 278 int mpeg3_set_sample(mpeg3_t *file, long sample, int stream);    // Seek
 279 long mpeg3_get_sample(mpeg3_t *file, int stream);    // Tell current position
 280 </PRE></CODE>
 281
 282 and the video seeking is handled by:<P>
 283
 284 <CODE><PRE>
 285 int mpeg3_set_frame(mpeg3_t *file, long frame, int stream); // Seek
 286 long mpeg3_get_frame(mpeg3_t *file, int stream);            // Tell current position
 287 </PRE></CODE>
 288
 289
 290 You can either perform percentage seeking or absolute byte seeking but
 291 not both on the same file handle.  Once you perform either method, the
 292 file becomes configured for that method.<P>
 293
 294 If you're in byte seeking mode and you want the current time stamp in
 295 the file you can't use mpeg3_get_frame or mpeg3_get_sample because you
 296 don't know the total length in the desired units.  The
 297 <CODE>mpeg3_audio_samples</CODE> and <CODE>mpeg3_video_frames</CODE>
 298 commands don't work in percentage seeking either.  Instead use
 299
 300 <CODE><PRE>
 301 double mpeg3_get_time(mpeg3_t *file);
 302 </PRE></CODE>
 303
 304 which gives you the last timecode read in seconds.  The MPEG standard
 305 specifies timecodes being placed in the streams.  Now you know the
 306 absolute byte position in the file and the current time stamp, enough
 307 to update a progress bar or a text box.<P>
 308
 309 Finally, there is a way to seek to the previous frame of video:
 310
 311
 312 <CODE><PRE>
 313 int mpeg3_previous_frame(mpeg3_t *file, int stream);
 314 </PRE></CODE>
 315
 316 Because MPEG 1 & 2 are really hairy, the set commands won't do much
 317 good for playing backwards.  mpeg3_previous_frame does some tricks to
 318 seek to the previous frame.  Next you have to call a read_frame
 319 command to read it.
 320 <P>
 321
 322
 323
 324
 325
 326 <FONT FACE=HELVETICA SIZE=+4><B>STEP 6: Read the data</B></FONT><P>
 327
 328 <I>To read <B>audio</B> data use:</I><P>
 329
 330 <CODE><PRE>
 331 int mpeg3_read_audio(mpeg3_t *file,
 332                 float *output_f,      // Pointer to pre-allocated buffer of floats
 333                 short *output_i,      // Pointer to pre-allocated buffer if int16's
 334                 int channel,          // Channel to decode
 335                 long samples,         // Number of samples to decode
 336                 int stream);          // Stream containing the channel
 337 </PRE></CODE>
 338
 339 This decodes a buffer of sequential floats or int16's for a single
 340 channel, depending on which *output... parameter has a nonzero
 341 argument.  To get a floating point buffer pass a pre-allocated buffer
 342 to <CODE>output_f</CODE> and NULL to <CODE>output_i</CODE>. To get an
 343 int16 buffer pass NULL to <CODE>output_f</CODE> and a pre-allocated
 344 buffer to <CODE>output_i</CODE>.  Alternatively you can pass NULL to
 345 both buffer arguments and the decoder won't render anything.<P>
 346
 347 After reading an audio buffer, the current position in the one stream
 348 is advanced.  Remember that if you're using percentage seeking you
 349 can't call <TT>mpeg3_set_sample</TT> to rewind and read every channel.
 350 How then, do you read more than one channel of audio data?  Use
 351
 352 <CODE><PRE>
 353 mpeg3_reread_audio(mpeg3_t *file,
 354                 float *output_f,      /* Pointer to pre-allocated buffer of floats */
 355                 short *output_i,      /* Pointer to pre-allocated buffer of int16's */
 356                 int channel,          /* Channel to decode */
 357                 long samples,         /* Number of samples to decode */
 358                 int stream);
 359 </PRE></CODE>
 360
 361 to read each remaining channel after the first channel.<P>
 362
 363
 364
 365
 366
 367
 368
 369 <I>To read <B>video</B> data there are two methods.  RGB frames or YUV
 370 frames.  To get an RGB frame use:</I> <BR>
 371
 372 <CODE><PRE>
 373 int mpeg3_read_frame(mpeg3_t *file,
 374                 unsigned char **output_rows, // Array of pointers to the start of each output row
 375                 int in_x,                    // Location in input frame to take picture
 376                 int in_y,
 377                 int in_w,
 378                 int in_h,
 379                 int out_w,                   // Dimensions of output_rows
 380                 int out_h,
 381                 int color_model,             // One of the color model #defines given above.
 382                 int stream);
 383 </PRE></CODE>
 384
 385 The video decoding works like a camcorder taking copies of a movie
 386 screen.  The decoder "sees" a region of the movie screen defined by
 387 <CODE>in_x, in_y, in_w, in_h</CODE> and transfers it to the frame
 388 buffer defined by <CODE>**output_rows</CODE>.  The input values must be
 389 within the boundaries given by <CODE>mpeg3_video_width</CODE> and
 390 <CODE>mpeg3_video_height</CODE>.  The size of the frame buffer is
 391 defined by <CODE>out_w, out_h</CODE>.  Although the input dimensions
 392 are constrained, the frame buffer can be any size.<P>
 393
 394 <CODE>color_model</CODE> defines which RGB color model the picture
 395 should be decoded to and the possible values are given in
 396 <B>libmpeg3.h</B>.  The frame buffer pointed to by
 397 <CODE>output_rows</CODE> must have enough memory allocated to store the
 398 color model you select.<P>
 399
 400 <B>You must allocate 4 extra bytes in the last output_row.</B>  This is
 401 scratch area for the MMX routines.<P>
 402
 403 <CODE>mpeg3_read_frame</CODE> advances the position in the one stream by 1 frame.<P>
 404
 405 <I>To read YUV frames use one of two methods:</I><BR>
 406
 407 <CODE><PRE>
 408 int mpeg3_read_yuvframe(mpeg3_t *file,
 409                 char *y_output,
 410                 char *u_output,
 411                 char *v_output,
 412                 int in_x,
 413                 int in_y,
 414                 int in_w,
 415                 int in_h,
 416                 int stream);
 417 </PRE></CODE>
 418
 419 The behavior of in_x, in_y, in_w, in_h is identical to mpeg3_read_frame
 420 except here you have no control over the output frame size.  <B>You
 421 must allocate in_w * in_h for the y_output, and in_w * in_h / 4 for the
 422 u_output and v_output.</B>
 423
 424 While <B>mpeg3_read_yuvframe</B> allows cropping of letterbox it still
 425 requires one memcpy.  A faster alternative is:<P>
 426
 427 <CODE><PRE>
 428 int mpeg3_read_yuvframe_ptr(mpeg3_t *file,
 429                 char **y_output,
 430                 char **u_output,
 431                 char **v_output,
 432                 int stream);
 433 </PRE></CODE>
 434
 435 This redirects a *y_output, *u_output, and *v_output pointer to the
 436 scratch buffer that decoding took place in.  Since MPEG is temporal
 437 compression, there is always a buffer containing the last decoder
 438 output.<P>
 439
 440
 441
 442
 443 For professional use the library can decode YUV 4:2:2 video in addition
 444 to YUV 4:2:0.  This variable is determined at encoding time, won't
 445 affect the usage of <B>mpeg3_read_frame</B> but you do need an extra
 446 function call in order to use <B>mpeg3_read_yuvframe</B>.  To determine
 447 the encoding of the video stream use
 448
 449 <CODE><PRE>
 450 mpeg3_colormodel(mpeg3_t *file, int stream)
 451 </PRE></CODE>
 452
 453 This returns either <B>MPEG3_YUV420P</B> or <B>MPEG3_YUV422P</B>.  The
 454 output buffers and the YUV to RGB conversion for
 455 <B>mpeg3_read_yuvframe</B> must be adjusted for the higher sampling of
 456 MPEG3_YUV422P.  When using <B>mpeg3_read_yuvframe_ptr</B> you don't
 457 need to adjust any output buffers.<P>
 458
 459 <FONT FACE=HELVETICA SIZE=+4><B>Synchronizing video with audio</B></FONT><P>
 460
 461 To synchronize video with audio in realtime you need to sometimes delay
 462 the video and sometimes drop frames.  It's easy to calculate the number
 463 of frames to drop but if you're using percentage seeking you can't
 464 calculate the exact percentage to seek forward by.  Instead call <P>
 465
 466 <CODE>mpeg3_drop_frames(mpeg3_t *file, long frames, int stream);</CODE><P>
 467
 468 This skips <CODE>frames</CODE> frames from the current position whether
 469 in percentage seeking or absolute seeking.<P>
 470
 471
 472 <FONT FACE=HELVETICA SIZE=+4><B>STEP 7: Close the file</B></FONT><P>
 473
 474 Be sure to close the file with <CODE>mpeg3_close(mpeg3_t *file)</CODE>
 475 when you're done with it.
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485 <A NAME=TOC>
 486 <H1>Using tables of contents for editing</H1><P>
 487
 488 In 1985 everyone watched Smurfs but one guy watched Robotech.  In 1990
 489 everyone watched Teenage Mutant Ninja Turtles but one guy watched
 490 Transformers.  In 1995 everyone watched Pokemon but one guy watched
 491 Behind the Scenes.  Now everyone wants handheld organizers but one guy
 492 wants MPEG editors.  For the wierdos who always looked at the camera
 493 rig instead of the celebrity, libmpeg3 supports a way of seeking to an
 494 exact frame or sample in any kind of MPEG encapsulation format for
 495 editing.<P>
 496
 497 A table of contents must be built for any footage to be edited with
 498 libmpeg3.  Run <TT>mpeg3toc &lt;mpeg stream> &lt;output table of contents></TT><P>
 499
 500
 501 For editing DVD footage, the mpeg stream argument should be the ifo
 502 file belonging to the title set to be edited.  This utility reads
 503 through every file comprising the mpeg stream and records the offset of
 504 every 65536th sample and every keyframe so it can be pretty slow.<P>
 505
 506 The resulting table of contents file should be passed to mpeg3_open
 507 and mpeg3_open_copy just like a normal file.  The only difference is
 508 frame seeking of video is available.<P>
 509
 510
 511