DOCS/tech/libmpcodecs.txt

   1 NOTE: If you want to implement a new native codec, please add it to
   2 libavcodec. libmpcodecs is considered mostly deprecated, except for wrappers
   3 around external libraries and codecs requiring binary support.
   4
   5 The libMPcodecs API details, hints - by A'rpi
   6 ==================================
   7
   8 See also: colorspaces.txt, codec-devel.txt, dr-methods.txt, codecs.conf.txt
   9
  10 The VIDEO path:
  11 ===============
  12
  13     [MPlayer core]
  14           | (1)
  15     ______V______   (2)  /~~~~~~~~~~\    (3,4)  |~~~~~~|
  16     |           | -----> | vd_XXX.c |  -------> | vd.c |
  17     | dec_video |        \__________/  <-(3a)-- |______|
  18     |           | -----,  ,.............(3a,4a).....:
  19     ~~~~~~~~~~~~~  (6) V  V
  20                    /~~~~~~~~\     /~~~~~~~~\  (8)
  21                    | vf_X.c | --> | vf_Y.c | ---->  vf_vo.c / ve_XXX.c
  22                    \________/     \________/
  23                        |              ^
  24                    (7) |   |~~~~~~|   : (7a)
  25                        `-> | vf.c |...:
  26                            |______|
  27
  28 Short description of video path:
  29 1. MPlayer/MEncoder core requests the decoding of a compressed video frame:
  30    calls dec_video.c::decode_video().
  31
  32 2. decode_video() calls the video codec previously selected in init_video().
  33    (vd_XXX.c file, where XXX == vfm name, see the 'driver' line of codecs.conf)
  34
  35 3. The codec should initialize the output device before decoding the first
  36    frame. It may happen in init() or at the middle of the first decode(), see
  37    3a. It means calling vd.c::mpcodecs_config_vo() with the image dimensions,
  38    and the _preferred_ (meaning: internal, native, best) colorspace.
  39    NOTE: This colorspace may not be equal to the colorspace that is actually
  40    used. It's just a _hint_ for the colorspace matching algorithm and mainly
  41    used as input format of the converter, but _only_ when colorspace
  42    conversion is required,
  43
  44 3a. Selecting the best output colorspace:
  45    The vd.c::mpcodecs_config_vo() function will go through the outfmt list
  46    defined by the 'out' lines in codecs.conf and query both vd (codec) and
  47    vo (output device/filter/encoder) if the format is supported or not.
  48
  49    For the vo, it calls the query_format() function of vf_XXX.c or ve_XXX.c.
  50    It should return a set of feature flags, the most important ones for this
  51    stage are: VFCAP_CSP_SUPPORTED (colorspace supported directly or by
  52    conversion) and VFCAP_CSP_SUPPORTED_BY_HW (colorspace supported
  53    WITHOUT any conversion).
  54
  55    For the vd (codec), control() with VDCTRL_QUERY_FORMAT will be called.
  56    If it does not implement VDCTRL_QUERY_FORMAT, (i.e. answers CONTROL_UNKNOWN
  57    or CONTROL_NA), it is assumed to be CONTROL_TRUE (colorspace supported)!
  58
  59    So, by default, if the list of supported colorspaces is constant and does
  60    not depend on the actual file/stream header, then it is enough to list the
  61    formats in codecs.conf ('out' field) and not implement VDCTRL_QUERY_FORMAT.
  62    This is the case for most codecs.
  63
  64    If the supported colorspace list depends on the file being decoded, list
  65    the possible out formats (colorspaces) in codecs.conf and implement the
  66    VDCTRL_QUERY_FORMAT to test the availability of the given colorspace for
  67    the given video file/stream.
  68
  69    The vd.c core will find the best matching colorspace, depending on the
  70    VFCAP_CSP_SUPPORTED_BY_HW flag (see vfcap.h). If no match can be found,
  71    it will try again with the 'scale' filter inserted between vd and vo.
  72    If this is unsuccessful, it will fail :(
  73
  74 4. Requesting buffer for the decoded frame:
  75    The codec has to call mpcodecs_get_image() with proper imgtype & imgflag.
  76    It will find the optimal buffering setup (preferred stride, alignment etc)
  77    and return a pointer to the allocated and filled up mpi (mp_image_t*) struct.
  78    The 'imgtype' controls the buffering setup, i.e. STATIC (just one buffer that
  79    'remembers' its content between frames), TEMP (write-only, full update),
  80    EXPORT (memory allocation is done by the codec, not recommended) and so on.
  81    The 'imgflags' set up the limits for the buffer, i.e. stride limitations,
  82    readability, remembering content etc. See mp_image.h for the short
  83    description. See dr-methods.txt for the explanation of buffer
  84    importing and mpi imgtypes.
  85
  86    Always try to implement stride support! (stride == bytes per line)
  87    If no stride support, then stride==bytes_per_pixel*image_width.
  88    If you have stride support in your decoder, use the mpi->stride[] value
  89    for the byte_per_line for each plane.
  90    Also take care of other imgflags, like MP_IMGFLAG_PRESERVE and
  91    MP_IMGFLAG_READABLE, MP_IMGFLAG_COMMON_STRIDE and MP_IMGFLAG_COMMON_PLANE!
  92    The file mp_image.h contains flag descriptions in comments, read it!
  93    Ask for help on dev-eng, describing the behavior of your codec, if unsure.
  94
  95 4.a. buffer allocation, vd.c::mpcodecs_get_image():
  96    If the requested buffer imgtype!=EXPORT, then vd.c will try to do
  97    direct rendering, i.e. ask the next filter/vo for the buffer allocation.
  98    It's done by calling get_image() of the vf_XXX.c file.
  99    If it was successful, the imgflag MP_IMGFLAG_DIRECT will be set, and one
 100    memcpy() will be saved when passing the data from vd to the next filter/vo.
 101    See dr-methods.txt for details and examples.
 102
 103 5. Decode the frame, to the mpi structure requested in 4., then return the mpi
 104    to decvideo.c. Return NULL if the decoding failed or skipped the frame.
 105
 106 6. decvideo.c::decode_video() will now pass the 'mpi' to the next filter (vf_X).
 107
 108 7. The filter's (vf_X) put_image() then requests a new mpi buffer by calling
 109    vf.c::vf_get_image().
 110
 111 7.a. vf.c::vf_get_image() will try to get direct rendering by asking the
 112      next filter to do the buffer allocation (calls vf_Y's get_image()).
 113      If it fails, it will fall back on normal system memory allocation.
 114
 115 8. When we're past the whole filter chain (multiple filters can be connected,
 116    even the same filter multiple times) then the last, 'leaf' filters will be
 117    called. The only difference between leaf and non-leaf filters is that leaf
 118    filters have to implement the whole filter API.
 119    Currently leaf filters are: vf_vo.c (wrapper over libvo) and ve_XXX.c
 120    (video encoders used by MEncoder).
 121
 122
 123 Video Filters
 124 =============
 125
 126 Video filters are plugin-like code modules implementing the interface
 127 defined in vf.h.
 128
 129 Basically it means video output manipulation, i.e. these plugins can
 130 modify the image and the image properties (size, colorspace, etc) between
 131 the video decoders (vd.h) and the output layer (libvo or video encoders).
 132
 133 The actual API is a mixture of the video decoder (vd.h) and libvo
 134 (video_out.h) APIs.
 135
 136 main differences:
 137 - vf plugins may be "loaded" multiple times, with different parameters
 138   and context - it's new in MPlayer, old APIs weren't reentrant.
 139 - vf plugins don't have to implement all functions - all functions have a
 140   'fallback' version, so the plugins only override these if wanted.
 141 - Each vf plugin has its own get_image context, and they can interchange
 142   images/buffers using these get_image/put_image calls.
 143
 144
 145 The VIDEO FILTER API:
 146 =====================
 147 filename: vf_FILTERNAME.c
 148
 149 vf_info_t* info;
 150     pointer to the filter description structure:
 151
 152     const char *info;    // description of the filter
 153     const char *name;    // short name of the filter, must be FILTERNAME
 154     const char *author;  // name and email/URL of the author(s)
 155     const char *comment; // comment, URL to papers describing algorithm etc.
 156     int (*open)(struct vf_instance *vf,char* args);
 157                          // pointer to the open() function:
 158
 159 Sample:
 160
 161 vf_info_t vf_info_foobar = {
 162     "Universal Foo and Bar filter",
 163     "foobar",
 164     "Ms. Foo Bar",
 165     "based on algorithm described at http://www.foo-bar.org",
 166     open
 167 };
 168
 169 The open() function:
 170
 171     open() is called when the filter is appended/inserted in the filter chain.
 172     It'll receive the handler (vf) and the optional filter parameters as
 173     char* string. Note that encoders (ve_*) and vo wrapper (vf_vo.c) have
 174     non-string arg, but it's specially handled by MPlayer/MEncoder.
 175
 176     The open() function should fill the vf_instance_t structure with the
 177     implemented functions' pointers (see below).
 178     It can optionally allocate memory for its internal data (vf_priv_t) and
 179     store the pointer in vf->priv.
 180
 181     The open() function should parse (or at least check syntax of) parameters,
 182     and fail (return 0) on error.
 183
 184 Sample:
 185
 186 static int open(vf_instance_t *vf, char* args)
 187 {
 188     vf->query_format = query_format;
 189     vf->config       = config;
 190     vf->put_image    = put_image;
 191     // allocate local storage:
 192     vf->priv = malloc(sizeof(struct vf_priv_s));
 193     vf->priv->w =
 194     vf->priv->h = -1;
 195     if(args)   // parse args:
 196     if(sscanf(args, "%d:%d", &vf->priv->w, &vf->priv->h)!=2) return 0;
 197     return 1;
 198 }
 199
 200 Functions in vf_instance:
 201
 202 NOTE: All these are optional, their function pointer is either NULL or points
 203 to a default implementation. If you implement them, don't forget to set
 204 vf->FUNCNAME in your open() !
 205
 206     int (*query_format)(struct vf_instance *vf, unsigned int fmt);
 207
 208 The query_format() function is called one or more times before the config(),
 209 to find out the capabilities and/or support status of a given colorspace (fmt).
 210 For the return values, see vfcap.h!
 211 Normally, a filter should return at least VFCAP_CSP_SUPPORTED for all supported
 212 colorspaces it accepts as input, and 0 for the unsupported ones.
 213 If your filter does linear conversion, it should query the next filter,
 214 and merge in its capability flags. Note: You should always ensure that the
 215 next filter will accept at least one of your possible output colorspaces!
 216
 217 Sample:
 218
 219 static int query_format(struct vf_instance *vf, unsigned int fmt)
 220 {
 221     switch(fmt){
 222     case IMGFMT_YV12:
 223     case IMGFMT_I420:
 224     case IMGFMT_IYUV:
 225     case IMGFMT_422P:
 226     return vf_next_query_format(vf,IMGFMT_YUY2) & (~VFCAP_CSP_SUPPORTED_BY_HW);
 227     }
 228     return 0;
 229 }
 230
 231 For the more complex case, when you have an N -> M colorspace mapping matrix,
 232 see vf_scale or vf_format for examples.
 233
 234
 235     int (*config)(struct vf_instance *vf,
 236         int width, int height, int d_width, int d_height,
 237         unsigned int flags, unsigned int outfmt);
 238
 239 The config() is called to initialize/configure the filter before using it.
 240 Its parameters are already well-known from libvo:
 241     width, height:     size of the coded image
 242     d_width, d_height: wanted display size (usually aspect corrected w/h)
 243         Filters should use width, height as input image dimension, but the
 244         resizing filters (crop, expand, scale, rotate, etc) should update
 245         d_width/d_height (display size) to preserve the correct aspect ratio!
 246         Filters should not rely on d_width, d_height as input parameters,
 247         the only exception is when a filter replaces some libvo functionality
 248         (like -vf scale with -zoom, or OSD rendering with -vf expand).
 249     flags:       the "good" old libvo flag set:
 250         0x01 - force fullscreen (-fs)
 251         0x02 - allow mode switching (-vm)
 252         0x04 - allow software scaling (-zoom)
 253         0x08 - flipping (-flip)
 254     (Usually you don't have to worry about flags, just pass it to next config.)
 255     outfmt: the selected colorspace/pixelformat. You'll receive images in this
 256     format.
 257
 258 Sample:
 259
 260 static int config(struct vf_instance *vf,
 261                   int width, int height, int d_width, int d_height,
 262                   unsigned int flags, unsigned int outfmt)
 263 {
 264     // use d_width/d_height if not set by the user:
 265     if (vf->priv->w == -1)
 266         vf->priv->w = d_width;
 267     if (vf->priv->h == -1)
 268         vf->priv->h = d_height;
 269     // initialize your filter code
 270     ...
 271     // OK now config the rest of the filter chain, with our output parameters:
 272     return vf_next_config(vf,vf->priv->w,vf->priv->h,d_width,d_height,flags,outfmt);
 273 }
 274
 275     void (*uninit)(struct vf_instance *vf);
 276
 277 Okay, uninit() is the simplest, it's called at the end. You can free your
 278 private buffers etc here.
 279
 280     int (*put_image)(struct vf_instance *vf, mp_image_t *mpi);
 281
 282 Ah, put_image(). This is the main filter function, it should convert/filter/
 283 transform the image data from one format/size/color/whatever to another.
 284 Its input parameter is an mpi (mplayer image) structure, see mp_image.h.
 285 Your filter has to request a new image buffer for the output, using the
 286 vf_get_image() function. NOTE: Even if you don't want to modify the image,
 287 just pass it to the next filter, you have to either
 288 - not implement put_image() at all - then it will be skipped
 289 - request a new image with type==EXPORT and copy the pointers
 290 NEVER pass the mpi as-is, it's local to the filters and may cause trouble.
 291
 292 If you completely copy/transform the image, then you probably want this:
 293
 294     dmpi = vf_get_image(vf->next,mpi->imgfmt, MP_IMGTYPE_TEMP,
 295                         MP_IMGFLAG_ACCEPT_STRIDE, vf->priv->w, vf->priv->h);
 296
 297 It will allocate a new image and return an mp_image structure filled by
 298 buffer pointers and stride (bytes per line) values, in size of vf->priv->w
 299 times vf->priv->h. If your filter cannot handle stride, then leave out
 300 MP_IMGFLAG_ACCEPT_STRIDE. Note that you can do this, but it isn't recommended,
 301 the whole video path is designed to use strides to get optimal throughput.
 302 If your filter allocates output image buffers, then use MP_IMGTYPE_EXPORT
 303 and fill the returned dmpi's planes[], stride[] with your buffer parameters.
 304 Note, it is not recommended (no direct rendering), so if you can, use
 305 vf_get_image() for buffer allocation!
 306 For other image types and flags see mp_image.h, it has comments.
 307 If you are unsure, feel free to ask on the dev-eng mailing list. Please
 308 describe the behavior of your filter, and its limitations, so we can
 309 suggest the optimal buffer type + flags for your code.
 310
 311 Now that you have the input (mpi) and output (dmpi) buffers, you can do
 312 the conversion. If you didn't notice yet, mp_image has some useful info
 313 fields. They may help you a lot creating if() or for() structures:
 314     flags: MP_IMGFLAG_PLANAR, MP_IMGFLAG_YUV, MP_IMGFLAG_SWAPPED
 315         helps you to handle various pixel formats in single code.
 316     bpp: bits per pixel
 317         WARNING! It's number of bits _allocated_ to store a pixel,
 318         it is not the number of bits actually used to keep colors!
 319         So it's 16 for both 15 and 16 bit color depth, and is 32 for
 320         32bpp (actually 24 bit color depth) mode!
 321         It's 1 for 1bpp, 9 for YVU9, and is 12 for YV12 mode. Get it?
 322     For planar formats, you also have chroma_width, chroma_height and
 323     chroma_x_shift, chroma_y_shift too, they specify the chroma subsampling
 324     for YUV formats:
 325         chroma_width  = luma_width  >> chroma_x_shift;
 326         chroma_height = luma_height >> chroma_y_shift;
 327
 328 If you're done, call the rest of the filter chain to process your output
 329 image:
 330     return vf_next_put_image(vf,dmpi);
 331
 332
 333 Ok, the rest is for advanced functionality only:
 334
 335     int (*control)(struct vf_instance *vf, int request, void* data);
 336
 337 You can control the filter at runtime from MPlayer/MEncoder/dec_video:
 338 #define VFCTRL_QUERY_MAX_PP_LEVEL 4 /* test for postprocessing support (max level) */
 339 #define VFCTRL_SET_PP_LEVEL       5 /* set postprocessing level */
 340 #define VFCTRL_SET_EQUALIZER      6 /* set color options (brightness,contrast etc) */
 341 #define VFCTRL_GET_EQUALIZER      8 /* get color options (brightness,contrast etc) */
 342 #define VFCTRL_DRAW_OSD           7
 343 #define VFCTRL_CHANGE_RECTANGLE   9 /* Change the rectangle boundaries */
 344
 345
 346     void (*get_image)(struct vf_instance *vf, mp_image_t *mpi);
 347
 348 This is for direct rendering support, works the same way as in libvo drivers.
 349 It makes in-place pixel modifications possible.
 350 If you implement it (vf->get_image!=NULL), then it will be called to do the
 351 buffer allocation. You SHOULD check the buffer restrictions (stride, type,
 352 readability etc) and if everything is OK, then allocate the requested buffer
 353 using the vf_get_image() function and copying the buffer pointers.
 354
 355 NOTE: You HAVE TO save the dmpi pointer, as you'll need it in put_image()
 356 later on. It is not guaranteed that you'll get the same mpi for put_image() as
 357 in get_image() (think of out-of-order decoding, get_image is called in decoding
 358 order, while put_image is called for display) so the only safe place to save
 359 it is in the mpi struct itself: mpi->priv=(void*)dmpi;
 360
 361
 362     void (*draw_slice)(struct vf_instance *vf, unsigned char** src,
 363                        int* stride, int w,int h, int x, int y);
 364
 365 It's the good old draw_slice callback, already known from libvo.
 366 If your filter can operate on partial images, you can implement this one
 367 to improve performance (cache utilization).
 368
 369 Ah, and there are two sets of capability/requirement flags (vfcap.h type)
 370 in vf_instance_t, used by the default query_format() implementation, and by
 371 the automatic colorspace/stride matching code (vf_next_config()).
 372
 373     // caps:
 374     unsigned int default_caps; // used by default query_format()
 375     unsigned int default_reqs; // used by default config()
 376
 377 BTW, you should avoid using global or static variables to store filter instance
 378 specific stuff, as filters might be used multiple times and in the future even
 379 multiple streams might be possible.
 380
 381
 382 The AUDIO path:
 383 ===============
 384 TODO!!!