Archive for December, 2013

Avid DMF Considerations

Monday, December 30th, 2013


One of the many new features that were part of the 7.0.0 release was Dynamic Media Folders or DMF. DMF allows for media services to be applied to folders as a set of rules and actions and processed whether the Media Composer application is running or not. The solution appears to be a localized version of Interplay Central as you can see “Interplay Central - Progress Monitor”  flash for a few seconds in the browser’s tab before changing to “Background Queue.” I think it’s great that technology and solutions get repurposed for different uses and markets once developed.

But in the case of DMF, there are some considerations to be made depending on your workflow and whether you are better served with a foreground/background transcode, or using DMF. And as usual, the answer is “it depends.” It depends on what your sources are and what you plan on doing with the media once edited;  Is it to be finished inside Media Composer or conformed elsewhere? An important one is whether you are dependent on FrameFlex or not as part of the process. As you can see from the following screenshot, not all transcodes are created equal and the user needs to be aware of which one is being used (click for original size image). The NOTES field indicates DMF, and Background Comparability ON or OFF.

dmf-considerations.png As you can see, the same original AMA linked clip will have different metadata attached to it after a DMF transcode or a foreground/background transcode process once linked in the bins. The Image Aspect Ratio and Reformat values are not the same (Stretch? Nothing’s been stretched). Some of this is based on whether you have compatibility ON or OFF in the transcode settings, and I will get to that in a future blog dedicated to FrameFlex. The issue is that even with these settings turned off in the DMF profile, the file is transcoded as though it were turned on. Click on thumbnail to see full size image of DMF Transcode settings.


It makes sense for Color Transform to be baked in (for now), but not for raster and image size. It is either a bug or a limitation of the DMF’s external transcode engine. Anything listed as 16:9 in the Image Aspect Ratio column at a 1920×1080 resolution will not have the ability to use FrameFlex in the timeline. Which for an offline to online process FrameFlex on proxy media should be part of the basic design.

Another consideration is when using DMF, there is no project association attached to the media. The DMF clip has an empty “Project” field.  If you are using DMF for media to be associated with different projects and later use the Media Tool or third party software applications to find associated media, it will be problematic at best.

I find that background transcode is the overall better solution for formats that are not 16:9 or greater than HD in resolution. Background transcode allows me to manage clips to a defined project, keep the raster size “live” for FrameFlex and downstream conforms all for the small price of keeping the Media Composer up and running. Having Media Composer not running with DMF does not offer enough advantages considering what I am giving up. It would be an entirely different matter if I could run Background Services on a separate system that did not have Media Composer installed. Perhaps that will be a future consideration as referred to in this blog

As with any project, think through all your needs from start to finish, and pick the best path for success.


Monday, December 30th, 2013


In the course of my “industry research” I came across a very cool little application called “Plotagon“. It is a simple integrated script and storyboard application with real time playback and voices of what is written, very similar to a gaming engine, or a SIMS type environment. It will be interesting to see where this application goes as they expand its toolset as it can be used in marketing, social media, education and to some extent, filmmaking. I was able to very quickly write a bad script and using the preset list of actions create a small scene. Once complete, it can be shared via the Plotagon site as well as YouTube if desired. You can see this masterpiece here. The script can be exported using the Fountain markup language supported in several writing applications. The entire process is very easy, and somewhat addicting. The filmmaking process would need more controls over actions, timing, angles, etc. which would make the UI more involved, but I can see them pulling this off in future versions.

This reminds me of technology I have seen in the past with PowerProduction’s software offerings for storyboarding and recently its integration as a plug-in for NLE systems.  Martini QuickShot can be used in Final Cut Pro and Media Composer as an AVX plug-in allowing editors to add missing shots as needed rather than just a title with “Missing Shot” providing better previsualizuation when working with production and producers. I have often sent printed timelines or exports in frame view to production to give the a better idea of the shot size and angle to better support the story. In 2005, Media Composer exported interactive HTML storyboards from Avid FilmScribe, but unfortunately most of the web-based templates no longer work.

Editing,  like any language, is in a constant state of change. The combination of script, game engines, editing continue to shape how stories are told and shared across different distribution channels and will be fascinating to see how tools used by storytellers will evolve over time.

My Two Favorite 7.0.3 Bug Fixes

Friday, December 27th, 2013


Win Van den Broeck does an excellent job of listing all the available fixes and “new” items in Media Composer v7.0.3 in his blog. For me, my two favorites are listed as bug fixes in the 7.0.3 Read Me, that border on being new features as they start enabling new workflows, or re-enabling old ones.

The first one is 24fps timecode support in the SoundTC column for “NTSC” based project types. I’ve written about workflow issues with SoundTC in this blog, but basically the issue was that SoundTC was always hard-coded to 30fps timecode in a 23.976 or 24.000 HD project type. This was a workflow carryover from NTSC based workflows before HD was available, but never updated in the nine years since HD was a standard Avid project. This is important to workflows that sync in third party systems prior to editorial as there is only one master clip that needs to carry all the timecode information. From a conform perspective, productions were putting this timecode in any of the other five AuxTC columns making it very inconsistent from production to production. With 7.0.3 and forward, there is now a self defined timecode column for this metadata. Some things to be aware of though as the “bug fix” is just enough to allow for 24 frame counts but know that:

  • ALE has a new Global header  ”SOUNDTC_FPS” where the value 24 or 30 can be defined. Unlike the header for Video FPS, the ALE will not import if there is a 23.976 value. Only 24 or 30 can be defined. This type of ALE will only work in 7.0.3 and forward. I have not tried it in an earlier version to see what happens, but I suspect a similar error message will pop up. 
  • Any one clip can either be 24 or 30. Unfortunately you cannot track what the value is other than loading the clip and scrolling bast the :23 frame count to see what it’s doing. I recommend adding a custom column and entering a value there if it needs to be tracked. It would be nice if the global header from the ALE be used as a column once imported.
  • The user is prompted with the choice of 24 or 30 only when that field is empty and a new value is being entered by hand or by “duplicate column” feature. If there is a value there, any new entry will assume the frame rate of the existing timecode. If you need to change the timecode count type, delete values first, then enter a new value where you will prompted. An exception to this is merging an ALE where SOUNDTC_FPS is defined. All existing values will now reflect what is defined in the ALE.
  • If you create an EDL with mixed timecode values in the SoundTC, it will not be flagged with an FCM command or comment.
  • If you export an ALE from the bin, it does not contain the new Global Header
  • You can bring a bin with 24 frame timecode in the SoundTC back to MC v6.5.4.1 and it will count as 24, but you will have no ability to change it as in 7.0.3 if needed.
  • Bringing a bin forward will assume the 30fps timecode and remain as such in 7.0.3, but can be changed once value is deleted and a new one entered.
  • User will be prompted when doing a “duplicate column.”

My second favorite one is the change to the ALE merge process. In all previous versions, merging an ALE into existing masterclips was not a true merge, but a “replace with” whatever was in the ALE. So if there was existing metadata in a column that was not in the ALE, it would get deleted. For example, here is a bin with metadata before the merge (click images for larger version):

703ale-merge-before.png And in previous versions, merging the following ALE:


would result in the bin now looking like


Where only the fields in the ALE remain, but all the other column metadata has been deleted. In 7.0.3, the same ALE merge results in


Where all the metadata remains and only what is in the ALE changes after the merge. This one has lots of the workflow benefits as external databases can now be easily repurposed in editorial at any point in the process.  The metadata update does propagate to the subclips, but I still need to test for .sync and .grp. This starts enabling color workflows as mentioned in this blog entry.

Both of these features are much needed workflow type solutions. I hope Avid is reaching out to the third parties that create dailies for post production to generate ALE with only changed columns as a feature as well as be aware of the SoundTC column, even if it is to be tracked as redundant metadata during the transition process.  ALE merging of a subset of columns is currently done by editing the ALE itself to only contain relevant fields and is how the above example was created. Some databases will export only selected fields which makes the process a whole lot smoother.

For me, these are more than bug fixes, but workflow enhancements important to the post process. I almost would have highlighted these two in the “What’s New” section of the Read Me rather than in the bug fix list, but good to know these are now available.

Giving Voice to Metadata

Friday, December 27th, 2013


I think anyone using PhraseFind, ScriptSync, and SoundBite  appreciates what dialogue can bring in finding what you’re looking for as part of the editorial process. At times it is akin to finding a needle in a haystack. So it was interesting to see the Apple patent on voice-tagging photos and using Siri to retrieve them as part of the claims expanding the capabilities of voice based interaction with Apple devices. It will be seen as new and innovative if and when it hits a future version of iOS.

It reminded me an awful lot of a pending patent and prototype I had designed and built at Avid over three years ago that used multiple descriptive tracks on any given piece of media. Currently metadata tagging is either clip based, frame based, or span based and can be a drawn out process. The idea behind this solution was to use voice annotation and descriptions to the video. In its most simple form, a single track would describe what is going on in the scene. Because it is time based as the tagging happens during a record/playback - all search results layer line up to that portion of the clip. Or using “context based range searches” can further refine search results. Things get even more “descriptive” when creating multiple metadata tracks, where each track can be of a certain category, for example:

  1. Characters, placement, movement, position, etc.
  2. Camera, angle, framing, movement, zooming, etc.
  3. Location, objects, day, night, interior, exterior, colors, etc.

Any search can now use all tracks, or just a subset of tracks to filter out results as needed. Combining voice tagging metadata with pre-existing “metadata: such as camera  name, shoot date, scene, take can make for a very powerful media management system that could not only be considered new and innovative, but extremely useful as well to productions dealing with not hundreds, but thousands of hours of source material. Some customers I discussed this with had needs for forty or more descriptive tracks on any given source. One could even consider recording a tagged “descriptive” track directly to camera during production and used anywhere downstream in the production cycle.

Voice, the new metadata.

FrameFlex vs. Resize

Saturday, December 14th, 2013


Avid FrameFlex is a new feature in Media Composer v7 that allows for image re-framing. The FrameFlex parameters go back to the original high resolution master using more pixels to create the new frame rather than resizing an HD frame to the new size. One result involves more pixels being used and scaled down, versus the latter which takes pixels and blows them up. Scaling tends to result in a higher quality image compared to the reverse. So with this in mind, and knowing that only FrameFlex uses the original source file resolution, and any scaling operation that is not FrameFlex is restricted to the HD resolution of the project, I set out to compare the different methods of re-scaling versus extraction.

  1. FrameFlex
  2. 3D Warp Effect with HQ active
  3. Standard Resize effect 

The image above is a 4K (quad GD) R3D file. As you can see from the FrameFlex bounding box, it is a rather aggressive “punch-in” for the shot. In FrameFlex terms, it is 50%, as far as resize goes, it is 200%.  The results were really surprising. In the end, I did not see 200% of “wow” difference. For the most part, it was very difficult to see the differences between the two operations. While there is some very slight softening, it was not as much as I thought it was going to be. And just to be sure, I did the same extraction in RedCine X Pro to use as reference. In that frame there is a difference in the gray area of the shirt which could be attributed to the 12bit to 10 bit transcode. In all tests, the R3D was done as a FULL debayer to an uncompressed HD MXF file.

Here are the resulting frames exported as TIFF. Click links to download each file.

I also did a quick test with the standard Resize effect which does not have an HQ button and there is some very slight difference there, compared to the 3D Warp resize with HQ active. If you want to download the zip file with all the TIFF files, click here. In the end, it’s different tools for different jobs. The 3D Warp does give you extra image control such as rotation to level out a horizon when needed. 

Quality overall is difficult to tell from stills alone. Codec, aspect ratio (other than a multiple of 16:9) motion and other factors do come into play, but with all things relative, I was more surprised at how well the resize from HD stood up. Even the amount of detail and noise in a shot could affect the overall quality of the resize versus extraction operations. Here is a download of the same test with the XAVC 4K codec. In this case, the 3DWarp is less crips at the same 200% push, but as expected, with smaller push-in, it becomes less noticeable.  Also, there would be a distinct visible quality difference had the same re-frame was shot as Quad HD resolution to start with versus an extraction,  but that is a test for another day.

PhraseFind Tips

Friday, December 6th, 2013


PhraseFind is one of the more innovative methods available to search lots of footage based on what is spoken, in addition to what might be logged as metadata. This phonetic indexing and search functionality is technology licensed from Nexidia which powers other similar solutions such as SoundBite from Boris. Anyone who has used it will gush over the time-savings and usefulness this brings to workflows. But I bet that many users are not benefiting from the full functionality that PhraseFind offers or how to enter terms correctly to ensure proper results. Unfortunately this information is not part of any Media Composer documentation that I could easily find.

Users can use the following syntax operators to really zero in specific content with extremely high confidence in the results. So… From my modified Nexida documentation on search tips:

A key part of successfully searching audio with Nexidia’s technology is to understand a few basic rules regarding entering search terms. The syntax used here is entered directly in the text entry of the FIND window:

Characters:  A general rule of thumb is to spell out every word in the search query.  This includes:

  • Numbers:  Instead of ‘2008′, type ‘two thousand eight’.  Spell the number using the variation in which it is most likely to be spoken — for example, in an address, ‘495′ is likely to be referred to as either ‘four nine five’ or ‘four ninety five’. 
  • Acronyms:  Separate acronyms that are spoken as a series of letters with spaces.  For example, ‘FBI’ would be entered ‘F B I’ and ‘NCAA’ would most likely be entered ‘N C double A’. 
  • Symbols, Punctuation: Omit all symbols and punctuation such as $, ! and -. 
  • Abbreviations:  Spell out an abbreviation the way it is pronounced.  For example ‘Mr’ should be ‘Mister’ and ‘Dr’ would be ‘Doctor’ (or ‘Drive’).

Quotation Marks:  These are the only non-alphabetic characters that are valid to use in the search box (other than “&” for spanned ranges - see below).  Placing quotation marks around two or more words tells PhraseFind to search for those words together in the sequence they were entered.  For example, entering “President Ford” “United Nations” will launch a search for President Ford as one term and United Nations as the other term.

Spelling:  Because we’re not searching a transcript or other text, correct spelling is not required.  In fact, modifying the spelling of words can actually help improve the results.  If the correct spelling of a word is not pronounced the way in which it is typically spoken, adjust it so that the letters more closely match how it is spoken.  For example, when searching for ‘Barrack Obama’, try ‘Buh rock o bahma‘.

Expression Analysis

This method extracts discrete sequences of adjacent terms that are most likely to have been spoken together, and therefore yield good results for the user:

Multiple Searches:  Given a multi-word search – each individual term, as well as the literal string is independently searched.  An example — President Ronald Reagan – typed into the search box would produce the following search requests:

  • Search (President Ronald Reagan) 
  • Search (President) 
  • Search (Ronald) 
  • Search (Reagan) 

Quotation Marks:  If terms are enclosed in quotation marks, the terms inside the quotation marks are searched together, and the remaining words are each individually searched.  A literal search is not executed when quotation marks are included.  An example – healthcare reform  “Ronald Reagan” – typed into the search  box would produce the following search requests:

  • Search (healthcare) 
  • Search (reform) 
  • Search (Ronald Reagan) 

Span Based Searches

This is very useful when looking for content in a more contextual manner rather than specific instances as noted with the previous method. The FIND interface gives no UI indication for this type of search but is possible if the proper string is entered. For example, if I were cutting a program on the Red Sox winning season from all the broadcast material available, I may want to find footage of Papi and Home Run within a certain time span of each other. The results would indicate that the clip context is probably what I am looking for in this case, versus any time Papi, home, run or “Home Run” are uttered.

The notation is:

  • “Term1 &X Term 2″ … where X is time in seconds.
  • So for my example: “Papi &20 home run” would find those terms within 20 seconds or less of each other

This can also be used to identify 2 or more words within a certain amount of time in a clip.  For example,

  • “Term1 &5 Term2 &10 Term3″

Will locate files with term1 and term 2 within 5 seconds of one another, then within those files, determine if term3 is spoken within 10 seconds of the first 2 terms.  Note this order dependency.

Update: Version 2.0 of PhraseFind re-introduced with Media Composer v8.8 no longer supports span based searches.