Title: DOBES Workflow
1DOBES Workflow
- Paul Trilsbeek
- DOBES training week
- June 2007
2Archiving workflow
The sequence of actions that are required to get
a piece of recorded material into the archive.
3Video workflow (MPI)
Recording
Capturing (DV)
Transcoding (MPEG1 2, WAV)
DMF
Sessions
Cutting (MPEG1 2, WAV)
Transcribing/ Annotating
Metadata
Archiving
4Video workflow (alternative)
Recording
Capturing (DV)
Cutting (DV)
Sessions
Transcoding (MPEG1 2, WAV)
Transcribing/Annotating Metadata creation
Archiving
5Workflow People involved
- DOBES teams (you)
- DOBES Corpus Manager (me)
- MPI Digiteam (Nick Wood, Bas Roset)
6Digital Master File (DMF)
- The MPI Digiteam captures every tape they receive
entirely. We call this entire recording the
Digital Master File or DMF. - Based on this DMF, the DOBES team can determine
the time codes of those parts that they want to
archive, the so-called Sessions.
7Specifying Sessions
- A Session in terms of the IMDI corpus is a
resource bundle, generally a piece of video or
audio (in several formats) accompanied by an
annotation file and optionally some other
information files. - When we talk about Sessions in relation to
cutting media, we mean the piece of audio or
video that you would want to archive in the
corpus as one linguistically meaningful unit
(one interview, one story, one song, )
8Specifying Sessions
- Audio and Video Sessions can also consist of
several fragments merged together to one file. - In order to cut or put together these Sessions,
the Archive needs to receive an IMDI file with
the exact time code specifications based on the
Digital Master File (DMF).
9Video workflow Who does what?
What part of the workflow can be done by who?
10Video workflow Who does what?
- 3 common scenarios
- all capturing, transcoding and cutting done by
the MPI Digiteam. - first version in MPEG1 done by the DOBES team (in
the field), MPEG2 version done by the MPI
Digiteam. - all capturing, transcoding and cutting done by
the DOBES team.
11Video workflow Scenario 1
- the DOBES team sends tapes to the DOBES Archive.
- the MPI Digiteam captures the tapes and sends
entire captured files (DMF) to the DOBES team. - the DOBES team determines timecodes for the
sections that need to be cut (Sessions) and sends
these in IMDI metadata files to the Corpus
Manager. - the MPI Digiteam cuts Sessions and sends them to
the DOBES team.
12Video workflow Scenario 1
- advantage less work for the DOBES team.
- disadvantage the DOBES team has to wait for
DMFs and Sessions to be sent by the MPI Digiteam.
13Video workflow Scenario 2
- the DOBES team does the capturing, cutting, and
transcoding process in the field to create MPEG1,
writing down all time codes while capturing and
cutting. - after returning, the DOBES team sends the tapes
and all time code information in IMDI files to
the DOBES Archive. - the MPI Digiteam creates MPEG2, MPEG1 and WAV
versions for archiving with exactly the same time
codes.
14Video workflow Scenario 2
- advantages the DOBES team can already start
working with the media right away in the field. - disadvantages more work to be done in the field
not all field circumstances may allow for this
scenario time code administration during the
whole process is very important
15Video workflow Scenario 3
- the DOBES team does whole capturing, transcoding
and cutting process and sends final Sessions in
MPEG2, MPEG1 and WAV (according to MPI
specifications) with IMDI metadata to the DOBES
Archive, or uploads them to the Archive using
LAMUS.
16Video workflow Scenario 3
- Advantages the DOBES team can already start
working with the media right away. - Disadvantages all the work has to be done by the
DOBES team higher chance of errors during the
whole process not all field circumstances may
allow for this scenario.
17Video workflow general
- Every team has different field circumstances and
requirements, a suitable workflow scenario should
be agreed upon between the Corpus Manger and each
team. - When doing parts yourself, send examples and
communicate with the Corpus Manger in an early
stage, instead of processing a large batch of
tapes and discovering that something is wrong in
the end.
18Video workflow general
- continuous time code on a video tape (no gaps or
jumps) prevents problems during capturing use
the end search or equivalent function on your
camera before recording a new scene and do not
re-use tapes. - do not use long play mode and 12-bit sound on DV
cameras.
19Video workflow general
- there are hundreds of settings for MPEG encoding
and different encoders have different qualities
use tools recommended by the MPI and settings
files provided by the MPI.
20Audio workflow
- For tape or disc based recorders workflow is
similar to video workflow, only no transcoding
needed (linear PCM WAV is the archiving format). - Flash memory based recorders no capturing
needed. Files can be copied directly to the
computer, after which the Sessions can be cut. - HiMD recorders files can be copied to the
computer using SonicStage software.
21Audio workflow
- HiMD Flash recorders use Linear PCM recording
mode, no compressed audio (mp3 / ATRAC).
22Tape labeling
- When sending tapes to the MPI, always label them
using the DOBES tape labeling conventions (see A4
guide) with a unique code for your project (talk
to corpus manager), e.g. AWSDVDP24Jun0301 stands
for Aweti, Sebastian Drude, Video, DV, PAL,
recorded on the 24th of June 2003, tape number
01. - Label the tape itself, not (only) the box.
23Time codes
- Please use the correct forms when writing down
timecodes - For video sources HHMMSSFF (hours, minutes,
seconds, frames). Frames are the smallest
possible step within a video file. PAL video has
25 frames in one second, NTSC has 30 (or 29.97). - For audio sources HHMMSSSSS (hours, minutes,
seconds, milliseconds). - Even if you dont need this degree of accuracy,
fill in the remaining digits with zeros. (e.g.
011458000).
24Specifying time codes in IMDI
- Specify Session start and end time codes in
ResourcesSource. - For merged fragments, specify multipe sources.
Specify the order in which they should be merged
in a Description field. - For merged fragments from one source, specify the
source multiple times with different time codes.
Specify the order in which they should be merged
in a Description field.
25Specifying time codes in IMDI
26Further workflow
- after IMDI Metadata descriptions and media
Sessions are ready, this material can already be
archived. This can be done by the DOBES Corpus
Manager, or by the DOBES team itself, using
LAMUS. - Annotations for the Sessions can be uploaded
using LAMUS in a later stage.
27Conclusions
- There is not one definite workflow, one will
have to be worked out and agreed upon for between
every project and the Corpus Manager. - As technology advances, more and more can be done
directly in the field. Flash memory and hard disk
recording devices will simplify the workflow
significantly.