Title: Flexible Verification of MPEG4 Stream in PeertoPeer CDN
1Flexible Verification of MPEG-4 Stream in
Peer-to-Peer CDN
ICICS 2004 Malaga, Spain
Tieyan Li Yongdong Wu, Di Ma, Huafei Zhu, Robert
Deng InfoComm Security Department
(ICSD) Institute for Infocomm Research
(I2R) 27th, Oct. 2004
2Outline
- Packet based Stream Authentication Schemes
(P-SASs) - Why P-SASs fail? And what we find out?
- Background on stream formats
- Unequal Loss Verification (ULV) scheme
- ULP method
- ULV scheme
- Security and performance analysis
- Showcasing our scheme
- Publishing streams in Peer-to-Peer CDN
- The peer-to-peer CDN scenario
- Publishing and retrieving
- Imaginative usage modes (interactive,
hierarchical) - Conclusion and future works
3Traditional Packet based Stream Authentication
Schemes (P-SASs)
Hashing each packet Generating the hash
tree Signing on root of the tree Amortizing
hashes and signature over the whole packet group.
Content
P1
P2
Pn
P1
P2
Pn
4But when packet has format
- The truth is that most of the streams must have
certain kinds of format (i.e. JPEG2000, MPEG-4). - So that, a single packet may contain multiple
descriptions from various sub-streams. - Processing the whole packet as a single unit is
not proper in some cases, such as error
correction and authentication.
5How stream packets are generated?
VS-C
VS-B
MultiPlexer
VS-A
P1
P2
Pn
DeMUX
Case 1 Raw packets No encoding/decoding No
authentication No transcoding
6How stream packets are generated?
VS-C
FEC-C
VS-B
FEC-B
MultiPlexer
VS-A
FEC-A
P1
P2
Pn
Decoding
DeMUX
Case 2 Raw packets Encoding/decoding No
authentication No transcoding
7How stream packets are generated?
VS-C
FEC-C
VS-B
FEC-B
MultiPlexer
AUTH
VS-A
FEC-A
P1
P2
Pn
P1
P2
Pn
DeMUX
Decoding
VRFY
Case 3 Raw packets Encoding/decoding Authenticati
on No transcoding
8How stream packets are generated?
AUTH
Transcoding
P1
P2
Pn
Pn
Pn
P1
P2
P1
P2
DeMUX
Decoding
VRFY
Case 4 Raw packets Encoding/decoding Authenticati
on Transcoding
P1
P2
Pn
9Why transcoding and transforming?
- Networks feature very different bandwidths,
error rates, end system processing abilities,
QoSs and policies. - Transcoding means removing some layers of a
stream e.g. smaller packet size, low quality to
avoid massive packet loss in narrow bandwidth. - Transforming means enlarge or reduce packet size
without losing quality e.g. sending 2 times of
smaller packets with ½ sizes.
10Why P-SASs failed?
- The proxy can not store the stream as an ordinary
file, since it needs to remember the packets
otherwise it must remember the detail on
generating the packets. - The proxy can not change the packets, also means
can not enlarge the packets, can not change
authentication data attached on each packet. - The proxy can not re-encode/decode the stream
since the encoding data is embedded into each
packets. - With so many cants, the proxy can not be a
good service provider, run out of money and can
not survive.
Authentication (vertically) based on packet is
fading out Authentication (horizontally) based
on object is on the way.
11And what we find out?
- For P-SASs, authentication can only be done
after the packets are generated. - Packets are only used for transmission on
network layer. Authentication should be done on a
higher layer than network layer. That means
authentication should happen before packets are
generated. - Authenticating all packet information, including
the encoding information, which is not clean. - What we can do?
- Authentication is done only on the raw data,
excluding encoding information. (clean) - Authentication based on objects, composing the
objects are flexible. - Thus, the proxy can provide good service and
make big money
12Background on MPEG-4 stream format
A MPEG-4 presentation is divided into sessions
including units of aural, visual, or audiovisual
content, called media objects. A video sequence
(or group, denoted as VSs) includes a series of
video objects (VOs). Each VO is encoded into one
or more video object layers (VOLs). Each layer
includes information corresponding to a given
level of temporal and spatial resolution, so that
scalable transmission and storage are possible.
Each VOL contains a sequence of 2D
representations of arbitrary shapes at different
time intervals that is referred to as a video
object plane (VOP). Video object planes are
divided further into macroblocks (MBs) of size 16
x 16. Each macroblock is encoded into six blocks
B1, B2, , B6 of size 8 x 8 when a 420 format
is applied.
13Background on ULP packet format
- In a virtual object sequence VS, VOs, VOLs, VOPs,
MBs and Blocks are arranged based on a predefined
style. - The objects are encapsulated into n packets
vertically. - Data units in the same column form a packet,
while data units in the same row form a layer. - The data are protected unequally. High priority
layers consist of more parity units, while low
priority layers have less parity units. The
shadow units represent parity units for media
unit.
Packet format of Unequal Loss Protection (ULP)
scheme
14More background on Merkle hash tree
15ULV scheme
Flexible on transcoding a partial tree, i.e. a
sub-tree VO2-VO3 can be removed.
Flexible on selecting a partial tree, i.e. a tree
from plane layer to root.
Step1 Generating authentication data
Step 2 Amortizing authentication data
Step 3 Transcoding and transforming
Step 4 Verifying authentication data
16Generating authentication data
- Formulas for calculating the MHT hash values at
different layers
- The object group hash is given as
17Amortizing authentication data
- After generating the authentication data, we use
ECC encoders to encode them and amortize them
onto the packet. - Assumption the authentication data is uniformly
treated with the same encoding rate as of the
highest priority layer. (to make sure the
authentication data can be recovered with the
highest priority).
- Next, we append integrity unit and signature
unit onto the packets.
18Transcoding and transforming
- Transcoding means we preserve the important
parts of a MHT and truncate other unimportant
parts of the tree to allow low bandwidth
transmission. - Transforming means we preserve all the content
in a stream (group), only enlarging or reducing
the packet sizes, to adjust different network
conditions.
New authentication data is generated. It
includes the original signature, the new
integrity unit and the new signature (i.e. the
proxys signature on sub-tree)
19Verifying
- Verifying is the reverse process of generating.
- At least k out of n packets are received
- If no enough packets, ignore the stream
- The authentication data can then be decoded.
- Verifying the signatures
- If no, then discard the stream
- If yes, then next
- Verifying the integrity units
- Reconstructing the hash tree based on collected
packets - Comparing the hash tree with the received one
from root to bottom. - If no, then discard the stream
- If yes, then accept the stream as authenticated.
20Security Analysis
- First of all, the security of the scheme relies
on the security of MHT. - Secondly, the probability of verification
- Signature verification (yes or no)
- Integrity verification (xxx verified)
- Base layer (with most FECs and highest priority)
is assumed recoverable. - Assuming an erasure channel with independent
packet loss. - Packet loss rate is p. Suppose n packets
transmitted totally - The received packets is probably (nk)pn-k(1-p)k
- Verfication can only be done on recovered
packets, the verification rate is T/T. (where T
is the recovered tree and T is the reconstructed
tree from recovered authentication data.)
21Performance analysis
- Till now, we can only estimate the computational
cost incurred by computing cryptographic
primitives. There will be a huge space to further
experiment the scheme. - Some observations
- Signature is generated once (initially, and per
transcoding operation). - Signature verifications are much faster than
generation. - For hash tree verification, a MHT of n data
items, 2n hash operations are needed. - Tree construction is needed by both generation
and verification processes.
22Showcase the ULV scheme
The foreground of each image is a English word.
The foreground words form a sentence.
Syntactic structure of image objects Where VF is
the value of foreground object, and VB is the
value of background object.
23Showcase the ULV scheme
Only the foreground of each image is received.
Syntactic structure of received image objects
The sub-tree VB is replaced with a hash value
HVB.
24The Peer-to-Peer CDN Scenario
- Why we need Peer-to-Peer CDN? Instead of other
retrieving models? - P2P CDN, or super distribution, will be the main
trend in the near future. - Other models, such as centralized or distributed
delivery are established models. - Thus, the transcoding operation described in
this paper could be meaningful.
25Publishing and retrieving
Suppose a stream S VS1, VS2, VSn. Its
authentication data is HS, , Where
HS hVS1, hVS2, , hVSn, and VS1,
VS2, , VSn Publishing and retrieving
Key hN(S), - ?? content of S Key hN(S),
A ?? content of
26Imaginative usage modes
- The only limitation on using the scheme is your
imagination. - Single source vs. Multiple sources
- Flexible on verifying multiple sources
- Modifying layers
- The scheme is as scalable as the stream format.
- An interactive model
- An online verification server is needed.
- A hierarchical model
- Every portion is protected, as well as
composable. - It needs a reliable charging model.
27Conclusions and future works
- We figured out an important scenario where all
packet based stream authentication schemes failed
on working. - We proposed a secure and efficient solution
toward the above scenario. - We pointed out some potential extensions of our
basic scheme.
- Many future works to do
- Performance analysis is not done, but is
necessary for learning the real differences
between these two kinds of solution. - Demo is not done, but is necessary for
illustrating the scheme. - DRM is not done, but is necessary for knowing
the infringement on existing DRM system, and for
improving the DRM specifications. - More unknowns on user experiences, how to choose
cryptographic primitives
28Thank you! Q A