![]() |
|
![]() |
Abstract
This article addresses one major
security demand in digital marketplaces: copyright protection. Our goal
is to present interactive tools which strengthen the producers? acceptance
to use digital watermarking techniques to offer their data in a more secure
way in the digital marketplace. Though security is recognized as an important
issue in multimedia it is, ironically mostly not presented by most of the
new media. Our intentions are based on the lack of media support and our
focus is on the definition of new design criteria for digital watermarking
in multimedia environments. The paper gives a short introduction to digital
watermarking techniques, their robustness criteria, the security risks
and visual interactive handling support essential for user acceptance.
Furthermore, we describe four prototype tools as parts of an interactive
multimedia watermarking environment for digital data: the two LabellingEditors
for images and video, the 3D-Watermark and the ObjectTrackingEditor.
This article addresses one major security demand in digital marketplaces: copyright protection. Our goal is to present interactive tools which strengthen the producers? acceptance to use digital watermarking techniques to offer their data in a more secure way in the digital marketplace.
Though security is recognized as an important issue in multimedia it is, ironically, mostly not presented by most of the new media. Generally, common multimedia security mechanisms are not realized by using multimedia tools applying security. Usually the security algorithms are seen as background processes, invisible to the user. Our intentions are based on the lack of media support and our focus is on the definition of new design criteria for digital watermarking in multimedia environments.
We begin with a short introduction to digital watermarking and show what a digital watermark means. We present general existing technical approaches for digital watermarking and show application areas where digital watermarks can be used for copyright protection, customer information and embedding metadata. Beside the advantages, we show the acceptance problems and risks of existing techniques: robustness, quality loss and lack of visual interactive handling support of watermark embedding and retrieval. Visual interactions can strengthen the acceptance to use the new technology and overcome security lacks like robustness and quality loss in the watermarking algorithms when selecting appropriate watermarking features.
Furthermore, we describe four prototype
tools as parts of an interactive multimedia watermarking environment for
digital data: the two LabellingEditors for images and video, the 3D-Watermark
and the ObjectTrackingEditor. The LabelingEditor is a graphical interface
to support the insertion process of different types of label information
into video streams, as well as the retrieval of such information. The 3D-Watermark
gives an intuitive view to the embedded watermark. It shows the quality
loss and provides information about the robustness and efficiency of the
applied watermarking algorithm. The ObjectTrackingEditor is a tool to support
direct watermarking of automatically or manually selected objects in multimedia
data. Finally, we assess our achievements so far, and provide an overview
of further work.
Summarizing, digital watermarking intends to enable the proof of ownership on copyrighted material, detect the originator of illegally made copies, monitor the usage of the copyrighted multimedia data and analyze the spread spectrum of the data over networks and severs. It should be noted, though, that the embedded signaling can be used for a variety of purposes, other than copyright control. In a distributed video production environment, for example, watermarking can be applied to integrate different kinds of data into the video material, such as customer information, or meta data as a format independent part of the video, e.g. information about cuts or scene descriptions.
Requirements - robustness criteria: Currently we have concentrated our work on digital image and video data. There are a number of characteristics and requirements for watermarking mechanisms:
Transparent watermarking: Additionally, our latest experiments have shown that the user interaction for labeling image data is difficult, non-uniform and inconceivable. The user has no possibility to understand and interact with the watermarking process. He cannot proof the robustness or see what happens with his work. He cannot measure or influence the quality loss. Important parts of the video can not be selected separately to strengthen the robustness of the watermark. Cutting out not important parts would not cause a fault watermarking retrieval. Only the important part should be watermarked. Because watermarking works directly on the media material the authors and producers should be involved in the labeling process to observe what happens with their data and to dispel acceptance problems of the labeling process, too. Thus, one major part of our work is the design of visual interaction tools for labeling and the design of metaphor to visualize labeled multimedia data.
Our goals is to design interactive
watermarking environments as an enabling technology for the use of the
watermarking technologies in digital environments and to make the security
algorithms more attractive and usable.
Labeling and retrieval: Currently, two different kinds of watermarking are used and can be configured in the editor: a amplitude-modulation-based and a DCT-based algorithm, [13], [10]. The "File" menu offers the open function for original images or already watermarked images. The original image is presented on the left, the watermarked image is presented on the right, see figure 1.
Figure 1: LabelingEditor for single images
Embedding and retrieval are controlled by the two control buttons. The scrolling area shows the detailed embedding or retrieval process steps. If the original image is available the copyright holder can switch between the two views: showing the watermarked image or the difference view using the radio buttons. The difference view provides a more illustrative description of what happened with the original one. Black areas stand for no changes made during watermarking process, the other colors show the changes that were made. A more detailed view can be seen in the attached demo video.
Robustness test: Coupled with the embedding and
retrieval, the editor provides basic robustness tests: geometrical transformations
like rotation, scaling, cutting or lossy compression and low pass filtering.
The editor performs the distortions and simultaneously checks, if the watermark
information can be retrieved correctly. The behavior is demonstrated in
the demo video.
The following can be seen:
To improve the information about the
visual distortions the 3D-Watermark can visualize not only the absolute
differences. As an alternative the 3-Watermark presents the differences
separated into the RGB or HSV model, figure 3:
a) red-channel
b) green-channel
c) blue-channel
d) hue-3D-watermark
e) saturation-3D-watermark
f) value-3D-watermark
Figure 3 a)-f): RGB and HVS models
The RGB model is hardware-oriented. The red channel was less used than the green and blue. By contrasts, Smith?s HSV (hue, saturation, value) model, [17], is user-oriented, being based on the intuitive appeal of the artist?s tint, shade, and tone. The RGB watermarks show a more smooth relief than the HVS watermarks. Especially the hue and saturation evaluation show extreme alterations. Our next work is to combine the watermarking embedding and retrieval steps directly with the 3D-Watermark output to get a visual feedback, what happens during these processes, and to see which values are influenced or needed for retrieval.
Because of the statistical properties
of a watermark, the 3D-Watermark cannot visualize all embedded data. It
could be possible, that information is embedded although no visible distortions
are observed, because the image data properties were already in the correct
meaning. This disadvantage does not influence the review of the quality
loss and the visual distortions, but can affect the review of the robustness
based on the intensity of the 3D relief. To check this, the LabelEditor
can be used to cut out the appropriate sections and try to retrieve the
watermark again. Compared with the image background the example shows few
distortions in the face region and so it seems to be not robust against
cutting the face region, figure 4:
Figure 4: 3D-face-region
The results in the LabelEditor show that there is really a security lack, figure 5:
Figure 5: retrieval in the face region
Our experiments showed that this drawback
does not appear very often. But, to improve the 3D-Watermark we will offer
an additional view to the actually used pixel of the image for retrieval.
For security reasons, this option is only accessible for copyright holders.
Of course, this additional view cannot provide information about the visual
quality loss compared to the original.
We now discuss the components of the LabelingEditor and their functionalities. We begin with the Editor, and then describe the Visualization Component.
The editing tool: The editing tool actually embodies two distinct features. For the copyright holder (the annotator) it provides the facilitation for labeling. For security reasons this feature is only provided by log-in with the secret owner key. All other users, including the annotator, can use the editing tool only as a retrieval interface.
The tool offers different views to copyright, customer or meta data for each frame of the video. The labeling process can be configured dynamically by selecting the watermarking mechanism, a DCT-based and a amplitude-modulation-based algorithm. It must be stressed, however, that the current stage should merely be understood as a first step towards a more complete watermarking scheme, which supports all kinds of existing watermarking techniques. At present, we are also working on a new watermarking mechanism in which we seek to consider the advantages of the existing techniques by overcoming a number of shortcomings, e.g. increasing the amount of label data that can be embedded into image data.
In order to insert all information
of the Editor, we use a special syntax description and data reduction technique.
Before the label data will be embedded, the information is separated and
transformed into the watermarking syntax, as described in figure 7.
Figure 7 labeling process
To differentiate into public and private data, the private data P will be encrypted with the secret owner key k and the encryption F to retrieve the final watermark W: W = Fk(P). The public data remains unencrypted and accessible. To reduce the data volume the label data is Huffman-encoded. We use watermarking techniques where the watermark is inserted into perceptually significant regions of the image, [10]. Depending on the image characteristic I the watermarking information W is embedded and the labeled image IW is created:
IW =I + f(W,I) (public watermarking)
IW =I + f(Fk(P),I) (private watermarking)
Depending on the watermarking technique the function f modifies the image data for embedding the label.
The label can be integrated into a single frame of a video, or can be spread over all frames. For the latter, though, a removal of frames might lead to a partial or complete loss of the label data. However, the advantage of this approach is that more data can be embedded.
For a transparent labeling process, the copyright holder gets the pixel positions where the label is embedded into the video frames.
The visualization component:
The video visualization component provides the view inside the video sequence (see the left window in figure 6). We use a video visualization in the form of a 3D-cube, [18]. The frames of a selected video are displayed as a floating 3D block, where the current image is represented as full image, while older images form the lateral surface of the cube. Thus, the 3D-Cube represents temporal and content aspects of the video sequence. Features, such as editing rhythm, shot boundaries, camera motion, etc., are clearly visible in the pattern appearing on the surface of the cube. The user can stroke across the cube with the cursor to access any part of the video, even a single frame, like thumbing through pages of a flip book.
The Visualization Component is synchronized with the Editor. This means that while the user navigates through the VideoCube, the editor shows the annotated information. The Visualization Component is activated after a user has specified a query in the Editor. The retrieved video frames are highlighted with a distinct color to indicate their position in the video sequence.
The visualization component can be
exchanged to offer alternative presentations, e.g., to the traditional
video player or a linear two dimensional presentation.
Our goal is to support transparent and selective labeling. This means, characteristic objects of the multimedia data are pre-selected with an automatic object tracking algorithm based on the MPEG-4 standard, [16]. Such object-tracking relies on ?object-based? descriptions of video sequences in terms of semantic relations between objects or regions. The description process of video objects may be performed automatically, semi-automatically or manually, i.e. as part of the video production process. Automatic processes search is described in [19]. Semi-automatically techniques use interactive pre-selected view points to recognize an object and use these points to perform object tracking, [15]. Manual techniques require user interaction to find objects and perform object tracking.
Once video objects have been located
in a video sequence, a tracking procedure can start to calculate the positions
of the video objects in successive frames and then find the trajectory
of the objects throughout the whole video. Content-based interactivity
is thus based on the data structure, and independent of the coding techniques
used for each accessible unit. In such a way, the user is provided with
suggestions about video objects that might need watermarking, based on
a pre-parsing of the video. The actual labeling process is not started
before the user has agreed on the selection, providing him or her with
the possibility to deselect parts, or select additional other part of the
video according to his interest. A possible outcome of the object identification
process is suggested in figure 8, where the left frame shows the original
image and the right frame the identified objects.
Figure 8 identification of objects
For our automatically object tracking in video sequences we use fuzzy contour refinement, developed by Steudel (1997), [19]. This algorithm provides good results for different kind of test sequences compared to ordinary motion-based approaches. For the semi-automatically object detection we are using algorithms for edge detection based on the Canny techniques, [4]. Currently we combine the object-tracking algorithm with our watermarking techniques described in Dittmann et al. (1998), [10]. We apply these techniques in the following way:
Figure 9 Watermarking editor with object identification
Our new watermarking mechanism support
direct watermarking of automatically, semi- automatically or manually selected
objects. We combine the advantages of Caronni?s approach, [5]combined with
the Fridrich approach, [12], using rectangle tagging by modulating the
brightness of the objects for later retieval and the approach of Koch et
al. (1995), [14], using DCT coefficient manipulation to embed robust labels.
The retrieval process starts with a search inside the data and tries to
find watermarked objects searching for a typical watermarking pattern.
All objects will then be checked with an equivalent retrieval process and
the embedded information is retrieved.
At present, we are engaged in research into the design of a framework for distributed, digital video production. The aim of the framework is to provide support for all stages of the video production process.
Furthermore watermarking not only enables
to prove the ownership on copyrighted work and to detect the originator
of illegally made copies, but also to analyze the spread spectrum of the
data over networks and severs. Our recent work is focused on a MonitorAgent
that resembles, in parts, a search engine that provides the copyright holder
with results about where, by whom and for what purposes the work is used.
The visualization of monitoring results might be achieved in different
ways. The MonitorAgent performs an analysis on the usage of the detected
material using a 3D-metaphor.
[2]Aucella, A. F.: Converting to Graphical User Interfaces, Turorials, CHI '95 Proceedings, 1995
[3] Bailey,W.A., Knox, S.T, and Lynch, E.F. Effects of interface design of user productivity. In Proceedings of the Conference of Human Factors in Computer Systems. New York: ACM, 1988, 207-212.
[4] Canny, J.: A computational Approach to edge edge detection, IEEE Trans. On Pattern Anal. And Mach. Intell., Vol. PAMI-VIII, No. 6, Nov. 1986
[5] Caronni, G.: Assuring Ownership Rights for Digital Images, Proceedings of 'reliable IT systems' (verlaessliche IT-Systeme) VIS '95, H.H.
[6] Davies, S.E., Bury, K.F. and Darnell, M.J. An experimental comparison of a windowed vs. a non- windowed operating system environment. In roceedings of the Human Factors Society 29th Annual Meeting. Santa Monica, CA: Human Factors Society, 1985, 250-254
[7] Digimarc: Watermarking Technology, http://www.digimarc.com/wt_page.html, PictureMarcTM1996
[8] Dittmann, J., Steinmetz, A.: Konzeption für Sicherheitsmechanismen für das Projekt DiVidEd, GMD-Studie, 1997
[9] Dittmann, J., Steinmetz, A., Delp, M.: Robustness of Embedded Image Labels Against Several Damaging Possibilities, Technical paper, GMD, 1997
[10] Dittmann, J., Stabenau, M., Steinmetz, R.: Robust MPEG Copyright Protection Technologies, to appear at the IFIP?98 - SEC?98 in Wien/Budapest, 1998
[11] Dumais, S.T. and Jones, W.P. A comparison of symbolic and spatial filing. In Proceedings of the Conference of Human Factors in Computer Systems. New York: ACM, 1985, 127-130
[12] Fridrich, J. :Methods for data hidung, Center for Intelligent Systems & Department of Systems Science and Industrial Engineering, SUNY Binghamton, Methods for Data Hiding", working paper (1997)
[13] Jordan, F., Kutter, M. and Bossen, F.: Digital signature of color images using amplitude modulation, SPIE-EI97 Proceedings, 1997
[14] Koch,E., Zhao, J.: Towards Robust and Hidden Image Copyright Labeling, Proc. of 1995 IEEE Workshop on Nonlinear Signal and Image Processing (Neos Marmaras, Greese, June 1995
[15] Mulroy , P.: Video Content Extraction: Review of Current Automatic Segmentation Algorithms, Proc. of WIAMIS?97 Workshop on Image Analysis for Multimedia Interactive Services, 1997
[16] Pereira, F.: MPEG-4: a new challenge for the representation of audio-visual information, keynote speech no Picture Coding Symposium, Melbourne - Austrália, Março 1996
[17] Smith, A.: Color gammut transform pairs, Proc. Of ACM SIGGRAPH?78, pp. 12-19, 1978
[18] Steinmetz, A.: DiVidEd A Distributed Video Production System, work in Progress, Proceedings of Visual96 Information Systems, February 1996
[19] Steudel, A. and Glesener M.: Object Tracking in Video Sequences with Fuzzy Contour Fefinement, Proc. of WIAMIS?97, Belgium, 1997
[20] Whiteside, J., Jones, S., Levy, P. and Wixon, D. User performance with command, menu and iconic interfaces. In Proceedings of the Conference of Human Factors in Computer Systems. New York: ACM, 1985, 185-191
| Mai 1998 |