Jessada Karnjana. Techniques for audio and speech information hiding based on singular-spectrum analysis and applications in service sector. Doctoral Degree(Engineering and Technology). Thammasat University. Thammasat University Library. : Thammasat University, 2016.
Techniques for audio and speech information hiding based on singular-spectrum analysis and applications in service sector
Abstract:
The rapid growth of the Internet and the rise of information distribution systems have posed social concerns and business demands for applications such as broadcast monitoring, owner identification, proof of ownership, transaction tracking, tampering detection, copy control, and information carrier. Audio and speech information hiding (ASIH) has been suggested as a solution for these demands. The information hiding is a scheme of making information unnoticeable. In general, there are five required properties for ASIH. (1) Inaudibility or transparency: this is a property that the hidden information does not cause any perceptual difference in hearing between a host signal and that signal embedded by the hidden information. (2) Robustness and fragility: the hidden information is said to be robust against an attack if it could be correctly extracted after that attack was performed to the signal carrying that information. Otherwise, it is said to be fragile to the attack. (3) Blindness: this property refers to an ability of correctly extracting the hidden information embedded into a watermarked signal by using only the watermarked signal. (4) Confidentiality: this property requires that, even though an unauthorized person knows there is information embedded into a signal, he or she cannot access the information. (5) Capacity: this terminology refers to the amount of information that can be embedded into a host signal. The first challenge of ASIH is that, normally, these required properties conflict with each other. Another fact that poses another challenge to this work is that the human auditory system is extremely sensitive. The last challenge is from an application usage viewpoint, i.e., for some applications, such as ownership protection or identification, there is motivation for attackers to remove or destroy the hidden information, and it is not easy to cope with all possible attacks. This research aims to explore ASIH that can satisfy all requirements, especially the conflict between inaudibility and robustness. Based on literature reviews, ASIH based on singular value decomposition (SVD) is one of the robust techniques. All SVD-based schemes embed information by slightly changing the singular values of a matrix representing a host signal. Since SVD-based analysis is a data-driven technique, all SVD-based ASIH schemes treat an audio or speech signal as a meaningless time-series and do not take any audio or speech features nor human auditory perception into consideration. As a result, all published embedding rules used in the SVD-based schemes are uninformed. The disadvantage of these uninformed embedding rules is that it is difficult, if not impossible, to improve the overall performance further by taking some characteristics of an input signal or of the human auditory system into account. As a consequence, the SVD-based schemes cannot resolve the conflicting problem between the audibility and robustness for some input signals. The motivation in this work is to resolve the above-mentioned problems by trying to address the physical meaning of a singular value so that when the advantages of the SVD-based technique combine with some informed embedding rules, we can overcome the problem of conflicting requirements. In this work, we propose a framework based on the singular-spectrum analysis (SSA), which is closely related to the SVD. We show that singular values can have the physical meaning in SSA. Hence, the SSA is adopted to exploit the advantages of the SVD-based technique, and, at the same time, it provides us the framework in which an embedding rule can be informed. We propose the basic structure of ASIH based on SSA. The test results showed that the proposed SSA-based ASIH satisfied both inaudibility and robustness criteria. However, the scheme could not deal with some input signals and result in poor sound quality or low watermark-extraction precision. We improve the scheme further by developing two embedding parameter selection methods: the one that is based on the differential evolution and the one based on the psychoacoustic model. Then, the test results were improved considerably. In this work, we demonstrate that the proposed SSA-based structure is flexible enough to be adapted for fragile or semi-fragile watermarking. Thus, we propose the semi-fragile watermarking schemes. The test results showed that the schemes are robust against various malicious attacks but fragile to non-malicious ones. Finally, we apply the SSA-based ASIH for two potential service applications: tampering detection and information carrier. For the tampering detection, the proposed scheme could detect, locate, and identify the tampered area correctly. For the information carrier application, our proposed scheme could achieve the embedding capacity of 320 bps without sacrificing the sound quality
Thammasat University. Thammasat University Library