Analysis of the Wavelet Domain Filtering Approach for Video Super-Resolution

-The wavelet domain-centered algorithms for the super-resolution research area give better visual quality and have been explored by different researchers. The visual quality is achieved with increased complexity and cost as most of the systems embed different pre-and post-processing techniques. The frequency and spatial domain-based methods are the usual approaches for super-resolution with some benefits and limitations. Considering the benefits of wavelet domain processing, this paper deals with a new algorithm that depends on wavelet residues. The methodology opts for wavelet domain filtering and residue extraction to get super-resolved frames for better visuals without embedding other techniques. The avoidance of noisy high-frequency components from low-quality videos and the consideration of edge information in the frames are the main targets of the super-resolution process. This inverse process is carried with a proper combination of information present in low-frequency bands and residual information in the high-frequency components. The efficient known algorithms always have to sacrifice simplicity to achieve accuracy, but in the proposed algorithm efficiency is achieved with simplicity. The robustness of the algorithm is tested by analyzing different wavelet functions and at different noise levels. The proposed algorithm performs well in comparison to other techniques from the same domain.


INTRODUCTION
Super Resolution (SR) is a leading field of Digital Signal Processing (DSP) with wide applicability in electronic imaging such as biomedical, forensics, surveillance, satellite imaging, etc. Not only for better picturing but for proper data extraction, there is a need for high-quality images which are not readily available every time. An expensive High-Resolution (HR) imaging system has restrictions on the sensor's capacity, the optics fabricating machinery, memory, and the sensor's transmission bandwidth. Physical up-gradation has almost reached its limits, so the solution is to develop effective ways to overcome the hardware limitations of the imaging systems. This leads to the SR concept and related developments. Many authors [1][2][3][4][5] explored the SR concept, its recent applications, limitations, and scope for improvements.
The concept aims to produce HR frames from successive Low-Resolution (LR) images or frames by applying DSP techniques in a proper sequence. Some of the work related to SR is summarized below. The main contributions of the current paper are: • An introduction of the SR concept with the required basics.
• A selection of a proper wavelet function through analyzing the simulation results for future work.
• Analysis of the proposed work with different noise levels and wavelet functions to check the robustness of the algorithm.
• The establishment that the combination of low-frequency components and wavelet residuals for high-frequency details like edges leads to increased efficiency in comparison with the state-of-the-art techniques.

A. Super-Resolution with the Observation Model
The inverse process to carry out high-quality images or frames from their low-quality versions using digital image processing techniques is known as SR. The original HR version of the scene gets degraded by different factors like warping (W), blurring (B), aliasing or down-sampling (D), and additional noise (N) by the environment or by imaging devices. This degradation process results in the LR version of the images. The SR issue is an ill-posed problem, i.e. it has no particular solution and it doesn't have a particular mathematical expression. Researchers have tried to convert this ill-posed problem into a well-posed one in order to get a solution [1,2]. Even these trials do not represent the exact issue but helped obtain nearby versions and their solutions.
For representing the SR concept mathematically, let us assume the following: X is the original HR image, with yk -kth LR images/frames, where k is the number of observations. D is the decimation matrix, B is the blurring matrix, Wk the warping matrix, and Nk the Additional noise. So, the mathematical expression of the observation model is represented in (1): The generalized observation model is represented in Figure  1. The observation model shows the exposures of the LR frame from the high-quality frame due to some degradation factors. The SR process is to get back the high-quality frames by restoring the degraded data using DSP techniques. This reverse process is carried out with different algorithms developed for different applications. The techniques for SR are mainly classified by their domain: spatial, frequency, and wavelet [3][4][5][6]. Each domain has its advantages and disadvantages based on the application field. The next section summarizes some recent literature regarding SR.

B. Super-Resolution Process Categorization
The categorization of the SR process is based on the domains in which the technique is developed. The techniques in the frequency field deal with the frequency element as an image trait. The frequency-domain approach is depending on shifting, aliasing, and band limitation of the signal [3][4][5][6]. The very first approach of SR was in the frequency area [7]. The routine equations that narrate the HR representation to the perceived vitiated pictures were framed with approximating the comparative moves among a series of down-sized, aliased, and without noise LR images. This process was protracted in [8] by recommending a biased least squares result upon the theory that the distortion and noise individualities are identical for all LR pictures. Authors in [9] presented the DCT centered image quality improvement algorithm. The DCT has the benefits to attain a notable improvement in picture quality even for frequently posed cases. A foremost benefit of the frequency province-centered SR approaches is that they are habitually hypothetically straightforward and rational in calculations.
The most elementary way to boost the resolution of a picture is spatial domain interpolation methods. The set of pixels estimates new pixels either by considering neighboring pixels or averaging the pixel values. These methods are known as nearest neighbor, bilinear, and bicubic interpolation. The composite interpolation methods are Cubic B-spline interpolation method [10], New Edge-Directed Interpolation (NEDI) [11], and Edge-Guided Interpolation (EGI) [12]. For the spatially grounded SR approaches, uneven interpolation methods are some of the utmost innate approaches with moderately minimal computational difficulty. The frequency and spatial-based domain methods have their advantages and disadvantages as well. Wavelet Transform (WT) based methods give frequency components as well as spatial statistics which produce results more encouraging than the previous transforms. The same theory was investigated in [13]. The discrete WT and Gabor wavelet combination also give promising results in the superresolution construction of satellite images, explored in [14]. The wavelet domain-centered SR restoration methodology can examine and manipulate global and local features at coarse-and fine-scale respectively. Among the confronts in SR, one needs to uphold or retrieve the real edges of entities in the interim condensing noise, which is commonly challenging to be realized instantaneously using frequency-centered processes due to the parallel reaction of edges and noise in the frequency range. This leads to embedding WT and edge-preserving algorithms like EGI [15]. The combination of the Keren algorithm for image registration, DWT, and NEDI to improve the edges in the frames was explored in [16]. The appealing assets of WT, for instance, density, multi-resolution, and locality are valuable for probing actual-world motions. WT recommends a substitute solution to examine exact edges and noise individually. A collective statement of WT-centered techniques is that the LR frame is the quality truncated frequency subordinate band constructed by the WT of the picture [17]. The difference in wavelet decomposed sub-bands with actual low-resolution image, interpolation is the technique given in [18] for the reconstruction of images. Authors in [19] recommended the SR method for the degraded frame with DWT and stationary WT (SWT). Yet, these accessible ways have partial execution in a range of noise stages, motion planes, wavelets, and the total of consumed frames. Still, the researchers are showing interest in wavelet domain processing for the better performance of the algorithms.

III. THE PROPOSED WAVELET-DOMAIN VIDEO SR PROCESS
Enhancing video quality in frequency and spatial domains is the traditional way, but nowadays the wavelet domain processing has become a trend due to the benefits of both domains embedded in one. The proposed algorithm is used basically for the analysis of the effect of the use of wavelets in the super-resolution process. The proposed technique uses the benefits of wavelet domain processing such as low frequency and high-frequency separation with the help of well-known wavelet families. The flow of work for the proposed methodology is divided into two parts: (a) Degradation process and (b) Upgradation process. These processes are presented in Figures 2 and 3.

A. Degradation Process
Naturally, the quality of images or video frames are degraded during the acquisition process due to many factors like noise in the acquisition process or environment, hardware quality, etc. Many times, due to less storage and transmission capacity, the images/videos are compressed which leads to degradation. Dealing with degraded quality data means inefficient information processing, less effective analysis, and lower quality visuals. The degradation process of the input video frames is shown in Figure 2. In the proposed method, the original high-quality video frames are degraded by some factors like downsampling and addition of noise and are referred to as LR frames. The process is shown in Figure 3. The degradation process of the proposed method can be communicated mathematically by modifying (1) to: yk = DX + Nk (2) The steps followed in the degradation process are given in the form of a descriptive pseudo code in Figure 3. These LR frames are used as input in the upgradation process.

B. Upgradation Process:
The use of the wavelet transform is the frequency domain filtering of an input image, without any other noise removal technique. If only wavelet domain filtering is applied then it affects the output. This effect is analyzed for frame or image quality enhancement. The SR area has attracted the attention of researchers, as it enables the use of conventional image acquisition systems as it is with some post-processing. It reduces the cost of exchange of traditional systems by new technology. The simple and general steps involved in the superresolution process are shown in Figure 4.

A. Peak Signal to Noise Ratio (PSNR)
The PSNR is a measure of the fraction among the highest signal power and the distortion of the noise that alters the superiority of its interpretation. Since maximum signals include a vast range of intensity, the PSNR is typically articulated in the logarithmic scale, whereas Mean Square Error (MSE) is offered by: where x represents the original image or frame, y denotes the matrix data of the corrupted image or frame, m gives the number of the pixel lines of the images, and n signifies the number of columns in the pixel image [1].

B. Structural Similarity (SSIM)
Parameters like luminance, contrast, and structure are important to check the similarity between the original and super-resolved frames. Collecting all these terms together we get the Structural Similarity (SSIM) Index [16,20]. If the measurement of similarity is considered between images or frames x and y with the same size then the equation for SSIM will be: where, μ ୶ and μ ୷ are the averages of x and y, σ ୶ ଶ and σ ୷ ଶ are variances of x and y, and σ ୶୷ is the covariance of x and y. c ଵ and c ଶ are constants [1].

V. EXPERIMENTAL RESULTS AND DISCUSSION
The projected SR method based on the wavelet domain is verified on renowned noncopyrighted videos. The reason behind considering these videos is that they represent an easy  [23] which contains non-copyrighted and copyrighted type videos (to avoid copyright issues, the authors of the current paper considered only non-copyrighted videos). The frames were separated for further processing. The original high-resolution video was resized to 512×512 pixels. Grounded on the observation prototype, the entered LR frames had dimensions of 128×128 pixels after down-sampling and were more degraded by the addition of different noise levels. The noise used to degrade frames was Gaussian noise with specific SNR. The aimed algorithm was executed in MATLAB (R2018b). The results and analysis are divided into three parts are explained below: • Comparison of the proposed algorithm with existing techniques.
• Analysis of the planned algorithm with diverse noise levels.
• Analysis of the projected algorithm with different wavelet functions.

A. Comparison of the Proposed Algorithm with Existing Techniques
The evaluation of the average values of PSNR and SSIM results for the proposed method and other techniques is shown in Table I. The values from the Table itself declare the prominence of the proposed algorithm. The most important characteristic is the simplicity, when concidering its efficiency. The algorithms used for comparison have many pre-processing and hybrid approaches which makes them more complex than the proposed algorithm. The reason behind this elevated performance is that the DWT-centered SR algorithms are further useful to regain the high-level frequency particulars of the specific degraded frames. The real borders are appropriately maintained and noise is eliminated by filtering purposes. For the "Foreman" frames, even though the PSNR and SSIM results attained by the proposed method are greater than the other methods', the performance gain can be boosted by adding other edge preservation and direct mapping techniques. Figure 6 shows the behavior of SR methods concerning the average PSNR and SSIM values. The most important quality metrics, i.e. SSIM, have shown a substantial increase by the proposed algorithm.

B. Analysis of the Proposed Algorithm with Diverse Noise Levels
The robustness of the proposed algorithm is validated with different noise levels, varying from 50dB to 25dB with a step of 5dB. The sampled noisy frames of the "Foreman" video are shown in Figure 8. The frames were already down-sampled by a scale of 1/4th to the original. The db1 wavelet function was used. Fig. 6.
Comparative analysis of the proposed algorithm with other SR methods. The average PSNR and average SSIM values after upgrading the quality are shown in Table II. Figure 7 shows the graph for quality metrics with different noise levels. It can be seen that the proposed algorithm gives better results in a variety of noise levels. The PSNR values are not affected much but the SSIM values have significantly decreased. This can be concluded as the addition of noise significantly affect structural similarities. Fig. 7.
Effect of different noise levels on super-resolution process.

C. Analysis of the Proposed Algorithm with Different Wavelet Functions
The literature provides different wavelet functions which have been tried with varying results. To check and analyze the variation in the efficiency of the proposed algorithm according to the wavelet function, all the parameters were kept stable and the noise level at 30dB. In the literature, the Daubechies family is explored in the super-resolution process, but not all functions are used except db2, db7/9 which are having wide applicability. This paper utilized db1, db2, db7, and db9 wavelet functions  Table  III. Figure 8 shows the graph for the proposed algorithm using different wavelet functions. The db1/Haar function gives the most promising results due to its component reconstruction capability. Many researchers prefer db2 for decomposition and reconstruction purposes, but in this case, db1 provides more efficacy in reconstruction while being simpler than the db2. The surveyed recent papers show that there is still a scope for efficiency improvement in object detection areas like face [21] and logo [22] recognition. The decision of recognition is based on the processes applied to the input data. But input pictures with low quality increase the complications in the final decision making. To avoid such circumstances, quality details in the input data are needed. This mends the interest in the SR area.
VI. CONCLUSION The current paper explored a simpe algorithm for the analysis of the effect of wavelet domain filtering in visual quality upgradation of low or degraded videos. This technique is based on wavelet residual mapping and interpolation. The robustness of the wavelet filtering approach is investigated for different noise levels and different wavelet functions and it can be said that the proposed algorithm surpasses the other popular methods from the same domain. Simplicity and efficiency are the two main advantages of the proposed algorithm. The purpose behind developing this algorithm is to reduce the complexity. As the mapping of low-frequency information with wavelet residues gives promising results, the embedding of the wavelet domain with neural networks for learning this SR mapping will hopefully succeed.