首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Coding and pooling, the major two sequential procedures in sparse coding based scene categorization systems, have drawn much attention in recent years. Yet improvements have been made for coding or pooling separately, this paper proposes a spatially constrained scheme for sparse coding on both steps. Specifically, we employ the m-nearest neighbors of a local feature in the image space to improve the consistency of coding. The benefit is that similar image features will be encoded with similar codewords, which reduced the stochasticity of a conventional coding strategy. We also show that the Viola–Jones algorithm, which is well-known in face detection, can be tailored to learning receptive fields, embedding the spatially constrained information on the pooling step. Extensive experiments on the UIUC sport event, 15 natural scenes and the Caltech 101 database suggests that scene categorization performance of several popular algorithms can be ubiquitously improved by incorporating the proposed two spatially constrained sparse coding scheme.  相似文献   

2.
Shadow detection is significant for scene understanding. As a common scenario, soft shadows have more ambiguous boundaries than hard shadows. However, they are rarely present in the available benchmarks since annotating for them is time-consuming and needs expert help. This paper discusses how to transfer the shadow detection capability from available shadow data to soft shadow data and proposes a novel shadow detection framework (MUSD) based on multi-scale feature fusion and unsupervised domain adaptation. Firstly, we set the existing labeled shadow dataset (i.e., SBU) as the source domain and collect an unlabeled soft shadow dataset (SSD) as the target domain to formulate an unsupervised domain adaptation problem. Next, we design an efficient shadow detection network based on the double attention module and multi-scale feature fusion. Then, we use the global–local feature alignment strategy to align the task-related feature distributions between the source and target domains. This allows us to obtain a robust model and achieve domain adaptation effectively. Extensive experimental results show that our method can detect soft shadows more accurately than existing state-of-the-art methods.  相似文献   

3.
Recently, sparse coding has become popular for image classification. However, images are often captured under different conditions such as varied poses, scales and different camera parameters. This means local features may not be discriminative enough to cope with these variations. To solve this problem, affine transformation along with sparse coding is proposed. Although proven effective, the affine sparse coding has no constraints on the tilt and orientations as well as the encoding parameter consistency of the transformed local features. To solve these problems, we propose a Laplacian affine sparse coding algorithm which combines the tilt and orientations of affine local features as well as the dependency among local features. We add tilt and orientation smooth constraints into the objective function of sparse coding. Besides, a Laplacian regularization term is also used to characterize the encoding parameter similarity. Experimental results on several public datasets demonstrate the effectiveness of the proposed method.  相似文献   

4.
Gaze prediction is a significant approach for processing a large amount of incoming visual information of videos. Recent gaze prediction algorithms often employ sparse models with the assumption that every superpixel in the video frames can be represented as linear combinations of a few salient superpixels. However, they are not actuated enough because of the insufficient knowledge that video signals contain a non-negative request. Hence, we develop a novel gaze prediction based on an inverse sparse coding framework with a determinant sparse measure. By introducing this sparse measure, the solutions are non-negative and sparser than conventional sparse constraints. However, the proposed optimization problem becomes nonconvex, which is difficult to solve. To efficiently address the corresponding nonconvex optimization problem, we propose a novel algorithm based on the difference in convex function programming, which can yield the global solutions. Experimental results indicate the improved accuracy of the proposed approach compared with state-of-the-art algorithms.  相似文献   

5.
A new block-based fractal image coding algorithm called Fractal Block Coding in Residue Domain (FBCRD) is proposed. In basic Fractal Block Coding (FBC) algorithm, each block (called range block) is encoded by an affine mapping from a domain block within the same image to itself. The decoder uses the parameters of these mappings to synthesize the reconstructed image through an iterative procedure. FBCRD is a modification of basic FBC. In FBCRD, range blocks and domain blocks are all residue blocks subtracted from their block means and both the parameters of affine mappings and block means are coded. This modification leads to fewer iterations at the decoder. An optimized decoding strategy is also introduced which reduces total decoding time by more than half of that of basic FBC. This improvement is favorable for real time implementation of fractal image compression. Supported in Part by the Defence Preresearch Foundation, the National Science Foundation of Guangdong Prooince and the National “Chinbing” Project  相似文献   

6.
To minimize the errors of the reconstructed values and improve the quality of decoded image,an efficient reconstruction scheme for transform domain Wyner-Ziv (WZ) video coding is proposed.The reconstruction scheme exploits temporal correlation of the coefficient bands,the WZ decoded bits stream and the side information efficiently.When side information is outside the decoded quantization bin,the reconstructed value is derived using expectation of the WZ decoded bit stream and the side information.When side information is within the decoded quantization bin,the reconstructed value is derived using the biased predictor.Simulation results show that the proposed reconstruction scheme gains up to 1.32 dB compared with the commonly used boundary reconstruction scheme at the same bit rates and similar computation cost.  相似文献   

7.
Spectral images (SI) can be represented as 3D-arrays of spatial information across multiple wavelengths. Compressive Spectral Imaging (CSI) reduces sensing costs by sensing compressed versions of the scene, recovering a suitable version of the original SI solving a sparsity-inducing inverse problem. On the other hand, Convolutional Sparse Coding (CSC) has been successfully proved for representing gray-scale images, however it misses any correlation between images. This work considers the spatial-spectral correlation within SIs introducing an extension of the CSC signal model describing the SI as the sum of convolutions of 3D sparse coefficient maps with their respective 3D dictionary filters. Furthermore, we use the proposed CSC framework for recovering SIs from CSI measurements. The simulations results, using two different CSI acquisition architectures, show that the proposed CSC framework yields better representations of the SIs than those obtained under the traditional sparse signal representation approach, improving the quality of the recovered SIs.  相似文献   

8.
Nonlocal means (NLM) filtering or sparse representation based denoising method has obtained a remarkable denoising performance. In order to integrate the advantages of two methods into a unified framework, we propose an image denoising algorithm through skillfully combining NLM and sparse representation technique to remove Gaussian noise mixed with random-valued impulse noise. In the non-Gaussian circumstance, we propose a customized blockwise NLM (CBNLM) filter to generate an initial denoised image. Based on it, we classify the different noisy pixels according to the three-sigma rule. Besides, an overcomplete dictionary is trained on the initial denoised image. Then, a complementary sparse coding technique is used to find the sparse vector for each input noisy patch over the overcomplete dictionary. Through solving a more reasonable variational denoising model, we can reconstruct the clean image. Experimental results verify that our proposed algorithm can obtain the best denoising performance, compared with some typical methods.  相似文献   

9.
This paper introduces a redundancy adaptation algorithm based on an on-the-fly erasure network coding scheme named Tetrys in the context of real-time video transmission. The algorithm exploits the relationship between the redundancy ratio used by Tetrys and the gain or loss in encoding bit rate from changing a video quality parameter called the Quantization Parameter (QP). Our evaluations show that with equal or less bandwidth occupation, the video protected by Tetrys with redundancy adaptation algorithm obtains a PSNR gain up to or more than 4 dB compared to the video without Tetrys protection. We demonstrate that the Tetrys redundancy adaptation algorithm performs well with the variations of both loss pattern and delay induced by the networks. We also show that Tetrys with the redundancy adaptation algorithm outperforms traditional block-based FEC codes with and without redundancy adaptation.  相似文献   

10.
A novel efficient time domain threshold based sparse channel estimation technique is proposed for orthogonal frequency division multiplexing (OFDM) systems. The proposed method aims to realize effective channel estimation without prior knowledge of channel statistics and noise standard deviation within a comparatively wide range of sparsity. Firstly, classical least squares (LS) method is used to get an initial channel impulse response (CIR) estimate. Then, an effective threshold, estimated from the noise coefficients of the initial estimated CIR, is proposed. Finally, the obtained threshold is used to select the most significant taps. Theoretical analysis and simulation results show that the proposed method achieves better performance in both BER (bit error rate) and NMSE (normalized mean square error) than the compared methods has good spectral efficiency and moderate computational complexity.  相似文献   

11.
Sparse coding has been used for image representation successfully. However, when there is considerable variation between source and target domain, sparse coding cannot achieve satisfactory results. In this paper, we proposed a Projected Transfer Sparse Coding algorithm. In order to reduce their distribution difference, we project source and target data into a shared low dimensional space. Meanwhile, we learn a projection matrix and a shared dictionary and the sparse coding of source and target data in the low dimensional space. Unlike existing methods, the sparse representations are learnt using the projected data which are invariant to the distribution difference and the irrelevant samples. Thus, the sparse representations are robust and can improve the classification performance. We do not need to know any explicit correspondence across domains. We learn the projection matrix, the discriminative sparse representations, and the dictionary in a unified objective function. Our image representation method yields state-of-the-art results.  相似文献   

12.
This paper addresses issues in visual tracking where videos contain object intersections, pose changes, occlusions, illumination changes, motion blur, and similar color distributed background. We apply the structural local sparse representation method to analyze the background region around the target. After that, we reduce the probability of prominent features in the background and add new information to the target model. In addition, a weighted search method is proposed to search the best candidate target region. To a certain extent, the weighted search method solves the local optimization problem. The proposed scheme, designed to track single human through complex scenarios from videos, has been tested on some video sequences. Several existing tracking methods are applied to the same videos and the corresponding results are compared. Experimental results show that the proposed tracking scheme demonstrates a very promising performance in terms of robustness to occlusions, appearance changes, and similar color distributed background.  相似文献   

13.
针对行人重识别无监督跨域迁移问题,提出一种 基于域鉴别网络和域自适应的行人重识别算法。首先,使用改 进ResNet-50训练监督域鉴别网络模型,加入共享空间组件得到特征 不变属性,用于区分类间图像,并基 于对比损失和差异损失来提高模型的分类性能。其次,利用域自适应无监督迁移方法由源域 数据集导出特 征不变属性,并应用到未标记的目标域数据集上。最后,匹配查询图像和共享空间中的图库 图像执行跨域 行人重识别。为验证算法有效性,在CUHK03、Market-1501和DukeMTMC-reID数据集上进行了实验,算法 在Rank-1准确度分别达到34.1%、38.1%和28.3%,在mAP分别达到34.2%、17. 1%和17.5%,最后还验证了 模型各个组件在训练阶段的必要性。结果表明本文算法在大规模数据集上的性能优于现有的 一些无监督行人重识别方法,甚至接近于某些传统监督学习方法的性能。  相似文献   

14.
Object tracking is always a very attractive research topic in computer vision and image processing. In this paper, an innovative method called salient-sparse-collaborative tracker (SSCT) is put forward, which exploits both object saliency and sparse representation. Within the proposed collaborative appearance model, the object salient feature map is built to create a salient-sparse discriminative model (SSDM) and a salient-sparse generative model (SSGM). In the SSDM module, the presented sparse model effectively distinguishes the target region from its background by using the salient feature map that further helps locate the object in complex environment. In the SSGM module, a sparse representation method with salient feature map is designed to improve the effectiveness of the templates and deal with occlusions. The update scheme takes advantage of salient correction, thus the SSCT algorithm can both handle the appearance variation as well as reduce tracking drifts effectively. Plenty of experiments with quantitative and qualitative comparisons on benchmark reveal the SSCT tracker is more competitive than several popular approaches.  相似文献   

15.
The nonlocal self-similarity of images means that groups of similar patches have low-dimensional property. The property has been previously used for image denoising, with particularly notable success via sparse coding. However, only a few studies have focused on the varying statistics of noise in different similar patches during the iterative denoising process. This has motivated us to introduce an improved weighted sparse coding for gray-level image denoising in this paper. On the basis of traditional sparse coding, we introduce a weight matrix to account for the noise variation characteristics of different similar patches, while introduce another weight matrix to make full use of the sparsity priors of natural images. The Maximum A-Posterior estimation (MAP) is used to obtain the closed-form solution of the proposed method. Experimental results demonstrate the competitiveness of the proposed method compared with that of state-of-the-art methods in both the objective and perceptual quality.  相似文献   

16.
17.
In this paper we propose an online semi-supervised compressive coding algorithm, termed SCC, for robust visual tracking. The first contribution of this work is a novel adaptive compressive sensing based appearance model, which adopts the weighted random projection to exploit both local and discriminative information of the object. The second contribution is a semi-supervised coding technique for online sample labeling, which iteratively updates the distributions of positive and negative samples during tracking. Under such a circumstance, the pseudo-labels of unlabeled samples from the current frame are predicted according to the local smoothness regularizer and the similarity between the prior and the current model. To effectively track the object, a discriminative classifier is online updated by using the unlabeled samples with pseudo-labels in the weighted compressed domain. Experimental results demonstrate that our proposed algorithm outperforms the state-of-the-art tracking methods on challenging video sequences.  相似文献   

18.
为了进一步压缩比特率,在线性预测(LP)语音编码中使用了可变阶数方法。即根据当前语音帧的性质决定相应LP滤波器的阶数。但是,如果预测阶数太小,由于语音频谱的动态范围大,可能使LP分析不能够正确地匹配较高的共振峰。讨论了一个用于语音编码的频域技术,用以在浊音语音共振峰模型方面改善低阶数线性预测(LP)的性能。  相似文献   

19.
l2-norm sparse representation (l2-SR) based face recognition method has attracted increasing attention due to its excellent performance, simple algorithm and high computational efficiency. However, one of the drawbacks of l2-SR is that the test sample may be conspicuous difference from the training samples even from the same class and thus the method shows poor robustness. Another drawback is that l2-SR does not perform well in identifying the training samples that are trivial in correctly classifying the test sample. In this paper, to avoid the above imperfection, we proposed a novel l2-SR. We first identifies the training samples that are important in correctly classifying the test sample and then neglects components that cannot be represented by the training samples. The proposed method also involve in-depth analysis of l2-SR and provide novel ideas to improve previous methods. Experimental results on face datasets show that the proposed method can greatly improve l2-SR.  相似文献   

20.
Blocking artifact, characterized by visually noticeable changes in pixel values along block boundaries, is a common problem in block-based image/video compression, especially at low bitrate coding. Various post-processing techniques have been proposed to reduce blocking artifacts, but they usually introduce excessive blurring or ringing effects. This paper proposes a self-learning-based post-processing framework for image/video deblocking by properly formulating deblocking as an MCA (morphological component analysis)-based image decomposition problem via sparse representation. Without the need of any prior knowledge (e.g., the positions where blocking artifacts occur, the algorithm used for compression, or the characteristics of image to be processed) about the blocking artifacts to be removed, the proposed framework can automatically learn two dictionaries for decomposing an input decoded image into its “blocking component” and “non-blocking component.” More specifically, the proposed method first decomposes a frame into the low-frequency and high-frequency parts by applying BM3D (block-matching and 3D filtering) algorithm. The high-frequency part is then decomposed into a blocking component and a non-blocking component by performing dictionary learning and sparse coding based on MCA. As a result, the blocking component can be removed from the image/video frame successfully while preserving most original visual details. Experimental results demonstrate the efficacy of the proposed algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号