非合焦画像からぼけを取り除くために符号化開口が有効であることが知られている.しかし,その開口形状の設計に関する研究は十分に行われていない.そこで本論文では実用的な設計法とその詳細な実装を示すことを目的とし,特に最適化に用いる遺伝的アルゴリズムの性能を大きく左右する交叉法の比較と評価を行う.また,ぼけ除去画像を認識に用いる場合,その画像の良し悪しは画質(視覚的品質)ではなくその後の認識率により評価されるべきである.そこで本論文では,PSNRを基準とした画質評価と,その画像を認識した際の認識率との関係を示す.Although coded aperture is known as an efficient technique to remove the optical blur from the defocused image, studies on designing the shape of the aperture have not been well explored yet. Therefore in this paper, we will show a practical method with detailed implementation of the optimization of the coded apertures. Especially, crossover process which strongly affects to the performance of genetic algorithm is intensively evaluated and compared with the other methods. In the context of object recognition, the performance of the deblurring should be evaluated by not the visual quality but the resultant recognition rates. Therefore we will show the relationships between PSNR-based image quality criterion and actual recognition rates through experiment.
Proceedings of the IEEE International Conference on Computer Vision 2015 3568-3576 2015年2月17日 査読有り
The central projection model commonly used to model cameras as well as projectors, results in similar advantages and disadvantages in both types of system. Considering the case of active stereo systems using a projector and camera setup, a central projection model creates several problems, among them, narrow depth range and necessity of wide baseline are crucial. In the paper, we solve the problems by introducing a light field projector, which can project a depth-dependent pattern. The light field projector is realized by attaching a coded aperture with a high frequency mask in front of the lens of the video projector, which also projects a high frequency pattern. Because the light field projector cannot be approximated by a thin lens model and a precise calibration method is not established yet, an image-based approach is proposed to apply a stereo technique to the system. Although image-based techniques usually require a large database and often imply heavy computational costs, we propose a hierarchical approach and a feature-based search for solution. In the experiments, it is confirmed that our method can accurately recover the dense shape of curved and textured objects for a wide range of depths from a single captured image.
IPSJ Transactions on Computer Vision and Applications 6 45-52 2014年 査読有り
The paper addresses the recognition problem of defocused patterns. Though recognition algorithms assume that the input images are focused and sharp, it does not always hold on actual camera-captured images. Thus, a recognition method that can recognize defocused patterns is required. In this paper, we propose a novel recognition framework for defocused patterns, relying on a single camera without a depth sensor. The framework is based on the coded aperture which can recover a less-degraded image from a defocused image if depth is available. However, in the problem setting of "a single camera without a depth sensor," estimating depth is ill-posed and an assumption is required to estimate the depth. To solve the problem, we introduce a new assumption suitable for pattern recognition
templates are known. It is based on the fact that in pattern recognition, all templates must be available in advance for training. The experiments confirmed that the proposed method is fast and robust to defocus and scaling, especially for heavily defocused patterns.
ステレオ法はシーンに対して光を投影することなくシーンの奥行マップを得ることができる手法であるが,被写体上のテクスチャやエッジが繰返し模様であったりエピポーラ線と並行である場合に画像間の対応付けが不安定になるという問題がある.一方,錯乱円径が距離によって変化することを利用し距離を得るDepth from Defocusは,レンズの口径によって奥行精度が限定されるという欠点をもつ.そこで本研究では,ステレオカメラの左右のカメラに異なる合焦距離を与え,更に符号化開口を導入することでステレオ法とDepth from Defocusを融合し,両手法の利点を兼ね備えた安定度と精度をもつ奥行推定方法を提案する.
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL PHOTOGRAPHY (ICCP 2013) 2013年 査読有り
This paper shows a random and distinct shape of each pixel improves the performance of super-resolution using multiple input images. Since the spatial light sensitivity distribution in each pixel of an image sensor is rectangular and identical, the process of imaging is equivalent to the point sampling of blurred image which is a result of convolution of a rectangle with the original image. The convolution results in a loss of the high spatial frequency component of the original image, which limits the performance of super-resolution. Thus, we sprayed a fine-grained black powder on an image sensor to give a random code to the spatial light sensitivity distribution in each pixel. This approach was combined with a reconstruction technique based on sparse regularization, which is commonly used in compressed sensing, in an experiment with an actual setup. A high-resolution image was reconstructed from a limited number of input images and the performance of super-resolution was significantly improved.
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) 209-216 2013年 査読有り
In this paper we propose a novel depth measurement method by fusing depth from defocus (DFD) and stereo. One of the problems of passive stereo method is the difficulty of finding correct correspondence between images when an object has a repetitive pattern or edges parallel to the epipolar line. On the other hand, the accuracy of DFD method is inherently limited by the effective diameter of the lens. Therefore, we propose the fusion of stereo method and DFD by giving different focus distances for left and right cameras of a stereo camera with coded apertures. Two types of depth cues, defocus and disparity, are naturally integrated by the magnification and phase shift of a single point spread function (PSF) per camera. In this paper we give the proof of the proportional relationship between the diameter of defocus and disparity which makes the calibration easy. We also show the outstanding performance of our method which has both advantages of two depth cues through simulation and actual experiments.
In this paper, we propose a method to accurately measure the shape of objects by suppressing indirect reflection such as interreflection or subsurface scattering. We use a modulation with M-sequence shifted along the line of the slit light to be accurately detected on the captured image in two ways. This method has two advantages; one is the characteristics of propagation of higher spatial frequency components and the other is geometric constraint between the projector and the camera. Prior to the measurement, epipolar constraint is obtained through calibration, and then the phase consistency is evaluated to suppress the interreflection. The value of cross-correlation is used to suppress the dilation of the light caused by the subsurface scattering.
ACM TRANSACTIONS ON GRAPHICS 28(3) 1-8 2009年8月 査読有り
We show a new camera based interaction solution where an ordinary camera can detect small optical tags from a relatively large distance. Current optical tags, such as barcodes, must be read within a short range and the codes occupy valuable physical space on products. We present a new low-cost optical design so that the tags can be shrunk to 3 m m visible diameter, and unmodified ordinary cameras several meters away can be set up to decode the identity plus the relative distance and angle. The design exploits the bokeh effect of ordinary cameras lenses, which maps rays exiting from an out of focus scene point into a disk like blur on the camera sensor. This bokeh-code or Bokode is a barcode design with a simple lenslet over the pattern. We show that a code with 1 5 mu m features can be read using an off-the-shelf camera from distances of up to 2 meters. We use intelligent binary coding to estimate the relative distance and angle to the camera, and show potential for applications in augmented reality and motion capture. We analyze the constraints and performance of the optical system, and discuss several plausible application scenarios.
CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4 2544-2551 2009年 査読有り
Exaggerated defocus can not be created with an ordinary compact digital camera because of its tiny sensor size, so it is hard to take pictures that attract a viewer to the main subject. On the other hand, there are many methods for controlling focus and defocus of previously taken pictures. However most of these methods require purpose-built equipment such as a camera array to take pictures. Therefore, in this paper we propose a method to create images focused at any depth with arbitrarily blurred background from the set of images taken by a handheld compact digital camera moved randomly. Using our method, it is possible to produce various aesthetic blurs by changing the size, shape or density of the blur kernel. In addition, we confirm the potential of our method through a subjective evaluation of blurred images created by our system.
Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers 63(6) 857-865 2009年 査読有り
Exaggerated defocus cannot be achieved with an ordinary compact digital camera because of its tiny sensor size, so taking pictures that draw the attention of a viewer to the subject is hard. Many methods are available for controlling the focus and defocus of previously taken pictures. However, most of these methods require custom-built equipment such as a camera array to take pictures. Therefore, in this paper, we describe a method for creating images focused at any depth with an arbitrarily blurred background from a set of images taken by a handheld compact digital camera that is moved at random. Our method can produce various aesthetic blurs by changing the size, shape, or density of the blur kernel. In addition, we demonstrate the potential of our method through a subjective evaluation of blurred images created by our system.
We developed a direction system of real-world operation from distant site. The system consists of several sets of cameras, projectors and PCs connected via network each other. At first, the 3-D shape of the object is measured using pattern light projection method, and it is sent to the distant PC. A supervisor can observe the CG of the object from any viewpoint and draw annotation figures on the CG. The direction message is sent to the real field and projected onto the object using projectors. The projected annotations are well aligned geometrically because the all cameras and projectors are calibrated with single reference object. The worker is free from any wearing equipment, ex. HMD, and multi projectors avoid the problem of occlusion by the worker body. Direction of alignment task of both existing and new object are also implemented.