A team of researchers from Yale-NUS College have developed new computer vision and deep learning approaches to extract more precise data from low-level vision in videos caused by environmental factors such as rain and snow. nighttime conditions. They also improved the accuracy of estimating 3D human pose in videos.
Computer vision technology, which is used in applications such as automatic surveillance systems, autonomous vehicles, and health and social distancing tools, is often affected by environmental factors, which can cause problems with data extracted.
The new research was presented at the 2021 Computer Vision and Pattern Recognition (CVPR) Conference.
Environmental impact on images
Conditions such as low light and human-made lighting effects like glare, glow, and spotlights affect nighttime images. Rain images are also affected by rain streaks or rain accumulation.
Yale-NUS College associate science professor Robby Tan led the research team.
“Many computer vision systems, like automatic surveillance and self-driving cars, rely on clear visibility of input video to function well. For example, self-driving cars cannot perform robustly in heavy rain, and automatic CCTV surveillance systems often fail at night, especially if the scenes are dark or there is strong glare or spotlights, ” said Assoc. Professor Tan.
The team relied on two separate studies that introduced deep learning algorithms to improve the quality of night and rain videos.
The first study focused on increasing brightness while simultaneously removing the effects of noise and light, such as glare, glare, and spotlights to create clear nighttime images. The new technique aims to improve the clarity of nighttime images and videos in the event of unavoidable glare, which existing methods have yet to do.
In countries where heavy rains are common, rain accumulation has a negative impact on visibility in videos. The second study aimed to solve the problem by introducing a method that uses frame alignment, which allows better visual information without being affected by rain trails, which often appear randomly in different frames. The team used a moving camera to use the depth estimate, which removed the rain haze effect. While existing methods revolve around removing rain streaks, newer methods can simultaneously remove rain streak and the rain haze effect.
3D human pose estimation
Along with the new techniques, the team also presented their research on estimating human pose in 3D, which can be used in video surveillance, video games and sports broadcasting.
3D multi-person pose estimation from monocular video, or video taken from a single camera, has been the subject of increasing research in recent years. Unlike videos from multiple cameras, monocular videos are more flexible and can be taken with a single camera, such as a cell phone.
However, high activity like multiple individuals in the same scene affects the accuracy of human detection. This is especially true when individuals closely interact or overlap in monocular video.
The team’s third study estimated human pose in 3D from video by combining two existing methods, which were top-down and bottom-up approaches. The new method produces a more reliable pose estimate in multi-person environments compared to the other two, and is better equipped to manage distance between individuals.
“In the next step of our 3D Human Pose Estimation research, which is supported by the National Research Foundation, we will examine how to protect the privacy information of videos. For methods of improving visibility, we strive to contribute to advancements in the field of computer vision, as they are essential for many applications that can affect our daily life, such as making self-driving cars perform better. in adverse weather conditions, ”said Assoc. Professor Tan.