A Faculty of Engineering Paper Presented at the Prestigious CVPR Conference

A Faculty of Engineering Paper Presented at the Prestigious CVPR Conference
תאריך

The paper by Shay Dekel, a PhD student of Prof. Yosi Keller, deals with computer vision and explores the estimating of rotations between two input images overlooking the same object.

A paper by PhD student Shay Dekel was presented at the CVPR Conference on Computer Vision and Pattern Recognition. The conference is affiliated with IEEE, the Institute of Electrical and Electronic Engineering, which covers the fields of electrical engineering, electronics, computers and software. CVPR is one of the most important conferences in the realm of computer vision, and acceptance rates are only 20%.

Dekel’s research, supervised by Prof. Yosi Keller, focuses on neural networks for machine or computer vision. “In general, neural networks are based on learning, which means there’s a period of learning or training during which the network adapts and learns from examples. Then when it comes to crunch time, when you actually use them, they receive new information that they’ve never been exposed to and deduce based on the knowledge they’ve accumulated,” explains Dekel. “Neural networks for computer vision receive images as input, process them with a networks-based algorithm, and produce information about the image. A classic example is a neural network that identifies the object in every picture. If you train the network to identify a cat based on numerous cat images, you expect that in real time, even if the network has never seen that particular cat image before, the algorithm will be able to identify it as a cat. It’s a fairly new field of research; until about a decade ago, it was practically considered science fiction.”

Dekel’s paper explores the estimating of angles between two input images overlooking the same object. “In my case, the network receives two images with minimal or nonexistent overlap. It picks up clues from both images and can tell what their angle is. If the two images overlap, the network finds identical features. In an extreme scenario where there’s no overlap whatsoever, the network detects hints such as straight building outlines, sidewalks, or objects whose shape can be analyzed,” Dekel elaborates. “In fact, the network learns that a building stands perpendicular to the ground, or that sidewalks are parallel to the floor. The policy of what to ‘look at’ in images is automatically learned during the training stage.”

In his research, Dekel used transformer networks, relatively new to the world of computer vision. “The transformer learns what to pay more attention to in images. If, say, there’s an image of grass, sky, and a person, the network can figure out that the person is more significant than the sky or grass,” says Dekel. We used three transformer types: one that knows how to crosscheck information from both images; one that encodes important information; and one that ‘refines’ the information from the sum of coded information, filtering only what’s necessary out of the whole sea of information. The two images here have minimal overlap, but the networks somehow managed to identify that the angle between them is 72 degrees horizontal and 27 degrees vertical.”

The research shows particular promise in the field of robotics, of all things. “In robotics, the camera acts as the robot’s eyes,” explains Dekel. “If the robot loses its course, this kind of algorithm can estimate its deviation and return it to its course effortlessly.”

Last Updated Date : 18/04/2024