Autonomous driving vehicles depend on a sense as crucial for human beings as vision. It is not just about the machine being able to see, which it already does, but about looking, analyzing, discriminating and proceeding in milliseconds. The challenge is to achieve this characteristic of people in just the right time to make the necessary decision. For a machine, for example, seeing a tree next to the track is easy. The difficult thing is knowing that it is not an object that is going to move or get in the way, and the opposite if it is a pedestrian. The scientific magazine Nature publishes this Wednesday two advances in this regard: a processor to respond quickly to an event with minimal information and a new system (algorithm) to improve the precision of machine vision with lower latency (response time).
These investigations, which are fundamental for the development of autonomous driving vehicles or robotics, already have advanced developments in the Institute of Microelectronics (Imse) in the Andalusian capital, from the Higher Council for Scientific Research (CSIC) and the University of Seville. Multinationals such as Samsung and Sony already use the patents sold by the company Prophesy.
The two works he publishes Nature They are innovations on these systems based on foveation, the human mechanism that allows us to maximize resolution in the area where vision is focused, while lowering it in areas of non-relevant peripheral vision. In this way, the amount of information is reduced, but the visual recognition capacity of the data essential for decision making in milliseconds is maintained.
The key is accurate interpretation of the scene and rapid motion detection to allow immediate reactions. Conventional cameras can capture the image of a frame and reproduce it at very high resolution, but all that information has to be processed and discriminated, which involves time and resource expenditure incompatible with the instantaneous decisions required by autonomous driving or advanced robotics.
One of the advances It is signed by Daniel Gehrig, researcher at the University of Pennsylvania (USA), and Davide Scaramuzza, professor of robotics at the University of Zurich (Switzerland). Both have addressed the difficulty of decision-making with high-resolution color images. These require a large bandwidth to be processed with the necessary fluidity and reduce this high capacity at the cost of greater latency, more time to respond. The alternative is to use an event camera, which processes continuous streams of pulses, but at the sacrifice of precision.
To address these limitations, the authors have developed a hybrid system that achieves effective object detection with minimal latency. The algorithm combines information from two cameras: one that slows down the color frame rate to reduce the bandwidth required, and another event camera that compensates for that latency loss, ensuring that fast-moving objects, such as pedestrians and cars, , can be detected. “The results pave the way toward efficient and accurate object detection, especially in extreme scenarios,” the researchers say.
“It’s a breakthrough. Current driver assistance systems such as those from MobileEye—which are built into more than 140 million cars worldwide—work with standard cameras that take 30 frames per second, or one image every 33 milliseconds. Additionally, they require a minimum of three frames to reliably detect a pedestrian or car. This brings the total time to initiate the braking maneuver to 100 milliseconds. Our system allows us to reduce this time to below one millisecond without the need to use a high-speed camera, which would entail an enormous computational cost,” explains Scaramuzza.
Current systems increase the total time to initiate the braking maneuver to 100 milliseconds. Our algorithm allows us to reduce this time to below one millisecond without the need to use a high-speed camera.
Davide Scaramuzza, professor of robotics at the University of Zurich (Switzerland)
The technology has been “transferred to a top-level company,” according to the researcher. “If approved, it can typically take many years from proof of concept to impact to final implementation,” he adds.
For his part, Luping Shi, director of the Center for Brain-Inspired Computing Research (CBICR) at Tsinghua University (China), has developed with his team the Tianmouc chip (processor). Inspired by the way the human visual system works, it combines fast and imprecise perceptions, such as those of human peripheral vision, with higher resolution ones that are slower to process.
In this way, the chip also works as an event camera, which instead of full frames processes continuous flows of electrical impulses (events or spikes) recorded by each photosensor when it detects a sufficient change in light. “Tianmouc has an array of hybrid pixels: some with low precision, but fast detection, based on events to allow quick responses to changes without the need for too many details and others with slow processing to produce an accurate visualization of the scene,” he explains. the investigator. The chip has been tested in scenarios such as a dark tunnel suddenly illuminated by a dazzling light or on a road crossed by a pedestrian.
Bernabé Linares, research professor at Imse and head of the highest resolution commercial event camera, highlights that Scaramuzza uses drones to collect images conventionally and with event cameras. “The advance is the algorithm used to recognize objects and the result is interesting,” he highlights.
Imse works mainly with processors, and points out that algorithmic developments such as those at the University of Zurich are essential as a complement to chips and for robotic applications. As they are very compact technologies, they require lightweight calculation systems that consume little energy. “For drones it is an important development. This type of event cameras are very good for them,” he highlights.
Luping Shi’s work is closer to the developments of the Neuromorphic Systems Group of the Imse. In this case it is a hybrid processor. “The pixels alternate and spatial differences are calculated. Stores the light from one image to the next and calculates the change. If there is no modification, the difference is zero. It provides data very infrequently from a fairly sophisticated sensor,” explains Linares.
Although the uses highlighted by Nature are aimed at autonomous driving, advances in vision have great relevance in robotics, which also requires the ability to discriminate information to make decisions at high speed. This is the case of industrial automation processes. “But car manufacturers are very interested because they look for all kinds of developments, since it is safer this way and they can get the best out of each technology,” explains Linares, who highlights that Renault is one of Prophesee’s investors.
You can follow EL PAÍS Technology in Facebook and x or sign up here to receive our weekly newsletter.
#investigations #provide #human #vision #driverless #cars