Peter Kovacs, SVP of aiData and Deputy CEO, aiMotive Kft.
Data pipelines for ADAS and autonomous driving have traditionally focused on ingesting large sensor datasets, enabling search and curation, and supporting manual or automated annotation of driving-relevant entities such as objects, lanes, traffic signs, and traffic lights.
With the growing adoption of end-to-end driving models and trajectory-based training targets, the requirements on these pipelines are changing. Instead of primarily ensuring coverage of perception labels and Operational Design Domain (ODD) dimensions, the focus increasingly shifts toward capturing the diversity of dynamic behavior in traffic: interactions between the ego vehicle and surrounding participants, rare maneuver patterns, and complex multi-agent situations.
This talk discusses how data infrastructure must evolve to support these needs, including approaches for identifying and curating behavior-rich data, generating additional training scenarios, and evaluating coverage of traffic interactions rather than only static scene properties.
End-to-end systems also change how Validation must be approached. Because their internal decision logic is less interpretable than traditional modular stacks, scenario-based testing and Simulation become even more critical. We present an approach for capturing real-world locations and reconstructing them for Simulation, enabling both replay and modification of recorded traffic situations as well as the creation of new scenarios in the same environments.
Using neural rendering techniques such as Gaussian Splatting, these reconstructed environments can be rendered in real time on commodity hardware, enabling closed-loop Simulation with high visual fidelity. This allows the software under test to react differently than during the original recording while maintaining a realistic visual environment. The same environment models can be used in Software-in-the-Loop and Hardware-in-the-Loop setups, significantly reducing the need to repeatedly collect data in the same physical locations while still enabling large-scale Validation.