By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.
August 5, 2022

The ultimate list of lidar datasets for autonomous vehicles

August 5, 2022

Manufacturers must maximize the capabilities of sensors like LiDAR to take advantage of their complementarity and acquire redundancy of the information collected as vehicles with progressively high levels of autonomy emerge. 

What is data labeling?

The act of recognizing raw data (pictures, text files, videos, etc.) and adding one or more relevant labels to provide context so that a machine learning model may learn from it is called data labeling.

What is Lidar?

Lidar (Light Detection and Ranging) is a remote sensing technique that measures ranges (varying distances) to the Earth using light in the form of a pulsed laser. Laser light is delivered from a source and reflected from the scene's objects in LiDAR. The system receiver detects the reflected light, and the time of flight is utilized to create a distance map of the scene's objects.

Getting started with LiDAR

The operation of a LIDAR is rather straightforward. A sensor detects the reflection of a concentrated light beam targeted at an object. The intensity and angle (or phase) of the beam are measured if it is detected. These values are then entered into an equation that is executed by a fast onboard computer to determine the position and properties of the reflecting item. We can swiftly build up a 3D "picture" of the surrounding region by mechanically "sweeping" the beam and receiver array. This is frequently depicted as a "point cloud" to assist us in visualizing what the LIDAR is "seeing." LIDAR is used in a variety of scientific applications, including weather forecasting and climate studies, urban planning, deep sea research, surveying, aeronautics, and forestry, in addition to self-driving vehicles.

LiDAR datasets for autonomous vehicles

1. nuScenes is a large-scale self-driving dataset composed of urban street scenes photographed in Singapore and Boston, Massachusetts. It has 850 classified segments in 23 categories for multi-object tracking and 850 labeled sequences in 32 categories for semantic + panoptic segmentation.

Dataset information

2. Audi Autonomous Driving Dataset (A2D2): A2D2 stands for Audi Autonomous Driving Dataset (A2D2). The information was gathered in three German cities. It has 12,499 tagged frames in 14 sections for multi-object tracking and 41,280 labeled frames in 38 categories for semantic segmentation. Labels are extracted from camera images using 2D semantic segmentation.

Dataset Information

3. ApolloScape: ApolloScape is a dataset generated by Baidu Research for autonomous driving. The data was gathered in Beijing, China, under varying illumination conditions and traffic densities. It is only licensed for academic usage and uses lidar sensors: Riegl VMX-1HA, 2 pcs. It features 53 labeled sequences and 5 categories for multi-object tracking.

Dataset Information

4. DENSE Seeing Through Fog: As part of the DENSE project, Seeing Through Fog is a driving dataset that was created. The data was collected in northern Europe and includes various weather situations such as fog, snow, and rain. Its licensing is only for academic usage, and it has 12000 tagged frames in 28 groups for object detection.

Dataset information

Understanding LiDAR datasets in autonomous driving.

LiDAR provides a three-dimensional image for self-driving cars to analyze; many experts believe such images are more precise than cameras. Unlike cameras, LiDAR is unaffected by shadow, sunshine, or impending headlights from other vehicles. According to BloombergNEF's most recent assessment, 17 auto manufacturers are producing 21 Light detection and ranging passenger vehicle models as of January 2022. LiDAR technology is used in Advanced Driving Systems (ADS) and Advanced Driver Assistance Systems (ADAS) by automakers (ADAS), giving the driver a complete picture of the vehicle's surroundings.


To be useful, LiDAR data must be appropriately tagged, which is a large task that can be difficult to scale. The problem for AI engineers is converting enormous amounts of unstructured data into structured data in order to train machine learning models. We advocate hiring a managed workforce that specializes in categorizing LiDAR data from autonomous vehicles.

If your company uses LiDAR technology, Isahit's highly qualified staff is ready to provide high-quality LiDAR data to your project team.

You might also like
this new related posts

Want to scale up your data labeling projects
and do it ethically? 

We have a wide range of solutions and tools that will help you train your algorithms. Click below to learn more!