Meta, Facebook’s parent company, has introduced a new AI model called the Segment Anything Model (SAM) and a corresponding dataset.

SAM is designed to identify and segment different objects within images and videos. It can recognize objects even those it has not encountered during its training.

SAM is promptable, meaning it can take input prompts like points or boxes to specify what objects to segment. It can handle complex scenes with occlusions, reflections, and shadows. Thus, it will be a valuable tool for computer vision research.

SAM was trained on a vast dataset of 11 million images and 1.1 billion masks covering a wide range of objects and categories. Some of them include plants, animals, food, furniture etc. With its capability, it will be the largest segmentation dataset to date.

SAM’s generalization ability and data diversity allow it to segment new objects. It reveals strong zero-shot performance on various segmentation tasks.

Users can click on objects or use text prompts such as “cat” or “chair” to select objects.

Meta claims that the tool has excellent generalization capability. Hence, it can segment new objects that have not been seen before.

SAM can produce many valid masks when uncertainty exists, a crucial skill for solving segmentation in the real world.

Automatic object detection and masking are now simple with SAM. It can provide a segmentation mask for any prompt instantly.

The release of SAM and its dataset marks a significant advancement in computer vision research. The promptability, strong zero-shot performance, and ability to handle complex scenes make it an essential tool for researchers and developers.

The model’s ability to produce many valid masks when uncertainty exists also demonstrates its potential for real-world applications.

