Saturday 08 March 2025
A team of researchers has developed a new type of artificial intelligence that can anticipate and predict complex actions, such as cooking or assembling furniture, by analyzing short videos of people performing these tasks. The AI system, called MANTA, uses a combination of computer vision and machine learning techniques to identify patterns in the video data and make predictions about what will happen next.
MANTA is designed to learn from small amounts of data, making it useful for applications where large datasets are not available. For example, it could be used to train robots to perform tasks in environments with limited training data.
The system consists of a neural network that analyzes the video frames and identifies specific objects and actions within them. It then uses this information to predict what will happen next, such as whether a person will pick up an object or move it to a different location.
MANTA has been tested on several datasets, including videos of people cooking, assembling furniture, and performing other everyday tasks. The results show that the system is able to make accurate predictions about 80% of the time, even when given only a few seconds of video data.
One potential application of MANTA is in the development of autonomous robots that can perform complex tasks in various environments. For example, it could be used to train a robot to assemble furniture in a factory setting or to cook meals in a kitchen.
Another potential application is in the field of healthcare, where MANTA could be used to analyze videos of patients performing daily activities and predict whether they are at risk of falling or experiencing other health problems.
Cite this article: “AI System Predicts Complex Actions from Short Videos”, The Science Archive, 2025.
Artificial Intelligence, Machine Learning, Computer Vision, Video Analysis, Pattern Recognition, Prediction, Robotics, Autonomous Systems, Healthcare, Fall Risk Assessment







