Revamped AI benchmark measures how quickly responses are generated for user queries

29th March, 2024
rohillaamit123
Revamped AI benchmark measures how quickly responses are generated for user queries

A recent addition to AI benchmarking involves gauging the responsiveness of large language models in question-and-answer scenarios. Dubbed Llama 2, this benchmark boasts 70 billion parameters and was crafted by Meta Platforms.

On Wednesday, MLCommons, an artificial intelligence benchmarking group, presented a new set of tests and results focused on the speed of high-end hardware in executing AI tasks and responding to user queries।

Two new benchmarks from MLCommons assess how quickly AI chips and systems can produce results by utilizing strong, data-rich AI models. These findings shed light on how quickly artificial intelligence programs like ChatGPT can respond to user requests.

One of the most recent benchmarks, Llama 2, particularly gauges how quickly huge language models can respond to question-and-answer exchanges. It was created by Meta Platforms and has 70 billion parameters.

Furthermore, MLCommons integrated an additional text-to-image generator, called MLPerf, into its collection of benchmarking tools by utilizing Stability AI’s Stable Diffusion XL model.

Based solely on raw performance, servers using Nvidia’s H100 chips from major participants in the market like Google, Supermicro, and Nvidia themselves came out on top in both new benchmarks. A number of server vendors submitted systems that used the less powerful L40S CPU from Nvidia.

Server builder Krai entered the image generating benchmark race with a design that uses a Qualcomm AI chip, which uses a significant amount of power less than the most recent Nvidia CPUs.

Intel also presented a design centered around its Gaudi2 accelerator chips, with the company describing the results as “solid.”

While raw performance is important, energy consumption must also be taken into account when implementing AI applications. Modern AI chips are power-hungry, which makes it difficult for AI companies to strike an appropriate balance between performance and energy economy.