CSC and LUMI part of AI2’s OLMo language model
The Allen Institute for AI (AI2) announced the creation of an open, state-of-the-art generative language model: AI2 OLMo (Open Language Model). OLMo will be comparable in scale to other state-of-the-art large language models at 70 billion parameters, and is expected in early 2024.
OLMo will be a uniquely open language model intended to benefit the research community by providing access and education around all aspects of model creation. AI2 is developing OLMo in collaboration with AMD and CSC, using the new GPU portion of the all-AMD processor powered LUMI pre-exascale supercomputer – one of the greenest supercomputers in the world.
OLMo will be a new avenue for many people in the AI research community to work directly on language models for the first time. All elements of the OLMo project will be accessible – not only will the data be available, but so will the code used to create the data. The model, the training code, the training curves, and evaluation benchmarks will be open-sourced. The ethical and educational considerations around the creation of this model will also be openly shared and discussed to help guide the understanding and responsible development of language modeling technology.
– With the scientific community in mind, OLMo will be purpose-built to advance the science of language models. OLMo will be the first language model specifically designed for scientific understanding and discovery, says Hannaneh Hajishirzi, an OLMo project lead and a Senior Director of NLP Research at AI2.
– AI2’s deep heritage in natural language processing (NLP) with AMD’s history of supporting the scientific community through our high-performance computing efforts are a perfect match for OLMo. With the new OLMo initiative from AI2, which is geared for science, we have the capability to extend our knowledge into generative AI using the impressive capabilities from the LUMI supercomputer powered by AMD EPYC™ CPUs and AMD Instinct™ accelerators, said Ian Ferreria, senior director, AI Solutions, AMD.
– OLMo will be something special. In a landscape where many are rushing to cash in on the business potential of generative language models, AI2 has the unique ability to bring our world-class expertise together with world-class hardware from AMD and LUMI to produce something explicitly designed for scientists and researchers to engage with, learn from, and use to create the next generation of safe, effective AI technologies, notes Noah Smith, an OLMo project lead and a Senior Director of NLP Research at AI2.
Pekka Manninen, Director of Science and Technology at CSC, adds:
– Generative AI carries the potential of being the breakthrough technology of this decade, analogous to how search engines and smartphones penetrated our society in the previous decades. Open, transparent, and explainable Large Language Models (LLMs) are vital for the democratization of this technology. We are proud to be part of this collaboration for its great societal impact and technological ambition level, and happy that we can contribute to it with the LUMI supercomputer and our expertise. Supercomputers like LUMI can accelerate LLM training by an order of magnitude, and many other features of the LUMI infrastructure position it as a leading platform for natural language processing.
Read the full text on the Allen Institute of AI website.
AMD, the AMD Arrow logo, EPYC, AMD Instinct, and combinations thereof are trademarks of Advanced Micro Devices, Inc.