The JUWELS supercomputer was used to train Teuken-7B. Copyright: Forschungszentrum Jülich / Sascha Kreklau
In late 2024, OpenGPT-X—a German consortium of artificial intelligence (AI) experts—released Teuken-7B, one of the first commercial large language models (LLMs) developed multilingually from the beginning. Like other LLMs, Teuken-7B is designed to support individuals and organizations to accelerate their workflows with customizable models while also protecting their data.
"This release is a great success," says Dr. Stefan Kesselheim, who is leading the project at JSC together with Dr. Andreas Herten. “It is the first model of its kind that we have trained on our computer.”
As a partner in the OpenGPT-X consortium, the Jülich Supercomputing Centre (JSC) supported Teuken-7B’s development by giving access to the center’s JUWELS supercomputer to train the model and JSC’s AI experts were closely involved with training Teuken-7B. With its 3,744 NVIDIA A100 GPUs, JUWELS is the most powerful system of its kind in Germany and well-suited for AI training tasks. The center’s upcoming flagship system, JUPITER, will lead Germany across the exascale threshold in 2025 and offer many times more computing power for AI training tasks.
The consortium has already made progress on a number of important research questions, including how to train and operate a multilingual LLM in the most energy-effective and cost-effective manner possible. The project already developed a multilingual “tokenizer,” which allows the model to break down words into individual components. This method leads to a reduction in training costs, and is particularly valuable for European languages with longer word structures such as German, Finnish, or Hungarian.
For more information about Teuken-7B, read the full release on the JSC website. For more information about OpenGPT-X, visit the consortium’s website.