Together AI reposted this
‼️ After many requests, we are delighted to release the first research paper describing the method we use to create the BgGPT series of Bulgarian models (https://bggpt.ai/). 🎓 Our method, called BAM (Branch-And-Merge), is general and applicable to any language, task or domain. Using BAM, we can fine-tune a generative AI model to obtain new skills (e.g., Bulgarian) without forgetting the skills of the base model (e.g., math, reasoning, English). 📈 We demonstrate that BAM-trained models enjoy state-of-the-art performance on several new domains (e.g., Bulgarian, German). ♾️ The work is first to explain the success of model merging as a way to mitigate forgetting when fine-tuning. Thus, we propose to actively use model merging during model training. 👪 The work is a collaboration between researchers at INSAIT, Google DeepMind, Together AI, LogicStar AI, ETH Zürich and University of Chicago: Anton Alexandrov, Dr. Veselin Raychev, Dr. Mark Müller, Prof. Ce Zhang, Prof. Martin Vechev, Dr. Kristina Toutanova. 🔜 Expect soon a major new release of more powerful Bulgarian (and other) language models with open license for commercialization and much improved BG Chat for the general public. Link to paper in the comments.