OpenAI’s GPT-3 language model has been lauded for its impressive abilities, but it’s not without its drawbacks. Its massive size, requiring 175 billion parameters, makes it costly and environmentally unfriendly to run. Despite this, going smaller isn’t necessarily the solution either. Smaller models like GPT-2, while more efficient, lack the robustness and versatility of their larger counterparts.
The sweet spot may lie in ‘medium’ models, such as Google’s MUM (Multitask Unified Model). MUM is a billion-parameter model that demonstrates a balance between efficiency and effectiveness. It can understand and generate language across 75 languages, demonstrating its versatility. It also outperforms GPT-3 in several tasks, including question answering and summarisation.
Another approach is to make larger models more efficient. Microsoft’s Turing-NLG, a 17-billion-parameter model, uses a technique called ‘model distillation’ to reduce the model’s size without compromising performance. This technique involves training a smaller model to mimic the larger one, effectively ‘distilling’ the knowledge from the larger model into a smaller, more efficient one.
In conclusion, size does matter when it comes to AI language models, but bigger isn’t always better. The future may lie in finding the right balance between size and efficiency, or in making larger models more efficient through techniques like model distillation.
Go to source article: https://diginomica.com/not-too-big-not-too-small-just-right-size-does-matter-when-it-comes-gen-ai-language-models