DeepSeek’s new models offer big inference cost savings • The Register

Chinese AI darling DeepSeek is back with a new open weights large language model that promises performance to rival the best proprietary American LLMs. Perhaps more importantly, it claims to dramatically reduce inference costs and it extends support for Huawei’s Ascend family of AI accelerators.

Unveiled on Friday, DeepSeek V4 is available for download on popular model repos like Hugging Face, the company’s API, and web service in two new flavors. The first is a smaller 284 billion parameter Flash mixture-of-experts (MoE) model with 13 billion active parameters, while the larger of the two is a 1.6 trillion parameter model, 49 billion of which are in use at any given moment.

Go to Source