Reading Time: 3 minutes
Chinese AI research lab DeepSeek grabbed global attention last week with the release of its open-source AI model, DeepSeek-R1. The company says the model rivals industry giants like OpenAI in critical areas such as mathematical reasoning, code generation, and cost efficiency, signalling a shift in the global AI landscape.
What is Deepseek?
DeepSeek is an artificial intelligence research lab which emerged from Fire-Flyer, a deep-learning branch of High-Flyer, a Chinese quantitative hedge fund. Established in 2015, High-Flyer gained prominence by leveraging advanced computing to analyse financial data. By 2023, its founder, Liang Wenfeng, redirected resources towards creating DeepSeek, aspiring to develop groundbreaking AI models.
Unlike most Chinese AI firms, DeepSeek operates independently of major tech giants such as Baidu and Alibaba. Liang’s motivation for this ambitious venture was rooted in scientific curiosity rather than immediate financial returns. “Basic science research rarely offers high returns on investment,” he remarked.
What is DeepSeek-R1?
DeepSeek-R1 is an advanced reasoning model that claims to surpass existing benchmarks on several critical tasks. The model and its variants, such as DeepSeek-R1-Zero, employ large-scale reinforcement learning (RL) techniques and multi-stage training to achieve their capabilities.
DeepSeek has also taken a notable step by open-sourcing not just its flagship models but also six smaller distilled variants, ranging from 1.5 billion to 70 billion parameters. These models are MIT-licensed, enabling researchers and developers to freely distil, fine-tune, and commercialise their work.
DeepSeek vs OpenAI: Is there a difference?
Both Open AI and Deepseek have leveraged AI to create their own large language models (LLM). However, unlike traditional models that depend on supervised fine-tuning, DeepSeek-R1-Zero claims to have emerged with robust reasoning abilities after training solely with RL. However, to enhance readability and address language inconsistencies, DeepSeek introduced DeepSeek-R1, which matches OpenAI’s o1 model in performance on reasoning tasks.
DeepSeek also advanced technical designs such as multi-head latent attention (MLA) and a mixture of experts, which made its models more cost-effective. The latest DeepSeek model required just one-tenth of the computing power used by Meta’s comparable Llama 3.1 model, according to a report by Epoch AI.
Who are the people behind Deepseek?
Liang Wenfeng, born in 1985, is a Chinese entrepreneur and the founder and CEO of DeepSeek. He is also the co-founder of the quantitative hedge fund High-Flyer. Liang’s educational background includes a Bachelor of Engineering in electronic information engineering and a Master of Engineering in information and communication engineering from Zhejiang University.
In 2016, he co-founded the quantitative investment firm Ningbo High-Flyer, which utilised mathematics and AI for investment strategies. Liang expanded his focus on AI by founding High-Flyer AI in 2019, which specialised in AI algorithms and applications. Through DeepSeek, Liang has positioned himself at the forefront of AI research.