Wednesday, January 29, 2025

Product and Process Industrial Engineering of Generative AI - LLM Models - Deepseek Story

The product and the processes used in the products of Deepseek are the result of cost reduction and productivity improvement orientation promoted by industrial engineering in engineering activities and tasks. 

Modern Industrial Engineering - A Book of Online Readings.

365+ Lessons and articles and 100+ Case Studies on Industrial Engineering. 

https://www.academia.edu/126612353/Modern_Industrial_Engineering_A_Book_of_Online_Readings


29.1.2025

🎉 DeepSeek-R1 is now live and open source, rivaling OpenAI's Model o1. Available on web, app, and API. 

Click for details.

Into the unknown

Start Now

Free access to DeepSeek-V3.

Experience the intelligent model.

Get DeepSeek App

Chat on the go with DeepSeek-V3

Your free all-in-one AI tool

https://www.deepseek.com/



On 29 November 2023, DeepSeek released the DeepSeek-LLM series of models, with 7B and 67B parameters in both Base and Chat forms. It was developed to compete with other LLMs available at the time. The paper claimed benchmark results higher than most open source LLMs at the time, especially Llama 2.  Like DeepSeek Coder, the code for the model was under MIT license, with DeepSeek license for the model itself.


7 May 2024

DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. 

It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation. Compared with DeepSeek 67B, DeepSeek-V2 achieves significantly stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times. We pretrain DeepSeek-V2 on a high-quality and multi-source corpus consisting of 8.1T tokens, and further perform Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to fully unlock its potential. Evaluation results show that, even with only 21B activated parameters, DeepSeek-V2 and its chat versions still achieve top-tier performance among open-source models.

https://arxiv.org/abs/2405.04434


GitHub page on DeepSeek-V2

https://github.com/deepseek-ai/DeepSeek-V2


Model Architecture for Lower Cost

DeepSeek-V2 adopts innovative architectures to guarantee economical training and efficient inference:

 MLA (Multi-head Latent Attention), which utilizes low-rank key-value union compression to eliminate the bottleneck of inference-time key-value cache, thus supporting efficient inference.

For Feed-Forward Networks (FFNs), we adopt DeepSeekMoE architecture, a high-performance MoE architecture that enables training stronger models at lower costs.



R1
On 20 November 2024, DeepSeek-R1-Lite-Preview became accessible via DeepSeek's API. It was trained for logical inference, mathematical reasoning, and real-time problem-solving.

The DeepSeek-R1 model provides responses comparable to other contemporary LLMs, such OpenAI's GPT-4o and o1, despite being trained at a significantly lower cost—stated at US$6 million compared to $100 million for OpenAI's GPT-4 in 2023—and requiring a tenth of the computing power of a comparable LLM.


In December 2024, they released a base model DeepSeek-V3-Base and a chat model DeepSeek-V3. The model architecture is essentially the same as V2.


By 27 January 2025 the Deepseek Chat app had surpassed ChatGPT as the highest-rated free app on the iOS App Store in the United States.  The chatbot reportedly answers questions, solves logic problems and writes computer programs on par with other chatbots on the market.



26.12.2024

NewsIntroducing DeepSeek-V3 2024/12/26

🚀 Introducing DeepSeek-V3

Biggest leap forward yet

⚡ 60 tokens/second (3x faster than V2!)

💪 Enhanced capabilities

🛠 API compatibility intact

🌍 Fully open-source models & papers


https://api-docs.deepseek.com/news/news1226




Cost Reduction of Deekseek Products

The Steps

[D] How exactly did Deepseek R1 achieve massive training cost reductions, most posts I read are about its performance, RL, chain of thought, etc, but it’s not clear how the cost of training of the model was brought down so drastically


DeepSeek’s AI Cuts $95M in Costs and 98% of GPUs

https://www.linkedin.com/pulse/deepseeks-ai-cuts-95m-costs-98-gpusthe-disruption-big-tech-lfhof/


DeepSeek’s Optimization Strategy: Redefining AI Cost and Efficiency.

Posted on Jan 29


By focusing on cost reduction, open-source collaboration, and efficient model architectures, DeepSeek is redefining what’s possible in AI—democratizing access and challenging the status quo.


As AI continues to evolve, one thing is clear: the future belongs to those who can do more with less. And DeepSeek is leading the way.


Technical Report

PDF Available

Analysis of Efficiency Enhancement through DEEPSEEK Technology and DIKWP Semantic Space Transformation Interaction

January 2025

DOI:10.13140/RG.2.2.29761.67684

Authors:

Yucong Duan, Hainan University

Zhendong Guo, Hainan University

https://www.researchgate.net/publication/388437776_Analysis_of_Efficiency_Enhancement_through_DEEPSEEK_Technology_and_DIKWP_Semantic_Space_Transformation_Interaction


6.2.2025

Event - Deploying DeepSeek V3 and DeepSeek-R1 on Amazon SageMaker



Speakers:

Supreeth S Angadi | GenAI/ML Startups Solution Architect, AWS, 

Pradipta Dash | Senior Startups Solutions Architect, AWS, 

Sourabh Jain | Sr. GenAI Startups Account Manager, AWS


Language:

English


Address:

Bagmane Constellation Business Park Block-7, Bagmane Constellation Service Rd, Ferns City, Doddanekkundi, Bengaluru, Karnataka 560048, IN



Event details



Day Thursday, February 6, 2025


Time 10:00 AM - 4:00 PM India Time


Type IN PERSON

Place  Bagmane Constellation Business Park Block-7, Bagmane Constellation Service Rd, Ferns City, Doddanekkundi, Bengaluru, Karnataka 560048, IN




Are you a startup founder or machine learning (ML) engineer looking to effectively deploy and manage AI models while optimizing costs?


Join us for an intensive hands-on workshop exploring Amazon SageMaker Studio's unified ML development environment and learn production-ready strategies for model deployment.


DeepSeek is a cutting-edge family of large language models that has gained significant attention in the AI community for its impressive performance, cost-effectiveness, and open-source nature. DeepSeek offers a range of models including the powerful DeepSeek-V3, the reasoning-focused DeepSeek-R1, and various distilled versions. These models stand out for their innovative architecture, utilizing techniques like Mixture-of-Experts and Multi-Head Latent Attention to achieve high performance with lower computational requirements.


In this hands-on workshop, you'll learn about Amazon SageMaker Studio's comprehensive toolkit to self-host large language models from DeepSeek while maintaining cost efficiency.


Who is this for? This workshop is ideal for:


Startup founders and technical leaders creating AI solutions

ML Engineers and Data Scientists

DevOps professionals managing GenAI/ML infrastructure

Technical decision-makers evaluating GenAI/ML platforms

Developers interested in self-hosting open-source LLMs

Engineers looking to optimize their GenAI/ML infrastructure costs

During this hands-on workshop, you'll learn how to leverage Amazon SageMaker Studio's unified environment to streamline your ML workflows and implement cost-effective model deployment strategies.


Key highlights:


Master Amazon SageMaker Studio's unified interface and development environment

Hands-on implementation of self-hosting DeepSeek and similar models

Deploy cost-optimization strategies including scale-to-zero capabilities

Enhance inference performance using Fast Model Loader and container caching

Best practices for managing GenAI/ML development lifecycle

Real-world examples of production GenAI/ML infrastructure optimization

Interactive troubleshooting and optimization sessions

This workshop is specifically designed for startup teams who want to productionze GenAI/ML infrastructure while maintaining cost efficiency. You'll gain hands-on experience with Amazon SageMaker's advanced features and learn practical strategies for managing GenAI/ML workloads.



Prerequisites:


Laptop with adequate specifications for hands-on exercises

Basic understanding of machine learning concepts

Familiarity with Python programming

AWS account access

Basic knowledge of container technologies

Understanding of ML deployment concepts.


https://aws.amazon.com/startups/events/deploying-deepseek-v3-and-deepseek-r1-on-amazon-sagemaker-q1



https://en.wikipedia.org/wiki/DeepSeek















No comments:

Post a Comment