🚨🚨 Deepeek-R1 !! big 3 day tech crash!? 🚨🚨 short NVDA for 5 days at market open monday for $$$
original content (technology)🚨🚨 Deepeek-R1 !! big 3 day tech crash!? 🚨🚨 short NVDA for 5 days at market open monday for $$$
hold TSM though
https://finance.yahoo.com/quote/TSM/ big 3 day tech crash?:
====
https://www.zerohedge.com/news/2025-01-25/sixteen-trillion-dollar-questionomg!!!
= = =
Deepeek-R1
free untrained MODELS at all sizes from 1.5B up to massive 671B.
https://github.com/deepseek-ai/DeepSeek-R1?tab=readme-ov-file#2-model-summary - DeepSeek-R1-Distill-Qwen-1.5B
- DeepSeek-R1-Distill-Qwen-7B
- DeepSeek-R1-Distill-Llama-8B
- DeepSeek-R1-Distill-Qwen-14B
- DeepSeek-R1-Distill-Qwen-32B (requant to 5 bit for 24GB VRAM 4090, sweet!)
- DeepSeek-R1-Distill-Llama-70B [has some censorship]
https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B- DeepSeek-R1 671B !!!!! :
https://huggingface.co/deepseek-ai/DeepSeek-R1All free! 99.999% uncensored. fastest, best benchmarks!!!
= = = = =
https://github.com/deepseek-ai/DeepSeek-V3?tab=readme-ov-file#6-how-to-run-locally big thread :
https://news.ycombinator.com/item?id=42768072 deepseek v3 cost $5.57 million only , pretraining 2,664,000 h800 hours using 8-bit pretraining
no censorship in 32B verion
Run DeepSeek R1 with Ollama and Cline in VScode - 100% Local Solution:
https://www.youtube.com/watch?v=oeBDn6vclz0How to Run DeepSeek R1 Locally (Better Than OpenAI o1) Tutorial:
https://www.youtube.com/watch?v=LJMqT-FZOXThis open source AI crushes everything - DeepSeek R1:
=====
https://www.youtube.com/watch?v=xCQXyZkMsbsHow to Run DeepSeek-R1 Locally | The FREE Open-Source Reasoning AI;
https://www.youtube.com/watch?v=rzMEieMXYFA Install and Run Locally DeepSeek-R1 AI Model on Windows:
https://www.youtube.com/watch?v=1zsK0U33hN4Chinese AI platform DeepSeek overtakes ChatGPT on Apple's app download rankings in the US
======
= = = =
https://arxiv.org/abs/2412.19437 DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages to fully harness its capabilities. Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. Despite its excellent performance, DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training. In addition, its training process is remarkably stable. Throughout the entire training process, we did not experience any irrecoverable loss spikes or perform any rollbacks.
- - - - - -
DeepSeek has 50,000 Nvidia H100 gpus [probably gimped 4090d versions]
DeepSeek has $450 million in pretraining GPU cards, 200 grunt employees, just 17 ai experts
DeepSeek has fully crushed the LLM market, 50 times cheaper and far better!
ai pings : @QuestionEverything , @x0x7 , @Monica , @MasterSuppressionTechnique , @prototype , @observation1 , @taoV , @SecretHitler, @Master_Foo, @Crackinjokes, @Sheitstrom, @FuckShitJesus, @wyrmblut