Latest Tech News

Stay updated with the latest in technology, AI, cybersecurity, and more

Filtered by: torch Clear Filter

Show HN: Learn LLMs LeetCode Style

TorchLeet is broken into two sets of questions: Question Set: A collection of PyTorch practice problems, ranging from basic to hard, designed to enhance your skills in deep learning and PyTorch. LLM Set: A new set of questions focused on understanding and implementing Large Language Models (LLMs) from scratch, including attention mechanisms, embeddings, and more. Note Avoid using GPT. Try to solve these problems on your own. The goal is to learn and understand PyTorch concepts deeply. Table o

AI leadership development platform Praxis Labs sells to Torch

Praxis Labs, a learning development platform, announced its acquisition on Thursday for an undisclosed amount by the leadership and coaching enterprise Torch. “As a small company with fewer than 20 people serving companies as large as Amazon, we knew we needed to build powerful partnerships, across product and go-to-market, to reach more companies,” co-founder and CEO of Praxis Labs, Elise Smith, told TechCrunch about the reason for the sale. She and her co-founder, Heather Shen, met Torch CEO

Showh HN: Microjax – JAX in two classes and six functions

Microjax: JAX in two classes and six functions or Read on Github (I recommend actually running the notebook, either on your own computer or Colab). This is inspired by Andrej Karpathy's Micrograd, a PyTorch-like library in about 150 lines of code. Despite PyTorch's popularity, I prefer the way JAX works because it a more functional style. This tutorial borrows heavily from Matthew J Johnson's great 2017 presentation on the predecessor to JAX, autograd: Video / Slides / Code. My main contribut

Showh HN: Microjax - Jax in two classes and six functions

Microjax: JAX in two classes and six functions or Read on Github (I recommend actually running the notebook, either on your own computer or Colab). This is inspired by Andrej Karpathy's Micrograd, a PyTorch-like library in about 150 lines of code. Despite PyTorch's popularity, I prefer the way JAX works because it a more functional style. This tutorial borrows heavily from Matthew J Johnson's great 2017 presentation on the predecessor to JAX, autograd: Video / Slides / Code. My main contribut

Fault Tolerant Llama training – PyTorch blog

Collaborators: Less Wright, Howard Huang, Chien-Chin Huang, Crusoe: Martin Cala, Ethan Petersen tl;dr: we used torchft and torchtitan to train a model in a real-world environment with extreme synthetic failure rates to prove reliability and correctness of fault tolerant training Training loss across 1200 failures with no checkpoints. NOTE: Each small spike is a non-participating worker recovering which affects the metrics but not the model Introduction We want to demonstrate torchft in wo

PyTorch Reshaping with None

PyTorch Reshaping with None Currently I am learning attention mechanism from Dive into Deep Learning book. In the book I see following implementation in masked softmax: def sequence_mask (X, valid_len, value = - 1e6 ): """ X is 2D array (number_of_points, maxlen), valid_len is 1D array (number_of_points)""" max_len = X . size( 1 ) mask = torch . arange(max_len, dtype = torch . float32, device = X . device)[ None , :] < valid_len[:, None ] X[ ~ mask] = value return X In sequential data process

DeepDive in everything of Llama3: revealing detailed insights and implementation

[ View in English | 中文版文档点这里 ] This project is an enhanced version based on naklecha/llama3-from-scratch. It has been comprehensively improved and optimized on the basis of the original project, aiming to help everyone more easily understand and master the implementation principle and the detailed reasoning process of the Llama3 model. Thanks to the contributions of the original author :) The following are the core improvements of this project: Structural Optimization The presentation se