Latest Tech News

Stay updated with the latest in technology, AI, cybersecurity, and more

Filtered by: theta Clear Filter

Overclocking LLM Reasoning: Monitoring and Controlling LLM Thinking Path Lengths

This work investigates how large reasoning models internally track their thinking progress and how such processes can be monitored and controlled. We focus on reasoning models that explicitly segment their computations using <think> and </think> tokens (e.g., DeepSeek-R1), allowing us to study the internal dynamics of the "thinking phase." 1. Monitoring the Thinking Phase We hypothesize that hidden states encode a token's relative position within the thinking phase. To test this, we collect hi

Just Ask for Generalization (2021)

Generalizing to what you want may be easier than optimizing directly for what you want. We might even ask for "consciousness". This blog post outlines a key engineering principle I’ve come to believe strongly in for building general AI systems with deep learning. This principle guides my present-day research tastes and day-to-day design choices in building large-scale, general-purpose ML systems. Discoveries around Neural Scaling Laws, unsupervised pretraining on Internet-scale datasets, and o

Parametric shape optimization with differentiable FEM simulation

All examples are expected to run from the examples/<example_name> directory of the Tesseract-JAX repository . In this example, you will learn how to: Compose both Tesseracts with Tesseract-JAX to create a pipeline that can be used for differentiable shape optimization. Build a Tesseract that uses finite differences under the hood to enable differentiability of a non-autodifferentiable geometry operation (computing a signed distance field from a 3D model). In this notebook, we explore the opt