Show HN: Llama 3.1 70B on a single RTX 3090 via NVMe-to-GPU bypassing the CPU
(news.ycombinator.com)
1.
2.
BarraCUDA Open-source CUDA compiler targeting AMD GPUs
(news.ycombinator.com)
3.
Show HN: AutoShorts – Local, GPU-accelerated AI video pipeline for creators
(news.ycombinator.com)
4.
5.
Gaussian Splatting 3 Ways
(news.ycombinator.com)
6.
TRELLIS.2: state-of-the-art large 3D generative model (4B)
(news.ycombinator.com)
7.
CUDA-l2: Surpassing cuBLAS performance for matrix multiplication through RL
(news.ycombinator.com)
8.
CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication Through RL
(news.ycombinator.com)
9.
10.
11.
12.
CUDA Ontology
(news.ycombinator.com)
13.
14.
15.
Nvidia DGX Spark: great hardware, early days for the ecosystem
(news.ycombinator.com)
16.
Nvmath-Python: Nvidia Math Libraries for the Python Ecosystem
(news.ycombinator.com)
17.
18.
Show HN: CUDA Fractal Renderer
(news.ycombinator.com)
20.
Asynchronous Error Handling Is Hard
(news.ycombinator.com)
21.
CUDA Ray Tracing 2x Faster Than RTX: My CUDA Ray Tracing Journey
(news.ycombinator.com)
22.
Show HN: I built a tensor library from scratch in C++/CUDA
(news.ycombinator.com)