Show HN: Llama 3.1 70B on a single RTX 3090 via NVMe-to-GPU bypassing the CPU
(news.ycombinator.com)
31.
32.
BarraCUDA Open-source CUDA compiler targeting AMD GPUs
(news.ycombinator.com)
33.
Show HN: AutoShorts – Local, GPU-accelerated AI video pipeline for creators
(news.ycombinator.com)
34.
35.
Gaussian Splatting 3 Ways
(news.ycombinator.com)
36.
TRELLIS.2: state-of-the-art large 3D generative model (4B)
(news.ycombinator.com)
37.
CUDA-l2: Surpassing cuBLAS performance for matrix multiplication through RL
(news.ycombinator.com)
38.
CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication Through RL
(news.ycombinator.com)
39.
40.
41.
42.
CUDA Ontology
(news.ycombinator.com)
43.
44.
45.
Nvidia DGX Spark: great hardware, early days for the ecosystem
(news.ycombinator.com)
46.
Nvmath-Python: Nvidia Math Libraries for the Python Ecosystem
(news.ycombinator.com)
47.
48.
Show HN: CUDA Fractal Renderer
(news.ycombinator.com)
50.
Asynchronous Error Handling Is Hard
(news.ycombinator.com)
51.
CUDA Ray Tracing 2x Faster Than RTX: My CUDA Ray Tracing Journey
(news.ycombinator.com)
52.
Show HN: I built a tensor library from scratch in C++/CUDA
(news.ycombinator.com)