Post-transformer inference: 224× compression of Llama-70B with improved accuracy
(news.ycombinator.com)
1.
Today's top topics:
apple
amazon
google
android
tesla
spacex
generative ai
openai
nvidia
android authority