Latest Tech News

Stay updated with the latest in technology, AI, cybersecurity, and more

Filtered by: questions Clear Filter

Show HN: Learn LLMs LeetCode Style

TorchLeet is broken into two sets of questions: Question Set: A collection of PyTorch practice problems, ranging from basic to hard, designed to enhance your skills in deep learning and PyTorch. LLM Set: A new set of questions focused on understanding and implementing Large Language Models (LLMs) from scratch, including attention mechanisms, embeddings, and more. Note Avoid using GPT. Try to solve these problems on your own. The goal is to learn and understand PyTorch concepts deeply. Table o

Join Our Livestream: Inside the AI Copyright Battles

What's going on right now with the copyright battles over artificial intelligence? Many lawsuits regarding generative AI’s training materials were initially filed back in 2023, with decisions just now starting to trickle out. Whether it’s Midjourney generating videos of Disney characters, like Wall-E brandishing a gun, or an exit interview with a top AI lawyer as he left Meta, WIRED senior writer Kate Knibbs has been following this fight for years—and she’s ready to answer your questions. Bring

Grok 4 appears to seek Elon Musk’s views when answering controversial questions

Elon musk and the xAI logo. Vincent Feuray | Afp | Getty Images When xAI's Grok 4 chatbot was launched on Wednesday, users and media outlets quickly began pointing out examples of it consulting its owner Elon Musk's views on controversial matters. CNBC was able to confirm that when asked to take a stance on some potentially contentious questions, the chatbot said it was analyzing posts from Musk while generating its answers. When asked, "Who do you support in the Israel vs Palestine conflict? O

Ars Live recap: Climate science in a rapidly changing world

The conversation then moved to the record we have of the Earth's surface temperatures and the role of Berkeley Earth in providing an alternate method of calculating those. While the temperature records were somewhat controversial in the past, those arguments have largely settled down, and Berkeley Earth played a major role in helping to show that the temperature records have been reliable. Lately, those temperatures have been unusually high, crossing 1.5° C above pre-industrial conditions for t

Perplexity launches AI-powered web browser for select group of subscribers

Perplexity AI on Wednesday launched a new artificial intelligence-powered web browser called Comet in the startup's latest effort to compete in the consumer internet market against companies like Google and Microsoft . Comet will allow users to connect with enterprise applications like Slack and ask complex questions via voice and text, according to a brief demo video Perplexity released on Wednesday. The browser is available to Perplexity Max subscribers, and the company said invite-only acce

Show HN: Dev atrophy test – Can you still code without AI?

Hey HN, I'm Per from Scrimba (YC S20), the code-learning platform. There's been a lot of talk lately about whether AI tools are causing skill atrophy amongst developers. We get a front-row seat to this, and we see more and more students struggle with basic concepts, and building apps on their own. This is almost always a consequence of relying too much on ChatGPT and vibe coding tools. So we built a small side project: https://devatrophy.com It's a test of your core web dev knowledge — no ha

Livestream Replay: Beginner Advice for Claude, a ChatGPT Alternative

Hello WIRED subscribers! Thank you to everyone who attended our most recent AI Unlocked livestream Q&A session, Chatbot Basics: Beginner Advice For Claude, a ChatGPT Alternative. Staff writer Reece Rogers and senior correspondent Kylie Robison provided an overview of Anthropic’s Claude chatbot, one of the most-used alternatives to OpenAI’s ChatGPT and popular with AI insiders. They also answered audience questions about all kinds of topics, such as the main differences between Claude and ChatGPT

Evaluating Long-Context Question and Answer Systems

While evaluating Q&A systems is straightforward with short paragraphs, complexity increases as documents grow larger. For example, technical documentation, novels and movies, as well as multi-document scenarios. Although some of these evaluation challenges also appear in shorter contexts, long-context evaluation amplifies issues such as: Information overload: Irrelevant details in large documents obscure relevant facts, making it harder for retrievers and models to locate the right evidence for

Show HN: I Built AskMedically – Get Research-Backed Answers to Medical Queries

Hi HN, I’ve built AskMedically – an AI-powered assistant that answers health and medical questions using real research papers from trusted medical sources like PubMed, Cochrane, etc. Whether you’re a healthcare enthusiast, patient, student, or professional – AskMedically helps you explore trusted medical knowledge without needing a medical degree or slogging through dozens of PDFs. Examples: • “Does intermittent fasting improve insulin sensitivity?” • “What are the benefits of creatine for

A Chinese firm has just launched a constantly changing set of AI benchmarks

Development of the benchmark at HongShan began in 2022, following ChatGPT’s breakout success, as an internal tool for assessing which models are worth investing in. Since then, led by partner Gong Yuan, the team has steadily expanded the system, bringing in outside researchers and professionals to help refine it. As the project grew more sophisticated, they decided to release it to the public. Xbench approached the problem with two different systems. One is similar to traditional benchmarking:

Think of a Number

My feed was recently clogged up with news articles reporting that Sam Altman thinks that AGI is here, or will be here next year, or whatever. I will refrain from giving even more air to this nonsense by linking to the stories. This kind of irresponsible hype-generation drives me nuts (although it also drives up stock prices so I can see why the tech bros are motivated to do it). Sure AI can have a good crack at undergraduate mathematics right now, and sure that’s pretty amazing. But our universi

Chemical knowledge and reasoning of large language models vs. chemist expertise

Benchmark corpus To compile our benchmark corpus, we utilized a broad list of sources (Methods), ranging from completely novel, manually crafted questions over university exams to semi-automatically generated questions based on curated subsets of data in chemical databases. For quality assurance, all questions have been reviewed by at least two scientists in addition to the original curator and automated checks. Importantly, our large pool of questions encompasses a wide range of topics and que