Latest Tech News

Stay updated with the latest in technology, AI, cybersecurity, and more

Filtered by: agents Clear Filter

Qodo CLI agent scores 71.2% on SWE-bench Verified

We’re excited to announce that Qodo Command, our CLI agent, achieved a scored of 71.2% on SWE-bench Verified (submission pending review), the leading benchmark for evaluating AI agents on real-world software engineering tasks. This achievement is a strong signal that Qodo’s agents are built for the realities of production development. For use cases like reviewing code, writing tests, fixing bugs, and generating features, our CLI agent goes beyond autocomplete to deliver thoughtful, context-awar

Study warns of security risks as ‘OS agents’ gain control of computers and phones

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Researchers have published the most comprehensive survey to date of so-called “OS Agents” — artificial intelligence systems that can autonomously control computers, mobile phones and web browsers by directly interacting with their interfaces. The 30-page academic review, accepted for publication at the prestigious Association for Computatio

Launch HN: Halluminate (YC S25) – Simulating the internet to train computer use

Hi everyone, Jerry and Wyatt here from Halluminate ( https://halluminate.ai/ ). We help AI labs train computer use agents with high quality data and RL environments. Training AI agents to use computers, browsers, and software is one of the highest-potential opportunities for AI. To date, however, this capability is still unreliable. The emerging method to improve this is called Reinforcement Learning with Verifiable Rewards (RLVR). However, researchers are currently bottlenecked by a lack of hi

Open SWE: An open-source asynchronous coding agent

The use of AI in software engineering has evolved over the past two years. It started as autocomplete, then went to a copilot in an IDE, and in the fast few months has evolved to be a long running, more end-to-end agent that run asynchronously in the cloud. We believe that all agents will long more like this in the future - long running, asynchronous, more autonomous. Specifically, we think that they will: Run asynchronously in the cloud Integrate directly with your tooling Have enough conte

We built an open-source asynchronous coding agent

The use of AI in software engineering has evolved over the past two years. It started as autocomplete, then went to a copilot in an IDE, and in the fast few months has evolved to be a long running, more end-to-end agent that run asynchronously in the cloud. We believe that all agents will long more like this in the future - long running, asynchronous, more autonomous. Specifically, we think that they will: Run asynchronously in the cloud Integrate directly with your tooling Have enough conte

Gen AI disillusionment looms, according to Gartner's 2025 Hype Cycle report

JuSun/Getty ZDNET's key takeaways Gartner has released its 2025 Hype Cycle report. AI agents and data are at their most inflated and need precise application to yield results. The report also emphasized trust and safety efforts as critical to the next five years. Research firm Gartner has released its annual Hype Cycle report, which investigates whether new technology is living up to expectations or is still far off from making a meaningful impact. At the top of the list this year? AI agent

Gartner's AI Hype Cycle reveals which AI tech is peaking - but will it last?

JuSun/Getty ZDNET's key takeaways Gartner has released its 2025 Hype Cycle report. AI agents and data are at their most inflated and need precise application to yield results. The report also emphasized trust and safety efforts as critical to the next five years. Research firm Gartner has released its annual Hype Cycle report, which investigates whether new technology is living up to expectations or is still far off from making a meaningful impact. At the top of the list this year? AI agent

Google’s new diffusion AI agent mimics human writing to improve enterprise research

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Google researchers have developed a new framework for AI research agents that outperforms leading systems from rivals OpenAI, Perplexity, and others on key benchmarks. The new agent, called Test-Time Diffusion Deep Researcher (TTD-DR), is inspired by the way humans write by going through a process of drafting, searching for information, an

Tavily raises $25M to connect AI agents to the web

Companies across many industries are implementing AI agents for internal use, automating a wide range of tasks. In the financial sector, AI agents are critical for fraud detection. They can analyze vast amounts of transaction data in real time. Meanwhile, sales organizations are using AI agents to gather data on potential customers. These AI sales agents can scour the web and social media for information. To be effective, these agents need to access the internet and find information from relev

Coding Agents 101

Coding Agents 101: The Art of Actually Getting Things Done The year is 2025. Coding agents aren't magic, but they're about the closest thing we have. We've noticed some engineers, in particular at the senior-to-staff level, finding success faster than others. Here we share some top lessons sourced from the experience of our customers and ourselves. About this guide: Product-agnostic We discuss tips that will help you be successful with any coding agent. Tactical We offer our favorite bits of act

Google Cloud’s data agents promise to end the 80% toil problem plaguing enterprise data teams

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Data doesn’t just magically appear in the right place for enterprise analytics or AI, it has to be prepared and directed with data pipelines. That’s the domain of data engineering and it has long been one of the most thankless and tedious tasks that enterprises need to deal with. Today, Google Cloud is taking direct aim at the tedium of da

Google embeds AI agents deep into its data stack - here's what they can do for you

Joan Cros/NurPhoto via Getty Images ZDNET's key takeaways Google is introducing powerful tech for agents and data. They are also introducing a series of data-centric agents. A new command-line AI coding tool is now available. I am no stranger to hyperbolic claims from tech companies. Anyone who's on the receiving end of a firehose of press announcements related to AI understands. Everything is game-changing, world-changing, the most, the best, yada, yada, yada. And then there's Google. Goo

Genie 3: A new frontier for world models

Given a text prompt, Genie 3 can generate dynamic worlds that you can navigate in real time at 24 frames per second, retaining consistency for a few minutes at a resolution of 720p. Towards world simulation At Google DeepMind, we have been pioneering research in simulated environments for over a decade, from training agents to master real-time strategy games to developing simulated environments for open-ended learning and robotics. This work motivated our development of world models, which are

DeepMind reveals Genie 3, a world model that could be the key to reaching AGI

Google DeepMind has revealed Genie 3, its latest foundation world model that the AI lab says presents a crucial stepping stone on the path to artificial general intelligence, or human-like intelligence. “Genie 3 is the first real-time interactive general purpose world model,” Shlomi Fruchter, a research director at DeepMind, said during a press briefing. “It goes beyond narrow world models that existed before. It’s not specific to any particular environment. It can generate both photo-realistic

Inside OpenAI’s quest to make AI do anything for you

Shortly after Hunter Lightman joined OpenAI as a researcher in 2022, he watched his colleagues launch ChatGPT, one of the fastest-growing products ever. Meanwhile, Lightman quietly worked on a team teaching OpenAI’s models to solve high school math competitions. Today that team, known as MathGen, is considered instrumental to OpenAI’s industry-leading effort to create AI reasoning models: the core technology behind AI agents that can do tasks on a computer like a human would. “We were trying t

Build an AI telephony agent for inbound and outbound calls

AI Telephony Agent Make INBOUND and OUTBOUND calls with AI agents using VideoSDK. Supports multiple SIP providers and AI agents with a clean, extensible architecture for VoIP telephony solutions. Installation Prerequisites Python 3.11+ VideoSDK account Twilio account (SIP trunking provider) Google API key (for Gemini AI) Setup Clone the repository git clone https://github.com/yourusername/ai-agent-telephony.git cd ai-agent-telephony Install dependencies pip install -r requirements.txt

Topics: add agent agents ai sip

Deep Agents

Using an LLM to call tools in a loop is the simplest form of an agent. This architecture, however, can yield agents that are “shallow” and fail to plan and act over longer, more complex tasks. Applications like “Deep Research”, “Manus”, and “Claude Code” have gotten around this limitation by implementing a combination of four things: a planning tool, sub agents, access to a file system, and a detailed prompt. Acknowledgements: this exploration was primarily inspired by Claude Code and reports o

Launch HN: Lucidic (YC W25) – Debug, test, and evaluate AI agents in production

Hi HN, we’re Abhinav, Andy, and Jeremy, and we’re building Lucidic AI ( https://dashboard.lucidic.ai ), an AI agent interpretability tool to help observe/debug AI agents. Here is a demo: https://youtu.be/Zvoh1QUMhXQ. Getting started is easy with just one line of code. You just call lai.init() in your agent code and log into the dashboard. You can see traces of each run, cumulative trends across sessions, built-in or custom evals, and grouped failure modes. Call lai.create_step() with any metad

Runloop lands $7M to power AI coding agents with cloud-based devboxes

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Runloop, a San Francisco-based infrastructure startup, has raised $7 million in seed funding to address what its founders call the “production gap” — the critical challenge of deploying AI coding agents beyond experimental prototypes into real-world enterprise environments. The funding round, led by The General Partnership with participati

How can enterprises keep systems safe as AI agents join human employees? Cyata launches with a new, dedicated solution

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now You thought generative AI was a technological tidal wave of change coming for enterprises, but the truth is — at 2.5 years since the launch of ChatGPT — the change is only getting started. A whopping 96% of IT and data executives plan to increase their use of AI agents this year alone, according to a recent survey from Cloudera covered by C

Want AI agents to work together? The Linux Foundation has a plan

MR.Cole_Photographer/Getty With the rise of AI agents, AI programs that can perform tasks for you without being explicitly told how to carry out every individual step, a problem has arisen. It's an old one in tech circles: Interoperability. How do you get AI agents to work together? One answer is Cisco's AGNTCY (pronounced "agency") project. To prevent AI agency fragmentation, Cisco has donated the AGNTCY project to the Linux Foundation. The Thai project is backed by numerous industry heavywei

Principles for production AI agents

Every now and then, people ask me: “I am new to agentic development, I’m building something, but I feel like I'm missing some tribal knowledge. Help me catch up!”. I’m tempted to suggest some serious stuff like multiweek courses (e.g. by HuggingFace or Berkeley), but not everyone is interested in that level of diving. So I decided to gather six simple empirical learnings that helped me a lot during app.build development. This post is somewhat inspired by Design Decisions Behind app.build, but

Six Principles for Production AI Agents

Every now and then, people ask me: “I am new to agentic development, I’m building something, but I feel like I'm missing some tribal knowledge. Help me catch up!”. I’m tempted to suggest some serious stuff like multiweek courses (e.g. by HuggingFace or Berkeley), but not everyone is interested in that level of diving. So I decided to gather six simple empirical learnings that helped me a lot during app.build development. This post is somewhat inspired by Design Decisions Behind app.build, but

How E2B became essential to 88% of Fortune 100 companies and raised $21 million

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now E2B, a startup providing cloud infrastructure specifically designed for artificial intelligence agents, has closed a $21 million Series A funding round led by Insight Partners, capitalizing on surging enterprise demand for AI automation tools. The funding comes as an remarkable 88% of Fortune 100 companies have already signed up to use E2B

Software engineer on the real state of AI agents (they're not there yet)

Serving tech enthusiasts for over 25 years.TechSpot means tech analysis and advice you can trust A hot potato: Amid growing hype around AI agents, one experienced engineer has brought a grounded perspective shaped by work on more than a dozen production-level systems spanning development, DevOps, and data operations. From his vantage point, the notion that 2025 will bring truly autonomous workforce-transforming agents looks increasingly unrealistic. In a recent blog post, systems engineer Utka

Anthropic unveils ‘auditing agents’ to test for AI misalignment

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now When models attempt to get their way or become overly accommodating to the user, it can mean trouble for enterprises. That is why it’s essential that, in addition to performance evaluations, organizations conduct alignment testing. However, alignment audits often present two major challenges: scalability and validation. Alignment testing r

Gupshup raises $60M in equity and debt, leaves unicorn status hanging

Gupshup, a business messaging startup that began its journey in India over two decades ago and became a unicorn four years ago, has raised a new over $60 million round — but is keeping its new valuation under wraps. In 2021, Gupshup raised two funding rounds within four months, securing $340 million from prominent investors including Tiger Global, Fidelity Management, Think Investments, and Malabar Investments. These rounds — the startup’s first in roughly a decade — valued Gupshup at $1.4 bill

Open-source MCPEval makes protocol-level agent testing plug-and-play

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Enterprises are beginning to adopt the Model Context Protocol (MCP) primarily to facilitate the identification and guidance of agent tool use. However, researchers from Salesforce discovered another way to utilize MCP technology, this time to aid in evaluating AI agents themselves. The researchers unveiled MCPEval, a new method and open-so

This startup thinks email could be the key to usable AI agents

AI companies are pushing agents as the next Great Workplace Disruptor, but experts say they’re still not ready for prime time. AI agents often struggle to make decisions by themselves, hallucinate frequently, can’t cooperate with other agents, fail at confidentiality awareness, and integrate poorly with existing systems. Industry pioneers like Andrej Karpathy and Ali Ghodsi have said that, like the deployment of autonomous vehicles, humans need to be in the loop in order for agents to succeed.

OpenAI’s ChatGPT Agent Is Haunting My Browser

Most people’s browser tabs are filled with unread news articles. Mine are filled with AI agents and ghost clicks. I have four instances of OpenAI’s ChatGPT Agent—the generative AI tool released last week, which can run searches and perform tasks on the web—already open with each running in its own tab. I’ve given these first four agents relatively simple jobs based on ChatGPT’s suggestions. One is clicking around to find a birthday gift on the Target website, and another is generating a pitch d