🎉 Hey Gate Square friends! Non-stop perks and endless excitement—our hottest posting reward events are ongoing now! The more you post, the more you win. Don’t miss your exclusive goodies! 🚀
🆘 #Gate 2025 Semi-Year Community Gala# | Square Content Creator TOP 10
Only 1 day left! Your favorite creator is one vote away from TOP 10. Interact on Square to earn Votes—boost them and enter the prize draw. Prizes: iPhone 16 Pro Max, Golden Bull sculpture, Futures Vouchers!
Details 👉 https://www.gate.com/activities/community-vote
1️⃣ #Show My Alpha Points# | Share your Alpha points & gains
Post your
When browsers become the next battlefield for AI, who will be eliminated?
The future of AI lies in autonomous web navigation agents. Major tech companies are investing in the development of browser agents aimed at automating web tasks and enhancing productivity. This article explores their application scenarios, current challenges, and the opportunities brought by Web3 native solutions. This article is based on a piece by Mario Chow, Figo, and @IOSG, compiled and authored by BlockBeats. (Background: OpenAI's Sam Altman: I'm interested in acquiring Google Chrome! Competing for market share of the largest browser) (Background info: Perplexity offered $34.5 billion to acquire the Chrome browser; AI search engine small soldiers battling giants) In the past 12 months, the relationship between web browsers and automation has undergone dramatic changes. Almost all major tech companies are racing to build autonomous browser agents (browser agent). Starting from the end of 2024, this trend has become increasingly evident: OpenAI launched the Agent mode in January, Anthropic introduced the "Computer Use" feature for the Claude model, Google DeepMind launched Project Mariner, Opera announced the agent-based browser Neon, and Perplexity AI released the Comet browser. The signal is quite clear: the future of AI lies in agents capable of autonomously navigating web pages. This trend is not just about adding smarter chat Bots to browsers but represents a fundamental shift in how machines interact with the digital environment. Browser agents are a class of AI systems that can "see" web pages and take action: clicking links, filling out forms, scrolling pages, inputting text: just like human users. This model promises to unleash tremendous productivity and economic value as it automates tasks that still require human intervention or are too complex for traditional scripts to handle. ▲ GIF demonstration: The practical operation of AI browser agents: following instructions, navigating to target dataset pages, automatically taking screenshots, and extracting required data. Who will win the AI browser war? Almost all major tech companies ( and some startups) are developing their own browser AI agent solutions. Here are some of the most representative projects: OpenAI – Agent Mode OpenAI's Agent mode ( was formerly known as Operator and is set to launch in January 2025 ) as an AI agent equipped with its own browser. Operator can handle various repetitive online tasks: for example, filling out web forms, ordering groceries, scheduling meetings: all completed through standard web interfaces commonly used by humans. ▲ An AI agent schedules meetings like a professional assistant: checking calendars, finding available time slots, creating events, sending confirmations, and generating .ics files for you. Anthropic – Claude's "Computer Use" At the end of 2024, Anthropic introduced a brand new "Computer Use (" feature for Claude 3.5, giving it the ability to operate computers and browsers like a human. Claude can see the screen, move the cursor, click buttons, and input text. This is the first large model agent tool of its kind to enter public beta testing, allowing developers to let Claude navigate websites and applications automatically. Anthropic positions it as an experimental feature mainly aimed at automating multi-step workflows on the web. Perplexity – Comet AI Startup Perplexity ) is known for its Q&A engine( and is set to launch the Comet browser in mid-2025 as an AI-driven alternative to Chrome. The core of Comet is an AI-powered conversational search engine built into the URL bar )omnibox(, capable of providing instant answers and summaries instead of traditional search links. In addition, Comet also features Comet Assistant, a resident agent in the sidebar that can automatically perform daily tasks across websites. For example, it can summarize your opened emails, schedule meetings, manage browser tabs, or browse and fetch web information on your behalf. By allowing the agent to perceive the current web content through the sidebar interface, Comet aims to seamlessly integrate browsing with AI assistance. Real-world application scenarios of browser agents In the previous text, we reviewed how major tech companies ) OpenAI, Anthropic, Perplexity, etc.( inject functionalities into browser agents )browser agents( through different product forms. To better understand their value, we can further examine how these capabilities are applied in real-life scenarios in daily life and business workflows. Daily web automation # E-commerce and personal shopping A very practical scenario is to delegate shopping and booking tasks to agents. Agents can automatically fill your online shopping cart and place orders based on a fixed list, or search for the lowest prices among multiple retailers and complete the checkout process for you. For travel, you can ask the AI to perform tasks like: "Help me book a flight to Tokyo next month ) with a fare under $800(, and book a hotel with free Wi-Fi." The agent will handle the entire process: searching for flights, comparing options, filling in passenger information, and completing hotel bookings, all through the airline and hotel websites. This level of automation far exceeds existing travel Bots: it is not just about recommending but directly executing purchases. # Enhancing office efficiency Agents can automate many repetitive business operations that people perform in browsers. For example, organizing emails and extracting to-do items, or checking gaps across multiple calendars and automatically scheduling meetings. Perplexity's Comet Assistant can already summarize the content of your inbox through the web interface or add new appointments for you. The agent can also log into SaaS tools to generate regular reports, update spreadsheets, or submit forms once authorized by you. Imagine an HR agent that can automatically log into different recruitment websites to post jobs; or a sales agent that can update leads in the CRM system. These daily mundane tasks could take up a lot of employee time, but AI can complete them by automating web forms and page operations. Besides single tasks, agents can also link together complete workflows across multiple web systems. All these steps require operations across different web interfaces, which is precisely the strength of browser agents. Agents can log into various dashboards for troubleshooting and even orchestrate processes, such as completing onboarding tasks ) by creating accounts on multiple SaaS websites(. Essentially, any multi-step operation that currently requires opening multiple websites can be delegated to agents. Current challenges and limitations Despite the immense potential, today's browser agents are still far from perfect. Current implementations reveal some long-standing technical and infrastructure issues: Mismatched architecture Modern...