380,000 vibe-coded apps are sitting on the open web. 5,000 of them are leaking real data.
RedAccess found that AI coding tools like Lovable, Base44, and Replit default to public hosting, leaving medical records, bank internals, and corporate secrets indexed by Google.
RedAccess, an Israeli cybersecurity startup, scanned the public web and found 380,000 applications built with AI coding tools that anyone could access. About 5,000 of those apps contained sensitive data: medical records, bank financials, corporate documents, and full customer service transcripts. No login required.
The findings, first reported by Wired on May 8 and verified independently by Axios, paint a picture of what happens when non-developers build production apps with AI and nobody checks the defaults.
Vibe coding, the practice of generating full applications from natural-language prompts without reviewing the code underneath, went from a curiosity to a workplace norm in roughly six months. The speed is the selling point. The security model, or the absence of one, is the cost nobody priced in.
The default is public, and nobody’s changing it
The root cause isn’t complicated. Platforms like Lovable, Base44, Replit, and Netlify let anyone generate a working web app from a text prompt. The problem is that many of these platforms ship apps with public-by-default hosting. The generated apps land on the open internet, get indexed by Google, and sit there until someone notices.
RedAccess CEO Dor Zvi told Wired: “There is no limit to how easily people can make something like this and use it in a production environment without company permission.”
That last phrase matters. These aren’t developer side projects on someone’s personal domain. These are apps built by employees using company data, deployed to publicly indexed URLs, and forgotten.
Axios verified exposed apps that included a shipping company’s vessel routing schedules showing which vessels were heading to which ports, internal financial documents from a Brazilian bank, and full unredacted customer service conversations from a cabinet supplier. One app built for a children’s care facility held patient conversations. Another, built for a hospital, contained doctor-patient summaries and patient complaints. Both were accessible to anyone with the URL, and Google had already indexed them.
The scale of the mess
Out of 380,000 publicly accessible apps that RedAccess identified, roughly 5,000 contained data that shouldn’t have been public. That’s about 1.3%, a number that sounds small until you consider what “sensitive” means here: clinical trial statuses, incident response logs from a security firm, school recordings with student and teacher information, and chatbot transcripts with personal data woven through them.
VentureBeat reports that roughly 40% of the flagged apps exposed data without any access controls at all. No authentication, no rate limiting, no obfuscation. Just raw data behind a URL that Google had already crawled.
The exposure patterns echo the S3-bucket misconfiguration wave of 2017-2019, when companies left Amazon storage buckets open to the internet. But there’s a key difference: those buckets were configured by engineers who should have known better. These apps were built by people who never saw the configuration in the first place. The AI generated the app. The AI picked the defaults. The user didn’t know there were defaults to change.
What “40% sensitive” actually means
RedAccess categorized the exposed data across the 5,000 flagged apps. The breakdown, reported by VentureBeat, includes clinical trial statuses from a healthcare company, incident response logs from a security firm, school recordings containing student and teacher information, staff scheduling systems, and corporate strategy presentations.
The common thread: every one of these apps was built to solve a real workplace problem. Someone needed a quick tool to track patient intake, or a dashboard for shipping logistics, or a form for customer feedback. The AI delivered a working app in minutes. But “working” and “secure” aren’t the same thing, and none of these platforms insert a security review between “generate” and “deploy.”
The apps that contained medical data are particularly concerning. In the US, health data exposed without authorization triggers HIPAA obligations regardless of whether the person who built the app understood they were handling protected health information. A vibe-coded patient tracker built by a well-meaning administrator carries the same legal exposure as a breach at a major health system.
Phishing on the same rails
RedAccess found something worse than accidental exposure. Attackers are using the same vibe-coding platforms to build phishing sites. The researchers identified sites impersonating Bank of America, FedEx, Trader Joe’s, and McDonald’s, all built with Lovable.
These phishing pages benefit from the same convenience that makes vibe coding attractive to legitimate users: fast deployment, professional-looking output, and hosting on platform subdomains that don’t immediately trigger browser warnings. A phishing page on a Lovable subdomain looks more trustworthy than one on a freshly registered .xyz domain. Lovable told Wired it began investigating and removing the phishing sites after being notified.
The platforms push back
The responses from the platforms follow a pattern. Replit CEO Amjad Masad argued that RedAccess only gave the company 24 hours before going to the press. Lovable and Base44 said RedAccess didn’t provide the specific URLs needed to verify findings. These are process objections that don’t address the core problem: the defaults are wrong.
Masad’s broader defense, that public apps are public by design and should be accessible online, sidesteps the question of whether the people building these apps understood they were building in public. A developer who deploys to Vercel knows what “public” means. A non-technical employee who types “build me an app to track patient complaints” into Lovable probably doesn’t.
Lovable said it began investigating and removing phishing sites after RedAccess flagged them. But the phishing cleanup doesn’t address the larger question of why legitimate apps default to public hosting in the first place.
Shadow IT, but faster
CISOs have dealt with shadow IT for decades: employees spinning up tools without IT approval. Vibe coding accelerates the problem by orders of magnitude. An employee who wanted an unauthorized app five years ago had to find one, sign up, and learn it. Now they can build one in 15 minutes, populate it with production data, and forget about it.
The Five Eyes joint advisory on agentic AI warned about exactly this class of risk: AI tools operating with real data and insufficient oversight. That warning focused on agentic systems, but vibe-coded apps are the same problem wearing a friendlier face. They’re autonomous enough to deploy themselves, but not autonomous enough to secure themselves.
What this means for you
If your organization hasn’t audited for vibe-coded shadow apps, start now. Search for your company name and internal terminology on Lovable, Base44, and Replit subdomains. Check whether any employee-built apps reference internal APIs or databases.
For individual developers, the lesson is narrower but still real: AI-generated code inherits whatever defaults the platform ships. VS Code’s recent Copilot co-author incident showed what happens when AI tools make assumptions about attribution. This research shows what happens when they make assumptions about access control.
The fix isn’t banning AI coding tools. It’s the same fix that eventually tamed S3 misconfigurations: make private the default, make public an explicit opt-in, and put audit tooling in between. AWS added S3 Block Public Access in 2018 after years of breach headlines. It took embarrassment at scale to change the default. The vibe-coding platforms are at the start of that same curve, and 380,000 exposed apps suggest the embarrassment phase has arrived.
Until the platforms ship private-by-default, that number is going to keep growing.
Share this article
Sources
Frequently Asked
- What is vibe coding?
- Vibe coding is the practice of using AI tools like Lovable, Replit, or Base44 to generate full applications from natural-language prompts, often without writing or reviewing the underlying code.
- Why are vibe-coded apps leaking data?
- Many AI coding platforms default to making apps publicly accessible and indexed by search engines. Users who don't manually change privacy settings end up hosting sensitive data on the open web.
- How can companies protect themselves from vibe-coded data leaks?
- Audit for shadow apps built on AI platforms, enforce private-by-default deployment policies, and scan for exposed API keys and database credentials in any AI-generated code before it touches production data.