PromptSpy shows Android malware with generative AI
PromptSpy is an Android malware family that folds a generative AI service into its execution flow. It is a small twist with outsized impact, because the AI guides interface interactions that traditional scripts often miss.
What makes PromptSpy different: AI-guided persistence
Most Android malware that automates taps and swipes relies on fixed coordinates or brittle selectors. Change the skin, layout, or gesture and the script stumbles. PromptSpy takes a different path. It snapshots the current screen as structured data, prompts a generative model for stepwise instructions, then executes those instructions via Accessibility Services. The goal is simple but potent, stay locked in the recent apps list so the process is not easily killed.
How the loop works
- Capture the current screen structure as XML, including element bounds and labels.
- Send that context to the AI with a natural language goal, for example, lock the app in recents.
- Receive JSON instructions, such as long press a card or swipe to reveal the lock icon.
- Execute the gesture through Accessibility Services, then repeat until the AI confirms success.
Example: on one device, locking an app in recents requires a long press on the app card, then tapping a padlock icon. Because the AI reasons about visible elements rather than fixed coordinates, the same logic adapts across launchers and manufacturer skins. A practical mitigation is to audit Accessibility permissions and require justifications for any app that requests them.
Capabilities after the foothold
Persistence is only the start. PromptSpy ships with a virtual network computing component that provides remote, real-time viewing and control of the device. Communication with the command server occurs over a VNC protocol channel wrapped with symmetric encryption, and the malware can receive additional configuration, including an AI key for its prompting routine. Once active, the operator can watch the screen, simulate input, and gather details about the environment.
What operators can do
- List installed apps and the foreground activity to scope targets.
- Capture lockscreen interactions, including recorded pattern entries and password fields.
- Take screenshots or record the screen on demand.
- Use invisible overlays to interfere with uninstallation or service deactivation.
Real-world scenario: a victim thinks they closed the suspicious app, but the recents view shows a small lock icon on its card. The malware remains alive, the VNC feed stays open, and overlays quietly block attempts to tap Uninstall or Force stop. A quick check is to open recents and ensure no unknown app is pinned.
Distribution clues and targeting signals
Analysis links PromptSpy to sideloaded packages hosted on look-alike sites rather than official stores. One campaign used a banking brand style with the name MorganArg and an icon that echoed a well known U.S. bank, paired with a Spanish login lure. That combination suggests a regional focus and a preference for financial themes. Debug strings revealed simplified Chinese text in the codebase, a common sign that development occurred in a Chinese-speaking environment, though code ancestry does not prove operator geography.
Example: a site using a bank-like color palette and the label Iniciar sesión coaxed visitors to install an update outside normal channels. Legitimate Android apps rarely require manual sideloading for updates. A simple control is to restrict installation from unknown sources at the policy level, then maintain a short exception list for test devices.
Response playbook if infection is suspected
Immediate actions
- Isolate the device from networks to cut the VNC channel.
- Capture a screen recording of the recents view, then power cycle.
- Boot into Safe Mode, which disables third-party apps, and uninstall the suspicious package from Settings, Apps.
- Reboot normally, then revoke any lingering Accessibility permissions for apps that do not need them.
- Reset the device lock method, and rotate credentials that were accessed during the exposure window.
Network and endpoint indicators
Blocklists change quickly, so treat these as starting points at the time of writing:
| Domain or IP | Why it matters |
|---|---|
| m-mgarg.com | Look-alike site used in the install flow |
| mgardownload.com | Distribution host for droppers |
| 54.67.2[.]84 | Command server for the VNC channel |
Tip: on managed fleets, create a detection that flags any app that both requests Accessibility Service privileges and opens a persistent foreground service while pinning itself in recents.
Prevention and detection ideas for teams
- Tighten sideloading: disable installation from unknown sources on production devices, grant time-boxed exceptions only for validated test builds.
- Guard Accessibility Services: require justification and review for apps that request Accessibility privileges, then audit for unusual event volumes or click automation.
- Harden against overlays: enable settings that warn when an app draws over others, and train help desk staff to ask about grayed or untappable buttons during removal attempts.
- Leverage built-in protection: devices with Google Play Services typically include Play Protect by default, which helps block known malware families; keep it active and up to date.
- Hunt for AI-in-the-loop behavior: look for repeated cycles of Accessibility actions interleaved with short network bursts to an AI endpoint, followed by more UI actions.
Common pitfalls: legitimate accessibility tools can resemble automation malware, so pair alerts with behavioral thresholds and human review. Another trap is assuming a static gesture, for example, long press then lock, will identify this threat. The model can adapt the sequence, which means detection should focus on intent patterns, not hardcoded coordinates.
Why AI-choreographed UI attacks matter
Generative models reason about what is on the screen, not where it sits. That single shift gives attackers flexibility across device families and launcher tweaks. Instead of rewriting scripts for each target, an operator can ask the model to navigate from context to goal. The approach is not omnipotent, it depends on reliable Accessibility access and on the model interpreting the screen structure correctly. Still, it reduces the maintenance burden for criminals and widens the set of devices a campaign can handle. Security teams that tune controls around Accessibility, overlays, and unusual recents-lock behavior will catch much of this technique before it evolves further.
Back…