I Turned WeChat into an API Using Claude Code and Frida

Here’s what I built: a bridge that turns the official WeChat desktop client into something you can control programmatically. It’s an OpenClaw plugin, so any AI agent running on OpenClaw can send and receive WeChat messages natively. It also implements a Wechaty puppet, so it plugs into the most popular WeChat bot framework. You can send text, receive messages, download images, and switch between conversations, all without touching the GUI.

One person. A little over two weeks. Most of the hard work was done by Claude Code.

Why This Exists

This project started as a side effect of building Agent RDP, a tool for controlling remote desktops via AI. One of Agent RDP’s outputs is the accessibility tree, the structured representation of every UI widget on screen that operating systems expose for assistive technologies. I discovered that some models struggle to control a computer from screenshots alone, but giving them the accessibility tree makes them dramatically better at it.

While working on that, I started using OpenClaw, an open-source AI personal assistant that you interact with through messaging platforms like WhatsApp, Telegram, and Slack. It bothered me that there was no WeChat integration. WeChat is the dominant messaging platform for over a billion people, and there’s no official API for personal accounts. I went looking for existing solutions and came up empty.

There are good reasons for that.

Why Nobody Has Done This

WeChat is one of the hardest messaging platforms to build third-party integrations for. Four reasons:

1. The protocol is closed. There’s no official API for personal accounts. Tencent offers APIs for Official Accounts (public pages) and Mini Programs, but nothing for person-to-person messaging.

2. MMTLS, a proprietary encryption protocol. WeChat doesn’t use standard TLS. It uses MMTLS (“MicroMessenger TLS”), a custom protocol based on TLS 1.3 drafts with significant modifications. On top of that, there’s a second business-layer encryption underneath, so the client performs double encryption before anything hits the wire. Citizen Lab published a security analysis of MMTLS in October 2024, and prior to their research, no tools existed to inspect MMTLS-encrypted traffic. Reverse-engineering this protocol from scratch is not a realistic option.

3. Aggressive crackdowns on reverse-engineering efforts. In January 2026, Tencent filed DMCA takedown requests with GitHub, resulting in the removal of over 4,000 repositories (including forks) related to WeChat database decryption and chat history export, tools that circumvented SQLCipher encryption on WeChat’s local databases. Earlier, Tencent restricted Web WeChat access starting in 2017, killing an entire ecosystem of bots and automation libraries that depended on the web protocol. Accounts registered after 2017 can’t even log into Web WeChat anymore.

4. Account bans. There are widespread reports of accounts being suspended after suspicious activity. Tencent monitors for bot-like behavior and will flag or ban accounts they believe are being automated.

How They Detect You

I studied WeChat’s detection strategies before writing any code. They operate on several layers:

Behavioral monitoring. Sending too many messages too quickly, adding lots of contacts in a short window, or repetitive message patterns all trigger flags.
Client fingerprinting. They check whether you’re using an official client. Projects that emulate or puppet the protocol (like the old “iPad protocol” or “Web protocol” approaches) leave detectable traces.
Network-level analysis. Unusual IP patterns, VPN usage, or connections that don’t match typical client behavior get flagged.
Legal enforcement. Beyond technical measures, they actively scan GitHub and send DMCA takedowns and cease-and-desist letters for commercial operations.

Why I Thought It Was Feasible Anyway

I believed there were three angles that made this different from previous approaches:

Behavioral: This is going to be used by my AI assistant, and the assistant acts like a human: it reads messages, occasionally replies, maybe sends a few messages in a conversation. It doesn’t blast hundreds of messages per minute or add strangers en masse. From WeChat’s behavioral detection standpoint, it’s indistinguishable from a normal user.

Network and client fingerprinting: I’m not reverse-engineering MMTLS or emulating the wire protocol. I’m running the actual, official WeChat binary. It handles all network communication, authentication, and encryption. There’s no protocol emulation, no forged packets, no spoofed client signatures. Nothing for client fingerprinting to flag.

Legal: The DMCA takedowns targeted tools that offer database decryption as a service, letting users export and crack open their own WeChat databases. Our tool doesn’t do that. While we did reverse-engineer the database encryption in order to build the project, the end product just lets people use WeChat: send messages, receive messages, the same things the official client does. We’re not distributing a decryption tool. With those constraints in mind, I started building.

Approach 1: Accessibility Tree Control on macOS

My first instinct was to use what I already had from Agent RDP, AI-driven computer control via the accessibility tree. On macOS, WeChat exposes a reasonable accessibility tree, so an AI agent could theoretically read the chat window and type messages by controlling the keyboard and mouse.

This worked, technically. But it had fatal problems:

It takes over your input devices. When the agent is controlling WeChat, it’s moving your mouse and typing on your keyboard. You can’t use your computer for anything else.
LLM reasoning is expensive for deterministic tasks. Most actions in a chat app are entirely predictable: select a conversation, paste text, press enter. Burning GPT-4-class tokens to figure out “click the text box and type” is absurd. LLMs should handle decisions, not UI automation.
It’s slow. Every action requires a round trip to the model, a screenshot or tree capture, and another inference step.

Verdict: clever but impractical.

Approach 2: Linux WeChat with the Accessibility Tree

Next idea: instead of controlling macOS WeChat with a model, what about using the accessibility tree directly for deterministic RPA on a Linux build?

Tencent released an official Linux version of WeChat in late 2024, which made this possible. The Linux client uses the Qt framework. By default, it doesn’t expose accessibility information, but Qt supports Linux’s AT-SPI (Assistive Technology Service Provider Interface), the D-Bus-based accessibility framework. You can force it on with two environment variables:

export QT_ACCESSIBILITY=1
export QT_LINUX_ACCESSIBILITY_ALWAYS_ON=1

Launch WeChat with those set, and you get a structured tree of every widget (buttons, text fields, chat lists, message panels) queryable via Python’s gi.repository.Atspi bindings.

Here’s what the accessibility tree looks like for WeChat on Linux (via AT-SPI):

application "wechat"
└─ frame "WeChat" [ACTIVE]
   └─ tool-bar "Navigation"
   │  ├─ push-button "WeChat"
   │  ├─ push-button "Contacts"
   │  ├─ push-button "Favorites"
   │  ├─ push-button "Moments"
   │  └─ push-button "Mini Programs Panel"
   └─ split-pane
      ├─ list "Chats"
      │  ├─ list-item "昊罡 Hey! What's up? 😊 22:53" [SELECTED]
      │  ├─ list-item "Workflowly.AI 🗳️ Openclaw 的bug…… 21:58"
      │  ├─ list-item "昊罡和 Nick Bot [Video] 21:56"
      │  ├─ list-item "File Transfer"
      │  └─ ...
      └─ filler (chat content)
         ├─ label "昊罡"                          ← chat title
         ├─ list "Messages"
         │  ├─ list-item "🦞 OpenClaw 2026.3.1-beta.1 …"
         │  ├─ list-item "16:53"
         │  ├─ list-item "Here?"
         │  ├─ list-item "Yep, right here! 🫡"
         │  ├─ list-item "hello"
         │  └─ list-item "Hey! What's up? 😊"
         ├─ tool-bar
         │  ├─ push-button "Send sticker(Alt+E)"
         │  ├─ push-button "Send File"
         │  ├─ push-button "Screenshot(Alt+A)"
         │  ├─ push-button "Voice Call"
         │  └─ push-button "Video Call"
         ├─ text [EDITABLE,FOCUSED]               ← message input
         └─ push-button "Send(S)" [DISABLED]

Every widget is there: navigation, chat list, message history, toolbar, input field. You can programmatically walk this tree, find a chat, click it, and paste text into the input. But receiving messages was a mess:

The accessibility tree shows display names, not stable IDs. If someone changes their nickname, you lose track of which conversation is which.
Messages appear as flat text. There’s no way to tell who sent what. It’s just a plain text log in the message panel, and you have to scroll to see older history.
No structured data. No timestamps, no message types, no media. Just whatever text is currently visible on screen.

Verdict: sending works, receiving doesn’t. Half a solution.

Approach 3: Going Deeper with Database Decryption

If the accessibility tree can’t give me message history, where else could I find it? Messages persist between app restarts, so they’re stored somewhere on disk. I asked Claude Code to find out where.

It located the databases quickly: a set of SQLCipher-encrypted files in the user’s data directory. Each database is encrypted with a different AES-256 key. There are about 16 of them: contact.db, message_0.db, session.db, emoticon.db, and more.

The files were clearly encrypted. But the app itself can read them, which means the keys exist somewhere in process memory.

I told Claude Code to find them.

Memory Pattern Scanning

Claude Code suggested using Frida, a dynamic instrumentation toolkit that can inject JavaScript into running processes to inspect and modify their behavior at runtime. My initial thought was to hook into Qt’s internal calls to intercept database operations, but the WeChat Linux binary is statically linked against Qt: the framework is compiled directly into the 134MB binary rather than loaded as shared libraries. That means tools like GammaRay that expect dynamic Qt don’t work, and hooking Qt function calls is much harder. So instead of trying to intercept at the framework level, Claude Code took a different approach: scan the WeChat process’s memory for cipher_ctx structures, SQLCipher’s internal encryption context objects.

The scan targets a 16-byte signature pattern that identifies SQLCipher 4 contexts:

20 00 00 00 10 00 00 00 10 00 00 00 00 10 00 00

For 15 open databases, the scanner found roughly 33 matches (read contexts and write contexts). From each match, it walks pointer chains at various offsets to extract 32-byte candidate keys, producing around 1,500 candidates initially.

Most of those are garbage. A multi-pass filter narrows it down:

Strict pass: max 19 printable ASCII characters out of 32 bytes, max 6 zero bytes, min 16 unique byte values
Moderate pass: relaxes to 24/12/12
Relaxed pass: relaxes further to 28/18/8

Each candidate is verified by attempting to open an actual database with sqlcipher using PRAGMA cipher_compatibility = 4. If the exit code is 0, it’s a real key.

This worked. Every database key was recovered.

Image Decryption

Text messages come from the databases, but images are stored as .dat files with their own encryption. Claude Code reverse-engineered this too.

WeChat images use a two-layer hybrid encryption:

AES-128-ECB on the first 1024 bytes (padded to 1040 bytes with PKCS7)
Single-byte XOR on all remaining bytes

The AES key is stored in process memory as a 32-character hex string, but here’s the trap that Claude Code caught: the key is the first 16 ASCII bytes of that hex string, not the hex-decoded value. If the hex string is 2db48e820850a7cff445fb86ce85a4fa, the actual AES key is b"2db48e820850a7cf", the literal characters 2, d, b, 4…, not the bytes 0x2d, 0xb4, 0x8e.

This is stored XOR-obfuscated in memory using a compile-time mask baked into the .rodata section of the binary. The mask differs per build: different for aarch64 vs x86_64, different between WeChat versions. To extract the key, you find the mask in the binary, scan process memory for XOR-obfuscated hex strings, and apply the mask.

The Chat Selection Problem

Now I had message receiving (via database) and message sending (via accessibility tree paste). But there was a gap: the database uses internal user IDs like wxid_abc123, while the accessibility tree only shows display names. How do you select the right conversation to send a message to?

The solution was a hybrid of reverse engineering and RPA. Claude Code found that WeChat’s main process keeps a session manager object on the heap containing a vector of active sessions, each with its internal wxid at a known offset. By scanning the heap for the manager’s vtable pointer, you can enumerate all sessions and their IDs.

But you can’t just call the session-selection function directly from Frida. It has to run on Qt’s main thread, or the app deadlocks on internal mutexes. The workaround: use xdotool to click on any chat in the sidebar (triggering a real UI event on the main thread), then hook the selection function to redirect it to the target session based on internal ID.

What Claude Code Actually Did

I want to be specific about Claude Code’s role here, because it was more than “AI helped.” Claude Code was the primary reverse engineer.

I described what I wanted. Claude Code:

Identified that WeChat uses WCDB/SQLCipher and determined the encryption parameters
Wrote the Frida scripts to scan memory for cipher contexts and extract keys
Figured out the image encryption scheme (AES-ECB + XOR, the ASCII-bytes key encoding trap)
Reverse-engineered the service registry structure for CDN image downloads
Discovered the chat selection mechanism by scanning heap structures and hooking vtable entries
Generated the multi-pass key filtering algorithm

When Tencent pushed a new WeChat version mid-development, I gave Claude Code the porting documentation, and within two days it found all the updated offsets for both aarch64 and x86_64 architectures. The disassembled code is completely different between the two (different register usage, different instruction sequences, different function layouts) but the offsets were extracted correctly.

This is the part that should make you sit up: one person, armed with Claude Code and Frida, reverse-engineered a major commercial application’s encryption, database access, image decryption, and UI automation in about two weeks. The AI didn’t just assist. It drove the discovery process.

Lessons

AI tools are a cybersecurity force multiplier, for offense and defense. If a single person with an AI coding assistant can break open a billion-user application’s local encryption in two weeks, the implications for security research (and security threats) are significant. This isn’t theoretical; this is what I just did as a side project.

Don’t put your security in the client. WeChat’s encryption protects data at rest on the user’s device, but the keys live in the same process memory as the encrypted data. Any tool that can attach to the process (Frida, GDB, /proc/pid/mem) can extract them. This is a fundamental limitation of client-side encryption where the client itself needs the keys.

Design for zero trust. If you’re building a system, assume every client request could be hostile. Don’t rely on the assumption that only your official client will talk to your servers. Validate everything server-side. WeChat’s server-side detection (behavioral analysis, client fingerprinting) is actually the stronger defense; the local encryption is just a speed bump.

The accessibility tree is underrated. AT-SPI on Linux and UI Automation on Windows provide a structured, queryable representation of any GUI application. For the RPA parts of this project, the accessibility tree was far more reliable than screenshot-based approaches. If you’re building any kind of desktop automation, start there.

Wechaty’s puppet architecture is well-designed. Adding a new backend to Wechaty was straightforward. The puppet abstraction cleanly separates “how you connect to WeChat” from “what you do with messages.” My implementation is just another puppet provider that happens to work by instrumenting the official Linux client.

The code is open source at thisnick/agent-wechat. If you’re interested or have thoughts on the security implications, reach out.