APK Reverse Engineering + Agent Report Generation: How Hermes Agent Turns Binary Analysis into Executive Summaries

An agent that reads decompiled Java, parses network traces, and writes executive summaries is not a chatbot. It is a workflow orchestrator that bridges the gap between low-level reverse engineering toolchains and high-level business reporting.

Adrien Sales reverse-engineered his mobile operator’s APK, extracted private HTTP endpoints, built a Go CLI that snapshots consumption data into DuckDB every 5 minutes, and then used Hermes Agent to generate a 17-page multi-stakeholder report. The agent consumed SQL schemas, API logs, and business context, then produced three different report variants for a CEO, CIO, and network admin. Total cost: $19.57.

This is not a toy demo. It shows how agents orchestrate analysis, extraction, and synthesis workflows when the input is thousands of files and the output is a structured document.

The Reverse Engineering Stack

The mobile operator’s app had no public API. The workflow started with APK decompilation:

APKTool extracts the manifest, resources, and smali bytecode
jadx decompiles smali to Java source for readability
Manual inspection of network calls in decompiled code reveals private HTTP endpoints

Once endpoints were identified, Sales rebuilt them in a Go CLI that polls every 5 minutes and writes voice, data, and SMS consumption to DuckDB. The CLI also powers KDE Plasma widgets that display live metrics on his desktop.

The agent’s job was not to perform the reverse engineering. The agent’s job was to consume the artifacts (SQL schemas, API logs, consumption trends) and produce reports for three different audiences.

Agent Orchestration Flow

Hermes Agent is a multi-agent framework that supports role-based delegation. Sales designed a three-agent pipeline:

Data Analyst Agent: Queries DuckDB, calculates burn rates, identifies anomalies
Business Analyst Agent: Translates metrics into ROI, SLA compliance, cost projections
Report Writer Agent: Synthesizes findings into audience-specific narratives

Each agent has access to tools:

SQL execution against DuckDB
File system access to read API logs and schemas
PDF generation libraries for final output

The orchestration pattern is sequential with state handoff. The Data Analyst Agent runs queries and writes intermediate JSON. The Business Analyst Agent reads that JSON, applies business logic, and writes another JSON. The Report Writer Agent consumes both and generates Markdown, which is then rendered to PDF.

Prompt Engineering for Multi-Stakeholder Output

The hardest part was not querying the database. The hardest part was getting the agent to produce three different reports from the same data:

CEO Report: 30-second summary, screenshot-ready visuals, no jargon
CIO Report: ROI in euros, SLA compliance percentages, cost projections
Network Admin Report: Actionable tickets, specific hours, error patterns

Sales used role-playing prompts. Each agent was given a persona and explicit output constraints:

You are a network administrator. Your audience is technical.
Your report must include:
- Specific timestamps for anomalies
- Error codes and HTTP status patterns
- Actionable remediation steps

Do not include business justifications or ROI calculations.

This pattern works because it constrains the agent’s output space. Without it, the agent produces generic summaries that satisfy no one.

State Management and Intermediate Artifacts

The agent does not hold the entire DuckDB schema in memory. It queries incrementally and writes intermediate JSON files:

data_summary.json: Raw metrics (total consumption, peak hours, error counts)
business_metrics.json: Derived KPIs (burn rate, cost per GB, SLA compliance)
anomalies.json: Outliers flagged by statistical thresholds

Each agent reads the artifacts it needs and writes new ones. This keeps token budgets manageable and makes the pipeline debuggable. If the Business Analyst Agent produces bad ROI calculations, you can inspect business_metrics.json without re-running the Data Analyst Agent.

The final Report Writer Agent consumes all three JSON files and generates Markdown. The Markdown is then passed to a PDF renderer (likely Pandoc or a similar tool, though the article does not specify).

Tool Calling and Security Boundaries

The agents have file system access and SQL execution privileges. This is a local workflow, not a SaaS product, so the security boundary is the developer’s machine.

In a production deployment, you would need:

Read-only SQL credentials for the Data Analyst Agent
Sandboxed file system access (no writes outside a designated temp directory)
Rate limiting on PDF generation to prevent resource exhaustion

The article does not mention observability, but a production version would log:

Every SQL query executed by the Data Analyst Agent
Every file read/write operation
Token usage per agent
Latency for each stage of the pipeline

Without these logs, debugging a failed report generation is guesswork.

Failure Modes and Mitigation

Failure Mode	Symptom	Mitigation
Agent hallucinates SQL syntax	Query fails, pipeline stops	Validate SQL with a linter before execution
Business logic drift	ROI calculations change over time	Version control the prompts and JSON schemas
PDF rendering breaks	Markdown is valid but PDF is malformed	Add a Markdown-to-HTML preview step
Token budget exceeded	Agent truncates output mid-sentence	Chunk large datasets, process incrementally
Stale data in DuckDB	Report reflects old consumption patterns	Add a freshness check before pipeline starts

The biggest risk is prompt drift. If you tweak the CEO report prompt to add a new metric, you may inadvertently break the CIO report. The solution is to version control all prompts and test all three reports after every change.

Deployment Shape

This is a local script, not a web service. The deployment is:

Cron job triggers the Go CLI every 5 minutes to poll the mobile operator’s API
Go CLI writes to DuckDB
Separate cron job (daily or weekly) triggers the Hermes Agent pipeline
Agent pipeline generates three PDFs and writes them to a local directory
PDFs are manually reviewed and distributed

A production version might:

Expose the pipeline as an HTTP API (POST request with date range, returns PDF)
Store PDFs in S3 with signed URLs
Send reports via email or Slack
Add a web UI for selecting date ranges and report types

The current design is optimized for a single user who controls the entire stack. Scaling to multiple users requires authentication, multi-tenancy, and audit logs.

Cost Breakdown

Sales reports a total cost of $19.57 for the entire project. This likely includes:

LLM API calls (GPT-4 or similar for the three agents)
PDF rendering (if using a paid service)

The cost is low because the pipeline runs infrequently (daily or weekly) and the input data is small (a few thousand rows in DuckDB). If you were generating reports for 1,000 customers, the cost would scale linearly with the number of SQL queries and LLM calls.

Token usage is the main cost driver. The Data Analyst Agent reads the entire DuckDB schema on every run. If the schema is large, you could reduce costs by caching the schema and only re-reading it when the database structure changes.

Technical Verdict

Use this pattern when:

You have structured data (SQL, JSON, CSV) and need narrative reports for non-technical stakeholders
The input data is too large to fit in a single LLM context window
You need multiple output formats from the same data (CEO summary, technical deep-dive, compliance audit)
You control the entire stack and can debug intermediate artifacts

Avoid this pattern when:

The input data is unstructured (images, PDFs, audio) and requires multimodal models
You need real-time report generation (the multi-agent pipeline adds latency)
You cannot version control prompts and JSON schemas (prompt drift will break the pipeline)
You need to scale to thousands of concurrent users (the current design is single-user)

The real insight here is that agents are not magic. They are orchestrators. The value is in the pipeline design: which agent runs when, what artifacts it consumes, what artifacts it produces, and how you validate the output. The LLM is just one component in a larger workflow.

Source Links

Primary Article