An MCP Server for Korean Forensic Finance: Eleven Tools, One Conversation - Writing

Source code → github.com/pon00050/krff-shell

Behind the four-repo platform layer that ingests DART, scores anomalies, and runs statistical validation, there has to be a delivery layer. Otherwise the system is a directory full of parquet files and a researcher with a SQL client.

krff-shell is that delivery layer. It is the CLI that orchestrates the pipeline. It is the DuckDB query layer that wraps the parquets behind a clean API. It is the per-company HTML report generator that turns a parquet row into a visual forensic dossier. And it is the MCP server that exposes the entire forensic-finance stack as eleven tools that Claude (or any MCP-aware client) can call in conversation.

The MCP layer is the most interesting piece, and it is what this post is about.

Why MCP Matters for This Stack

Forensic finance work has an unusual access pattern. The questions are concrete — “give me the Beneish trajectory for this ticker over the last five years” — but they come from people who do not write SQL. Lawyers building cases. Journalists researching a story. Compliance officers checking exposure. Regulators triaging a backlog. Every one of those users has a specific factual question that the parquet files contain the answer to. None of them is going to learn DuckDB syntax to ask it.

The conventional path is to build a web frontend with forms for every question type. That works for the questions you anticipated. It fails for the long tail. A real forensic question often combines three or four facts in a way the form designer did not predict — “show me the M-Score history for companies whose officers also sit on the board of A사, filtered to those with CB issuances above ₩10B in the last two years.” Building UI for that combinatorial space is hopeless.

MCP solves the access problem differently. The tools expose specific factual primitives — get the Beneish history, list flagged companies, fetch the officer-network neighbors — and Claude composes them into answers. The user types the question in natural language. Claude figures out which tools to call in what order. The system answers.

The Eleven Tools

Each tool is a single Python function with typed parameters and a documented return shape. The full set:

#	Tool	What it returns
1	`lookup_corp_code`	Resolves a Korean or English company name (or ticker) to the canonical DART corp_code. Always called first.
2	`get_company_summary`	One-page profile: market, ticker, BRN, CRN, recent disclosures, current flags.
3	`get_beneish_scores`	M-Score time series with all 8 components, with sector percentiles and DART links.
4	`get_cb_bw_events`	Convertible bond and bond-with-warrant issuance and repricing history.
5	`get_price_volume`	OHLCV time series in a configurable window.
6	`get_officer_holdings`	Per-officer shareholding changes over time.
7	`get_timing_anomalies`	Disclosure filings clustered around abnormal price or volume days.
8	`get_major_holders`	5%+ ownership filings (대량보유상황보고서).
9	`get_officer_network`	The officer-network neighbors of a company in the cross-company directorship graph.
10	`search_flagged_companies`	The current priority queue: companies with multiple active flags, with rationale.
11	`search_jfia_literature`	Searches the 469-article JFIA catalog for forensic accounting research relevant to a query.

lookup_corp_code is the entry point because every other tool keys on corp_code. The MCP convention is that Claude calls it first, gets the canonical identifier, then composes the rest of the call sequence.

The second-most-used pattern is search_flagged_companies followed by the relevant detail tools — Claude pulls the current flag list, picks the cases that match the user’s interest, and drills into each with the per-company tools. From the user’s perspective, none of that is visible: they ask “which companies should I be looking at this week?” and Claude returns a ranked answer with the underlying signals.

What the DuckDB Layer Does Underneath

Every tool is a thin wrapper over krff/db.py — the DuckDB query layer that owns all parquet access. There is no direct pandas reading anywhere in the MCP server. The reasons are mundane but accumulate.

DuckDB reads parquets faster than pandas for the column-projected, row-filtered queries the MCP tools issue. The 100-row M-Score history for a single company is a WHERE corp_code = ? ORDER BY year against beneish_scores.parquet — DuckDB returns it in milliseconds; pandas loads the whole file. For an MCP server expected to serve interactive turn-taking, that latency difference is the difference between usable and unusable.

The query() function is parameterized — every user input goes through DuckDB’s parameter binding, never string interpolation. This was added after a SQL injection vulnerability in an early version. The pattern is now boring: every tool that accepts a parameter passes it through query() with ? placeholders.

There is an async variant for the FastAPI endpoints (the same tools are exposed as REST as well as MCP). Both call paths go through the same query layer. There is no second copy of the SQL anywhere.

The Per-Company HTML Report

Outside the MCP layer, the most-used delivery surface is <corp_code>_report.html — a self-contained HTML file generated per company that bundles the full forensic dossier into one document.

The report has four sections. M-Score trend — eight years of M-Score history with the eight component breakdown, threshold lines, and sector percentile bands. CB/BW timeline — every issuance and repricing event with conversion price, market price, moneyness, and dilution flag. Timing anomalies — disclosure filings overlaid on price and volume time series, with abnormal-day highlighting. Officer holdings and network — shareholding changes and the company’s position in the cross-company directorship graph.

Plotly powers the interactive charts. The HTML is fully self-contained — no external JavaScript, no API calls at view time. A reviewer can hand the report to a non-technical colleague who can open it in a browser. The sections themselves are populated by the same DuckDB queries the MCP tools use; the report is structurally just a pre-baked composition of MCP tool outputs.

When ANTHROPIC_API_KEY is configured, an optional final section adds Claude-generated narrative synthesis — a few paragraphs walking the reader through what the signals collectively suggest, what the priority follow-up questions are, and what the relevant Korean enforcement precedents would be. The synthesis is clearly labeled as AI-generated and remains opt-in.

What This Layer Is Not

It is not a hosted service. The MCP server runs locally; the FastAPI service runs locally; the reports are generated locally. There is no persistent backend, no user accounts, no rate limits, no access control. For analysts running the system on their own data, this is the right shape. For a hosted multi-tenant deployment, additional infrastructure is required — and is intentionally out of scope for the open-source release.

It is not a fully-automated alerting system. The 11 MCP tools and the per-company reports are query interfaces — pull, not push. There is no scheduled job that emails you “company X has tripped flag Y this week.” Building that on top is straightforward (the DuckDB layer makes the queries trivial), but it is application logic, not infrastructure.

It is not a substitute for human review. Every flag the system produces is a signal that requires interpretation. The MCP tools and reports surface the signals quickly and consistently; they do not draw conclusions. The reframe of forensic-finance work the system enables is “review more cases, faster” — not “automate the review.”

Why It Was Built This Way

Similar forensic infrastructure has historically existed in Korean capital markets as closed-source institutional products — the underlying public data deserves a public infrastructure layer that does not gate access by institutional affiliation. This stack is the open-data equivalent.

The MCP interface is the bet that the most useful access pattern for that infrastructure is conversational. The tools are typed, parameterized, and individually documented. Claude composes them. The user asks the question.

The repository is at github.com/pon00050/krff-shell. MIT license. 317 tests. Korean-bilingual documentation. The platform layer it sits on is documented separately at forensic-platform-architecture.