Building a Cobalt Strike MCP With 4.12’s REST API and FastMCP
Earlier this year, I was playing around with a personal project consisting of an LLM-based framework for malware development when I hit a critical roadblock: how do you programmatically deploy and test the malware without manual GUI interaction? Anyone that has written malware knows how much of an iterative process it can be, especially if you're working on something novel or somewhat undocumented. It's a trial and error process and as much as it can be rewarding seeing that final debug statement printed confirming our new TTP works, there's a lot of potential to automate the trial and error process. This ultimately allows us to spend more time in the research phase and less time in the implementation phase.
I had been following Fortra's webinar series on Cobalt Strike 4.12, and when they announced a REST API was coming, I made a deliberate decision: wait for the official API rather than building fragile workarounds. When CS 4.12 dropped with its REST API and proper OpenAPI specification, combined with Anthropic's Model Context Protocol, I realised building an AI-powered interface to Cobalt Strike would be remarkably straightforward. This blog post is demonstrating how less than 300 lines of Python bridges Claude and Cobalt Strike so we can write the malware and let an LLM try and use it in a controlled environment until it finds the right implementation sweet spot.
Enter MCP and FastMCP
The Model Context Protocol (MCP) is Anthropic's standard for connecting AI models to external tools and data sources. FastMCP is a Python framework that makes building MCP servers simple, and here's the key insight: FastMCP can automatically generate tools from OpenAPI specifications.
This creates an elegant match. Cobalt Strike 4.12 ships with an OpenAPI spec describing every REST API operation. This is accessible once the csrestapi service is started and can be accessed at https://teamserver:50443/v3/api-docs. FastMCP can consume that spec and dynamically create MCP tools for each operation. Instead of manually defining tools for listing Beacons, uploading files, or executing BOFs, we point FastMCP at the API documentation and let it handle the heavy lifting. The diagram below shows what the implementation will look like. It's messy but you get the idea... The most important takeaways are that there are three main components: the Client Layer (your Claude Desktop instance or whatever LLM platform you're using to provide natural language commands, the MCP Server, and Cobalt Strike.
Implementation: The Technical Deep-Dive
The entire CS-MCP server is under 300 lines of Python. Let me walk through the key components.
Authentication and Client Setup
First, we authenticate with Cobalt Strike and obtain a bearer token:
async def authenticate() -> str:
"""Authenticate with Cobalt Strike API and retrieve bearer token"""
login_url = f"{BASE_URL}/api/auth/login"
login_data = {
"username": USERNAME,
"password": PASSWORD
}
async with httpx.AsyncClient(verify=VERIFY_SSL) as client:
response = await client.post(login_url, json=login_data)
response.raise_for_status()
auth_response = response.json()
bearer_token = auth_response.get("access_token")
return bearer_tokenWe then initialize an HTTP client with the token for all subsequent requests:
async def initialize_api_client(token: str) -> httpx.AsyncClient:
"""Initialize the API client with bearer token authentication"""
headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json"
}
return httpx.AsyncClient(
base_url=BASE_URL,
headers=headers,
verify=VERIFY_SSL,
timeout=30.0
)Dynamic Tool Generation
Here's where it gets interesting. We fetch the OpenAPI spec and generate MCP tools dynamically based on the spec:
def create_tools_from_openapi() -> list[Tool]:
"""Create MCP tools from the OpenAPI specification"""
tools = []
for path, path_item in openapi_spec["paths"].items():
for method in ["get", "post", "put", "delete", "patch"]:
if method not in path_item:
continue
operation = path_item[method]
operation_id = operation.get("operationId")
# Build input schema from parameters
properties = {}
required = []
if "parameters" in operation:
for param in operation["parameters"]:
param_name = param.get("name")
properties[param_name] = {
"type": param.get("schema", {}).get("type", "string"),
"description": param.get("description", "")
}
if param.get("required", False):
required.append(param_name)
tool = Tool(
name=operation_id,
description=f"{operation.get('summary')}\n\nPath: {method.upper()} {path}",
inputSchema={
"type": "object",
"properties": properties,
"required": required
}
)
tools.append(tool)
return toolsThis code walks through the OpenAPI spec, extracts operation metadata, and creates MCP Tool objects. Every Cobalt Strike operation automatically becomes available to Claude without manual tool definitions. This is the real magic of FastMCP and basically why I went with it.
Executing API Operations
When Claude uses a tool, we translate it into an API call:
async def call_api_operation(operation_id: str, arguments: Dict[str, Any]) -> Any:
"""Call an API operation based on operation ID and arguments"""
for path, path_item in openapi_spec["paths"].items():
for method in ["get", "post", "put", "delete", "patch"]:
if method not in path_item:
continue
operation = path_item[method]
if operation.get("operationId") == operation_id:
actual_path = path
request_params = {}
# Substitute path parameters and extract query params
if "parameters" in operation:
for param in operation["parameters"]:
param_name = param.get("name")
if param_name in arguments:
if param.get("in") == "path":
actual_path = actual_path.replace(
f"{{{param_name}}}",
str(arguments[param_name])
)
elif param.get("in") == "query":
request_params[param_name] = arguments[param_name]
# Execute the HTTP call
response = await api_client.request(
method, actual_path, params=request_params
)
response.raise_for_status()
return response.json()The function matches the operation ID to the spec, constructs the proper URL, and executes the HTTP call. Claude just calls tools by name with arguments, the MCP handles the translation.
Real-World Usage
With the CS-MCP running, Cobalt Strike interactions become conversational. You can ask Claude to perform high level operations and it will use the tools at its disposal exposed via the MCP to complete the task. Some very simple examples are shown below.
Now it's important to note that as useful as this may be in theory - any experienced red teamer will probably be skeptical at this point. There are two main issues that come to mind:
OPSEC
We want granular control of how TTPs are carried out in real life scenarios. For example if we want to spawn a new beacon, most of the time you will be using some custom BOF to do it or .NET tool you wrote etc.. Using the built-in CS commands will often get us detected unless you've done some heavy modification of the kits before setting up the teamserver. Although LLMs are aware of the general idea of OPSEC, they are not yet good at autonomously being OPSEC-safe.
Data confidentiality
Since LLMs started becoming commercially available we've had to assume that all the data that goes in and out of them is visible to the providers (Anthropic, OpenAI, Google etc..). This is a major issue when it comes to real engagements because we don't want to be leaking client data.
Proposed Solutions
OPSEC
Regarding the OPSEC issue; the solution is basically to wait for the next iterations of the REST API where Fortra has hinted that they will be adding a permissions model for more fine-tuned control of exposed endpoints and privilege groups. A couple of comments on the REST API release post seem to point in this direction:
"Roles or ‘command restrictions’: A junior operator or LLM can be restricted from running unauthorized functionality."
"No roles, authorizations and command restrictions yet: The functionality for roles or ‘command restrictions’ as described in our motivations is not yet provided. However, we did design our API to facilitate this in future releases."
Source: Release Out: Finally, Some REST
Hopefully, this functionality will allow operators to restrict the LLM's toolkit to only approved actions that operators deem safe for an agent to autonomously call.
Data confidentiality
I have yet to explore solutions to this issue - but a couple of obvious ones are:
- Using self-hosted models, either on your own hardware or on beefy cloud vms. This option is limited by cost, access to hardware, and open-source models.
- Using solutions like AWS Bedrock where data is not exposed to third parties (to some extent). Keep in mind that this is still on AWS infrastructure. The benefit of this option is that it unlocks access to closed-source models too while solving the confidentiality issue.
Closing Thoughts
What strikes me most about this project is how accessible it has become. The combination of FastMCP and well-documented REST APIs means building AI-powered security tooling no longer requires heroic engineering effort and long periods of time. The pattern demonstrated here - authenticate, fetch an OpenAPI spec, generate tools dynamically, route calls to API operations, works for any well-documented API.
I've made the CS-MCP code available on GitHub for operators to use and extend. The barrier to building AI integrations with security tools has dropped dramatically. If you can write basic Python and understand REST APIs, you can build MCPs that connect Claude to your security infrastructure - whether that's your SIEM, vulnerability scanner, C2, or custom internal tools.
Github Link: