Hyland Connect

angelborroy · ‎04-21-2026

A practical walkthrough of the Alfresco Model Context Protocol, a Spring AI agent, and an Angular extension — from zero to a working summarization feature.

Why Alfresco + AI, and Why Now

Alfresco Content Services is a mature, enterprise-grade ECM platform. It stores documents, enforces governance, and provides a rich REST API. What it doesn't do out of the box is reason about content: summarizing a 40-page report, classifying an invoice, or answering a question about a folder full of contracts.

Large language models can do all of thatm but connecting them to Alfresco safely and repeatably is where things get complicated. That is exactly what the Alfresco MCP Server solves, and what the sample project demonstrates end-to-end.

What Is the Model Context Protocol?

The Model Context Protocol (MCP) is an open standard that lets AI agents call external systems through a well-defined, discoverable interface. Think of it as a USB standard for AI tool use: a server exposes a list of tools, each with a name, a description, and a JSON schema for its inputs. Any MCP-aware client: an LLM framework, a chat application, a custom agent, can discover and call those tools without knowing anything about the underlying system.

For Alfresco, this means:

Without MCP	With MCP
Agent must know the Alfresco REST API surface	Agent calls named tools like `search_content`, `get_text_content`
Authentication handled ad hoc per request	Ticket forwarded as a standard `alf_ticket` parameter
Tight coupling between agent code and Alfresco version	MCP server abstracts the REST layer; agent is version-agnostic
Every new workflow requires new REST client code	New workflows reuse the same tool registry

The Alfresco MCP Server

The alfresco-mcp-server is a lightweight Python service that wraps the Alfresco REST API and exposes it as an MCP endpoint. It runs in a single Docker container and is available on Docker Hub:

docker pull angelborroy/alfresco-mcp-server:latest

Transport modes

The server supports three MCP transport protocols:

Mode	Use case
`stdio`	Local development, CLI tools, desktop AI assistants
`SSE` (Server-Sent Events)	Browser clients, streaming responses
`http` (StreamableHTTP)	Server-to-server, recommended for production agents

For the sample project we use HTTP mode. The endpoint is always at /mcp (not /😞

http://localhost:8003/mcp

Available tools

The server exposes a rich set of tools covering the full document lifecycle:

Tool	What it does
`search_content`	Full-text search (AFTS)
`advanced_search`	AFTS with sorting
`search_by_metadata`	Filter by creator, content type, custom metadata
`cmis_search`	CMIS query
`browse_repository`	List folder contents
`get_node_properties`	Retrieve all metadata for a node
`update_node_properties`	Set name, title, description, author
`get_text_content`	Request the PDF rendition and return it as base64
`download_document`	Download raw document content
`upload_document`	Upload a new file
`create_folder`	Create a folder
`delete_node`	Move to trash or permanently delete
`checkout_document`	Lock a document and create a working copy
`checkin_document`	Check in with optional new content
`cancel_checkout`	Discard working copy
`get_repository_info_tool`	Repository version and capabilities
`set_ticket`	Session-wide ticket (for testing)
`set_host`	Change Alfresco host at runtime

Authentication model

Alfresco uses ticket-based authentication. The MCP server never stores credentials: every tool call accepts an alf_ticket parameter that it forwards to the Alfresco REST API. Tickets are obtained at runtime by the caller and passed per-request:

# Obtain a ticket
curl -s -X POST http://localhost:8080/alfresco/api/-default-/public/authentication/versions/1/tickets \
  -H "Content-Type: application/json" \
  -d '{"userId": "admin", "password": "admin"}' | jq -r .entry.id
# → TICKET_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

This design means the MCP server is completely stateless with respect to credentials. It can be safely shared by multiple users or agents, each providing their own ticket.

The Sample Project: Three-Phase Architecture

The sample repository is structured in three progressive phases, each building on the previous one:

┌─────────────────────────────────────────────────────────────────┐
│  Phase 3 — ACA Extension (Angular / NgRx)                       │
│  Right-click → "Summarize Description"                          │
└───────────────────────────┬─────────────────────────────────────┘
                            │  POST /summarize/{nodeId}
                            │  X-Alfresco-Ticket: TICKET_...
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│  Phase 2 — AI Agent (Spring Boot 3 / Spring AI)                 │
│  Extracts text via MCP → calls LLM → writes description via MCP │
└──────────┬──────────────────────────────────────────────────────┘
           │  MCP tool calls (StreamableHTTP)         │  OpenAI-compatible
           ▼                                          ▼  chat completions
┌────────────────────────┐              ┌─────────────────────────┐
│  Phase 1 — MCP Server  │              │  Docker Model Runner    │
│  Alfresco REST wrapper │              │  Local LLM inference    │
└────────────────────────┘              └─────────────────────────┘
           │
           ▼
┌────────────────────────┐
│  Alfresco Content      │
│  Services (external)   │
└────────────────────────┘

Service ports at a glance

Service	Port	URL
Alfresco Content Services	8080	`http://localhost:8080`
Alfresco MCP Server	8003	`http://localhost:8003/mcp`
AI Agent (Phase 2)	8081	`http://localhost:8081`
Docker Model Runner	12434	`http://localhost:12434/engines/v1`
ACA Extension (Phase 3)	4200	`http://localhost:4200`

Phase 1. Running the MCP Server

The first phase is deliberately minimal: spin up the MCP server and verify that it can reach Alfresco.

# phase-1-mcp-server/compose.yaml
services:
  alfresco-mcp-server:
    image: angelborroy/alfresco-mcp-server:1.1.0
    environment:
      TRANSPORT_MODE: http
      PORT: 8003
      ALFRESCO_HOST: ${ALFRESCO_URL:-http://host.docker.internal:8080}
    ports:
      - "8003:8003"
    extra_hosts:
      - "host.docker.internal:host-gateway"

Copy .env.example to .env, set ALFRESCO_URL to point at your Alfresco instance, and start:

docker compose -f phase-1-mcp-server/compose.yaml --env-file phase-1-mcp-server/.env up -d

You can now call any tool directly. For example, to search for documents containing "annual report":

curl -s -X POST http://localhost:8003/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/call",
    "params": {
      "name": "search_content",
      "arguments": {
        "query": "annual report",
        "max_results": 5,
        "alf_ticket": "TICKET_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
      }
    }
  }' | jq .

This phase is also where you can configure a desktop AI assistant (Claude Desktop, Cursor, VS Code GitHub Copilot) to point at the MCP server and start exploring Alfresco interactively. The mcp.json file in the phase folder contains the ready-to-paste client configuration.

Phase 2. The AI Agent

Phase 2 is where the real integration happens. A Spring Boot 3 application uses two Spring AI beans:

McpSyncClient (StreamableHTTP transport) → talks to the MCP server
ChatClient (OpenAI-compatible) → talks to Docker Model Runner for local LLM inference

Spring AI configuration

# application.properties

# MCP Server — StreamableHTTP transport
spring.ai.mcp.client.streamable-http.connections.alfresco.url=${MCP_SERVER_URL}
spring.ai.mcp.client.request-timeout=60s

# OpenAI-compatible endpoint (Docker Model Runner)
spring.ai.openai.api-key=unused
spring.ai.openai.base-url=${MODEL_RUNNER_URL}
spring.ai.openai.chat.options.model=${LLM_MODEL}
spring.ai.openai.chat.options.temperature=0.3
spring.ai.openai.chat.options.max-tokens=2000
spring.ai.openai.read-timeout=300s

Spring AI auto-configures both beans from these properties — no boilerplate client setup required.

The summarization endpoint

The agent exposes a single endpoint:

POST /summarize/{nodeId}
X-Alfresco-Ticket: TICKET_…

The ticket arrives as an HTTP header, is passed through to every MCP tool call as alf_ticket, and is never stored anywhere:

@PostMapping("/summarize/{nodeId}")
public ResponseEntity<Map<String, String>> summarize(
        @PathVariable String nodeId,
        @RequestHeader(value = "X-Alfresco-Ticket", defaultValue = "") String ticket) {

    String summary = summarizeService.summarize(nodeId, ticket);
    return ResponseEntity.ok(Map.of("nodeId", nodeId, "summary", summary));
}

Text extraction via MCP

The McpService implements a two-step extraction strategy: first request the PDF rendition via get_text_content and decode it with Apache PDFBox; if that fails, fall back to download_document decoded as UTF-8.

public String extractText(String nodeId, String ticket) {
    // Step 1: get_text_content → creates rendition → polls until CREATED → returns base64 PDF
    String pdfB64 = callTool("get_text_content",
            Map.of("node_id", nodeId, "alf_ticket", ticket));

    if (pdfB64 != null && !pdfB64.isBlank()) {
        try {
            byte[] pdfBytes = Base64.getDecoder().decode(pdfB64.strip());
            try (PDDocument doc = Loader.loadPDF(pdfBytes)) {
                String text = new PDFTextStripper().getText(doc).strip();
                if (!text.isBlank()) return truncate(text);
            }
        } catch (Exception e) {
            log.warn("PDF extraction failed: {}", e.getMessage());
        }
    }

    // Step 2: fallback — download raw content (plain text, CSV, …)
    String raw = callTool("download_document",
            Map.of("node_id", nodeId, "save_to_disk", false, "alf_ticket", ticket));
    return truncate(decodeBase64OrReturn(raw));
}

The get_text_content tool handles the Alfresco rendition lifecycle automatically: it issues a POST to the rendition API, polls until the PDF is ready, then fetches and returns the content — all within a single MCP tool call.

LLM summarization

Once text is extracted it is sent to the LLM with a focused system prompt:

private static final String SYSTEM_PROMPT =
    "You are a document summarizer. Given the text content of a document, " +
    "return a concise 2-3 sentence summary suitable for a document description field. " +
    "Do not include any preamble or explanation — output only the summary.";

public String summarize(String text) {
    return chatClient.prompt()
            .system(SYSTEM_PROMPT)
            .user(text)
            .call()
            .content();
}

Writing the result back

After the LLM returns a summary, the agent calls update_node_properties through MCP to set the cm:description field on the document:

callTool("update_node_properties", Map.of(
        "node_id", nodeId,
        "description", summary,
        "alf_ticket", ticket));

The complete round-trip:

Agent → MCP: get_text_content(node_id, alf_ticket)
MCP → Alfresco: POST /renditions (pdf) → poll → GET content
MCP → Agent: base64 PDF

Agent → PDFBox: extract plain text
Agent → Docker Model Runner: chat completion (system + text)
LLM → Agent: 2-3 sentence summary

Agent → MCP: update_node_properties(node_id, description=summary, alf_ticket)
MCP → Alfresco: PUT /nodes/{nodeId}

Running Phase 2 standalone

cp phase-2-agent/.env.example phase-2-agent/.env
# edit .env: set LLM_MODEL, MCP_SERVER_URL, MODEL_RUNNER_URL
docker compose -f phase-2-agent/compose.yaml --env-file phase-2-agent/.env up -d

# verify connectivity
curl http://localhost:8081/health

# summarize a document
curl -s -X POST http://localhost:8081/summarize/<nodeId> \
  -H "X-Alfresco-Ticket: TICKET_…" | jq .

Phase 3. The ACA Extension

Phase 3 wraps everything in a user interface. The extension is an Nx Angular library (@sample/summarize-extension) that integrates into the Alfresco Content Application (ACA) and adds a Summarize Description item to the document context menu.

Extension manifest

The extension declares its context-menu entry and the action it dispatches in a JSON manifest:

{
  "actions": [
    {
      "id": "summarize.node.action",
      "type": "SUMMARIZE_NODE",
      "payload": "$(context.selection.first.entry)"
    }
  ],
  "features": {
    "contextMenu": [
      {
        "id": "summarize.context.menu",
        "order": 900,
        "icon": "auto_awesome",
        "title": "SUMMARIZE.ACTION_TITLE",
        "actions": { "click": "summarize.node.action" },
        "rules": {
          "visible": ["app.selection.file", "!app.navigation.isTrashcan"]
        }
      }
    ]
  }
}

The $(context.selection.first.entry) token is ACA's built-in way to pass the currently selected node (a full Node object) to the action payload.

NgRx action and effect

The extension follows ACA's NgRx architecture. An action carries the selected node; an effect handles the async HTTP call:

// summarize.actions.ts
export const SUMMARIZE_NODE = 'SUMMARIZE_NODE';

export class SummarizeNodeAction implements Action {
  readonly type = SUMMARIZE_NODE;
  constructor(public payload: Node) {}
}

// summarize.effects.ts
@Injectable()
export class SummarizeEffects {
  summarize$ = createEffect(
    () =>
      this.actions$.pipe(
        ofType<SummarizeNodeAction>(SUMMARIZE_NODE),
        switchMap((action) => {
          this.notification.showInfo('SUMMARIZE.PROCESSING');
          return this.summarizeService.summarize(action.payload.id!).pipe(
            tap(() => this.notification.showInfo('SUMMARIZE.SUCCESS')),
            catchError((error: HttpErrorResponse) => {
              this.notification.showError(error.error?.error || 'SUMMARIZE.ERROR');
              return EMPTY;
            })
          );
        })
      ),
    { dispatch: false }
  );
}

Ticket propagation from ACA

The extension reads the Alfresco ticket from the browser's local storage (where ADF stores it after login) and attaches it as a request header:

// summarize.service.ts
summarize(nodeId: string): Observable<{ nodeId: string; summary: string }> {
  const agentUrl = this.appConfig.get<string>('summarizeAgentUrl', '/api/agent');
  const ticket   = localStorage.getItem('ticket-ECM');
  const headers  = ticket ? new HttpHeaders({ 'X-Alfresco-Ticket': ticket }) : undefined;

  return this.http.post<{ nodeId: string; summary: string }>(
    `${agentUrl}/summarize/${nodeId}`,
    {},
    { headers }
  ).pipe(timeout(300_000));  // LLM calls can take a while
}

The 5-minute timeout is intentional: generating a summary goes through rendition creation, PDF extraction, and LLM inference. The snackbar notification keeps the user informed while this happens.

Running Phase 3

docker compose -f phase-3-aca-extension/compose.yaml up -d
# open http://localhost:4200

The Dockerfile clones ACA, applies the extension, and starts the dev server with proxies for both Alfresco and the agent. See phase-3-aca-extension/README.md for full details.

Running the Full Stack

From the repository root a single command starts all three phases together:

cp .env.example .env
# edit .env with your values
docker compose up -d

The root compose.yaml simply includes all three phase files. Services that talk to each other (agent → MCP server) use Docker service names. Only Alfresco and Docker Model Runner, which are external, use host.docker.internal.

Complete Data Flow

Here is what happens when a user right-clicks a PDF in ACA and selects Summarize Description:

User selects Summarize Description from the context menu
ACA dispatches SUMMARIZE_NODE with the Node object as payload
SummarizeEffects intercepts the action; snackbar shows "Generating summary, please wait…"
SummarizeService sends POST /api/agent/summarize/{nodeId} with X-Alfresco-Ticket header (proxied to the agent on port 8081)
AgentController receives the request
McpService calls get_text_content via MCP → MCP server requests the PDF rendition from Alfresco, polls until ready, returns base64 PDF
Apache PDFBox extracts plain text from the PDF bytes
LlmService sends the text to Docker Model Runner; LLM returns a 2-3 sentence summary
McpService calls update_node_properties via MCP → MCP server sets cm:description on the Alfresco node
Agent returns { nodeId, summary } to the extension; snackbar shows "Document description updated successfully."
Refreshing the document info drawer shows the new description

Key Design Decisions

Tickets at runtime, never at rest. Alfresco tickets expire (default: 1 hour). Storing them in .env files or environment variables is both fragile and insecure. Instead, every request carries the ticket as an HTTP header and the agent forwards it per MCP call. The MCP server is completely stateless with respect to credentials.

PDFBox over raw text. The get_text_content tool returns a PDF rendition, not raw bytes. PDFBox handles encoding, embedded fonts, and multi-column layouts far better than trying to parse raw content. The fallback to download_document covers plain-text and CSV files that don't need a rendition.

Local LLM via Docker Model Runner. No data leaves the machine, no API keys are required, and the latency profile is predictable. Switching to a cloud provider is a one-line change in application.properties.

NgRx Actions + Effects, not direct HTTP. The ACA extension uses the same NgRx pattern ACA itself uses internally. This makes the extension composable: other effects can listen to the same actions, and error handling is centralised.

Adding a New AI Workflow

The architecture is designed to be extended. To add a new workflow — say, automatic tagging based on document content:

Agent side: add a new @PostMapping to AgentController.java following the /summarize/{nodeId} pattern. Inject McpService to call any of the available MCP tools.
MCP side: no changes needed — all tools are already available. If you need a tool that doesn't exist yet, see the contribution section below.
Extension side: add an NgRx action class, a new createEffect in the effects class, and a new entry in summarize-extension.json.

Contributing to the Alfresco MCP Server

The alfresco-mcp-server is open source and actively maintained. There are many directions the community could take it:

New tools: version history, rendition management, permissions/ACLs, workflow and process integration, tagging and categories, audit log querying, site management
New transport modes: WebSocket support, gRPC
Authentication improvements: OAuth2 / SSO support alongside ticket-based auth
Multi-tenant support: routing tool calls to different Alfresco instances based on configuration
Alfresco Process Services integration: tools for starting and querying workflow instances

If you have an idea for a new tool, open an issue at github.com/AlfrescoLabs/alfresco-mcp-server/issues, describe the Alfresco API it would wrap and the use case it enables. If you want to contribute code, the server is a single Python file with a clear pattern for adding new tools.

Similarly, the sample project welcomes pull requests demonstrating new workflows (classification, comparison, extraction, question-answering) built on top of the MCP server.

Wrapping Up

The combination of Alfresco MCP Server + Spring AI + a local LLM gives you a fully local, enterprise-ready pipeline for adding AI to document management workflows, without sending documents to a cloud API, without writing Alfresco REST client boilerplate, and without tight coupling between your AI logic and the Alfresco version you are running.

The three-phase structure of the sample project is intentional: you can stop at Phase 1 and use the MCP server directly from a desktop AI assistant, stop at Phase 2 and integrate the agent into any existing application, or go all the way to Phase 3 and ship a polished UI feature in ACA.

The full source is at github.com/aborroy/alfresco-mcp-sample. Clone it, run docker compose up -d, and start building.

Questions? Open an issue on the sample project or the MCP server.