cancel
Showing results for 
Search instead for 
Did you mean: 
ttoine
Community Manager Community Manager
Community Manager

Large Language Models are changing how we build applications that work with content. However, using them in the development cycle is not trivial. Privacy, performance, and infrastructure are the main parts of the equation.

In this session, we focus on practical ways to run LLMs on-premises, from development to production.

We start with simple setups for local development. Then we move to testing approaches and finally to production-ready architectures. You will learn how to choose models that match your hardware, how to improve inference performance, and how to connect local LLMs with content management workflows.

As a concrete, hands-on example, Florian Priede will present a home-made RAG service running entirely on a Hyland laptop. The setup uses a locally hosted 8B model and locally generated embeddings from the alfresco-docs repository, enabling interaction with the full documentation corpus. All steps, from model execution to embedding and retrieval, are performed locally, with performance comparable to ChatGPT.

Building on this example, the session then explains how such a local setup can be scaled up, covering the key considerations when moving from a single-machine RAG service to more robust and scalable LLM architectures 

Running LLMs privately is becoming a key skill for building intelligent, content-centric applications. This session will help you build that skill and be ready to use Cloud Content Repository effectively as soon as it is available.

Session:
Wednesday 28th January 2026
10 AM EDT | 3 PM BT | 4 PM CET

Speaker:
Angel Borroy, Technical Evangelist at Hyland
Florian Priede, Technical Account Manager at Hyland

Register:
TTL #175 registration

square banner.png