Why I’m Building This

After reading some articles about ServiceNow AI use cases, I became curious about the platform and wanted to better understand how AI solutions can be combined with modern SaaS offerings.

Having worked with Oracle SaaS solutions and in roles where we built extensions using low code tools, chatbots, business processes and custom applications built on top of HCM/SCM/ERP, I was interested in exploring the additional value that AI can bring to the SaaS ecosystem.

My initial plan was to start with a Service Now Personal Developer Instance (PDI), and explore Service Now AI capabilities, but I cannot get an instance and I am in a waitlist instead 🙁

Lack of access to a Service Now instance means I need to re-think the use case, but that won’t be a problem as the alternative to using Now Assist native capabilities is an equally valid and strong use case as many organizations are looking beyond a single platform and experiment with AI solutions that can combine across ServiceNow, Jira, Confluence, and internal knowledge repositories.

The Goal

Build an AI Helpdesk Conversational Assistant* capable of answering employee questions from a knowledge base and eventually escalating unresolved issues into ServiceNow incidents.

* Conversational Assistant or Chatbot are terms being replaced by Agent and AI Agents, but one should not confuse both. An Agent can perform actions. A Conversational Assistant does not, it answers questions and can point users to a resolution, but that is the boundary.

Architecture and Considerations

With access to a Service Now instance one could leverage the Assist Now capabilities like AI Search (Vector Database) to act as the semantic database required for a proper RAG implementation. We could also use some out of the box connectors and even workflows to access data.

Given that none of that is possible I will create a custom external pipeline, which will look like this.

Knowledge data is typically spread across multiple systems, and we need the ability to connect to these systems. This is classic application integration, a space where I spent a lot of time in my career.

Ingestion Pipeline, this is all about data quality – nothing new here. Anyone that has worked in data integration knows that this is one of the most critical factors in success. If data is not properly cleaned, consolidated, normalized, and enriched, the outcome will be poor regardless of how powerful the LLM is. Garbage in, garbage out still applies in the AI era.

Vector DB. I’ve recently spent some time exploring different strategies for chunking, indexing, and retrieval. This is where many RAG implementations can go sideways, and there is an element of iterations here – this is not a one size fits all kind of approach. The way information is broken down and retrieved has a direct impact on answer quality.

LLM layer, this may surprise some people, but this is the easiest part of this architecture. If we have good data consolidation, good data quality and proper strategies to chunck and retrieve, then the LLM just needs to pretiffy the answer having all the required data.

UX Finally we need to figure out how to expose this interface? Inside Service Now? On a separate system? I will build a simple UI to demo the solution.

As with any good PoC, I m ignoring RBAC, authentication, latency, backups, disaster recovery, monitoring & observability, scalability, and all the other juicy areas that one would need to worry about just before someone says “let’s put this in production”

The Stack

There is really no right or wrong at this point, and as such I ll pick the ones that are open source and easily available.

Language: Python
Vector DB: SQLite
LLM: Ollama with llama3.2 (I have this locally)
Embeddings: nomic-embed-text – as i already have ollama locally
UI: simple chat page powered with Python streamlit

Building the Knowledge Base

For this exercise there is no need to go all in with application integration, specially given that I do not have access to any 🙂 So lets just create some KB articles as a text file to keep us going.

Example article:

# VPN Troubleshooting

Symptoms

- VPN connection fails
- Unable to access internal systems

Resolution

1. Verify internet connection
2. Restart VPN client
3. Reauthenticate
4. Retry connection

Escalation

If issue persists after three attempts, close the laptop take the rest of the day off.

I will ask ChatGPT to get me some more similar articles covering:

password reset
MFA
Outlook
software requests
laptop does not turn on

This should not be that far from what a real IT support knowledge base looks like.

The next step is to enrich the knowledge base with metadata. This is not essential for this initial PoC, however metadata becomes critical because it enables filtering, ranking, access control, and context aware retrieval. For example, we may want to limit searches to a specific product, prioritize recently updated articles, or return content tailored to a particular audience.

---
id: kb-001
title: Reset VPN Access
product: VPN
audience: support
updated_at: 2026-06-01
---

# VPN Troubleshooting

Symptoms

- VPN connection fails
- Unable to access internal systems

....

This simplistic approach – due to not having access to a Service Now PDI removes the Access Control capabilities that one would get from KB coming from Service Now, not that would matter as there is also no authentication at this moment 🙂

Ingestion Pipeline

Let’s assume the data is clean, structure, has metadata so the remaining steps is to create chunks (splitting of the data), generate embeddings, and ingest them into the database.

structured KB article
-> split into chunks
-> generate embeddings
-> store chunks + metadata in the vector database

chunks = [
    chunk
    for article in articles
    for chunk in chunk_article(
        article,
        max_words=260,
        overlap_words=40,
    )
]

The chunks content is what gets embedded. The metadata is what makes the final answer traceable back to the original KB article and section.

lass DocumentChunk:
    chunk_id: str
    article_id: str
    title: str
    heading_path: str
    content: str
    metadata: dict[str, str]

Generate embeddings

embedding_model = build_embedding_model("ollama:nomic-embed-text")

embeddings = embedding_model.embed_many(
    [chunk.search_text for chunk in chunks]
)

Store the chunks in the vector DB

store = SQLiteVectorStore("data/kb.sqlite")
store.initialize()

store.upsert_chunks(
    chunks,
    embeddings,
    embedding_model=embedding_model.name,
)

UX

Finally for all of this to make some sense lets quickly bring a Streamlit app.

For this step I used Codex (this is our engineering mandate at the moment) to quickly create a working prototype with Streamlit

What’s Next?

If you read this far, you might be thinking: “Where the hell is Service Now in all of this?”

Fair question.

For this PoC, I’m focusing on the knowledge and retrieval layers first. The goal is to prove that we can pull information from multiple sources and generate useful answers.

Once I get access to a ServiceNow PDI, I’ll extend the solution to create tickets automatically when no relevant knowledge article is found. I may also use the ServiceNow Knowledge Base as one of the data sources, but I still need to explore the best way to integrate it.

Tech Trantor

AI, LLMs, Agents, Cloud & Integration

Service Now KB Chatbot – Part 1

Why I’m Building This

The Goal

Architecture and Considerations

The Stack

Building the Knowledge Base

Ingestion Pipeline

UX

What’s Next?

Be the first to comment

Leave a Reply Cancel reply

Why I’m Building This

The Goal

Architecture and Considerations

The Stack

Building the Knowledge Base

Ingestion Pipeline

UX

What’s Next?

Share this:

Related Posts

Building a Practical RAG Pipeline with LangChain

Chunking Strategies for Better RAG Retrieval With Qdrant

AI Agents 101: Build an AI Agent with Ollama (llama3.1)

Be the first to comment

Leave a Reply Cancel reply