Insights with MiniLM: Hands-On Text Embeddings for MLOps

From MOT Notes to Insights with MiniLM: A Practical Guide to Text Embeddings

 

Intro: I’m not a data scientist or statistician – I’m a DevOps engineer who got interested in ML through building CarHunch.
 
This post shares what I’ve learned about embeddings through that journey, hopefully presented in a way that other DevOps engineers and people interested in AI/ML can understand and experiment with.
 
The Jupyter notebook is a simplified version of techniques I use in CarHunch at a much larger scale, made quick and easy to run so you can see and the concepts in action.
 

 

Every year, millions of vehicles undergo MOT testing in the UK, generating a massive amount of free-text defect notes that could revolutionize how we understand vehicle maintenance patterns.

 

But there’s a catch – these notes are messy, inconsistent, and nearly impossible to analyze at scale using traditional methods.

 

Consider these real examples from MOT records:

 

“Nearside rear brake pipe corroded”
“Brake hose deteriorated”
“Brakes imbalanced across an axle”
“Headlamp aim too high”
“Exhaust leaking gases”

 

While these notes are invaluable for mechanics, they create a nightmare for data analysis. Every tester phrases things slightly differently, and traditional keyword searches miss the bigger picture. How do you find all brake-related issues when they’re described in dozens of different ways?

 

The answer lies in embeddings – a powerful technique that transforms unstructured text into structured, analyzable data.

 

Embeddings convert text into numeric vectors, placing similar meanings close together in high-dimensional space. With embeddings, “brake hose deteriorated” and “brake pipe corroded” become neighbors – even though the wording differs significantly. This opens up entirely new possibilities for analyzing text data at scale.

 

This post demonstrates a practical, hands-on approach using MiniLM to:

 

  • – Transform messy MOT defect notes into structured embeddings
  • – Cluster similar defects automatically using machine learning
  • – Run semantic search to find related issues by meaning, not just keywords
  • – Visualize the results to understand patterns in vehicle defects

 


Try the Interactive Demo

 

The demonstration is a Jupyter notebook that you can open directly in Google Colab – no setup required on your local machine.

 

Important Note about Google Colab: When you click the link below, you’ll be prompted to sign in to Google. This is completely normal and free – Google Colab requires a Google account to save your work and provide computational resources. Your data remains private, and you can always download your work or run it locally if you prefer.

 

Open the demo in Google Colab

 

Or if you’d rather you can view the repository and run it locally:

github.com/DonaldSimpson/mot_embeddings_demo

 


How It Works: From Text to Insights

 
The demo is comprised of three key steps, each building on the previous one:
 

Step 1: Text to Numbers

 

The MiniLM model (specifically “all-MiniLM-L6-v2”) converts each defect note into a 384-dimensional vector. Think of this as creating a unique “fingerprint” for each piece of text that captures its semantic meaning. Notes about similar issues will have similar fingerprints.

 

Step 2: Finding Patterns

 

K-means clustering automatically groups these fingerprints together. The algorithm discovers that “brake pipe corroded” and “brake hose deteriorated” belong in the same cluster, while “headlamp aim too high” forms its own group. You’ll see this visualized in a 2D scatter plot using PCA (Principal Component Analysis).
 

Step 3: Intelligent Search

 

Semantic search uses cosine similarity to find the most relevant notes for any query. When you search for “brake failure,” it doesn’t just look for those exact words – it finds notes that are semantically similar, even if they use completely different terminology.
 

The notebook demonstrates this with a carefully curated set of real MOT defect notes, including:

 

  • Brake-related issues (pipes, hoses, imbalance)
  • Lighting problems (headlamp aim, functionality)
  • Steering and suspension defects
  • Exhaust system issues
  • Tyre wear problems

 

Each example is designed to show how embeddings capture meaning beyond literal word matching.

 


Hands-On Experimentation: Make It Your Own

 

This isn’t just a static demonstration – it’s a tool for discovery. The notebook is designed for active exploration, and the best way to understand embeddings is to experiment with them yourself.

 

Here’s a roadmap for turning this demo into a more personal learning experience:

 

    1. Start with your own data  
      The most rewarding experiment is using your own MOT notes. Have you had an MOT recently? Try adding those defect notes to see how they cluster with the sample data. You might be surprised by the patterns that emerge.

      notes = [
          "Engine oil leak",
          "Headlight not working", 
          "Nearside front tyre bald",
          "Steering pulling to the left",
          "Brake discs worn and pitted",
          # Add your own notes here...
          "Your defect notes here",
          "More of your defect notes"
      ]

      Suggestion: Try adding notes from different vehicle types (cars, vans, motorcycles) to see if the clustering adapts to different contexts.

 

    1. Play with clustering granularity 
      This is where things can get really interesting. Change the number of clusters and watch how the groupings shift:

      # Try different values: 2, 3, 4, 5, 6...
      kmeans = KMeans(n_clusters=3, random_state=42)

      This uses scikit-learn’s KMeans implementation.

      Start with 3 clusters and gradually increase. You’ll see how the algorithm balances between creating too many small groups versus too few large ones. The visualization will show you exactly how your notes are being grouped – some results might surprise you!

    2.  

    3. Make your own queries 
      The semantic search feature is incredibly powerful. Try queries that test the model’s understanding:

      # Test the model's semantic understanding
      query = "tyre wear"           # Should find tyre-related issues
      query = "steering problem"    # Should find steering defects  
      query = "engine issue"        # Should find engine problems
      query = "safety concern"      # Should find safety-related defects

      Try abstract concepts like “safety concern” or “performance issue” to see how well the model understands context beyond literal word matching.

 

  1. Experiment with different models (for the curious) 
    If you want to see how different embedding models perform, try swapping out MiniLM:

    # Larger, potentially more accurate model 
    model = SentenceTransformer("multi-qa-mpnet-base-dot-v1")
     
    # Or try a model specifically trained for technical text
     
    model = SentenceTransformer("all-mpnet-base-v2")

    These are SentenceTransformer models from the Hugging Face model hub.

    Compare the results – do the clusters change? Are the search results more relevant? This is a great way to understand how model choice affects performance.

  2.  

  3. Scale up and discover patterns 
    Once you’re comfortable with the basics, try working with larger datasets. The DVLA MOT dataset contains millions of records, and you’ll start to see fascinating patterns emerge:

    • Which vehicle makes have the most brake-related failures?
    • Do certain types of defects cluster by geographic region?
    • How do defect patterns change over time?

    This is where embeddings really shine – they can reveal insights that would be impossible to find with traditional keyword searches.

 

Each of these modifications provides immediate feedback – you can see the results directly in the notebook, making it an ideal learning environment.


 

Real-World Applications: CarHunch

 

For CarHunch, I’ve been applying this same approach to millions of MOT records. Embeddings make it possible to:

 

  • Standardize messy defect notes into consistent categories
  • Compare your car’s defects with similar vehicles
  • Surface patterns across the UK fleet (e.g., which makes and models fail most often on brakes)

 

A Surprising ‘Discovery’: The Land Rover Defender Seatbelt issue

DEFENDER2.NET - View topic - Seatbelt catching on @#$!

Sometimes, the most interesting insights come from patterns you’d never expect to find. Take my own Land Rover (original) Defender 110 as an example. When I analyzed its MOT history alongside thousands of similar vehicles, I discovered something surprising:

 

seatbelt damage is the number 1 most common issue for Defenders – not the engine, suspension or rust problems you’d probably expect from a rugged old off-road vehicle!

 

This revelation only became apparent through the kind of clustering and semantic analysis we’re exploring in this notebook. Traditional keyword searches would have missed this pattern entirely, because MOT testers describe seatbelt issues in dozens of different ways:

 

“Seatbelt webbing frayed”
“Driver’s seatbelt damaged”
“Seatbelt retraction mechanism faulty”
“Belt webbing showing signs of wear”

 

But embeddings revealed the underlying pattern: all these different descriptions clustered together as the same fundamental issue.

 

Even more fascinating, the analysis showed this is a design quirk specific to Defenders; the front seatbelts naturally fall right in to the door jambs as there’s nowhere else for them to go (plus the tensioners are weak/slow), so when the doors are closed they get trapped, causing accelerated wear that doesn’t occur in most other vehicles.

 

The Bigger Picture: How This Could Transform Automotive Design

 

This Defender example hints at something much larger: embeddings could impact how car manufacturers identify design flaws and improve vehicle quality. Imagine if every manufacturer had access to this kind of analysis across their entire fleet:

  • Early Warning System: Spot recurring issues before they become widespread problems
  • Design Validation: Verify that design changes actually solve the problems they’re meant to address
  • Cost-Benefit Analysis: Quantify the real-world impact of design decisions on maintenance costs
  • Competitive Intelligence: Understand how your vehicles compare to competitors in terms of reliability

 

Traditional quality control relies on warranty claims and customer complaints – reactive data that comes too late. But MOT data is generated continuously, providing a real-time view of how vehicles perform in the wild. The challenge has always been extracting meaningful insights from the unstructured text that testers write.

 

This is exactly the kind of insight that would be impossible to discover without the semantic understanding that embeddings provide. You can explore this particular analysis yourself with CarHunch’s enhanced hunches feature, which uses the same techniques demonstrated in this notebook.

 

This example is just a small subset of what that larger platform does, showing how embeddings can transform unstructured text data into actionable insights that reveal patterns invisible to traditional analysis methods.

 


From Experimentation to Production

 

Once you’ve experimented with the notebook and understand how embeddings work, you might be wondering: “How do I turn this into a production system?” This is where the journey from data science experimentation to operational ML begins.

 

In my previous post, “MLOps for DevOps Engineers – MiniLM & MLflow demo”, I showed how to take these same embedding techniques and build them into a proper MLOps pipeline. That post covers:

 

  • Containerizing the embedding pipeline with Docker
  • Tracking experiments and model versions with MLflow
  • Automating the entire workflow with Makefiles
  • Building quality gates and reproducibility into the process

 

Think of it this way: this notebook is your playground for understanding embeddings, while the MLOps post shows you how to turn that playground into a production system. The same MiniLM model that powers this interactive demo is the foundation for the automated pipeline in the MLOps example.

 

For DevOps engineers, this represents a natural progression: start with hands-on experimentation to understand the concepts, then apply your existing automation and infrastructure skills to make it production-ready.

 


Key Takeaways

 

For DevOps and SRE engineers curious about machine learning, embeddings represent an excellent entry point:

 

  • No GPU required for basic experimentation
  • Easy to run locally or in cloud environments
  • Immediately useful for messy, real-world text data
  • Natural bridge to production MLOps workflows

 

Give the notebook a try, experiment with your own MOT notes, and discover what insights you can uncover. When you’re ready to take it further, the MLOps post will show you how to automate and scale these techniques.

 

Open the demo in Google Colab

 


 

Contains public sector information licensed under the Open Government Licence v3.0.

MLOps for DevOps Engineers – MiniLM & MLflow demo

MLOps for DevOps Engineers – MiniLM & MLflow pipeline demo

 

As a DevOps and SRE engineer, I’ve spent a lot of time building automated, reliable pipelines and cloud platforms. Over the last couple of years, I’ve been applying the same principles to machine learning (ML) and AI projects.

 

One of those projects is CarHunch, a vehicle insights platform I developed. CarHunch ingests and analyses MOT data at scale, using both traditional pipelines and applied AI. Building it taught me first-hand how DevOps practices map directly onto MLOps: versioning datasets and models, tracking experiments, and automating deployment workflows. It’a a new and exciting area but the core idea is very much the same, with some interesting new tools and concepts added.

 

To make those ideas more approachable for other DevOps engineers, I have put together a minimal, reproducible demo using MiniLM and MLflow.

 

You can find the full source code here:

github.com/DonaldSimpson/mlops_minilm_demo

 

The quick way: make run

The simplest way to try this demo is with the included Makefile; that way all you need is Docker installed

# clone the repo
git clone https://github.com/DonaldSimpson/mlops_minilm_demo.git

cd mlops_minilm_demo

# build and run everything (training + MLflow UI)
make run

 

That one ‘make run’ command will:

  • – Spin up a containerised environment
  • – Run the demo training script (using MiniLM embeddings + Logistic Regression)
  • – Start the MLflow tracking server and UI

 

Here’s a quick screngrab of it running in the console:

Once it’s up & running, open
http://localhost:5001
in your browser to explore logged experiments

 

What the demo shows

– MiniLM embeddings turn short MOT-style notes (e.g. “brakes worn”) into vectors

– A Logistic Regression classifier predicts pass/fail

– Parameters, metrics (accuracy), and the trained model are logged in MLflow

– You can inspect and compare runs in the MLflow UI – just like you’d review builds and artifacts in CI/CD

– Run detail; accuracy metrics and model artifact stored alongside parameters

 

Here are screenshots of the relevant areas from the MLFlow UI:











 

Why this matters for DevOps engineers

    • Familiar workflows: MLflow feels like Jenkins/GitHub Actions for models – every run is logged, reproducible, and auditable

 

    • Quality gates: just as builds pass/fail CI, models can be gated by accuracy thresholds before promotion

 

    • Reproducibility: datasets, parameters and artifacts are versioned and tied to each run

 

    • Scalability: the same demo pattern can scale to real workloads – this is a scaled down version of my local process

 

 

Other ways to run it

 

If you prefer, the repo includes alternatives:

 

    • Python venv: create a virtualenv, install requirements.txt, run train_light.py

 

    • Docker Compose: build and run services with docker-compose up --build

 

    • Make targets: make train_light (quick run) or make train (full run)

 

These are useful if you want to dig a little deeper and see exactly what’s happening

 

Next steps

Once you’re comfortable with this small demo, natural extensions are:

 

    • – Swap in a real dataset (e.g. DVLA MOT data)

 

    • – Add data validation gates (e.g. Great Expectations)

 

    • – Introduce bias/fairness checks with tools like Fairlearn

 

    • – Run the pipeline in Kubernetes (KinD/Argo) for reproducibility

 

    • – Hook it into GitHub Actions for end-to-end CI/CD

 

 

Closing thoughts

DevOps and MLOps share the same DNA: versioning, automation, observability, reproducibility. This demo repo is a small but practical bridge between the two

 

Working on CarHunch gave me the chance to apply these ideas in a real platform. This demo distills those lessons into something any DevOps engineer can try locally.

 

Try it out at github.com/DonaldSimpson/mlops_minilm_demo and let me know how you get on

 

CarHunch – Vehicle Insights Platform

CarHunch Logo

Turning billions of MOT and accident records into real-time vehicle insights.

Visit the live project here:

www.carhunch.com


What CarHunch Does

  • Aggregates billions of MOT test results and STATS19 UK accident records.
  • Provides real-time analytics on vehicle makes, models, years, and conditions.
  • Compares a specific car against similar vehicles (make/model/year).
  • Highlights common MOT failures and safety risks for different vehicles.

How It Works

CarHunch is powered by a ClickHouse data warehouse for ultra-fast queries, with:

  • Python ETL pipelines for MOT and accident data ingestion.
  • Incremental updates from DVLA bulk & delta files.
  • Redis caching for instant lookups.
  • Machine learning (MiniLM embeddings + clustering) to spot defect patterns.
  • LLM integration (LLaMA) to generate natural-language insights.

Example Insights

“Your 2010 Ford Focus has a 28% higher MOT failure rate than average for similar cars, mainly due to suspension wear.”

“BMW 3 Series (2008–2012) commonly fail MOTs due to brake issues around 80,000 miles.”

“Motorcycles show a different pattern of MOT failures compared to cars, with lighting and tyre defects being most common.”

Technical Overview

CarHunch isn’t just about insights — it’s also a demonstration of building a modern, high-performance OLAP data platform from the ground up.

  • Database: ClickHouse OLAP warehouse for real-time analytics on billions of records.
  • ETL: Python pipelines ingesting DVLA MOT bulk/delta files and STATS19 accident datasets.
  • Data Modeling: Normalised vehicle/test/defect schema with indexing and partitioning for query performance.
  • APIs: REST endpoints (Flask/FastAPI) serving real-time queries to front-end applications.
  • Caching: Redis for ultra-fast repeated lookups.
  • Machine Learning: MiniLM embeddings + HDBSCAN clustering for identifying defect patterns and grouping similar vehicles.
  • LLM Integration: Local LLaMA models for natural-language explanations and summaries.
  • Deployment: Dockerised services on a Proxmox node, easily portable to cloud infrastructure.
  • Monitoring: Logging & system metrics (rsyslog, lm-sensors) for reliability and performance tracking.

Why CarHunch?

CarHunch shows how big data + AI can turn raw government datasets into meaningful insights that benefit both consumers and the automotive industry.

👉 Explore more at

CarHunch.com

CarHunch Screenshot

 
Get in touch
if you’d like to collaborate or learn more.

Monitoring Proxmox with Grafana and InfluxDB

I took these notes while setting up Grafana and InfluxDB on Proxmox.

I hit a few minor issues so thought I’d post it here as a mini “How To” or reference for others.

 

 

NOTE: If you are just looking for a simple and light-weight way to monitor Proxmox stats (including memory, CPU, disk for your LXCs and VMs), check out the brief section on “Pulse” at the end of this page!

 

 

This setup allows me to easily monitor my Proxmox host and the VMs and LXCs it runs via a nice Grafana dashboard, with the data/metrics stored in InfluxDB.

 

The main steps are:

 

1. Install Influx DB
2. Install Grafana
3. Configure Proxmox
4. Configure InfluxDB
5. Configure Grafana

Install InfluxDB

Proxmox makes this very quick and very easy, if you’re happy to trust the Community scripts available here:

https://community-scripts.github.io/ProxmoxVE/

which just means running this one-liner in the proxmox console:

 

bash -c "$(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/ct/influxdb.sh)"

 

this created an InfluxDB LXC in a couple of minutes.

 

For me, the IP and port were: http://192.168.0.24:8086

 

Install Grafana

This was much the same with a different script, and just meant running:

 

bash -c "$(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/ct/grafana.sh)"

 

then I also had a new Grafana instance here:

 

http://192.168.0.114:3000

 

Note that the default user:password for Grafana is admin:admin

 

Configure Proxmox

Next you need to set the Metrics Server used byProxmox, this will tell proxmox to send all metrics on itself and the VMs and LXCs it runs to InfluxDB.

This is set under “Datacenter” in the proxmox UI:

 

This looked straightforward too, but there were conflicting opinions on how to do it. I initially went with UDP which didn’t work for me; there was nowhere to set any authentication and I wasn’t allowing anonymous access to InfluxDB, so I switched to using HTTP which then allowed me to specify the (InfluxDB) credentials.

 

Configure InfluxDB

I created a “proxmox” organisation and a “proxmox” bucket in InfluxDB

 

I then created an API key/Token specifically for that proxmox bucket, which I used in the above pic.

 

To verify things were working between Proxmox and InfluxDB, I took a look in the data explorer:

 

 

You can see in that pic that InfluxDB has data on my VMs and LXCs, which it must have received from Proxmox, so I then knew my remaining issues were with the connection between InfluxDB <-> Grafana.

 

Configure Grafana

 

Initially I was getting “InfluxDB returned error: Unauthorized error reading influxDB” – hence the check above to confirm that Proxmox -> InfluxDB was working ok.

 

I couldn’t see anywhere in this version of Grafana to specify the Token for InfluxDB though – other screenshots on the ‘net had & used that option, but it wasn’t available for me 🙁

 

After some reading I learned you could set the Token by creating a new Custom HTTP Header called “Authorization” with the value “Token BXx…….7yBkw==” (that’s the word Token, a space, then the full Token you got from InfluxDB, all set as the Value for a new Custom HTTP Header called Authorization…)

 

This seemed surprisingly flaky to me, but it worked.

 

My (working) connection details look like this:

 

Prior to adding that HTTP Header, I was getting a successful connection but “0 measurements found”.

 

Next I added a new Proxmox dashboard to Grafana from here:
https://grafana.com/grafana/dashboards/10048-proxmox/

 

you don’t need to sign up there or anything else, just enter the ID: 10048 like in this pic and it’ll pull the Dashboard down:

 

Now I was finally able to see data being populated in Grafana from my Proxmox node & its VMs & LXCs:
Happy days.

 

The Pulse option

 

A possible alternative to the above Grafana and InfluxDB stack is to use “Pulse” – this was new to me and I have recently set it up too (you can never have enough monitoring!).

 

This is a very lightweight and more focused option that is really quick and easy to set up.

 

While the InfluxDB and Grafana approach can be extended to cover a vast range of monitoring and alerting for all sorts of things – I have set up and used it in several large companies I’ve worked for – if all you really want is Proxmox monitoring without those possibilities, this looks perfect.

 

 

with a simple install script for Proxmox:

 

bash -c "$(wget -qLO - https://github.com/community-scripts/ProxmoxVE/raw/main/ct/pulse.sh)"

 

 

Here’s my settings screen:

 

And here’s what it looks like on my Proxmox host:

 

Neat!

 

Beech Tree – 2025

This (awesome and huge) tree fell down in a storm at the start of 2025:

 

Stump cut:

 

 

Trunk cut in to sections for milling:

 

Gaps cut out and removed between the sections so I can get the mill in, and some firewood removed: