Event Sourcing 2026 — A Field Guide is the work that turns raw data into decisions. The pipeline from "we have data" to "we have a model that runs in production" is the same in every industry: ingest, transform, model, serve, monitor. This guide is the field-tested pattern for this tool in a production data context.
The data pipeline
The canonical data pipeline has five stages:
- Ingest. Pull from source (database, API, file, stream).
- Store. Land in a warehouse (Snowflake, BigQuery, Postgres) or a lake (S3, GCS).
- Transform. Clean, join, aggregate (dbt, Spark, dbt + Airflow).
- Serve. Expose to consumers (BI tool, API, application).
- Monitor. Quality, freshness, lineage.
A data pipeline that nobody can trust is useless. A data pipeline that runs but nobody knows what it does is a liability. Treat the pipeline as a product: documented, tested, monitored, owned.
dbt for analytics engineering
dbt is the de facto standard for SQL-based data transformation. The patterns:
-- models/marts/finance/ar_aging.sql
{{ config(materialized="incremental", on_schema_change="append_new_columns") } }
with ar as (
select * from {{ ref("stg_ar_invoices") } }
where docdate <= current_date
{% if is_incremental() %}
and _loaded_at > (select max(_loaded_at) from {{ this } })
{% endif %}
),
customers as (
select * from {{ ref("dim_customer") } }
)
select
ar.customer_id,
c.acctname as customer_name,
ar.status,
sum(ar.curyorigdocamt) as total_amount,
sum(ar.curydocbal) as open_balance,
max(ar.duedate) as oldest_due_date,
datediff("day", max(ar.duedate), current_date) as days_overdue
from ar
left join customers c on ar.customer_id = c.baccountid
where ar.status = "O"
group by ar.customer_id, c.acctname, ar.status
For the broader data engineering patterns, see the data warehouse feed guide.
ML ops in production
The lifecycle of an ML model in production:
- Train. Offline training on historical data.
- Evaluate. Test on a held-out set; track metrics.
- Deploy. Serve the model (REST, batch, embedded).
- Monitor. Track drift, accuracy, latency.
- Retrain. When drift exceeds threshold.
from fastapi import FastAPI
import joblib
import numpy as np
app = FastAPI()
model = joblib.load("model.pkl")
@app.post("/predict")
def predict(features: list[float]):
X = np.array(features).reshape(1, -1)
prediction = model.predict(X)
probability = model.predict_proba(X).max()
return {"prediction": int(prediction[0]), "confidence": float(probability)}
Vector databases and embeddings
Vector databases store embeddings — the dense numeric representation of text, images, audio. The use cases:
- Semantic search. "Find me the docs that are about X" — even if X is not a keyword.
- Recommendation. "Find me the items similar to this one."
- RAG. As discussed in the AI agents guide.
- Clustering. Group similar items for analysis.
from sentence_transformers import SentenceTransformer
import pinecone
model = SentenceTransformer("all-MiniLM-L6-v2")
index = pinecone.Index("my-index")
def search(query: str, k: int = 5) -> list:
embedding = model.encode(query).tolist()
results = index.query(embedding, top_k=k, include_metadata=True)
return [r["metadata"]["text"] for r in results["matches"]]
Monitoring the pipeline
The monitoring that matters:
- Freshness. Is the data up to date?
- Volume. Is the volume what we expect?
- Schema. Has the source schema changed?
- Quality. Are the values within expected ranges?
- Lineage. Where did this number come from?
For the broader monitoring patterns, see the monitoring guide and the Query Store guide.
Wrapping up
The pipeline, the transformation, the model serving, the vector search, the monitoring. Get all five right and the data is a product. The discipline is the same as any production system — fail safely, monitor always, and improve the system after every incident.
Wrapping up
That is the working approach I use on Acumatica projects. The same patterns show up whether you are in Nairobi, Johannesburg, Kigali, Lusaka or Harare — and they are the things that keep work moving when an upgrade lands at 6 PM on a Friday. If you are stuck on something specific, reach out or keep reading through the rest of the Acumatica blog.
Independent software engineer in Nairobi specialising in Acumatica customisations, Laravel backends, and tax fiscalisation integrations across East and Southern Africa.