Local Development with KIND
Deploy Data Index to a local Kubernetes cluster using KIND (Kubernetes in Docker) for development and testing.
Quick Start
cd data-index/scripts/kind
# 1. Create KIND cluster
./setup-cluster.sh
# 2. Install PostgreSQL
MODE=postgresql ./install-dependencies.sh
# 3. Deploy Data Index service
./deploy-data-index.sh postgresql
# 4. Deploy test workflow app
./deploy-workflow-app.sh
# 5. Test end-to-end
./test-mode1-e2e.sh
Detailed Setup
Step 1: Create KIND Cluster
cd data-index/scripts/kind
./setup-cluster.sh
What this does:
-
Creates a KIND cluster named
data-index-test -
Configures single control-plane node (can run workloads)
-
Sets up NodePort mappings for local access:
-
30080→ GraphQL API -
30432→ PostgreSQL -
30920→ Elasticsearch (future)
-
Configuration:
Override cluster name:
CLUSTER_NAME=my-cluster ./setup-cluster.sh
Verify cluster:
kubectl cluster-info --context kind-data-index-test
kubectl get nodes
Step 2: Install Dependencies
PostgreSQL Mode
MODE=postgresql ./install-dependencies.sh
What this installs:
-
PostgreSQL 16.x (Bitnami Helm chart)
-
Namespace:
postgresql -
Database:
dataindex -
User:
dataindex -
Password:
dataindex123 -
Service:
postgresql.postgresql.svc.cluster.local:5432 -
NodePort:
localhost:30432
-
Verify PostgreSQL:
kubectl get pods -n postgresql
kubectl logs -n postgresql postgresql-0
# Test connection
kubectl exec -n postgresql postgresql-0 -- \
env PGPASSWORD=dataindex123 psql -U dataindex -d dataindex -c '\l'
Step 3: Deploy Data Index Service
./deploy-data-index.sh postgresql
What this does:
-
Builds Data Index container image
-
Loads image to KIND cluster
-
Initializes database schema (executes SQL migration script)
-
Deploys Data Index service to
data-indexnamespace -
Creates NodePort service on port
30080
Verify deployment:
kubectl get pods -n data-index
kubectl logs -n data-index -l app=data-index-service
# Check GraphQL endpoint
curl http://localhost:30080/graphql \
-H "Content-Type: application/json" \
-d '{"query":"{ __schema { types { name } } }"}'
Deployment includes:
-
Service:
data-index-service(NodePort 30080) -
ConfigMap: Database connection settings
-
Deployment: 1 replica, readiness/liveness probes
-
Environment:
-
QUARKUS_DATASOURCE_JDBC_URL -
QUARKUS_DATASOURCE_USERNAME -
QUARKUS_DATASOURCE_PASSWORD
-
Step 4: Deploy FluentBit
cd ../fluentbit
./deploy-fluentbit.sh postgresql
What this does:
-
Generates FluentBit ConfigMap from source files:
-
postgresql/fluent-bit.conf -
postgresql/parsers.conf -
postgresql/flatten-event.lua
-
-
Deploys FluentBit DaemonSet to
loggingnamespace -
Configures to tail logs from
workflowsnamespace -
Sets up PostgreSQL output plugin
FluentBit configuration highlights:
[INPUT]
Name tail
Path /var/log/containers/*_workflows_*.log
Parser cri
Tag kube.*
[OUTPUT]
Name pgsql
Host postgresql.postgresql.svc.cluster.local
Port 5432
Database dataindex
Table workflow_events_raw # OR task_events_raw
Verify FluentBit:
kubectl get pods -n logging
kubectl logs -n logging -l app=workflows-fluent-bit --tail=50
# Check FluentBit is tailing workflow logs
kubectl logs -n logging -l app=workflows-fluent-bit | grep "inotify_fs_add"
Step 5: Deploy Test Workflow Application
cd ../../kind
./deploy-workflow-app.sh
What this does:
-
Builds
workflow-test-appcontainer (Quarkus Flow app) -
Loads image to KIND cluster
-
Deploys to
workflowsnamespace -
Creates Service on port
8082 -
Configures structured logging to stdout
Test workflows included:
-
/test-workflows/simple-set- Simple workflow with 2 set operations -
/test-workflows/hello-world- Hello world workflow -
/test-workflows/hello-world-fail- Intentional failure for testing
Verify deployment:
kubectl get pods -n workflows
kubectl logs -n workflows -l app=workflow-test-app
# Port-forward to access locally
kubectl port-forward -n workflows svc/workflow-test-app 8082:8080
Testing
Execute Test Workflow
# Port-forward workflow app
kubectl port-forward -n workflows svc/workflow-test-app 8082:8080 &
# Execute workflow
curl -X POST http://localhost:8082/test-workflows/simple-set \
-H "Content-Type: application/json" \
-d '{"name": "test-execution"}'
# Wait for events to propagate (5-10 seconds)
sleep 10
Query GraphQL API
# Get all workflow instances
curl http://localhost:30080/graphql \
-H "Content-Type: application/json" \
-d '{
"query": "{ getWorkflowInstances(limit: 10) { id name status startDate endDate } }"
}'
# Get workflow with tasks
curl http://localhost:30080/graphql \
-H "Content-Type: application/json" \
-d '{
"query": "{ getWorkflowInstances(limit: 5) { id name status taskExecutions { id taskName status } } }"
}'
Expected response:
{
"data": {
"getWorkflowInstances": [
{
"id": "01KQ...",
"name": "simple-set",
"status": "COMPLETED",
"startDate": "2026-04-27T20:30:00Z",
"endDate": "2026-04-27T20:30:05Z"
}
]
}
}
Verify Database
# Check raw events
kubectl exec -n postgresql postgresql-0 -- \
env PGPASSWORD=dataindex123 psql -U dataindex -d dataindex \
-c "SELECT COUNT(*) FROM workflow_events_raw;"
# Check normalized workflow instances
kubectl exec -n postgresql postgresql-0 -- \
env PGPASSWORD=dataindex123 psql -U dataindex -d dataindex \
-c "SELECT id, name, status FROM workflow_instances;"
# Check task instances
kubectl exec -n postgresql postgresql-0 -- \
env PGPASSWORD=dataindex123 psql -U dataindex -d dataindex \
-c "SELECT task_execution_id, task_name, status FROM task_instances;"
End-to-End Test Script
Run the complete test suite:
./test-mode1-e2e.sh
What this tests:
-
Workflow execution triggers events
-
FluentBit collects events from container logs
-
PostgreSQL receives raw events
-
Events are normalized in real-time to workflow_instances and task_instances
-
GraphQL API returns normalized data
Test output example:
[INFO] Testing PostgreSQL mode end-to-end
[STEP] Executing test workflow...
[INFO] ✓ Workflow executed
[STEP] Waiting for event propagation (10s)...
[STEP] Checking FluentBit...
[INFO] ✓ FluentBit is collecting events
[STEP] Checking raw events in PostgreSQL...
[INFO] ✓ Raw events: 8
[STEP] Checking normalized workflows...
[INFO] ✓ Workflows: 1
[STEP] Checking normalized tasks...
[INFO] ✓ Tasks: 2
[STEP] Querying GraphQL API...
[INFO] ✓ GraphQL returned workflow data
[INFO] ========================================
[INFO] PostgreSQL Mode End-to-End Test: PASSED
[INFO] ========================================
Development Workflow
Building and Deploying Changes
After making code changes:
# 1. Rebuild and redeploy Data Index service
cd data-index/data-index-service
mvn clean package -DskipTests
cd ../../scripts/kind
./deploy-data-index.sh postgresql
# 2. Or rebuild just the workflow app
cd data-index/data-index-integration-tests
mvn clean package -DskipTests
cd ../scripts/kind
./deploy-workflow-app.sh
After FluentBit config changes:
cd data-index/scripts/fluentbit
./deploy-fluentbit.sh postgresql
Accessing Services Locally
GraphQL API:
# Already accessible via NodePort
open http://localhost:30080/graphql
PostgreSQL:
# Already accessible via NodePort
psql -h localhost -p 30432 -U dataindex -d dataindex
# Password: dataindex123
Workflow App:
kubectl port-forward -n workflows svc/workflow-test-app 8082:8080
open http://localhost:8082
FluentBit Metrics:
kubectl port-forward -n logging <fluentbit-pod> 2020:2020
curl http://localhost:2020/api/v1/metrics
Troubleshooting
Cluster Won’t Start
Check Docker:
docker info
# If error, start Docker Desktop
Delete and recreate:
kind delete cluster --name data-index-test
./setup-cluster.sh
Pods Not Starting
Check images loaded:
docker exec -it data-index-test-control-plane crictl images | grep data-index
If missing, reload:
kind load docker-image kubesmarts/data-index-service:999-SNAPSHOT --name data-index-test
kind load docker-image local/workflow-test-app:1.0.0 --name data-index-test
No Events in Database
Check FluentBit is running:
kubectl get pods -n logging
kubectl logs -n logging -l app=workflows-fluent-bit | grep -i error
Check workflow app is in workflows namespace:
kubectl get pods -n workflows
Check FluentBit parser:
# KIND uses containerd (CRI runtime) - verify parser is 'cri', not 'docker'
kubectl get configmap -n logging workflows-fluent-bit-config -o yaml | grep Parser
See Troubleshooting Guide for more issues.
Cleanup
Reset Without Deleting Cluster
To redeploy with fresh data (keeps cluster and PostgreSQL):
# Delete Data Index and FluentBit
kubectl delete namespace data-index
kubectl delete namespace logging
# Clear PostgreSQL data
kubectl delete -n postgresql pvc --all
kubectl delete -n postgresql statefulset postgresql
# Reinstall PostgreSQL
cd data-index/scripts/kind
MODE=postgresql ./install-dependencies.sh
# Redeploy Data Index and FluentBit
./deploy-data-index.sh postgresql
cd ../fluentbit
./deploy-fluentbit.sh postgresql
| This is faster than recreating the entire cluster and useful for testing schema changes or FluentBit configuration. |