OMS Kubernetes Migration: Detail Docs¶
Companion to OMS Migration Plan: AWS EC2 → Azure AKS. The plan page is the executive view; these pages are the implementation-level detail produced by the Rails team during the prep work on the docker-setup-chnages branch.
Reading order¶
| Doc | What it covers |
|---|---|
| Workloads | The five Kubernetes Deployments (web + three Sidekiq capsules + scheduler) with replica counts, commands, resource targets, and termination grace periods. |
| Environment variables | Every env var the app reads. The input to the ConfigMap and Secret manifests. |
| External services | Inbound and outbound integrations: Shopify, SFCC, Amazon SP-API, Cirro 3PL, Azure Service Bus, Postmark, Klaviyo. Egress firewall inputs. |
| Scheduled jobs | Why sidekiq-scheduler must run as a singleton. Why we are not migrating to Kubernetes CronJobs. |
| Known issues | Ten things in the codebase that need attention before cutover, including four specific filesystem-write call sites. |
| Decisions needed | Open questions for leadership in suggested decision order. |
| Database migration | MySQL 5.7 → 8.0: three strategy options compared, recommendation, sequence of work, risks. |
| Rails team role | Clean ownership split: Rails team, DevOps (Senith), joint, leadership. |
| Shopify integration | Shopify Partners app ownership, in-tree shopify-app-admin engine, OAuth tokens, webhook URLs, cutover risks. |
Status¶
Rails-side prep is committed on the docker-setup-chnages branch. Every change is backwards-compatible — gated on environment variables — so the existing Capistrano deploy is unaffected.
Delivered:
- Production Dockerfile (multi-stage, non-root, jemalloc, wkhtmltopdf)
.dockerignorefor build context hygiene- Per-capsule Sidekiq configs (
sidekiq.default.yml,sidekiq.limited.yml,sidekiq.single.yml,sidekiq.scheduler.yml) /uphealth endpoint for liveness and readiness probes- STDOUT logging gated on
RAILS_LOG_TO_STDOUT - Force-SSL, ASSUME_SSL, and hosts allowlist via env vars
- Env-driven AWS S3 config (
AWS_*env vars) - Env-driven Datadog StatsD destination (
DD_AGENT_HOST,DD_DOGSTATSD_PORT) - Sidekiq graceful-shutdown timeout (
:timeout: 25) - Build-resilient credential lookups (
rescuewrappers indatabase.yml,secrets.yml,storage.yml) - A
local-dev/validation harness — kind-based local cluster that mirrors the target shape
Still owed:
- Refactor of four filesystem-write call sites to stream to S3 (see Known issues)
- MySQL 8.0 compatibility pass (RSpec against 8.0)
- GitHub Actions workflow to build the image (replaces Capistrano)
- Cutover smoke test script
See Rails team role for the full breakdown.
Related pages in this site¶
- OMS Migration Plan: AWS EC2 → Azure AKS — the executive overview this folder supports
- OMS — current OMS architecture and ownership
- Camel Topology — Azure-side integrations