fix: zero-downtime deploy com start-first e healthcheck

- Remove docker service update --force que causava downtime
- Agrupa env vars do Convex em um único update (evita múltiplos restarts)
- Adiciona delay: 10s e monitor: 30s no update_config
- Healthcheck do web usa /api/health com timeout
- Ajusta start_period: 180s (web) e 60s (convex)
- Convex backend não é mais forçado a reiniciar após stack deploy

Fluxo correto de deploy:
1. docker stack deploy detecta mudança
2. Novo container é criado (start-first)
3. Swarm espera healthcheck passar
4. Swarm espera monitor period (30s)
5. Container antigo é removido
6. Zero downtime durante todo o processo

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Seu Nome 2025-12-08 15:07:13 -03:00
parent d8936899ee
commit 40e2c01abd
2 changed files with 35 additions and 19 deletions

View file

@ -296,26 +296,31 @@ jobs:
echo "Using APP_DIR (stable)=$APP_DIR_STABLE"
APP_DIR="$APP_DIR_STABLE" RELEASE_SHA=${{ github.sha }} docker stack deploy --with-registry-auth -c stack.yml sistema
- name: Ensure Convex service envs and restart
- name: Ensure Convex service envs (sem force restart)
run: |
cd "$EFFECTIVE_APP_DIR"
set -o allexport
if [ -f .env ]; then . ./.env; fi
set +o allexport
echo "Ensuring Convex envs on service: sistema_convex_backend"
# Acumula todas as env vars em um único update para evitar múltiplos restarts
UPDATE_ARGS=""
if [ -n "${MACHINE_PROVISIONING_SECRET:-}" ]; then
docker service update --env-add MACHINE_PROVISIONING_SECRET="${MACHINE_PROVISIONING_SECRET}" sistema_convex_backend || true
UPDATE_ARGS="$UPDATE_ARGS --env-add MACHINE_PROVISIONING_SECRET=${MACHINE_PROVISIONING_SECRET}"
fi
if [ -n "${MACHINE_TOKEN_TTL_MS:-}" ]; then
docker service update --env-add MACHINE_TOKEN_TTL_MS="${MACHINE_TOKEN_TTL_MS}" sistema_convex_backend || true
UPDATE_ARGS="$UPDATE_ARGS --env-add MACHINE_TOKEN_TTL_MS=${MACHINE_TOKEN_TTL_MS}"
fi
if [ -n "${FLEET_SYNC_SECRET:-}" ]; then
docker service update --env-add FLEET_SYNC_SECRET="${FLEET_SYNC_SECRET}" sistema_convex_backend || true
UPDATE_ARGS="$UPDATE_ARGS --env-add FLEET_SYNC_SECRET=${FLEET_SYNC_SECRET}"
fi
echo "Current envs:"
if [ -n "$UPDATE_ARGS" ]; then
echo "Applying env updates (will respect update_config.order: start-first)..."
docker service update $UPDATE_ARGS sistema_convex_backend || true
fi
echo "Current envs:"
docker service inspect sistema_convex_backend --format '{{range .Spec.TaskTemplate.ContainerSpec.Env}}{{println .}}{{end}}' || true
echo "Forcing service restart..."
docker service update --force sistema_convex_backend || true
# NÃO fazemos --force aqui para respeitar a estratégia start-first do stack.yml
- name: Smoke test — register + heartbeat
run: |
@ -375,10 +380,11 @@ jobs:
run: |
docker service update --force sistema_web
- name: Restart Convex backend service (optional)
run: |
# Fail the job if the convex backend cannot restart
docker service update --force sistema_convex_backend
# Comentado: o stack deploy já atualiza os serviços com update_config.order: start-first
# Forçar update aqui causa downtime porque ignora a estratégia de rolling update
# - name: Restart Convex backend service (optional)
# run: |
# docker service update --force sistema_convex_backend
convex_deploy:
name: Deploy Convex functions