Skip to content

✅ Redis Implementation Validation Checklist

Use este checklist para validar a implementação Redis antes do deploy em produção.


📋 Pre-Deployment Checklist

1. ✅ Code Implementation

  • [ ] redis_session_service.py criado (450 linhas)
  • [ ] RPUSH para append_event() (O(1))
  • [ ] Pipeline transactions para atomicidade
  • [ ] TTL automático em todas as keys
  • [ ] Serialização/deserialização correta de eventos

  • [ ] redis_memory_service.py criado (200 linhas)

  • [ ] Sorted sets para indexação temporal
  • [ ] Cleanup de memórias antigas

  • [ ] backends.py criado (220 linhas)

  • [ ] Factory get_session_service(backend)
  • [ ] Factory get_memory_service(backend)
  • [ ] Suporta: redis, cloudsql, inmemory

  • [ ] slack_bot.py modificado

  • [ ] Import de get_session_service e get_memory_service
  • [ ] Usa factory functions
  • [ ] Logging de backend configurado

  • [ ] Package exports atualizados

  • [ ] session/__init__.py exporta RedisSessionService
  • [ ] memory/__init__.py exporta RedisMemoryService

  • [ ] requirements.txt atualizado

  • [ ] redis[asyncio]>=5.0.0 adicionado

2. ✅ Configuration

  • [ ] .env criado (baseado em .env.redis.example)

    SESSION_BACKEND=redis
    MEMORY_BACKEND=cloudsql
    REDIS_URL=redis://localhost:6379/0
    REDIS_SESSION_TTL=3600
    

  • [ ] Variáveis validadas

    python -c "
    import os
    from dotenv import load_dotenv
    load_dotenv()
    assert os.getenv('SESSION_BACKEND') == 'redis'
    assert os.getenv('REDIS_URL').startswith('redis://')
    assert int(os.getenv('REDIS_SESSION_TTL')) > 0
    print('✅ Config válida')
    "
    

3. ✅ Local Environment

  • [ ] Docker Redis running

    docker run -d -p 6379:6379 redis:7-alpine
    docker ps | grep redis  # Should show running container
    

  • [ ] Redis connection working

    redis-cli ping  # Should return: PONG
    

  • [ ] Dependencies installed

    pip install -r ifriend_agent/requirements.txt
    pip list | grep redis  # Should show redis>=5.0.0
    

4. ✅ Functional Testing

  • [ ] Bot starts without errors
    python slack_bot.py
    

Expected logs:

✅ SessionService inicializado
✅ MemoryService inicializado
✅ Runner configurado
   • Session: redis
   • Memory: cloudsql
   ⚡ Redis: Performance otimizada (~10ms vs ~800ms CloudSQL)

  • [ ] Session creation works

    python -c "
    import asyncio
    from ifriend_agent.config.backends import get_session_service
    
    async def test():
        svc = get_session_service('redis')
        session = await svc.create_session('test_app', 'test_user')
        print(f'✅ Session created: {session.id}')
    
        # Verify in Redis
        import redis.asyncio as redis
        r = await redis.from_url('redis://localhost:6379/0')
        exists = await r.exists(f'session:{session.id}:meta')
        assert exists, 'Session not found in Redis!'
        print('✅ Session verified in Redis')
    
    asyncio.run(test())
    "
    

  • [ ] Event append works

    python -c "
    import asyncio
    from ifriend_agent.config.backends import get_session_service
    from google.genai import types
    
    async def test():
        svc = get_session_service('redis')
        session = await svc.create_session('test_app', 'test_user')
    
        # Append event
        event = types.LiveClientMessageEvent(
            client_message=types.LiveClientMessage(
                turns=[types.LiveClientContent(
                    turn_id='test_turn',
                    parts=[types.Part(text='Test message')]
                )]
            )
        )
    
        await svc.append_event(session.id, event)
        print('✅ Event appended')
    
        # Verify
        loaded = await svc.get_session(session.id)
        assert len(loaded.events) == 2, f'Expected 2 events, got {len(loaded.events)}'
        print(f'✅ Session has {len(loaded.events)} events')
    
    asyncio.run(test())
    "
    

  • [ ] TTL cleanup works

    redis-cli TTL "session:test_session:meta"
    # Should return: positive number (seconds until expiration)
    # NOT -1 (no expiration) or -2 (doesn't exist)
    

5. ✅ Performance Testing

  • [ ] Run benchmark
    python benchmark_session_performance.py
    

Expected results:

Redis:
  Average: 2-10ms
  P95: 8-15ms
  P99: 15-25ms

CloudSQL:
  Average: 150-200ms
  P95: 300-400ms
  P99: 600-800ms

⚡ Redis is 50-80x FASTER than CloudSQL

  • [ ] Verify O(1) append
  • Redis average should be < 10ms
  • Redis P95 should be < 20ms
  • Redis should be 50-80x faster than CloudSQL

6. ✅ Integration Testing (Slack)

  • [ ] Bot responds to messages

    Slack: @IFriendBot hello
    Bot: [response in < 1s]
    

  • [ ] Check Redis keys created

    redis-cli KEYS "session:*"
    # Should show: session:slack_CHANNEL_USER_THREAD:meta
    #              session:slack_CHANNEL_USER_THREAD:events
    

  • [ ] Verify session persists

    Slack: @IFriendBot first message
    Bot: [response]
    
    Slack: @IFriendBot continue conversation
    Bot: [response with context from first message]
    

  • [ ] Multi-turn conversation works

    redis-cli LLEN "session:slack_C123_U456_T789:events"
    # Should increase with each message
    

7. ✅ Error Handling

  • [ ] Graceful Redis failure

    # Stop Redis
    docker stop ifriend-redis
    
    # Bot should show error but not crash
    python slack_bot.py
    # Expected: "❌ Erro ao inicializar SessionService"
    

  • [ ] Fallback to CloudSQL works

    SESSION_BACKEND=cloudsql python slack_bot.py
    # Should work (slower but functional)
    

  • [ ] Fallback to InMemory works

    SESSION_BACKEND=inmemory python slack_bot.py
    # Should work (fast but no persistence)
    

8. ✅ Documentation

  • [ ] README_REDIS.md criado
  • [ ] docs/REDIS_SETUP.md criado com setup completo
  • [ ] .env.redis.example criado com template
  • [ ] REDIS_IMPLEMENTATION_SUMMARY.md criado
  • [ ] REDIS_ROLLBACK_GUIDE.md criado
  • [ ] benchmark_session_performance.py criado

🚀 Production Deployment Checklist

1. ✅ Google Cloud Infrastructure

  • [ ] Memorystore instance created

    gcloud redis instances create ifriend-redis \
      --size=1 \
      --region=us-central1 \
      --tier=standard \
      --redis-version=redis_7_0
    

  • [ ] VPC Connector created

    gcloud compute networks vpc-access connectors create ifriend-connector \
      --region=us-central1 \
      --network=default \
      --range=10.8.0.0/28
    

  • [ ] Memorystore IP obtained

    gcloud redis instances describe ifriend-redis \
      --region=us-central1 \
      --format="get(host)"
    # Note the IP: 10.x.x.x
    

2. ✅ Cloud Run Configuration

  • [ ] cloudbuild.yaml updated com VPC connector

    - name: 'gcr.io/cloud-builders/gcloud'
      args:
        - 'run'
        - 'deploy'
        - 'ifriend-slack-bot'
        - '--vpc-connector=ifriend-connector'
        - '--set-env-vars=SESSION_BACKEND=redis,REDIS_URL=redis://10.x.x.x:6379/0'
    

  • [ ] Environment variables configured

    gcloud run services describe ifriend-slack-bot \
      --region=us-central1 \
      --format="get(spec.template.spec.containers[0].env)"
    
    # Should include:
    # SESSION_BACKEND=redis
    # REDIS_URL=redis://10.x.x.x:6379/0
    # REDIS_SESSION_TTL=3600
    

3. ✅ Deployment

  • [ ] Build and deploy

    gcloud builds submit --config cloudbuild.slack.yaml
    

  • [ ] Deployment successful

    gcloud run services list | grep ifriend-slack-bot
    # Should show: ✔ READY
    

4. ✅ Post-Deployment Validation

  • [ ] Logs show Redis initialization
    gcloud run logs read ifriend-slack-bot --limit 50 | grep Redis
    

Expected:

✅ SessionService inicializado
⚡ Redis: Performance otimizada (~10ms vs ~800ms CloudSQL)

  • [ ] No connection errors

    gcloud run logs read ifriend-slack-bot --limit 100 | grep -i error
    # Should be empty or unrelated to Redis
    

  • [ ] Slack bot responds

    Slack: @IFriendBot test production deploy
    Bot: [response in < 1s]
    

  • [ ] Sessions created in Memorystore

    # From Cloud Shell (same VPC):
    gcloud compute ssh <instance-in-same-vpc>
    redis-cli -h 10.x.x.x
    KEYS "session:*"
    # Should show sessions
    

5. ✅ Performance Monitoring

  • [ ] Response time improved
  • Before: ~2-3s
  • After: < 1s
  • Improvement: 2-3x

  • [ ] Cloud Logging metrics

    resource.type="cloud_run_revision"
    jsonPayload.message=~"Redis"
    

  • [ ] Memorystore metrics

    gcloud redis instances describe ifriend-redis \
      --region=us-central1 \
      --format="get(currentLocationId,memorySizeGb)"
    

6. ✅ Rollback Plan Validated

  • [ ] Environment variable fallback tested

    # Test rollback
    gcloud run services update ifriend-slack-bot \
      --set-env-vars=SESSION_BACKEND=cloudsql
    
    # Verify still works
    # Test in Slack
    
    # Rollback to Redis
    gcloud run services update ifriend-slack-bot \
      --set-env-vars=SESSION_BACKEND=redis
    

  • [ ] Rollback documentation ready (REDIS_ROLLBACK_GUIDE.md)


📊 Success Metrics

Metric Target Actual Status
append_event latency < 10ms avg ___ ms
Total response time < 1s ___ ms
Speedup vs CloudSQL 50-80x ___x
Session persistence ✅ Works ☐ Pass / ☐ Fail
TTL cleanup ✅ Auto ☐ Pass / ☐ Fail
Error rate < 1% ___%
Uptime > 99% ___%

🎯 Final Validation

All checks passed? ✅

  • [ ] Local testing complete
  • [ ] Performance benchmark shows 50-80x improvement
  • [ ] Integration testing successful
  • [ ] Error handling validated
  • [ ] Documentation complete
  • [ ] Production deploy successful
  • [ ] Post-deploy monitoring shows improvement
  • [ ] Rollback plan tested and documented

Ready for Production? 🚀

  • [ ] Team notified of deployment
  • [ ] Monitoring dashboards updated
  • [ ] On-call engineer aware of changes
  • [ ] Rollback procedure documented and tested

📞 Post-Deployment

Monitor for 24h:

  • [ ] Response times stable (< 1s)
  • [ ] No Redis connection errors
  • [ ] Memory usage stable (< 80%)
  • [ ] Session persistence working
  • [ ] TTL cleanup functioning

After 1 week:

  • [ ] Compare metrics before/after
  • [ ] Document performance improvements
  • [ ] Update team on results
  • [ ] Consider migrating Memory to Redis too (optional)

Checklist Owner: __
Date Started:
__
Date Completed: __
Production Deploy Date:
__


All checks passed → Deploy to production
⚠️ Some checks failed → Fix issues and retest
Critical checks failed → Do not deploy, investigate