✅ Redis Implementation Validation Checklist¶
Use este checklist para validar a implementação Redis antes do deploy em produção.
📋 Pre-Deployment Checklist¶
1. ✅ Code Implementation¶
- [ ] redis_session_service.py criado (450 linhas)
- [ ] RPUSH para append_event() (O(1))
- [ ] Pipeline transactions para atomicidade
- [ ] TTL automático em todas as keys
-
[ ] Serialização/deserialização correta de eventos
-
[ ] redis_memory_service.py criado (200 linhas)
- [ ] Sorted sets para indexação temporal
-
[ ] Cleanup de memórias antigas
-
[ ] backends.py criado (220 linhas)
- [ ] Factory
get_session_service(backend) - [ ] Factory
get_memory_service(backend) -
[ ] Suporta: redis, cloudsql, inmemory
-
[ ] slack_bot.py modificado
- [ ] Import de
get_session_serviceeget_memory_service - [ ] Usa factory functions
-
[ ] Logging de backend configurado
-
[ ] Package exports atualizados
- [ ]
session/__init__.pyexportaRedisSessionService -
[ ]
memory/__init__.pyexportaRedisMemoryService -
[ ] requirements.txt atualizado
- [ ]
redis[asyncio]>=5.0.0adicionado
2. ✅ Configuration¶
-
[ ] .env criado (baseado em
.env.redis.example)SESSION_BACKEND=redis MEMORY_BACKEND=cloudsql REDIS_URL=redis://localhost:6379/0 REDIS_SESSION_TTL=3600 -
[ ] Variáveis validadas
python -c " import os from dotenv import load_dotenv load_dotenv() assert os.getenv('SESSION_BACKEND') == 'redis' assert os.getenv('REDIS_URL').startswith('redis://') assert int(os.getenv('REDIS_SESSION_TTL')) > 0 print('✅ Config válida') "
3. ✅ Local Environment¶
-
[ ] Docker Redis running
docker run -d -p 6379:6379 redis:7-alpine docker ps | grep redis # Should show running container -
[ ] Redis connection working
redis-cli ping # Should return: PONG -
[ ] Dependencies installed
pip install -r ifriend_agent/requirements.txt pip list | grep redis # Should show redis>=5.0.0
4. ✅ Functional Testing¶
- [ ] Bot starts without errors
python slack_bot.py
Expected logs:
✅ SessionService inicializado
✅ MemoryService inicializado
✅ Runner configurado
• Session: redis
• Memory: cloudsql
⚡ Redis: Performance otimizada (~10ms vs ~800ms CloudSQL)
-
[ ] Session creation works
python -c " import asyncio from ifriend_agent.config.backends import get_session_service async def test(): svc = get_session_service('redis') session = await svc.create_session('test_app', 'test_user') print(f'✅ Session created: {session.id}') # Verify in Redis import redis.asyncio as redis r = await redis.from_url('redis://localhost:6379/0') exists = await r.exists(f'session:{session.id}:meta') assert exists, 'Session not found in Redis!' print('✅ Session verified in Redis') asyncio.run(test()) " -
[ ] Event append works
python -c " import asyncio from ifriend_agent.config.backends import get_session_service from google.genai import types async def test(): svc = get_session_service('redis') session = await svc.create_session('test_app', 'test_user') # Append event event = types.LiveClientMessageEvent( client_message=types.LiveClientMessage( turns=[types.LiveClientContent( turn_id='test_turn', parts=[types.Part(text='Test message')] )] ) ) await svc.append_event(session.id, event) print('✅ Event appended') # Verify loaded = await svc.get_session(session.id) assert len(loaded.events) == 2, f'Expected 2 events, got {len(loaded.events)}' print(f'✅ Session has {len(loaded.events)} events') asyncio.run(test()) " -
[ ] TTL cleanup works
redis-cli TTL "session:test_session:meta" # Should return: positive number (seconds until expiration) # NOT -1 (no expiration) or -2 (doesn't exist)
5. ✅ Performance Testing¶
- [ ] Run benchmark
python benchmark_session_performance.py
Expected results:
Redis:
Average: 2-10ms
P95: 8-15ms
P99: 15-25ms
CloudSQL:
Average: 150-200ms
P95: 300-400ms
P99: 600-800ms
⚡ Redis is 50-80x FASTER than CloudSQL
- [ ] Verify O(1) append
- Redis average should be < 10ms
- Redis P95 should be < 20ms
- Redis should be 50-80x faster than CloudSQL
6. ✅ Integration Testing (Slack)¶
-
[ ] Bot responds to messages
Slack: @IFriendBot hello Bot: [response in < 1s] -
[ ] Check Redis keys created
redis-cli KEYS "session:*" # Should show: session:slack_CHANNEL_USER_THREAD:meta # session:slack_CHANNEL_USER_THREAD:events -
[ ] Verify session persists
Slack: @IFriendBot first message Bot: [response] Slack: @IFriendBot continue conversation Bot: [response with context from first message] -
[ ] Multi-turn conversation works
redis-cli LLEN "session:slack_C123_U456_T789:events" # Should increase with each message
7. ✅ Error Handling¶
-
[ ] Graceful Redis failure
# Stop Redis docker stop ifriend-redis # Bot should show error but not crash python slack_bot.py # Expected: "❌ Erro ao inicializar SessionService" -
[ ] Fallback to CloudSQL works
SESSION_BACKEND=cloudsql python slack_bot.py # Should work (slower but functional) -
[ ] Fallback to InMemory works
SESSION_BACKEND=inmemory python slack_bot.py # Should work (fast but no persistence)
8. ✅ Documentation¶
- [ ] README_REDIS.md criado
- [ ] docs/REDIS_SETUP.md criado com setup completo
- [ ] .env.redis.example criado com template
- [ ] REDIS_IMPLEMENTATION_SUMMARY.md criado
- [ ] REDIS_ROLLBACK_GUIDE.md criado
- [ ] benchmark_session_performance.py criado
🚀 Production Deployment Checklist¶
1. ✅ Google Cloud Infrastructure¶
-
[ ] Memorystore instance created
gcloud redis instances create ifriend-redis \ --size=1 \ --region=us-central1 \ --tier=standard \ --redis-version=redis_7_0 -
[ ] VPC Connector created
gcloud compute networks vpc-access connectors create ifriend-connector \ --region=us-central1 \ --network=default \ --range=10.8.0.0/28 -
[ ] Memorystore IP obtained
gcloud redis instances describe ifriend-redis \ --region=us-central1 \ --format="get(host)" # Note the IP: 10.x.x.x
2. ✅ Cloud Run Configuration¶
-
[ ] cloudbuild.yaml updated com VPC connector
- name: 'gcr.io/cloud-builders/gcloud' args: - 'run' - 'deploy' - 'ifriend-slack-bot' - '--vpc-connector=ifriend-connector' - '--set-env-vars=SESSION_BACKEND=redis,REDIS_URL=redis://10.x.x.x:6379/0' -
[ ] Environment variables configured
gcloud run services describe ifriend-slack-bot \ --region=us-central1 \ --format="get(spec.template.spec.containers[0].env)" # Should include: # SESSION_BACKEND=redis # REDIS_URL=redis://10.x.x.x:6379/0 # REDIS_SESSION_TTL=3600
3. ✅ Deployment¶
-
[ ] Build and deploy
gcloud builds submit --config cloudbuild.slack.yaml -
[ ] Deployment successful
gcloud run services list | grep ifriend-slack-bot # Should show: ✔ READY
4. ✅ Post-Deployment Validation¶
- [ ] Logs show Redis initialization
gcloud run logs read ifriend-slack-bot --limit 50 | grep Redis
Expected:
✅ SessionService inicializado
⚡ Redis: Performance otimizada (~10ms vs ~800ms CloudSQL)
-
[ ] No connection errors
gcloud run logs read ifriend-slack-bot --limit 100 | grep -i error # Should be empty or unrelated to Redis -
[ ] Slack bot responds
Slack: @IFriendBot test production deploy Bot: [response in < 1s] -
[ ] Sessions created in Memorystore
# From Cloud Shell (same VPC): gcloud compute ssh <instance-in-same-vpc> redis-cli -h 10.x.x.x KEYS "session:*" # Should show sessions
5. ✅ Performance Monitoring¶
- [ ] Response time improved
- Before: ~2-3s
- After: < 1s
-
Improvement: 2-3x
-
[ ] Cloud Logging metrics
resource.type="cloud_run_revision" jsonPayload.message=~"Redis" -
[ ] Memorystore metrics
gcloud redis instances describe ifriend-redis \ --region=us-central1 \ --format="get(currentLocationId,memorySizeGb)"
6. ✅ Rollback Plan Validated¶
-
[ ] Environment variable fallback tested
# Test rollback gcloud run services update ifriend-slack-bot \ --set-env-vars=SESSION_BACKEND=cloudsql # Verify still works # Test in Slack # Rollback to Redis gcloud run services update ifriend-slack-bot \ --set-env-vars=SESSION_BACKEND=redis -
[ ] Rollback documentation ready (REDIS_ROLLBACK_GUIDE.md)
📊 Success Metrics¶
| Metric | Target | Actual | Status |
|---|---|---|---|
| append_event latency | < 10ms avg | ___ ms | ☐ |
| Total response time | < 1s | ___ ms | ☐ |
| Speedup vs CloudSQL | 50-80x | ___x | ☐ |
| Session persistence | ✅ Works | ☐ Pass / ☐ Fail | ☐ |
| TTL cleanup | ✅ Auto | ☐ Pass / ☐ Fail | ☐ |
| Error rate | < 1% | ___% | ☐ |
| Uptime | > 99% | ___% | ☐ |
🎯 Final Validation¶
All checks passed? ✅¶
- [ ] Local testing complete
- [ ] Performance benchmark shows 50-80x improvement
- [ ] Integration testing successful
- [ ] Error handling validated
- [ ] Documentation complete
- [ ] Production deploy successful
- [ ] Post-deploy monitoring shows improvement
- [ ] Rollback plan tested and documented
Ready for Production? 🚀¶
- [ ] Team notified of deployment
- [ ] Monitoring dashboards updated
- [ ] On-call engineer aware of changes
- [ ] Rollback procedure documented and tested
📞 Post-Deployment¶
Monitor for 24h:¶
- [ ] Response times stable (< 1s)
- [ ] No Redis connection errors
- [ ] Memory usage stable (< 80%)
- [ ] Session persistence working
- [ ] TTL cleanup functioning
After 1 week:¶
- [ ] Compare metrics before/after
- [ ] Document performance improvements
- [ ] Update team on results
- [ ] Consider migrating Memory to Redis too (optional)
Checklist Owner: __
Date Started: __
Date Completed: __
Production Deploy Date: __
✅ All checks passed → Deploy to production
⚠️ Some checks failed → Fix issues and retest
❌ Critical checks failed → Do not deploy, investigate