- Data at rest is an attack surface
- Breach = user data exposed
- GDPR/CCPA deletion requirements = data management headaches
- Compliance certifications required
- Backup/recovery complexity
Don't store customer data at all. Only store what's absolutely necessary.
Store: Only the user's Plaid access token in secure session Not Stored: Transactions, balances, credit scores, personal info
# User logs in with Plaid Link
# Plaid returns: access_token
# Your app: Store ONLY in encrypted session/cookie
# Customer visits dashboard
# Your app: Fetch fresh data from Plaid on-demand
# Display live data
# Discard after responsefrom flask import session
from cryptography.fernet import Fernet
class StatelessPlaidClient:
"""No database. Session-only token storage."""
def __init__(self, encryption_key):
self.cipher = Fernet(encryption_key)
def store_token_in_session(self, access_token):
"""Encrypt token and store in browser cookie (httpOnly)"""
encrypted = self.cipher.encrypt(access_token.encode())
session['plaid_token'] = encrypted.hex()
session.permanent = True
session.modified = True
def get_token_from_session(self):
"""Retrieve and decrypt token from session"""
if 'plaid_token' not in session:
raise ValueError("No Plaid token in session")
encrypted = bytes.fromhex(session['plaid_token'])
token = self.cipher.decrypt(encrypted).decode()
return token
def get_dashboard_data(self):
"""Fetch fresh data on-demand, don't store"""
token = self.get_token_from_session()
# Fetch from Plaid (live data)
accounts = plaid_client.accounts_get(access_token=token)
transactions = plaid_client.transactions_get(
access_token=token,
start_date='2024-09-13',
end_date='2024-12-13'
)
balances = plaid_client.accounts_balance_get(access_token=token)
# Return to browser
# Data is NOT stored in database
# Discarded after response
return {
'accounts': accounts.to_dict(),
'transactions': transactions.to_dict(),
'balances': balances.to_dict()
}
# Flask setup
app.secret_key = os.getenv('FLASK_SECRET_KEY')
app.config['SESSION_COOKIE_HTTPONLY'] = True # Can't be accessed by JavaScript
app.config['SESSION_COOKIE_SECURE'] = True # HTTPS only
app.config['PERMANENT_SESSION_LIFETIME'] = timedelta(hours=24)
@app.route('/')
def dashboard():
stateless = StatelessPlaidClient(os.getenv('ENCRYPTION_KEY'))
data = stateless.get_dashboard_data() # Fresh from Plaid
return render_template('dashboard.html', data=data)- Zero data breach risk - Nothing stored in database
- GDPR compliant - No customer data to delete (it never existed)
- No compliance certifications needed - You're not a data processor
- Simple deployment - No database required
- Cost-effective - No database server
- Always fresh data - Real-time from Plaid
- Plaid API calls on every page load - Slower (200-500ms per request)
- Rate limiting risk - Plaid has request limits
- No offline mode - Can't show data if Plaid is down
- No historical analysis - Can't show trends over time
- Session expires - User must re-authenticate
- MVP/Proof of concept
- Low-traffic apps (<1000 users)
- When you want zero data liability
Store: Only Plaid access token in secure session Not Stored: Transactions, balances, credit scores Why: Let Plaid store the data - they're compliance-certified
# Plaid Link => access_token
# Store token in session (encrypted, httpOnly cookie)
# Every request => fetch fresh from Plaid
# Plaid handles compliance, backups, securityUser Browser
↓
Session (encrypted cookie)
├─ plaid_access_token (encrypted)
└─ refresh_token (encrypted)
↓
Plaid API (your "database")
├─ Transactions (Plaid stores)
├─ Accounts (Plaid stores)
├─ Balances (Plaid stores)
└─ Credit data (Plaid stores)
Your Database = EMPTY (or only stores subscription/billing)
class PlaidAsDatabase:
"""Plaid is our database. We only store access tokens."""
def __init__(self):
self.plaid_client = plaid.ApiClient(
configuration=plaid.Configuration(
host=plaid.Environment.Sandbox,
api_key=os.getenv('PLAID_SECRET')
)
)
def get_transactions(self, access_token, days=90):
"""Query Plaid (your database)"""
start_date = (datetime.now() - timedelta(days=days)).date()
end_date = datetime.now().date()
response = self.plaid_client.transactions_get(
access_token=access_token,
start_date=start_date,
end_date=end_date
)
# Don't store in your database
# Return directly to frontend
return response.to_dict()
def get_spending_by_category(self, access_token):
"""Aggregate from Plaid, don't store"""
transactions = self.get_transactions(access_token)
# Aggregate in memory (don't store)
by_category = {}
for txn in transactions['transactions']:
category = txn['personal_finance_category']['primary']
if category not in by_category:
by_category[category] = 0
by_category[category] += abs(txn['amount'])
return by_category
def get_credit_score(self, access_token):
"""Query Plaid, don't store"""
response = self.plaid_client.credit_report_get(
access_token=access_token
)
score = response['credit_report']['scores'][0]['value']
return score
# Flask routes - no database queries
@app.route('/api/dashboard')
@require_auth
def get_dashboard():
token = session['plaid_token'] # From encrypted cookie
plaid_db = PlaidAsDatabase()
return jsonify({
'transactions': plaid_db.get_transactions(token, days=30),
'spending': plaid_db.get_spending_by_category(token),
'credit_score': plaid_db.get_credit_score(token)
})- Zero data breach risk - Nothing in your database
- GDPR/CCPA effortless - No customer data to manage
- Live data always - Real-time from Plaid
- Plaid handles compliance - SOC 2 Type II certified
- Plaid handles backups - Not your responsibility
- Plaid handles security - Enterprise security
- Plaid dependency - If Plaid is down, app is down
- Rate limiting - Plaid has API quotas per user
- Slower - API call per request (~200-500ms)
- No historical trends - Can't show "spending over 6 months"
- Costs scale - More users = more Plaid API calls
- Production apps where data liability is concern
- Companies without data engineering resources
- When Plaid uptime is acceptable
- Budget-conscious startups
Store: Access token + last 7 days of transactions (metadata only) Not Stored: Personal info, SSN, credit scores Benefit: Better UX + minimal compliance burden
class MinimalCacheArchitecture:
"""Cache only transaction metadata, not personal data"""
def __init__(self):
self.redis = redis.Redis() # In-memory cache
def sync_transactions(self, user_id, access_token):
"""Sync only transaction METADATA to Redis (TTL: 7 days)"""
# Fetch from Plaid
transactions = plaid_client.transactions_get(access_token)
# Store ONLY metadata (no sensitive data)
cache_data = [
{
'id': txn['transaction_id'],
'date': txn['date'],
'amount': txn['amount'],
'category': txn['personal_finance_category']['primary'],
'merchant': txn['merchant_name']
# NOT stored: name, address, SSN, credit info
}
for txn in transactions['transactions']
]
# Store in Redis with 7-day expiry
self.redis.setex(
f"user:{user_id}:transactions",
timedelta(days=7),
json.dumps(cache_data)
)
def get_dashboard(self, user_id, access_token):
"""Serve from cache, skip sensitive data"""
# Try cache first
cached = self.redis.get(f"user:{user_id}:transactions")
if cached:
return json.loads(cached) # Fast, cached response
# Cache miss: sync from Plaid
self.sync_transactions(user_id, access_token)
return self.get_dashboard(user_id, access_token)
# No SQL database needed, just Redis cache- Minimal data liability - Only transaction metadata
- Faster UX - Served from Redis cache
- Still GDPR-friendly - 7-day auto-deletion
- Scales better - Fewer Plaid API calls
- Simple compliance - No PII stored
- Still need infrastructure - Redis server required
- Cache invalidation - "Is cache stale?" complexity
- Eventual consistency - Data lags behind Plaid by up to 7 days
- Scaling from MVP to production
- Apps that need better performance
- When you want SOME caching without full database
Layer 1: httpOnly encrypted session cookie
- Plaid access token
- User ID
- Session metadata
Layer 2: Redis cache (optional, 7-day TTL)
- Transaction metadata only (no PII)
- Spending aggregates (calculated on-the-fly)
- User preferences
Layer 3: Postgres (minimal)
- Only: user ID, email, subscription status
- NO financial data
- NO personal data
User logs in
↓
Session cookie stores: access_token
↓
Check Redis for cached transactions
├─ Hit: Return from cache (fast)
└─ Miss: Fetch from Plaid + cache in Redis
↓
Postgres stores: { user_id, email, subscription }
(Just enough for billing/auth)
Data liability: MINIMAL
Performance: FAST
Compliance: TRIVIAL (no PII to manage)
from flask import session
import redis
from sqlalchemy import Column, String
from datetime import timedelta
# Minimal Postgres model
class User(db.Model):
id = Column(UUID, primary_key=True)
email = Column(String(255), unique=True) # Only email
subscription = Column(String(50)) # premium/free
created_at = Column(DateTime)
# NO credit data, NO transactions, NO personal info
# Session (encrypted cookie)
session['plaid_token'] # Plaid access token
session['user_id'] # User ID
session.cookie_secure = True
session.cookie_httponly = True
# Redis cache (7-day TTL)
redis.setex(
f"cache:txns:{user_id}",
timedelta(days=7),
transaction_data_json
)
# Hybrid dashboard
@app.route('/dashboard')
def dashboard():
user_id = session['user_id']
access_token = session['plaid_token']
# Try cache first
cached = redis.get(f"cache:txns:{user_id}")
if cached:
return render_template('dashboard.html',
transactions=json.loads(cached))
# Cache miss: fetch from Plaid
txns = plaid_client.transactions_get(access_token)
# Cache for 7 days
redis.setex(f"cache:txns:{user_id}",
timedelta(days=7),
json.dumps(txns))
return render_template('dashboard.html', transactions=txns)
# Compliance is trivial
# User deletion: Just clear their session + Redis cache
@app.route('/api/delete-me', methods=['DELETE'])
def delete_account():
user_id = session['user_id']
User.query.filter_by(id=user_id).delete() # Delete email only
redis.delete(f"cache:txns:{user_id}") # Clear cache
session.clear() # Clear session
db.commit()
return {'status': 'deleted'}| Feature | Session Only | Plaid as DB | Minimal Cache | Hybrid |
|---|---|---|---|---|
| Data Breach Risk | Zero ✅ | Zero ✅ | Very Low ✅ | Very Low ✅ |
| GDPR Compliance | Trivial | Trivial | Easy | Easy |
| API Calls/User | High | High | Low ✅ | Low ✅ |
| Performance | Slow | Slow | Fast ✅ | Fast ✅ |
| Cost | Low | Medium | Medium | Medium |
| Complexity | Low ✅ | Low ✅ | Medium | Medium |
| Infrastructure | Nothing | Nothing | Redis | Redis + Postgres |
| Can Show Trends | No | No | Depends | Yes ✅ |
| Offline Mode | No | No | 7 days | 7 days |
Use Option 1: Session-Only
- No database
- No infrastructure
- Deploy to Render free tier
- Focus on features
- Acceptable performance for low traffic
Switch to Option 4: Hybrid
- Redis cache for performance
- Minimal Postgres for billing/auth
- Still minimal compliance burden
- Scales to 10k+ users
Add real database only if:
- You need 6+ months of historical data
- You're offering trend analysis
- You're willing to invest in compliance
- User data is a business asset
With stateless/minimal storage:
❌ Data breach = user info NOT exposed
(Only encrypted token was stored)
❌ Ransomware = attacker gets NOTHING valuable
(No customer data to exfiltrate)
❌ Disgruntled employee = can't steal customer data
(They can steal code, not data)
❌ SQL injection = no customer data exposed
(Attacker gets subscription status, not transactions)
✅ GDPR requests = "We don't store your data"
(Literally nothing to export/delete)
Choose Option 1 (Session-Only) if:
- Building MVP
- <1000 users
- Want zero infrastructure
- Can tolerate 200-500ms latency
- Acceptable Plaid API limits
Choose Option 4 (Hybrid) if:
- Ready for production
- Need better performance
- Want historical data (but not 6+ months)
- Have some infrastructure budget
- Need faster GDPR responses
Choose traditional database only if:
- Offering complex analytics
- Need 6+ months of history
- Building ML features
- Willing to invest in compliance
- Data is your product
-
Week 1: Refactor to session-only storage
- Remove user data storage from app.py
- Keep only: user_id, plaid_token in session
- Update templates to request fresh data
-
Week 2: Add Redis caching (optional)
- Cache transaction metadata only
- 7-day TTL
- No PII/sensitive data
-
Week 3: Simplify database
- Keep only: User (id, email, subscription)
- Move everything else to session/cache
-
Week 4: Document compliance
- "We don't store customer financial data"
- "Data is encrypted in transit"
- "GDPR: No data to export/delete"
You can eliminate ~95% of your data compliance burden by not storing customer data.
It seems counterintuitive, but:
- Storing less data = fewer security concerns
- Fewer security concerns = easier compliance
- Easier compliance = less expensive to operate
- Less expensive = more profit margin
The question isn't "How do we securely store customer data?" The question is "Do we need to store customer data at all?"
Often the answer is: No.