Python Profiling¶
Find and fix performance bottlenecks in Python code.
CPU Profiling¶
cProfile (Built-in)¶
# Profile script
python -m cProfile -s cumtime script.py
# Save to file
python -m cProfile -o profile.prof script.py
# View results
python -m pstats profile.prof
Reading output:
ncalls tottime percall cumtime percall filename:lineno(function)
10000 0.500 0.000 2.500 0.000 slow_function
10000 2.000 0.000 2.000 0.000 inner_function
ncalls: Number of callstottime: Time in function (excluding subcalls)cumtime: Time in function (including subcalls)
Programmatic Profiling¶
import cProfile
import pstats
from io import StringIO
def profile_function(func):
"""Decorator to profile a function."""
def wrapper(*args, **kwargs):
profiler = cProfile.Profile()
profiler.enable()
result = func(*args, **kwargs)
profiler.disable()
stream = StringIO()
stats = pstats.Stats(profiler, stream=stream)
stats.sort_stats('cumtime')
stats.print_stats(20)
print(stream.getvalue())
return result
return wrapper
@profile_function
def slow_function():
# ... code to profile
pass
py-spy (Sampling Profiler)¶
No code changes needed, minimal overhead.
# Install
pip install py-spy
# Profile running process
py-spy top --pid 12345
# Record to flamegraph
py-spy record -o profile.svg --pid 12345
# Profile a script
py-spy record -o profile.svg -- python script.py
line_profiler¶
Profile line-by-line.
# script.py
@profile # Decorator, no import needed
def slow_function():
result = []
for i in range(1000):
result.append(i ** 2)
return result
slow_function()
Output:
Async Profiling¶
import asyncio
import cProfile
import pstats
async def main():
await slow_async_function()
# Profile async code
profiler = cProfile.Profile()
profiler.enable()
asyncio.run(main())
profiler.disable()
stats = pstats.Stats(profiler)
stats.sort_stats('cumtime')
stats.print_stats(20)
Memory Profiling¶
memory_profiler¶
from memory_profiler import profile
@profile
def memory_intensive():
large_list = [i ** 2 for i in range(1000000)]
return sum(large_list)
memory_intensive()
Output:
Line # Mem usage Increment Line Contents
4 50.0 MiB 0.0 MiB def memory_intensive():
5 88.5 MiB 38.5 MiB large_list = [...]
6 88.5 MiB 0.0 MiB return sum(large_list)
tracemalloc (Built-in)¶
import tracemalloc
tracemalloc.start()
# Run code
result = memory_intensive_function()
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
print("Top 10 memory allocations:")
for stat in top_stats[:10]:
print(stat)
objgraph¶
Visualize object references.
import objgraph
# Most common types
objgraph.show_most_common_types(limit=10)
# Growth between calls
objgraph.show_growth()
# Reference graph
obj = find_leaky_object()
objgraph.show_backrefs([obj], filename='refs.png')
Finding Memory Leaks¶
import gc
import tracemalloc
def find_leaks():
tracemalloc.start()
# Force garbage collection
gc.collect()
snapshot1 = tracemalloc.take_snapshot()
# Run suspected leaky code
for _ in range(100):
leaky_function()
gc.collect()
snapshot2 = tracemalloc.take_snapshot()
# Compare snapshots
diff = snapshot2.compare_to(snapshot1, 'lineno')
print("Memory growth:")
for stat in diff[:10]:
print(stat)
find_leaks()
Flame Graphs¶
Visual representation of where time is spent.
Reading flame graphs: - Width = time spent - Stack grows upward - Plateaus = time in that function - Wide bars at bottom = worth optimizing
FastAPI Profiling¶
from fastapi import FastAPI, Request
import cProfile
import pstats
from io import StringIO
app = FastAPI()
@app.middleware("http")
async def profile_requests(request: Request, call_next):
if request.query_params.get("profile"):
profiler = cProfile.Profile()
profiler.enable()
response = await call_next(request)
profiler.disable()
stream = StringIO()
stats = pstats.Stats(profiler, stream=stream)
stats.sort_stats('cumtime')
stats.print_stats(30)
print(stream.getvalue())
return response
return await call_next(request)
# Usage: GET /api/users?profile=1
Database Query Profiling¶
import logging
import time
from sqlalchemy import event
logger = logging.getLogger("sqlalchemy.engine")
@event.listens_for(Engine, "before_cursor_execute")
def before_cursor_execute(conn, cursor, statement, parameters, context, executemany):
conn.info.setdefault('query_start_time', []).append(time.time())
@event.listens_for(Engine, "after_cursor_execute")
def after_cursor_execute(conn, cursor, statement, parameters, context, executemany):
total = time.time() - conn.info['query_start_time'].pop()
if total > 0.1: # Log slow queries
logger.warning(f"Slow query ({total:.2f}s): {statement[:100]}")
Profiling Tips¶
What to Profile¶
- Slow endpoints — Start with user-reported issues
- High-traffic paths — Small improvements = big impact
- Background jobs — Often overlooked
- Startup time — Affects deployment
Common Bottlenecks¶
| Symptom | Likely Cause |
|---|---|
| High CPU, low I/O | Algorithm inefficiency |
| Low CPU, high I/O | Database/network bound |
| Memory growth | Leaks, caching gone wrong |
| Spiky latency | GC pauses, locks |
Optimization Checklist¶
- Profile before optimizing
- Identify the actual bottleneck
- Make one change at a time
- Measure after each change
- Document what you changed and why