Skip to content

Concurrency Fundamentals

The building blocks of concurrent programming.

Processes vs Threads

Processes

A process is an independent execution unit with its own memory space.

Process 1                Process 2
┌──────────────────┐    ┌──────────────────┐
│   Code           │    │   Code           │
│   Data           │    │   Data           │
│   Heap           │    │   Heap           │
│   Stack          │    │   Stack          │
│   File handles   │    │   File handles   │
└──────────────────┘    └──────────────────┘
     Isolated               Isolated

Characteristics: - Own memory space (isolated) - Heavier to create (memory allocation, OS setup) - Communicate via IPC (pipes, sockets, shared memory) - One process crash doesn't affect others

Threads

A thread is a lightweight execution unit sharing memory with its process.

Process
┌────────────────────────────────────────────┐
│   Code (shared)                            │
│   Data (shared)                            │
│   Heap (shared)                            │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐    │
│ │ Thread 1 │ │ Thread 2 │ │ Thread 3 │    │
│ │  Stack   │ │  Stack   │ │  Stack   │    │
│ └──────────┘ └──────────┘ └──────────┘    │
└────────────────────────────────────────────┘

Characteristics: - Share memory with parent process - Lighter to create (just a stack) - Communicate via shared memory (fast, but tricky) - One thread crash can crash entire process

Comparison

Aspect Process Thread
Memory Isolated Shared
Creation cost High Low
Communication IPC (slow) Shared memory (fast)
Context switch Expensive Cheaper
Crash isolation Isolated Affects all threads
Debugging Easier Harder

Concurrency vs Parallelism

Concurrency

Multiple tasks making progress, possibly by interleaving on one CPU.

CPU 0:  ─A─B─A─B─A─B─A─B─
        Time →

Tasks A and B are concurrent (interleaved)
but not parallel (not simultaneous)

Good for: - I/O-bound tasks (waiting for network, disk) - Keeping systems responsive - Managing many connections

Parallelism

Multiple tasks executing simultaneously on multiple CPUs.

CPU 0:  ─A─A─A─A─A─A─A─A─
CPU 1:  ─B─B─B─B─B─B─B─B─
        Time →

Tasks A and B run in parallel (simultaneously)

Good for: - CPU-bound tasks (computation) - Processing large datasets - Scientific computing

Combined

Real systems often use both.

CPU 0:  ─A1─B1─A1─B1─A1─B1─
CPU 1:  ─A2─B2─A2─B2─A2─B2─
        Time →

Concurrent (A and B interleaved) AND
Parallel (multiple CPUs)

Context Switching

When the OS switches between threads/processes.

What Happens

  1. Save current thread's state (registers, stack pointer)
  2. Store state in thread's control block
  3. Load next thread's state
  4. Resume execution

Cost

  • Thread switch: ~1-10 microseconds
  • Process switch: ~10-100 microseconds (more state to save)

Why It Matters

Too many context switches = overhead eating your performance.

# Bad: Too many tiny tasks
for i in range(1000000):
    await asyncio.sleep(0)  # Forces context switch each iteration

# Better: Batch work
for chunk in chunks(data, 1000):
    process_chunk(chunk)
    await asyncio.sleep(0)  # Occasional yield

CPU-Bound vs I/O-Bound

I/O-Bound

Task spends most time waiting for external operations.

Thread:  ─compute─[====wait for I/O====]─compute─

Examples:
- Network requests
- Database queries
- File operations
- User input

Strategy: Concurrency (async/await, threads) - While one task waits, another works - Parallelism doesn't help much (bottleneck is I/O)

CPU-Bound

Task spends most time doing computation.

Thread:  ─compute─compute─compute─compute─compute─

Examples:
- Image processing
- Data transformation
- Mathematical computation
- Compression/encryption

Strategy: Parallelism (multiple processes/workers) - More CPUs = more computation per second - Concurrency alone doesn't help (nothing to wait for)

Mixed Workloads

Real applications often have both.

Request:  ─[DB query]─process─[DB query]─process─respond
           I/O         CPU      I/O        CPU

Strategy: Use async for I/O, offload heavy CPU to workers.

User-Level vs Kernel-Level Threads

Kernel-Level (1:1)

Each user thread maps to one kernel thread.

User Space:     Thread 1    Thread 2    Thread 3
                   │           │           │
Kernel Space:  K-Thread 1  K-Thread 2  K-Thread 3

Characteristics: - OS schedules threads - Can run on multiple CPUs (true parallelism) - System call overhead - Used by: Java, C#, Rust, Python (threading)

User-Level (N:1)

Many user threads map to one kernel thread.

User Space:     Thread 1  Thread 2  Thread 3
                   └─────────┬─────────┘
Kernel Space:           K-Thread 1

Characteristics: - User-space scheduling (fast context switch) - Cannot run on multiple CPUs - One blocking call blocks all - Rarely used today

Hybrid (M:N)

Many user threads map to fewer kernel threads.

User Space:     T1  T2  T3  T4  T5  T6
                 └───┴───┘   └───┴───┘
Kernel Space:   K-Thread 1   K-Thread 2

Characteristics: - Best of both worlds (potentially) - Complex to implement - Used by: Go (goroutines), Erlang (processes)

Green Threads / Coroutines

Lightweight threads managed in user space, not by OS.

# Python coroutines
async def task():
    await some_io()  # Yields control, doesn't block OS thread

# Go goroutines
go func() {
    // Runs on Go's scheduler, not OS thread
}()

Characteristics: - Very lightweight (thousands are fine) - Cooperative scheduling (must yield explicitly) - Cannot use multiple CPUs without help - Great for I/O-bound concurrency

Summary

Concept When to Use
Threads Shared memory needed, moderate parallelism
Processes Isolation needed, CPU-bound parallelism
Async/await Many I/O operations, high concurrency
Green threads Massive concurrency, I/O-bound
Workload Solution
I/O-bound Async (Python asyncio, JS Promises)
CPU-bound Parallel processes (Python multiprocessing, Web Workers)
Mixed Async for I/O, offload CPU to workers