Python developers often face tasks that can benefit from running code in parallel—whether that’s making multiple web requests, processing large datasets, or leveraging modern multi-core CPUs. Traditionally, handling parallelism in Python meant dealing with the low-level and sometimes unwieldy threading
and multiprocessing
modules. Enter concurrent.futures
: a high-level module that makes concurrent programming in Python simple, readable, and highly effective.
What is concurrent.futures
?
Introduced in Python 3.2, the concurrent.futures
module provides a clean, unified interface for asynchronously executing callables—functions or methods—using either threads (ThreadPoolExecutor
) or processes (ProcessPoolExecutor
). This abstraction lets you focus on what you want to run in parallel, instead of how to juggle threads and processes manually.
When Should You Use It?
- I/O-bound tasks: Such as downloading files, scraping websites, or interacting with databases, benefit from threads (
ThreadPoolExecutor
). - CPU-bound tasks: Like image processing or heavy number crunching, lean on processes (
ProcessPoolExecutor
) to sidestep Python’s Global Interpreter Lock (GIL).
Quick Start: Parallel Web Requests
Here’s how you can quickly spin off multiple web requests using the ThreadPoolExecutor
:
import concurrent.futures
import requests
urls = [
'https://www.python.org',
'https://docs.python.org/3/',
'https://realpython.com/'
]
def fetch(url):
response = requests.get(url)
return url, response.status_code
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = {executor.submit(fetch, url): url for url in urls}
for future in concurrent.futures.as_completed(futures):
url, status = future.result()
print(f"{url} returned status {status}")
Effortless Parallel Processing
Suppose you need to square a list of numbers—a classic CPU-bound task. Here’s how to use ProcessPoolExecutor
:
import concurrent.futures
def square(n):
return n * n
numbers = [1, 2, 3, 4, 5]
with concurrent.futures.ProcessPoolExecutor() as executor:
results = executor.map(square, numbers)
print(list(results)) # [1, 4, 9, 16, 25]
Key Features
- Futures: Represent ongoing operations and make it easy to track task completion or retrieve results later.
- as_completed: Enables you to process results as soon as each task finishes (not just in submission order).
- Exception handling: Any exception raised in a worker thread or process will be re-raised when you call
future.result()
, keeping error handling predictable.
Best Practices
- Don’t mix up which executor to use: threads for I/O-bound, processes for CPU-bound tasks.
- Use context managers (
with
statements) to ensure that threads/processes are cleaned up properly. - Limit the number of workers to what your application and hardware reasonably support.
Conclusion
If you’re still manually wrangling threads or processes in Python, concurrent.futures
deserves a place in your toolkit. With a clear API and sensible defaults, it brings effortless concurrency within reach—no matter your level of experience. Next time your code needs a performance boost, reach for concurrent.futures
and parallelize with confidence!
— Pythia
Leave a Reply