# How to Benchmark Different Python Implementations with `pytest-benchmark` ```{note} Most of this text was generated with AI. ``` This guide will walk you through setting up and running performance benchmarks using `pytest-benchmark`. Benchmarking is crucial for making informed decisions about which libraries or implementation strategies offer the best performance for your specific use cases. We'll use the common example of comparing two JSON serialization libraries: the standard `json` and the faster `orjson`. ## Why benchmark? When you have multiple ways to achieve the same task (e.g., using different libraries or algorithms), benchmarks provide quantitative data on their performance. This data helps you: - Identify performance bottlenecks. - Choose the most efficient library/method for critical code paths. - Track performance regressions or improvements over time. - Justify technical decisions with concrete evidence. ## Prerequisites Before you start, make sure you have the following installed in your Python environment: 1. **Python**: (e.g., Python 3.8+) 2. **`uv`**: Or your preferred Python package manager/runner. 3. **`pytest`**: The testing framework. 4. **`pytest-benchmark`**: The pytest plugin for benchmarking. 5. **`orjson`**: The alternative JSON library we'll be testing against (the standard `json` library is built-in). You can install the necessary Python packages using `uv`: ```console uv pip install pytest pytest-benchmark orjson ``` # How to Benchmark Different Python Implementations with `pytest-benchmark`. This guide will walk you through setting up and running performance benchmarks using `pytest-benchmark`. Benchmarking is crucial for making informed decisions about which libraries or implementation strategies offer the best performance for your specific use cases. We'll use the common example of comparing two JSON serialization libraries: the standard `json` and the faster `orjson`. ## Why Benchmark? When you have multiple ways to achieve the same task (e.g., using different libraries or algorithms), benchmarks provide quantitative data on their performance. This data helps you: - Identify performance bottlenecks. - Choose the most efficient library/method for critical code paths. - Track performance regressions or improvements over time. - Justify technical decisions with concrete evidence. ## Prerequisites Before you start, make sure you have the following installed in your Python environment: 1. **Python**: (e.g., Python 3.8+) 2. **`uv`**: Or your preferred Python package manager/runner. 3. **`pytest`**: The testing framework. 4. **`pytest-benchmark`**: The pytest plugin for benchmarking. 5. **`orjson`**: The alternative JSON library we'll be testing against (the standard `json` library is built-in). You can install the necessary Python packages using `uv`: ```console uv pip install pytest pytest-benchmark orjson ``` ## Setting up Your Benchmark File 1. Create a directory for your benchmark scripts. Following your project structure, let's assume this is a `scripts/` directory. 2. Inside the `scripts/` directory, create a new Python file for your benchmarks. For our JSON example, let's name it `test_json_performance.py`. ``` project_root/ └── scripts/ └── test_json_performance.py ``` ## Writing Benchmark Functions In your `test_json_performance.py` file, you'll write functions that `pytest-benchmark` can discover and run. Each function will test a specific piece of code. Here's how to structure the benchmark for comparing `json.dumps` and `orjson.dumps`: ```python # scripts/test_json_performance.py import pytest import json import orjson # Sample data to be used for serialization SAMPLE_DATA = { "name": "Example User", "email": "user@example.com", "age": 30, "is_active": True, "balance": 1234.56, "metadata": {"key" + str(i): "value" + str(i) for i in range(50)}, } # Benchmark for the standard json library's dumps function def benchmark_standard_json_dumps(benchmark): """Benchmarks the standard json.dumps() function.""" benchmark(json.dumps, SAMPLE_DATA) def benchmark_orjson_dumps(benchmark): """Benchmarks the orjson.dumps() function.""" benchmark(orjson.dumps, SAMPLE_DATA) SERIALIZED_JSON_STD = json.dumps(SAMPLE_DATA) SERIALIZED_JSON_ORJSON = orjson.dumps(SAMPLE_DATA) def benchmark_standard_json_loads(benchmark): benchmark(json.loads, SERIALIZED_JSON_STD) def benchmark_orjson_loads(benchmark): benchmark(orjson.loads, SERIALIZED_JSON_ORJSON) ``` **Key points in the code:** - We import `pytest` and the libraries we want to test (`json`, `orjson`). - `SAMPLE_DATA` provides a consistent input for all benchmarks. - Each function starting with `benchmark_` is recognized by `pytest-benchmark`. - The `benchmark` fixture (provided by `pytest-benchmark`) is passed as an argument to these functions. - You call `benchmark(function_to_test, arg1, arg2, ...)` to run and measure the `function_to_test` with its arguments. ## Running the Benchmarks To run your benchmarks, navigate to your project's root directory in the terminal and use the command structure you've established: ```console uv run pytest scripts/test_json_performance.py ``` If you have multiple benchmark files in the `scripts/` directory, you can run one by one. ```console uv run pytest scripts/{BENCHMARK}.py ``` ## Understanding the output After running, `pytest-benchmark` will produce a table summarizing the performance results. It will look something like this (the exact numbers will vary based on your machine): | Name (time in us) | Min | Max | Mean | StdDev | Median | IQR | Outliers(\*) | Rounds | Iterations | | ----------------------------- | -------------- | -------------- | -------------- | ------------- | -------------- | ------------- | ------------ | ------ | ---------- | | benchmark_orjson_dumps | 3.8530 (1.0) | 6.5290 (1.0) | 4.3386 (1.0) | 0.3104 (1.0) | 4.2600 (1.0) | 0.3045 (1.0) | 64;95 | 22893 | 1 | | benchmark_standard_json_dumps | 19.0930 (4.96) | 31.2950 (4.80) | 20.6635 (4.76) | 1.6072 (5.18) | 20.2170 (4.75) | 1.4480 (4.75) | 72;165 | 4633 | 1 | | benchmark_orjson_loads | 3.3270 (1.0) | 5.8330 (1.0) | 3.6799 (1.0) | 0.3019 (1.0) | 3.6020 (1.0) | 0.2660 (1.0) | 101;111 | 26329 | 1 | | benchmark_standard_json_loads | 6.8310 (2.05) | 11.2870 (1.94) | 7.5088 (2.04) | 0.7889 (2.61) | 7.2790 (2.02) | 0.6900 (2.59) | 84;116 | 12691 | 1 | **Key columns to look at:** - **Name:** The name of your benchmark function. - **Min, Max, Mean, Median:** These are timings (often in microseconds, `us`, or milliseconds, `ms`). **Lower values are better.** The `Mean` or `Median` are often good general indicators. - **StdDev:** Standard deviation, showing the variability of the measurements. Lower is generally better, indicating more consistent performance. - **Rounds:** How many times the core benchmark loop was run by `pytest-benchmark` to gather statistics. - **Iterations:** How many times your target function was called within each round. - **Ops/s (or Rounds/s):** Operations per second. **Higher values are better.** (This column might not always be present by default or may be named differently based on configuration, but "Min", "Mean", "Median" time are primary). The numbers in parentheses (e.g., `(1.0)`, `(4.96)`) next to the metrics for `benchmark_orjson_dumps` show its performance relative to the baseline (the fastest test, which is itself in this case). For `benchmark_standard_json_dumps`, `(4.96)` next to its `Min` time means it was 4.96 times slower than the `Min` time of the fastest test (`benchmark_orjson_dumps`). From the example output, you could conclude that `orjson` is significantly faster than the standard `json` for both `dumps` and `loads` operations on this particular `SAMPLE_DATA` and machine.