# How to Benchmark Different Python Implementations with `pytest-benchmark`

```{note}
Most of this text was generated with AI.
```

This guide will walk you through setting up and running performance benchmarks
using `pytest-benchmark`. Benchmarking is crucial for making informed decisions
about which libraries or implementation strategies offer the best performance
for your specific use cases. We'll use the common example of comparing two JSON
serialization libraries: the standard `json` and the faster `orjson`.

## Why benchmark?

When you have multiple ways to achieve the same task (e.g., using different
libraries or algorithms), benchmarks provide quantitative data on their
performance. This data helps you:

- Identify performance bottlenecks.
- Choose the most efficient library/method for critical code paths.
- Track performance regressions or improvements over time.
- Justify technical decisions with concrete evidence.

## Prerequisites

Before you start, make sure you have the following installed in your Python environment:

1.  **Python**: (e.g., Python 3.8+)
2.  **`uv`**: Or your preferred Python package manager/runner.
3.  **`pytest`**: The testing framework.
4.  **`pytest-benchmark`**: The pytest plugin for benchmarking.
5.  **`orjson`**: The alternative JSON library we'll be testing against (the
    standard `json` library is built-in).

You can install the necessary Python packages using `uv`:

```console
uv pip install pytest pytest-benchmark orjson
```

# How to Benchmark Different Python Implementations with `pytest-benchmark`.

This guide will walk you through setting up and running performance benchmarks
using `pytest-benchmark`. Benchmarking is crucial for making informed decisions
about which libraries or implementation strategies offer the best performance
for your specific use cases. We'll use the common example of comparing two JSON
serialization libraries: the standard `json` and the faster `orjson`.

## Why Benchmark?

When you have multiple ways to achieve the same task (e.g., using different
libraries or algorithms), benchmarks provide quantitative data on their
performance. This data helps you:

- Identify performance bottlenecks.
- Choose the most efficient library/method for critical code paths.
- Track performance regressions or improvements over time.
- Justify technical decisions with concrete evidence.

## Prerequisites

Before you start, make sure you have the following installed in your Python environment:

1.  **Python**: (e.g., Python 3.8+)
2.  **`uv`**: Or your preferred Python package manager/runner.
3.  **`pytest`**: The testing framework.
4.  **`pytest-benchmark`**: The pytest plugin for benchmarking.
5.  **`orjson`**: The alternative JSON library we'll be testing against (the
    standard `json` library is built-in).

You can install the necessary Python packages using `uv`:

```console
uv pip install pytest pytest-benchmark orjson
```

## Setting up Your Benchmark File

1.  Create a directory for your benchmark scripts. Following your project
    structure, let's assume this is a `scripts/` directory.
2.  Inside the `scripts/` directory, create a new Python file for your
    benchmarks. For our JSON example, let's name it `test_json_performance.py`.

    ```
    project_root/
    └── scripts/
        └── test_json_performance.py
    ```

## Writing Benchmark Functions

In your `test_json_performance.py` file, you'll write functions that
`pytest-benchmark` can discover and run. Each function will test a specific
piece of code.

Here's how to structure the benchmark for comparing `json.dumps` and `orjson.dumps`:

```python
# scripts/test_json_performance.py

import pytest
import json
import orjson

# Sample data to be used for serialization
SAMPLE_DATA = {
    "name": "Example User",
    "email": "user@example.com",
    "age": 30,
    "is_active": True,
    "balance": 1234.56,
    "metadata": {"key" + str(i): "value" + str(i) for i in range(50)},
}

# Benchmark for the standard json library's dumps function
def benchmark_standard_json_dumps(benchmark):
    """Benchmarks the standard json.dumps() function."""
    benchmark(json.dumps, SAMPLE_DATA)

def benchmark_orjson_dumps(benchmark):
    """Benchmarks the orjson.dumps() function."""
    benchmark(orjson.dumps, SAMPLE_DATA)


SERIALIZED_JSON_STD = json.dumps(SAMPLE_DATA)
SERIALIZED_JSON_ORJSON = orjson.dumps(SAMPLE_DATA)


def benchmark_standard_json_loads(benchmark):
    benchmark(json.loads, SERIALIZED_JSON_STD)


def benchmark_orjson_loads(benchmark):
    benchmark(orjson.loads, SERIALIZED_JSON_ORJSON)

```

**Key points in the code:**

- We import `pytest` and the libraries we want to test (`json`, `orjson`).
- `SAMPLE_DATA` provides a consistent input for all benchmarks.
- Each function starting with `benchmark_` is recognized by `pytest-benchmark`.
- The `benchmark` fixture (provided by `pytest-benchmark`) is passed as an argument to these functions.
- You call `benchmark(function_to_test, arg1, arg2, ...)` to run and measure
  the `function_to_test` with its arguments.

## Running the Benchmarks

To run your benchmarks, navigate to your project's root directory in the
terminal and use the command structure you've established:

```console
uv run pytest scripts/test_json_performance.py
```

If you have multiple benchmark files in the `scripts/` directory, you can run one by one.

```console
uv run pytest scripts/{BENCHMARK}.py
```

## Understanding the output

After running, `pytest-benchmark` will produce a table summarizing the
performance results. It will look something like this (the exact numbers will
vary based on your machine):

| Name (time in us)             | Min            | Max            | Mean           | StdDev        | Median         | IQR           | Outliers(\*) | Rounds | Iterations |
| ----------------------------- | -------------- | -------------- | -------------- | ------------- | -------------- | ------------- | ------------ | ------ | ---------- |
| benchmark_orjson_dumps        | 3.8530 (1.0)   | 6.5290 (1.0)   | 4.3386 (1.0)   | 0.3104 (1.0)  | 4.2600 (1.0)   | 0.3045 (1.0)  | 64;95        | 22893  | 1          |
| benchmark_standard_json_dumps | 19.0930 (4.96) | 31.2950 (4.80) | 20.6635 (4.76) | 1.6072 (5.18) | 20.2170 (4.75) | 1.4480 (4.75) | 72;165       | 4633   | 1          |
| benchmark_orjson_loads        | 3.3270 (1.0)   | 5.8330 (1.0)   | 3.6799 (1.0)   | 0.3019 (1.0)  | 3.6020 (1.0)   | 0.2660 (1.0)  | 101;111      | 26329  | 1          |
| benchmark_standard_json_loads | 6.8310 (2.05)  | 11.2870 (1.94) | 7.5088 (2.04)  | 0.7889 (2.61) | 7.2790 (2.02)  | 0.6900 (2.59) | 84;116       | 12691  | 1          |

**Key columns to look at:**

- **Name:** The name of your benchmark function.
- **Min, Max, Mean, Median:** These are timings (often in microseconds, `us`,
  or milliseconds, `ms`). **Lower values are better.** The `Mean` or `Median` are
  often good general indicators.
- **StdDev:** Standard deviation, showing the variability of the measurements.
  Lower is generally better, indicating more consistent performance.
- **Rounds:** How many times the core benchmark loop was run by
  `pytest-benchmark` to gather statistics.
- **Iterations:** How many times your target function was called within each
  round.
- **Ops/s (or Rounds/s):** Operations per second. **Higher values are better.**
  (This column might not always be present by default or may be named differently
  based on configuration, but "Min", "Mean", "Median" time are primary).

The numbers in parentheses (e.g., `(1.0)`, `(4.96)`) next to the metrics for
`benchmark_orjson_dumps` show its performance relative to the baseline (the
fastest test, which is itself in this case). For
`benchmark_standard_json_dumps`, `(4.96)` next to its `Min` time means it was
4.96 times slower than the `Min` time of the fastest test
(`benchmark_orjson_dumps`).

From the example output, you could conclude that `orjson` is significantly
faster than the standard `json` for both `dumps` and `loads` operations on this
particular `SAMPLE_DATA` and machine.