Prometheus Middleware
=====================

Blacksmith can expose api calls metrics using :term:`Prometheus`.

It requires the extra dependency `prometheus_client`_ installed using the
following command.

.. _`prometheus_client`: https://pypi.org/project/prometheus-client/

::

   pip install blacksmith[prometheus]

Or using poetry

::

   poetry add blacksmith -E prometheus


To use the prometheus middlware, it has to be added to the `ClientFactory`.


All the available metrics are defined in :class:`blacksmith.PrometheusMetrics`,
histograms buckets can be configured, and some metrics are exposed using other
middleware, such as the  :ref:`HTTP Cache Middleware` or the
:ref:`Circuit Breaker Middleware`.

Usage example
-------------

Async
~~~~~

.. literalinclude:: prometheus_middleware_async.py

Sync
~~~~

.. literalinclude:: prometheus_middleware_sync.py


Default Metrics
---------------

While installing the metrics collector, it will add metrics on api call made.

There is `blacksmith_request_latency_seconds` Histogram and `blacksmith_info` Gauge.


blacksmith_request_latency_seconds Histogram
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Histogram have 3 metrics that are ``blacksmith_request_latency_seconds_count``,
``blacksmith_request_latency_seconds_sum`` and
``blacksmith_request_latency_seconds_bucket``.

All those metrics are incremented on every API calls.


You may configure the buckets using the parameter buckets

.. code-block:: python

   from blacksmith import PrometheusMetrics, AsyncPrometheusMiddleware

   BUCKETS = [0.05, 0.1, 0.2, 0.4, 0.8, 1.6, 3.2, 6.4, 12.8, 25.6]
   metrics = PrometheusMetrics(buckets=BUCKETS)
   middleware = AsyncPrometheusMiddleware(metrics)


``blacksmith_request_latency_seconds`` labels are  ``client_name``, ``method``,
``path``, ``status_code``.


.. note::

   The :term:`client_name` can indicated the service at its version, and, because a
   service can register the same method/path many times, it can be usefull
   to get the monitoring on every binding.

   Imagine the same route is consumed to get different aspect of the resource
   in many place of a code base. It can be appropriate to register different
   clients to distingate them.


.. figure:: ../../../../examples/prometheus_metrics/screenshot.png

   Example of `blacksmith_request_latency_seconds` Histogram


blacksmith_info Gauge
~~~~~~~~~~~~~~~~~~~~~

The metrics is ``blacksmith_info`` which is a Gauge that always return 1, it is usefull
to get the version of the blacksmith client installed, in its label `version`.


More Metrics by combining middlewares
-------------------------------------

blacksmith_circuit_breaker_state Gauge
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

While combining with the :ref:`Circuit Breaker Middleware`,
a metrics ``blacksmith_circuit_breaker_state`` Gauge is added to get the
states of circuit breakers per :term:`client_name`.

 * `0` - the circuit breaker is `closed`.
 * `1` - the circuit breaker is `half-open`.
 * `2` - the circuit breaker is `open`.


blacksmith_circuit_breaker_error Counter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

While combining with the :ref:`Circuit Breaker Middleware`,
a metrics ``blacksmith_circuit_breaker_error_total`` count the number
of errors of service.

.. note:

   When the circuit breaker is open, the errors are not included
   in the total count. Only the error of the service.


blacksmith_cache_hit
~~~~~~~~~~~~~~~~~~~~

While combining with the :ref:`HTTP Cache Middleware`, a metrics
``blacksmith_cache_hit_total`` count the number of responses served
from the cache.


blacksmith_cache_miss
~~~~~~~~~~~~~~~~~~~~~

While combining with the :ref:`HTTP Cache Middleware`, a metrics
``blacksmith_cache_miss_total`` count the number of responses that cannot
be served from the cache.

The ``cachable_state`` label indicated if the data is cachable or not.
Sometime the request is not cachable (the default policy cache only ``GET``),
sometime the response does not have a cache header, so it cannot be cached.

The cachable state can only contains `uncachable_request`, `uncachable_response`,
`cached`.

When the response is ``cached``, then the next request will be a hit (except if
the cache has expired).


blacksmith_cache_latency_seconds Histogram
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Histogram have 3 metrics that are ``blacksmith_cache_latency_seconds_count``,
``blacksmith_cache_latency_seconds_sum`` and ``blacksmith_cache_latency_seconds_bucket``.

It can be used to measure the performance of the cache.


.. code-block:: python

   from blacksmith import PrometheusMetrics, AsyncPrometheusMiddleware

   CACHE_BUCKETS = [0.005, 0.01, 0.02, 0.04, 0.08, 0.16, 0.32, 0.64, 1.28, 2.56]
   metrics = PrometheusMetrics(hit_cache_buckets=CACHE_BUCKETS)
   middleware = AsyncPrometheusMiddleware(metrics)


Expose metrics
--------------

After collecting metrics in the registry, the metrics has to be exposed,
because blacksmith is a client purpose API, it does not offer a way to expose
them, but, usually, a web framework application is used for that,
and used scrapped by a Prometheus instanced.


Example using starlette
~~~~~~~~~~~~~~~~~~~~~~~

::

   from prometheus_client import (
      generate_latest, CONTENT_TYPE_LATEST, REGISTRY
   )
   from starlette.applications import Starlette
   from starlette.responses import Response

   app = Starlette()

   @app.route("/metrics", methods=["GET"])
   async def get_metrics(request):
      resp = Response(
         generate_latest(REGISTRY),
         media_type=CONTENT_TYPE_LATEST,
         )
      return resp


.. note::

   REGISTRY is the default registry, `PrometheusMetrics` can be
   build by specifying another registry if necessary:

   ::

      from blacksmith import AsyncPrometheusMiddleware
      prom_middleware = AsyncPrometheusMiddleware(registry=my_registry)


Full examples of prometheus metrics
-----------------------------------

You will find an example using prometheus in the examples directory:

 * https://github.com/mardiros/blacksmith/tree/master/examples/prometheus_metrics

 * https://github.com/mardiros/blacksmith/tree/master/examples/circuit_breaker