Validation error python

An implementation of the JSON Schema specification for Python - jsonschema/errors.rst at main · python-jsonschema/jsonschema

Handling Validation Errors

.. currentmodule:: jsonschema.exceptions

When an invalid instance is encountered, a ValidationError will be
raised or returned, depending on which method or function is used.

.. autoexception:: ValidationError
    :noindex:

    The information carried by an error roughly breaks down into:

    ===============  =================  ========================
     What Happened   Why Did It Happen  What Was Being Validated
    ===============  =================  ========================
    `message`        `context`          `instance`

                     `cause`            `json_path`

                                        `path`

                                        `schema`

                                        `schema_path`

                                        `validator`

                                        `validator_value`
    ===============  =================  ========================


    .. attribute:: message

        A human readable message explaining the error.

    .. attribute:: validator

        The name of the failed `keyword
        <https://json-schema.org/draft/2020-12/json-schema-validation.html#name-a-vocabulary-for-structural>`_.

    .. attribute:: validator_value

        The associated value for the failed keyword in the schema.

    .. attribute:: schema

        The full schema that this error came from. This is potentially a
        subschema from within the schema that was passed in originally,
        or even an entirely different schema if a :kw:`$ref` was
        followed.

    .. attribute:: relative_schema_path

        A `collections.deque` containing the path to the failed keyword
        within the schema.

    .. attribute:: absolute_schema_path

        A `collections.deque` containing the path to the failed
        keyword within the schema, but always relative to the
        *original* schema as opposed to any subschema (i.e. the one
        originally passed into a validator class, *not* `schema`).

    .. attribute:: schema_path

        Same as `relative_schema_path`.

    .. attribute:: relative_path

        A `collections.deque` containing the path to the
        offending element within the instance. The deque can be empty if
        the error happened at the root of the instance.

    .. attribute:: absolute_path

        A `collections.deque` containing the path to the
        offending element within the instance. The absolute path
        is always relative to the *original* instance that was
        validated (i.e. the one passed into a validation method, *not*
        `instance`). The deque can be empty if the error happened
        at the root of the instance.

    .. attribute:: json_path

        A `JSON path <https://goessner.net/articles/JsonPath/index.html>`_
        to the offending element within the instance.

    .. attribute:: path

        Same as `relative_path`.

    .. attribute:: instance

        The instance that was being validated. This will differ from
        the instance originally passed into ``validate`` if the
        validator object was in the process of validating a (possibly
        nested) element within the top-level instance. The path within
        the top-level instance (i.e. `ValidationError.path`) could
        be used to find this object, but it is provided for convenience.

    .. attribute:: context

        If the error was caused by errors in subschemas, the list of errors
        from the subschemas will be available on this property. The
        `schema_path` and `path` of these errors will be relative
        to the parent error.

    .. attribute:: cause

        If the error was caused by a *non*-validation error, the
        exception object will be here. Currently this is only used
        for the exception raised by a failed format checker in
        `jsonschema.FormatChecker.check`.

    .. attribute:: parent

        A validation error which this error is the `context` of.
        ``None`` if there wasn't one.


In case an invalid schema itself is encountered, a SchemaError is
raised.

.. autoexception:: SchemaError
    :noindex:

    The same attributes are present as for `ValidationError`s.


These attributes can be clarified with a short example:

.. testcode::

    schema = {
        "items": {
            "anyOf": [
                {"type": "string", "maxLength": 2},
                {"type": "integer", "minimum": 5}
            ]
        }
    }
    instance = [{}, 3, "foo"]
    v = Draft202012Validator(schema)
    errors = sorted(v.iter_errors(instance), key=lambda e: e.path)

The error messages in this situation are not very helpful on their own.

.. testcode::

    for error in errors:
        print(error.message)

outputs:

.. testoutput::

    {} is not valid under any of the given schemas
    3 is not valid under any of the given schemas
    'foo' is not valid under any of the given schemas

If we look at ValidationError.path on each of the errors, we can find
out which elements in the instance correspond to each of the errors. In
this example, ValidationError.path will have only one element, which
will be the index in our list.

.. testcode::

    for error in errors:
        print(list(error.path))

.. testoutput::

    [0]
    [1]
    [2]

Since our schema contained nested subschemas, it can be helpful to look at
the specific part of the instance and subschema that caused each of the errors.
This can be seen with the ValidationError.instance and
ValidationError.schema attributes.

With keywords like :kw:`anyOf`, the ValidationError.context
attribute can be used to see the sub-errors which caused the failure. Since
these errors actually came from two separate subschemas, it can be helpful to
look at the ValidationError.schema_path attribute as well to see where
exactly in the schema each of these errors come from. In the case of sub-errors
from the ValidationError.context attribute, this path will be relative
to the ValidationError.schema_path of the parent error.

.. testcode::

    for error in errors:
        for suberror in sorted(error.context, key=lambda e: e.schema_path):
            print(list(suberror.schema_path), suberror.message, sep=", ")

.. testoutput::

    [0, 'type'], {} is not of type 'string'
    [1, 'type'], {} is not of type 'integer'
    [0, 'type'], 3 is not of type 'string'
    [1, 'minimum'], 3 is less than the minimum of 5
    [0, 'maxLength'], 'foo' is too long
    [1, 'type'], 'foo' is not of type 'integer'

The string representation of an error combines some of these attributes for
easier debugging.

.. testcode::

    print(errors[1])

.. testoutput::

    3 is not valid under any of the given schemas

    Failed validating 'anyOf' in schema['items']:
        {'anyOf': [{'maxLength': 2, 'type': 'string'},
                   {'minimum': 5, 'type': 'integer'}]}

    On instance[1]:
        3


ErrorTrees

If you want to programmatically query which validation keywords
failed when validating a given instance, you may want to do so using
jsonschema.exceptions.ErrorTree objects.

.. autoclass:: jsonschema.exceptions.ErrorTree
    :noindex:
    :members:
    :special-members:
    :exclude-members: __dict__,__weakref__

    .. attribute:: errors

        The mapping of validator keywords to the error objects (usually
        `jsonschema.exceptions.ValidationError`s) at this level
        of the tree.

Consider the following example:

.. testcode::

    schema = {
        "type" : "array",
        "items" : {"type" : "number", "enum" : [1, 2, 3]},
        "minItems" : 3,
    }
    instance = ["spam", 2]

For clarity’s sake, the given instance has three errors under this schema:

.. testcode::

    v = Draft202012Validator(schema)
    for error in sorted(v.iter_errors(["spam", 2]), key=str):
        print(error.message)

.. testoutput::

    'spam' is not of type 'number'
    'spam' is not one of [1, 2, 3]
    ['spam', 2] is too short

Let’s construct an jsonschema.exceptions.ErrorTree so that we
can query the errors a bit more easily than by just iterating over the
error objects.

.. testcode::

    tree = ErrorTree(v.iter_errors(instance))

As you can see, jsonschema.exceptions.ErrorTree takes an
iterable of ValidationErrors when constructing a tree so
you can directly pass it the return value of a validator object’s
jsonschema.protocols.Validator.iter_errors method.

ErrorTrees support a number of useful operations. The first one we
might want to perform is to check whether a given element in our instance
failed validation. We do so using the :keyword:`in` operator:

>>> 0 in tree
True

>>> 1 in tree
False

The interpretation here is that the 0th index into the instance ("spam")
did have an error (in fact it had 2), while the 1th index (2) did not (i.e.
it was valid).

If we want to see which errors a child had, we index into the tree and look at
the ErrorTree.errors attribute.

>>> sorted(tree[0].errors)
['enum', 'type']

Here we see that the :kw:`enum` and :kw:`type` keywords failed for
index 0. In fact ErrorTree.errors is a dict, whose values are the
ValidationErrors, so we can get at those directly if we want them.

>>> print(tree[0].errors["type"].message)
'spam' is not of type 'number'

Of course this means that if we want to know if a given validation
keyword failed for a given index, we check for its presence in
ErrorTree.errors:

>>> "enum" in tree[0].errors
True

>>> "minimum" in tree[0].errors
False

Finally, if you were paying close enough attention, you’ll notice that
we haven’t seen our :kw:`minItems` error appear anywhere yet. This is
because :kw:`minItems` is an error that applies globally to the instance
itself. So it appears in the root node of the tree.

>>> "minItems" in tree.errors
True

That’s all you need to know to use error trees.

To summarize, each tree contains child trees that can be accessed by
indexing the tree to get the corresponding child tree for a given
index into the instance. Each tree and child has a ErrorTree.errors
attribute, a dict, that maps the failed validation keyword to the
corresponding validation error.

best_match and relevance

The best_match function is a simple but useful function for attempting
to guess the most relevant error in a given bunch.

>>> from jsonschema import Draft202012Validator
>>> from jsonschema.exceptions import best_match

>>> schema = {
...     "type": "array",
...     "minItems": 3,
... }
>>> print(best_match(Draft202012Validator(schema).iter_errors(11)).message)
11 is not of type 'array'
.. autofunction:: best_match
    :noindex:


.. function:: relevance(validation_error)
    :noindex:

    A key function that sorts errors based on heuristic relevance.

    If you want to sort a bunch of errors entirely, you can use
    this function to do so. Using this function as a key to e.g.
    `sorted` or `max` will cause more relevant errors to be
    considered greater than less relevant ones.

    Within the different validation keywords that can fail, this
    function considers :kw:`anyOf` and :kw:`oneOf` to be *weak*
    validation errors, and will sort them lower than other errors at the
    same level in the instance.

    If you want to change the set of weak [or strong] validation
    keywords you can create a custom version of this function with
    `by_relevance` and provide a different set of each.

>>> schema = {
...     "properties": {
...         "name": {"type": "string"},
...         "phones": {
...             "properties": {
...                 "home": {"type": "string"}
...             },
...         },
...     },
... }
>>> instance = {"name": 123, "phones": {"home": [123]}}
>>> errors = Draft202012Validator(schema).iter_errors(instance)
>>> [
...     e.path[-1]
...     for e in sorted(errors, key=exceptions.relevance)
... ]
['home', 'name']
.. autofunction:: by_relevance
    :noindex:
Содержание

Введение
Пример применения
BaseModel
Вывод сообщения об ошибке как JSON
Диапазон допустимых значений
@validator
@root_validator
Полный код примеров
Похожие статьи

Введение

Pydantic это библиотека, с помощью которой можно парсить данные и выполнять валидацию.

Про установку можете прочитать

здесь

Свободный перевод того что они пишут о себе + комментарии:

Проверка данных и управление настройками с помощью аннотаций типа python.

pydantic применяет аннотации типа (type hints — смотрите

PEP 484
)
во время выполнения (runtime) и предоставляет понятные пользователю сообщения об ошибках, когда данные некорректны.

Определить какими должны быть данные можно с помощью в чистого, канонического python. Затем можно сделать валидацию с помощью pydantic.

Pydantic использует возможность современного Python. Убедитесь, что у вас версия не ниже 3.7

Желательно установить последнюю стабильную версию Python. Если нужно — прочитайте
«Руководство по установке Python в Linux»

Тем не менее, если вы планируете обмениваться данными в формате JSON со Swagger или Open API проверьте текущую
совместимость библиотек.

Пример использования

Рассмотрим скрипт

PydanticDemo.py

from dataclasses import dataclass
from typing import Tuple
from enum import Enum


@dataclass
class IceCreamMix:
name: str
flavor: str
toppings: Tuple[str, ...]
scoops: int

def main():
ice_cream_mix = IceCreamMix(
"PB&J",
"peanut butter",
("strawberries", "sprinkles"),
2
)

print(ice_cream_mix)

if __name__ == '__main__':
main()

python PydanticDemo.py

IceCreamMix(name=’PB&J’, flavor=’peanut butter’, toppings=(‘strawberries’, ‘sprinkles’), scoops=2)

Этот скрипт успешно демонстрирует тип мороженого

Добавим ещё немного ООП

from dataclasses import dataclass
from typing import Tuple
from enum import Enum


class Flavor(str, Enum):
chocolate = 'chocolate'
vanilla = 'vanilla'
strawberry = 'strawberry'
mint = 'mint'
coffeee = 'coffee'
peanut_butter = 'peanut butter'

class Topping(str, Enum):
sprinkles = 'sprinkles'
hot_fudge = 'hot fudge'
cookies = 'cookies'
brownie = 'brownie'
whipped_cream = 'whipped cream'
strawberries = 'strawberries'


@dataclass
class IceCreamMix:
name: str
flavor: Flavor
toppings: Tuple[Topping, ...]
scoops: int

def main():
ice_cream_mix = IceCreamMix(
"PB&J",
Flavor.peanut_butter,
(Topping.strawberries, Topping.sprinkles),
2
)

print(ice_cream_mix)

if __name__ == '__main__':
main()

$ python PydanticDemo.py

IceCreamMix(name=’PB&J’, flavor=<Flavor.peanut_butter: ‘peanut butter’>, toppings=(<Topping.strawberries: ‘strawberries’>, <Topping.sprinkles: ‘sprinkles’>), scoops=2)

Скрипт по-прежнему работает

Что если мы по ошибке выберем несуществующий запах или топпинг

def main():
ice_cream_mix = IceCreamMix(
"PB&J",
"smells like shit",
(Topping.strawberries, 111),
2
)

python PydanticDemo.py

IceCreamMix(name=’PB&J’, flavor=’smells like shit‘, toppings=(<Topping.strawberries: ‘strawberries’>, 111), scoops=2)

Скрипт не замечает подвоха.

Чтобы проверять данные автоматически установите pydantic и внесите всего одно изменение
в первую строку

from pydantic.dataclasses import dataclass

python PydanticDemo.py

Traceback (most recent call last):
File «PydanticDemo.py», line 41, in <module>
main()
File «PydanticDemo.py», line 31, in main
ice_cream_mix = IceCreamMix(
File «<string>», line 7, in __init__
File «C:UsersAndreipythonpydanticvenvlibsite-packagespydanticdataclasses.py», line 99, in _pydantic_post_init
raise validation_error
pydantic.error_wrappers.ValidationError: 2 validation errors for IceCreamMix
flavor
value is not a valid enumeration member; permitted: ‘chocolate’, ‘vanilla’, ‘strawberry’, ‘mint’, ‘coffee’, ‘peanut butter’ (type=type_error.enum; enum_values=[<Flavor.chocolate: ‘chocolate’>, <Flavor.vanilla: ‘vanilla’>, <Flavor.strawberry: ‘strawberry’>, <Flavor.mint: ‘mint’>, <Flavor.coffeee: ‘coffee’>, <Flavor.peanut_butter: ‘peanut butter’>])
toppings -> 1
value is not a valid enumeration member; permitted: ‘sprinkles’, ‘hot fudge’, ‘cookies’, ‘brownie’, ‘whipped cream’, ‘strawberries’ (type=type_error.enum; enum_values=[<Topping.sprinkles: ‘sprinkles’>, <Topping.hot_fudge: ‘hot fudge’>, <Topping.cookies: ‘cookies’>, <Topping.brownie: ‘brownie’>, <Topping.whipped_cream: ‘whipped cream’>, <Topping.strawberries: ‘strawberries’>])

pydantic не пропустил наш код. Разберём выдачу подробнее

pydantic.error_wrappers.ValidationError: 2 validation errors for IceCreamMix

Указано количество ошибок и класс. Это помогло бы с дебагом, если бы мы не знали заранее где ошибки
и сколько их

flavor
value is not a valid enumeration member; permitted:

‘chocolate’,
‘vanilla’,
‘strawberry’,
‘mint’,
‘coffee’,
‘peanut butter’

(type=type_error.enum; enum_values=[<Flavor.chocolate: ‘chocolate’>, <Flavor.vanilla: ‘vanilla’>, <Flavor.strawberry: ‘strawberry’>, <Flavor.mint: ‘mint’>, <Flavor.coffeee: ‘coffee’>, <Flavor.peanut_butter: ‘peanut butter’>])

Pydantic подсказывает допустимые значения.

Тоже самое и с топпингами, где вместо допустимого значения стоит 111

toppings -> 1
value is not a valid enumeration member; permitted: ‘sprinkles’, ‘hot fudge’, ‘cookies’, ‘brownie’, ‘whipped cream’, ‘strawberries’ (type=type_error.enum; enum_values=[<Topping.sprinkles: ‘sprinkles’>, <Topping.hot_fudge: ‘hot fudge’>, <Topping.cookies: ‘cookies’>, <Topping.brownie: ‘brownie’>, <Topping.whipped_cream: ‘whipped cream’>, <Topping.strawberries: ‘strawberries’>])

Верните корректные значения для Flavor и Topping но замените scoops с 2 на ‘2’

def main():
ice_cream_mix = IceCreamMix(
"PB&J",
Flavor.peanut_butter,
(Topping.strawberries, Topping.sprinkles),
'2'
)

python PydanticDemo.py

IceCreamMix(name=’PB&J’, flavor=<Flavor.peanut_butter: ‘peanut butter’>, toppings=(<Topping.strawberries: ‘strawberries’>, <Topping.sprinkles: ‘sprinkles’>), scoops=2)

scoops по-прежнему равно двум

Pydantic поддерживает приведение типа (type coercion)

BaseModel

Чтобы получить доступ к дополнительным возможностям таким как сериализация (Serialization)
и поддержка JSON воспользуемся классом BaseModel

Просто напомню, что сперва у нас было

from dataclasses import dataclass

Затем

from pydantic.dataclasses import dataclass

А сейчас нужно сделать

from pydantic import BaseModel

И убрать декоратор @dataclass перед class IceCreamMix:

class IceCreamMix: нужно заменить на class IceCreamMix(BaseModel):

а также добавить имена аттрибутов код, создающий объект класса IceCreamMix
то есть name = «PB&J» flavor = Flavor.peanut_butter и так далее

strawberries = 'strawberries'


class IceCreamMix(BaseModel):
name: str
flavor: Flavor
toppings: Tuple[Topping, ...]
scoops: int

def main():
ice_cream_mix = IceCreamMix(
name = "PB&J",
flavor = Flavor.peanut_butter,
toppings = (Topping.strawberries, Topping.sprinkles),
scoops = 2
)

python PydanticDemo.py

IceCreamMix(name=’PB&J’, flavor=<Flavor.peanut_butter: ‘peanut butter’>, toppings=(<Topping.strawberries: ‘strawberries’>, <Topping.sprinkles: ‘sprinkles’>), scoops=2)

Всё работает так же, как и до изменений.

Теперь можно вывести результат в виде JSON

print(ice_cream_mix.json())

python PydanticDemo.py

{«name»: «PB&J», «flavor»: «peanut butter», «toppings»: [«strawberries», «brownie»], «scoops»: 2}

Обратите внимание на JSON который вы получили выше.

Его можно скопировать, затем если нужно изменить и создать ещё один объект
прямо из JSON с помощью метода parse_raw()

Например:

another_mix = IceCreamMix.parse_raw('{"name": "New mix", "flavor": "mint", "toppings": ["cookies", "hot fudge"], "scoops": 2}')
print(another_mix.json())

{«name»: «New mix», «flavor»: «mint», «toppings»: [«cookies», «hot fudge»], «scoops»: 2}

Если случайно ошибиться со значением аттрибута — pydantic не даст соврать

another_mix = IceCreamMix.parse_raw('{"name": "New mix", "flavor": "novichoke", "toppings": ["cookies", "hot fudge"], "scoops": 2}')
print(another_mix.json())

python PydanticDemo.py

Traceback (most recent call last):
File «/home/avorotyn/python/pydantic/PydanticDemo.py», line 45, in <module>
main()
File «/home/avorotyn/python/pydantic/PydanticDemo.py», line 40, in main
another_mix = IceCreamMix.parse_raw(‘{«name»: «New mix», «flavor»: «novichoke«, «toppings»: [«cookies», «hot fudge»], «scoops»: 2}’)
File «pydantic/main.py», line 543, in pydantic.main.BaseModel.parse_raw
File «pydantic/main.py», line 520, in pydantic.main.BaseModel.parse_obj
File «pydantic/main.py», line 362, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for IceCreamMix
flavor
value is not a valid enumeration member; permitted: ‘chocolate’, ‘vanilla’, ‘strawberry’, ‘mint’, ‘coffee’, ‘peanut butter’ (type=type_error.enum; enum_values=[<Flavor.chocolate: ‘chocolate’>, <Flavor.vanilla: ‘vanilla’>, <Flavor.strawberry: ‘strawberry’>, <Flavor.mint: ‘mint’>, <Flavor.coffeee: ‘coffee’>, <Flavor.peanut_butter: ‘peanut butter’>])

ValidationError как JSON

В JSON можно также оформить сообщения об ошибках. Нужно импортировать из pydantic ValidationError
и воспользоваться try: except

from pydantic import BaseModel, ValidationError

def main():
try:
ice_cream_mix = IceCreamMix(
name = "PB&J",
flavor = "spring",
toppings = (Topping.strawberries, Topping.sprinkles),
scoops = 2
)
except ValidationError as e:
print(e.json())

python PydanticDemo.py

[
{
«loc»: [
«flavor»
],
«msg»: «value is not a valid enumeration member; permitted: ‘chocolate’, ‘vanilla’, ‘strawberry’, ‘mint’, ‘coffee’, ‘peanut butter'»,
«type»: «type_error.enum»,
«ctx»: {
«enum_values»: [
«chocolate»,
«vanilla»,
«strawberry»,
«mint»,
«coffee»,
«peanut butter»
]
}
}
]

Границы допустимых значений

Допустим вы хотите, чтобы число ложечек было обязательным аттрибутом со значениями от 0 до 5 не включая границы

from pydantic import BaseModel, ValidationError, Field

strawberries = 'strawberries'


class IceCreamMix(BaseModel):
name: str
flavor: Flavor
toppings: Tuple[Topping, ...]
scoops: int = Field(..., gt=0, lt=5)

python PydanticDemo.py

{«name»: «PB&J», «flavor»: «peanut butter», «toppings»: [«strawberries», «brownie»], «scoops»: 2}

Задано 2 ложечки, так что ошибок нет.

Попробуем 5 ложечек

def main():
try:
ice_cream_mix = IceCreamMix(
name = "PB&J",
flavor = "spring",
toppings = (Topping.strawberries, Topping.sprinkles),
scoops = 5
)
except ValidationError as e:
print(e.json())

python PydanticDemo.py

[
{
«loc»: [
«scoops»
],
«msg»: «ensure this value is less than 5«,
«type»: «value_error.number.not_lt»,
«ctx»: {
«limit_value»: 5
}
}
]
Traceback (most recent call last):
File «/home/avorotyn/python/pydantic/PydanticDemo.py», line 50, in <module>
main()
File «/home/avorotyn/python/pydantic/PydanticDemo.py», line 41, in main
print(ice_cream_mix.json())
UnboundLocalError: local variable ‘ice_cream_mix’ referenced before assignment

Не обращайте внимание на Traceback — можно было вложить print в try, но если всё ок, то объёкт создается и
этой ошибки нет, а если не ок, то pydantic ловит несоответствие и выдает value_error

Валидация с помощью декоратора validator

Ещё один полезный способ установки ограничений — с помощью @validator

Он применяется если нужно следить за каким-то одним аттрибутом

from pydantic import BaseModel, ValidationError, Field, validator

class IceCreamMix(BaseModel):
name: str
flavor: Flavor
toppings: Tuple[Topping, ...]
scoops: int = Field(..., gt=0, lt=5)

@validator('toppings')
def check_toppings(cls, toppings):
if len(toppings) > 4:
raise ValueError('Too many toppings')
return toppings

Если запустить этот код с двумя топпингами никаких ошибок не будет, поэтому сразу добавим ещё три, чтобы
в сумме стало пять.

def main():
try:
ice_cream_mix = IceCreamMix(
name = "PB&J",
flavor = "spring",
toppings = (Topping.strawberries, Topping.brownie,Topping.sprinkles,Topping.hot_fudge,Topping.whipped_cream),

python PydanticDemo.py

[
{
«loc»: [
«toppings»
],
«msg»: «Too many toppings»,
«type»: «value_error»
}
]

Теперь можно уменьшить число топпингов до четырёх и убедиться что ошибки нет.

@root_validator

Применяется когда нужно следить за всей моделью. Например за сочетаниями разных аттрибутов.

from pydantic import BaseModel, ValidationError, Field, validator, root_validator

Создайте ещё один класс — Container

strawberries = 'strawberries'

class Container(str, Enum):
cup = 'cup'
cone = 'cone'
waffle_cone = 'waffle cone'

class IceCreamMix(BaseModel):
name: str
flavor: Flavor

Зададим условие: если топпинг это hot_fudge то никакой рожок давать нельзя — можно только чашку (cup)

Валидацию будем делать через @root_validator

@validator('toppings')
def check_toppings(cls, toppings):
if len(toppings) > 4:
raise ValueError('Too many toppings')
return toppings

@root_validator
def check_cone_toppings(cls, toppings):
container = values.get('container')
toppings = values.get('toppings')
if container == Container.cone or container == Container.waffle_cone:
if Topping.hot_fudge in toppings:
raise ValueError('Cones cannot have hot fudge')
return values

def main():
try:
ice_cream_mix = IceCreamMix(
name = "PB&J",
container = Container.waffle_cone,
flavor = "spring",

У вас как раз должен был остаться топпинг hot fudge с прошлого примера, если нет — добавьте и запустите

python PydanticDemo.py

[
{
«loc»: [
«__root__»
],
«msg»: «Cones cannot have hot fudge»,
«type»: «value_error»
}
]

Окончательный код примера

Краткий обзор возможностей pydantic подошёл к концу.

Спасибо за внимание, ниже полный код к этой статье.

from pydantic import BaseModel,
ValidationError, Field, validator, root_validator
from typing import Tuple
from enum import Enum


class Flavor(str, Enum):
chocolate = 'chocolate'
vanilla = 'vanilla'
strawberry = 'strawberry'
mint = 'mint'
coffeee = 'coffee'
peanut_butter = 'peanut butter'


class Topping(str, Enum):
sprinkles = 'sprinkles'
hot_fudge = 'hot fudge'
cookies = 'cookies'
brownie = 'brownie'
whipped_cream = 'whipped cream'
strawberries = 'strawberries'


class Container(str, Enum):
cup = 'cup'
cone = 'cone'
waffle_cone = 'waffle cone'


class IceCreamMix(BaseModel):
name: str
container: Container
flavor: Flavor
toppings: Tuple[Topping, ...]
scoops: int = Field(..., gt=0, lt=5)

@validator('toppings')
def check_toppings(cls, toppings):
if len(toppings) > 4:
raise ValueError('Too many toppings')
return toppings

@root_validator
def check_cone_toppings(cls, values):
container = values.get('container')
toppings = values.get('toppings')
if container == Container.cone or container == Container.waffle_cone:
if Topping.hot_fudge in toppings:
raise ValueError('Cones cannot have hot fudge')
return values


def main():
try:
ice_cream_mix = IceCreamMix(
name="PB&J",
container=Container.waffle_cone,
flavor=Flavor.peanut_butter,
# flavor='unknown flavour'
toppings=(Topping.strawberries, Topping.brownie,
Topping.sprinkles),
# на validator
# toppings=(Topping.strawberries, Topping.brownie,
# Topping.sprinkles,Topping.cookies, Topping.sprinkles),
# на root_validator
# toppings=(Topping.strawberries, Topping.brownie,
# Topping.sprinkles,Topping.hot_fudge),
scoops=2
# scoops=5
)
print(ice_cream_mix.json())

except ValidationError as e:
print(e.json())


if __name__ == '__main__':
main()

Вторая часть статьи

Похожие статьи

Pydantic models
Python
enumerate

We use the Python package pydantic for fast and easy validation of input data. Here’s how.

We discovered the Python package pydantic through FastAPI, which we use for serving machine learning models. Pydantic is a Python package for data parsing and validation, based on type hints. We use pydantic because it is fast, does a lot of the dirty work for us, provides clear error messages and makes it easy to write readable code.

Two of our main uses cases for pydantic are:

  1. Validation of settings and input data.
    We often read settings from a configuration file, which we use as inputs to our functions. We often end up doing quite a bit of input validation to ensure the settings are valid for further processing. To avoid starting our functions with a long set of validations and assertions, we use pydantic to validate the input.
  2. Sharing data requirements between machine learning applications.
    Our ML models have certain requirements to the data we use for training and prediction, for example that no data is missing. These requirements are used when we train our models, when we run online predictions and when we validate newly added training data. We use pydantic to specify the requirements we have and ensure that the same requirements are used everywhere, avoiding duplication of error-prone code across different applications.

This post will focus on the first use case, validation of settings and input data. A later post will cover the second use case.

Validation of settings and input data

In some cases, we read settings from a configuration file, such as a
toml file, to be parsed as nested dictionaries.
We use the settings as inputs to different functions. We
often end up doing quite a bit of input validation to ensure the settings parsed from
file are valid for further processing. A concrete example is settings for machine learning
models, where we use toml files for defining model parameters, features and training details for the models.

This is quite similar to how FastAPI uses pydantic for input validation: the input to the API call is json, which in Python translates to a dictionary, and input validation is done using pydantic.

In this post we will go through input validation for a function interpolating a time series to a higher frequency. If we want to do
interpolation, we set the interpolation factor, i.e., the factor of upsamling, the
interpolation method, and an option to interpolate on the integral. Our interpolation
function is just a wrapper around pandas interpolation methods,
including the validation of input and some data wrangling. The input validation code started out looking a bit like this:

from typing import Dict


def validate_input_settings(params_in: Dict) -> Dict:
    params_validated = {}
    for key, value in params_in.items():
        if key == "interpolation_factor":
            if not int(value) == value:
                raise ValueError(f"{key} has a non-int value")
            if not int(value) >= 2:
                raise ValueError(f"{key}: {value} should be >= 2")
            value = int(value)
        elif key == "interpolation_method":
            allowed_set = {
                "repeat", "distribute", "linear", "cubic", "akima"
            }
            if value not in allowed_set:
                raise ValueError(f"{key} should be one of {allowed_set}, got {value}")
        elif key == "interpolate_on_integral":
            if not isinstance(value, bool):
                raise ValueError(f"{key} should be bool, got {value}")
        else:
            raise ValueError(f"{key} not a recognized key")
        params_validated[key] = value
    return params_validated

This is heavily nested, which in itself makes it hard to read, and perhaps you find that the validation rules aren’t crystal clear at first glance. We use SonarQube for static code quality analysis, and this piece of code results in a code smell, complaining that the code is too complex. In fact, this already has a cognitive complexity
of 18 as SonarQube counts, above the default threshold of 15. Cognitive complexity is a measure of how difficult it is to read code, and increments for each break in linear flow, such as an if statement or a for loop. Nested breaks of the flow are incremented again.

Let’s summarize what we check for in validate_input_settings:

  • interpolation_factor is an integer
  • interpolation_factor is greater than or equal to 2
  • interpolation_method is in a set of allowed values
  • interpolate_on_integral is boolean
  • The keys in our settings dictionary are among the three mentioned above

In addition to the code above, we have a few more checks:

  • if an interpolation_factor is given, but no interpolation_method,
    use the default method linear
  • if an interpolation_factor is given, but not interpolate_on_integral,
    set the default option False
  • check for invalid the invalid combination interpolate_on_integral = False
    and interpolation_method = "distribute"

At the end of another three if statements inside the for loop,
we end up at a cognitive complexity of 24.

Pydantic to the rescue

We might consider using a pydantic
model for the input validation.

Minimal start

We can start out with the simplest form of a pydantic model, with field types:

from pydantic import BaseModel


class InterpolationSetting(BaseModel):
    interpolation_factor: int
    interpolation_method: str
    interpolate_on_integral: bool

Pydantic models are simply classes inheriting from the BaseModel class. We can create an instance of the new class as:

InterpolationSetting(
    interpolation_factor=2, 
    interpolation_method="linear", 
    interpolate_on_integral=True
)

This automatically does two of the checks we had implemented:

  • interpolation_factor is an int
  • interpolate_on_integral is a bool

In the original script, the fields are in fact optional, i.e., it is possible to
provide no interpolation settings, in which case we do not do interpolation. We will
set the fields to optional later, and then implement the additional necessary checks.

We can verify the checks we have enforced now by supplying non-valid input:

from pydantic import ValidationError


try:
    InterpolationSetting(
        interpolation_factor="text",
        interpolation_method="linear",
        interpolate_on_integral=True,
    )
except ValidationError as e:
    print(e)

which outputs:

    1 validation error for InterpolationSetting
    interpolation_factor
      value is not a valid integer (type=type_error.integer)

Pydantic raises a ValidationError when the validation of the model fails, stating
which field, i.e. attribute, raised the error and why. In this case
interpolation_factor raised a
type error because the value "text" is not a valid integer. The validation is
performed on instantiation of an InterpolationSetting object.

Validation of single fields and combinations of fields

Our original code also had some additional requirements:

  • interpolation_factor should be greater than or equal to two.
  • interpolation_method must be chosen from a set of valid methods.
  • We do not allow the combination of interpolate_on_integral=False
    and interpolation_method="distribute"

The first restriction can be implemented using pydantic types. Pydantic provides many different types, we will use a constrained types this requirement, namely conint, a constrained integer type providing automatic restrictions such as lower limits.

The remaining two restrictions can be implemented as validators. We decorate our validation
functions with the validator decorator. The input argument to the validator decorator is the name of the attribute(s)
to perform the validation for.

All validators are run automatically when we instantiate
an object of the InterpolationSetting class, as for the type checking.

Our
validation functions are class methods, and the first argument is the class,
not an instance of the class. The second argument is the value to validate, and
can be named as we wish. We implement two validators, method_is_valid and valid_combination_of_method_and_on_integral:

from typing import Dict

from pydantic import BaseModel, conint, validator, root_validator


class InterpolationSetting(BaseModel):
    interpolation_factor: conint(gt=1)
    interpolation_method: str
    interpolate_on_integral: bool

    @validator("interpolation_method")
    def method_is_valid(cls, method: str) -> str:
        allowed_set = {"repeat", "distribute", "linear", "cubic", "akima"}
        if method not in allowed_set:
            raise ValueError(f"must be in {allowed_set}, got '{method}'")
        return method

    @root_validator()
    def valid_combination_of_method_and_on_integral(cls, values: Dict) -> Dict:
        on_integral = values.get("interpolate_on_integral")
        method = values.get("interpolation_method")
        if on_integral is False and method == "distribute":
            raise ValueError(
                f"Invalid combination of interpolation_method "
                f"{method} and interpolate_on_integral {on_integral}"
            )
        return values

There are a few things to note here:

  • Validators should return a validated value. The validators are run
    sequentially, and populate the fields of the data model if they are valid.
  • Validators should only raise ValueError, TypeError or AssertionError.
    Pydantic will catch these errors to populate the ValidationError and raise
    one exception regardless of the number of errors found in validation. You can read
    more about error handling
    in the docs.
  • When we validate a field against another, we can use the root_validator, which
    runs validation on entire model. Root validators are a little different: they have
    access to the values argument, which
    is a dictionary containing all fields that have already been validated. When the
    root validator runs, the interpolation_method may have failed to validate, in
    which case it will not be added to the values dictionary. Here, we handle that
    by using values.get("interpolation_method") which returns None if the key is
    not in values. The docs contain more information on root
    validators
    and
    field ordering,
    which is important to consider when we are using the values dictionary.

Again, we can verify by choosing input parameters to trigger the errors:

from pydantic import ValidationError


try:
    InterpolationSetting(
        interpolation_factor=1,
        interpolation_method="distribute",
        interpolate_on_integral=False,
    )
except ValidationError as e:
    print(e)

which outputs:

    2 validation errors for InterpolationSetting
    interpolation_factor
      ensure this value is greater than 1 (type=value_error.number.not_gt; limit_value=1)
    __root__
      Invalid combination of interpolation_method distribute and interpolate_on_integral False (type=value_error)

As we see, pydantic raises a single ValidationError regardless of the number of ValueErrors raised in our model.

Implementing dynamic defaults

We also had some default values if certain parameters were not given:

  • If an interpolation_factor is given, set the default value linear
    for interpolation_method if none is given.
  • If an interpolation_factor is given, set the default value False
    for interpolate_on_integral if none is given.

In this case, we have dynamic defaults dependent on other fields.

This can also be achieved with root validators, by returning a conditional value.
As this means validating one field against another, we must take care to ensure
our code runs whether or not the two fields have passed validation and been added to
the values dictionary. We will now also use
Optional types, because we will handle the cases where not all values are provided. We add the new validators set_method_given_interpolation_factor and set_on_integral_given_interpolation_factor:

from typing import Dict, Optional

from pydantic import BaseModel, conint, validator, root_validator


class InterpolationSetting(BaseModel):
    interpolation_factor: Optional[conint(gt=2)]
    interpolation_method: Optional[str]
    interpolate_on_integral: Optional[bool]

    @validator("interpolation_method")
    def method_is_valid(cls, method: Optional[str]) -> Optional[str]:
        allowed_set = {"repeat", "distribute", "linear", "cubic", "akima"}
        if method is not None and method not in allowed_set:
            raise ValueError(f"must be in {allowed_set}, got '{method}'")
        return method

    @root_validator()
    def valid_combination_of_method_and_on_integral(cls, values: Dict) -> Dict:
        on_integral = values.get("interpolate_on_integral")
        method = values.get("interpolation_method")
        if on_integral is False and method == "distribute":
            raise ValueError(
                f"Invalid combination of interpolation_method "
                f"{method} and interpolate_on_integral {on_integral}"
            )
        return values

    @root_validator()
    def set_method_given_interpolation_factor(cls, values: Dict) -> Dict:
        factor = values.get("interpolation_factor")
        method = values.get("interpolation_method")
        if method is None and factor is not None:
            values["interpolation_method"] = "linear"
        return values

    @root_validator()
    def set_on_integral_given_interpolation_factor(cls, values: Dict) -> Dict:
        on_integral = values.get("interpolate_on_integral")
        factor = values.get("interpolation_factor")
        if on_integral is None and factor is not None:
            values["interpolate_on_integral"] = False
        return values

We can verify that the default values are set only when interpolation_factor
is provided, running InterpolationSetting(interpolation_factor=3) returns:

InterpolationSetting(interpolation_factor=3, interpolation_method='linear', interpolate_on_integral=None)

whereas supplying no input parameters, InterpolationSetting(), returns a data model with all parameters set to None:

InterpolationSetting(interpolation_factor=None, interpolation_method=None, interpolate_on_integral=None)

Note: If we have static defaults, we can simply set them for the fields:

class InterpolationSetting(BaseModel):
    interpolation_factor: Optional[int] = 42

Final safeguard against typos

Finally, we had one more check in out previous script: That no unknown keys were provided. If we provide unknown keys to our data model now, nothing really happens, for example InterpolationSetting(hello="world") outputs:

InterpolationSetting(interpolation_factor=None, interpolation_method=None, interpolate_on_integral=None)

Often, an unknown field name is the result of a typo
in the toml file. Therefore we want to raise an error to alert the user.
We do this using a the model config, controlling the behaviour of the model. The extra attribute of the config determines what we do with extra fields. The default is ignore, which we can see in the example above, where the field is ignored, and not added to the model, as the option allow does. We can use the forbid option to raise an exception when extra fields are supplied.

from typing import Dict, Optional

from pydantic import BaseModel, conint, validator, root_validator


class InterpolationSetting(BaseModel):
    interpolation_factor: Optional[conint(gt=2)]
    interpolation_method: Optional[str]
    interpolate_on_integral: Optional[bool]

    class Config:
        extra = "forbid"

    @validator("interpolation_method")
    def method_is_valid(cls, method: Optional[str]) -> Optional[str]:
        allowed_set = {"repeat", "distribute", "linear", "cubic", "akima"}
        if method is not None and method not in allowed_set:
            raise ValueError(f"must be in {allowed_set}, got '{method}'")
        return method

    @root_validator()
    def valid_combination_of_method_and_on_integral(cls, values: Dict) -> Dict:
        on_integral = values.get("interpolate_on_integral")
        method = values.get("interpolation_method")
        if on_integral is False and method == "distribute":
            raise ValueError(
                f"Invalid combination of interpolation_method "
                f"{method} and interpolate_on_integral {on_integral}"
            )
        return values

    @root_validator()
    def set_method_given_interpolation_factor(cls, values: Dict) -> Dict:
        factor = values.get("interpolation_factor")
        method = values.get("interpolation_method")
        if method is None and factor is not None:
            values["interpolation_method"] = "linear"
        return values

    @root_validator()
    def set_on_integral_given_interpolation_factor(cls, values: Dict) -> Dict:
        on_integral = values.get("interpolate_on_integral")
        factor = values.get("interpolation_factor")
        if on_integral is None and factor is not None:
            values["interpolation_factor"] = False
        return values

If we try again with an unknown key, we now get a ValidationError:

from pydantic import ValidationError


try:
    InterpolationSetting(hello=True)
except ValidationError as e:
    print(e)

This raises a validation error for the unknown field:

1 validation error for InterpolationSetting
hello
  extra fields not permitted (type=value_error.extra)

Adapting our existing code is easy

Now we have implemented all our checks, and can go on to adapt our existing code to use the new data model. In our original implementation, we would do something like

params_in = toml.load(path_to_settings_file)
params_validated = validate_input_settings(params_in)
interpolate_result(params_validated)

We can replace the call to validate_input_settings with instantiation of the pydantic model: params_validated = InterpolationSetting(params_in). Each pydantic data model has a .dict() method that returns the parameters as a dictionary, so we can use it in the input argument to interpolate_result directly: interpolate_result(params_validated.dict()). Another option is to refactor interpolate_result to use the attributes of the InterpolationSetting objects, such as params_validated.interpolation_method instead of the values of a dictionary.

Conclusion

In the end, we can replace one 43 line method (for the full functionality)
and cognitive complexity of 24 with one 40 line
class containing six methods, each with cognitive complexity less than 4.
The pydantic data models will not necessarily be shorter than the custom validation code they replace, and since there are a few quirks and concepts to pay attention to, they are not necessarily easier
to read at the first try.

However, as we use the library for validation in our APIs, we are getting familiar with it, and we can understand more easily.

Some of the benefits of using pydantic for this are:

  • Type checking (and in fact also some
    type conversion),
    which we previously did ourselves, is now done automatically for us, saving us the work of repeating lines of error-prone code in many different functions.
  • Each validator has a name which, if we put a little thought into it, makes it very
    clear what we are trying to achieve. In our previous example, the purpose of each
    nested condition had to be deduced from the many if clauses and error messages. This
    should be a lot clearer now, especially if we use pydantic across different
    projects.
  • If speed is important, pydantic’s
    benchmarks show that they are
    fast compared to similar libraries.

Hopefully this will help you determine whether or not you should consider using pydantic models in your projects. In a later post, we will show how we use pydantic data models to share metadata between machine learning applications, and to share data requirements between the applications.

Form and field validation¶

Form validation happens when the data is cleaned. If you want to customize
this process, there are various places to make changes, each one serving a
different purpose. Three types of cleaning methods are run during form
processing. These are normally executed when you call the is_valid()
method on a form. There are other things that can also trigger cleaning and
validation (accessing the errors attribute or calling full_clean()
directly), but normally they won’t be needed.

In general, any cleaning method can raise ValidationError if there is a
problem with the data it is processing, passing the relevant information to
the ValidationError constructor. See below
for the best practice in raising ValidationError. If no ValidationError
is raised, the method should return the cleaned (normalized) data as a Python
object.

Most validation can be done using validators — helpers that can be reused.
Validators are functions (or callables) that take a single argument and raise
ValidationError on invalid input. Validators are run after the field’s
to_python and validate methods have been called.

Validation of a form is split into several steps, which can be customized or
overridden:

  • The to_python() method on a Field is the first step in every
    validation. It coerces the value to a correct datatype and raises
    ValidationError if that is not possible. This method accepts the raw
    value from the widget and returns the converted value. For example, a
    FloatField will turn the data into a Python float or raise a
    ValidationError.

  • The validate() method on a Field handles field-specific validation
    that is not suitable for a validator. It takes a value that has been
    coerced to a correct datatype and raises ValidationError on any error.
    This method does not return anything and shouldn’t alter the value. You
    should override it to handle validation logic that you can’t or don’t
    want to put in a validator.

  • The run_validators() method on a Field runs all of the field’s
    validators and aggregates all the errors into a single
    ValidationError. You shouldn’t need to override this method.

  • The clean() method on a Field subclass is responsible for running
    to_python(), validate(), and run_validators() in the correct
    order and propagating their errors. If, at any time, any of the methods
    raise ValidationError, the validation stops and that error is raised.
    This method returns the clean data, which is then inserted into the
    cleaned_data dictionary of the form.

  • The clean_<fieldname>() method is called on a form subclass – where
    <fieldname> is replaced with the name of the form field attribute.
    This method does any cleaning that is specific to that particular
    attribute, unrelated to the type of field that it is. This method is not
    passed any parameters. You will need to look up the value of the field
    in self.cleaned_data and remember that it will be a Python object
    at this point, not the original string submitted in the form (it will be
    in cleaned_data because the general field clean() method, above,
    has already cleaned the data once).

    For example, if you wanted to validate that the contents of a
    CharField called serialnumber was unique,
    clean_serialnumber() would be the right place to do this. You don’t
    need a specific field (it’s a CharField), but you want a
    formfield-specific piece of validation and, possibly, cleaning/normalizing
    the data.

    The return value of this method replaces the existing value in
    cleaned_data, so it must be the field’s value from cleaned_data (even
    if this method didn’t change it) or a new cleaned value.

  • The form subclass’s clean() method can perform validation that requires
    access to multiple form fields. This is where you might put in checks such as
    “if field A is supplied, field B must contain a valid email address”.
    This method can return a completely different dictionary if it wishes, which
    will be used as the cleaned_data.

    Since the field validation methods have been run by the time clean() is
    called, you also have access to the form’s errors attribute which
    contains all the errors raised by cleaning of individual fields.

    Note that any errors raised by your Form.clean() override will not
    be associated with any field in particular. They go into a special
    “field” (called __all__), which you can access via the
    non_field_errors() method if you need to. If you
    want to attach errors to a specific field in the form, you need to call
    add_error().

    Also note that there are special considerations when overriding
    the clean() method of a ModelForm subclass. (see the
    ModelForm documentation for more information)

These methods are run in the order given above, one field at a time. That is,
for each field in the form (in the order they are declared in the form
definition), the Field.clean() method (or its override) is run, then
clean_<fieldname>(). Finally, once those two methods are run for every
field, the Form.clean() method, or its override, is executed whether
or not the previous methods have raised errors.

Examples of each of these methods are provided below.

As mentioned, any of these methods can raise a ValidationError. For any
field, if the Field.clean() method raises a ValidationError, any
field-specific cleaning method is not called. However, the cleaning methods
for all remaining fields are still executed.

Raising ValidationError

In order to make error messages flexible and easy to override, consider the
following guidelines:

  • Provide a descriptive error code to the constructor:

    # Good
    ValidationError(_('Invalid value'), code='invalid')
    
    # Bad
    ValidationError(_('Invalid value'))
    
  • Don’t coerce variables into the message; use placeholders and the params
    argument of the constructor:

    # Good
    ValidationError(
        _('Invalid value: %(value)s'),
        params={'value': '42'},
    )
    
    # Bad
    ValidationError(_('Invalid value: %s') % value)
    
  • Use mapping keys instead of positional formatting. This enables putting
    the variables in any order or omitting them altogether when rewriting the
    message:

    # Good
    ValidationError(
        _('Invalid value: %(value)s'),
        params={'value': '42'},
    )
    
    # Bad
    ValidationError(
        _('Invalid value: %s'),
        params=('42',),
    )
    
  • Wrap the message with gettext to enable translation:

    # Good
    ValidationError(_('Invalid value'))
    
    # Bad
    ValidationError('Invalid value')
    

Putting it all together:

raise ValidationError(
    _('Invalid value: %(value)s'),
    code='invalid',
    params={'value': '42'},
)

Following these guidelines is particularly necessary if you write reusable
forms, form fields, and model fields.

While not recommended, if you are at the end of the validation chain
(i.e. your form clean() method) and you know you will never need
to override your error message you can still opt for the less verbose:

ValidationError(_('Invalid value: %s') % value)

The Form.errors.as_data() and
Form.errors.as_json() methods
greatly benefit from fully featured ValidationErrors (with a code name
and a params dictionary).

Raising multiple errors¶

If you detect multiple errors during a cleaning method and wish to signal all
of them to the form submitter, it is possible to pass a list of errors to the
ValidationError constructor.

As above, it is recommended to pass a list of ValidationError instances
with codes and params but a list of strings will also work:

# Good
raise ValidationError([
    ValidationError(_('Error 1'), code='error1'),
    ValidationError(_('Error 2'), code='error2'),
])

# Bad
raise ValidationError([
    _('Error 1'),
    _('Error 2'),
])

Using validation in practice¶

The previous sections explained how validation works in general for forms.
Since it can sometimes be easier to put things into place by seeing each
feature in use, here are a series of small examples that use each of the
previous features.

Using validators¶

Django’s form (and model) fields support use of utility functions and classes
known as validators. A validator is a callable object or function that takes a
value and returns nothing if the value is valid or raises a
ValidationError if not. These can be passed to a
field’s constructor, via the field’s validators argument, or defined on the
Field class itself with the default_validators
attribute.

Validators can be used to validate values inside the field, let’s have a look
at Django’s SlugField:

from django.core import validators
from django.forms import CharField

class SlugField(CharField):
    default_validators = [validators.validate_slug]

As you can see, SlugField is a CharField with a customized validator
that validates that submitted text obeys to some character rules. This can also
be done on field definition so:

is equivalent to:

slug = forms.CharField(validators=[validators.validate_slug])

Common cases such as validating against an email or a regular expression can be
handled using existing validator classes available in Django. For example,
validators.validate_slug is an instance of
a RegexValidator constructed with the first
argument being the pattern: ^[-a-zA-Z0-9_]+$. See the section on
writing validators to see a list of what is already
available and for an example of how to write a validator.

Form field default cleaning¶

Let’s first create a custom form field that validates its input is a string
containing comma-separated email addresses. The full class looks like this:

from django import forms
from django.core.validators import validate_email

class MultiEmailField(forms.Field):
    def to_python(self, value):
        """Normalize data to a list of strings."""
        # Return an empty list if no input was given.
        if not value:
            return []
        return value.split(',')

    def validate(self, value):
        """Check if value consists only of valid emails."""
        # Use the parent's handling of required fields, etc.
        super().validate(value)
        for email in value:
            validate_email(email)

Every form that uses this field will have these methods run before anything
else can be done with the field’s data. This is cleaning that is specific to
this type of field, regardless of how it is subsequently used.

Let’s create a ContactForm to demonstrate how you’d use this field:

class ContactForm(forms.Form):
    subject = forms.CharField(max_length=100)
    message = forms.CharField()
    sender = forms.EmailField()
    recipients = MultiEmailField()
    cc_myself = forms.BooleanField(required=False)

Use MultiEmailField like any other form field. When the is_valid()
method is called on the form, the MultiEmailField.clean() method will be
run as part of the cleaning process and it will, in turn, call the custom
to_python() and validate() methods.

Cleaning a specific field attribute¶

Continuing on from the previous example, suppose that in our ContactForm,
we want to make sure that the recipients field always contains the address
"fred@example.com". This is validation that is specific to our form, so we
don’t want to put it into the general MultiEmailField class. Instead, we
write a cleaning method that operates on the recipients field, like so:

from django import forms
from django.core.exceptions import ValidationError

class ContactForm(forms.Form):
    # Everything as before.
    ...

    def clean_recipients(self):
        data = self.cleaned_data['recipients']
        if "fred@example.com" not in data:
            raise ValidationError("You have forgotten about Fred!")

        # Always return a value to use as the new cleaned data, even if
        # this method didn't change it.
        return data

Cleaning and validating fields that depend on each other¶

Suppose we add another requirement to our contact form: if the cc_myself
field is True, the subject must contain the word "help". We are
performing validation on more than one field at a time, so the form’s
clean() method is a good spot to do this. Notice that we are
talking about the clean() method on the form here, whereas earlier we were
writing a clean() method on a field. It’s important to keep the field and
form difference clear when working out where to validate things. Fields are
single data points, forms are a collection of fields.

By the time the form’s clean() method is called, all the individual field
clean methods will have been run (the previous two sections), so
self.cleaned_data will be populated with any data that has survived so
far. So you also need to remember to allow for the fact that the fields you
are wanting to validate might not have survived the initial individual field
checks.

There are two ways to report any errors from this step. Probably the most
common method is to display the error at the top of the form. To create such
an error, you can raise a ValidationError from the clean() method. For
example:

from django import forms
from django.core.exceptions import ValidationError

class ContactForm(forms.Form):
    # Everything as before.
    ...

    def clean(self):
        cleaned_data = super().clean()
        cc_myself = cleaned_data.get("cc_myself")
        subject = cleaned_data.get("subject")

        if cc_myself and subject:
            # Only do something if both fields are valid so far.
            if "help" not in subject:
                raise ValidationError(
                    "Did not send for 'help' in the subject despite "
                    "CC'ing yourself."
                )

In this code, if the validation error is raised, the form will display an
error message at the top of the form (normally) describing the problem. Such
errors are non-field errors, which are displayed in the template with
{{ form.non_field_errors }}.

The call to super().clean() in the example code ensures that any validation
logic in parent classes is maintained. If your form inherits another that
doesn’t return a cleaned_data dictionary in its clean() method (doing
so is optional), then don’t assign cleaned_data to the result of the
super() call and use self.cleaned_data instead:

def clean(self):
    super().clean()
    cc_myself = self.cleaned_data.get("cc_myself")
    ...

The second approach for reporting validation errors might involve assigning the
error message to one of the fields. In this case, let’s assign an error message
to both the “subject” and “cc_myself” rows in the form display. Be careful when
doing this in practice, since it can lead to confusing form output. We’re
showing what is possible here and leaving it up to you and your designers to
work out what works effectively in your particular situation. Our new code
(replacing the previous sample) looks like this:

from django import forms

class ContactForm(forms.Form):
    # Everything as before.
    ...

    def clean(self):
        cleaned_data = super().clean()
        cc_myself = cleaned_data.get("cc_myself")
        subject = cleaned_data.get("subject")

        if cc_myself and subject and "help" not in subject:
            msg = "Must put 'help' in subject when cc'ing yourself."
            self.add_error('cc_myself', msg)
            self.add_error('subject', msg)

The second argument of add_error() can be a string, or preferably an
instance of ValidationError. See Raising ValidationError for more
details. Note that add_error() automatically removes the field from
cleaned_data.

Валидация форм и полей¶

Валидация формы происходит при очистке данных. Если вы хотите настроить этот процесс, есть различные места для внесения изменений, каждое из которых служит для разных целей. В процессе обработки формы выполняются три типа методов очистки. Обычно они выполняются, когда вы вызываете метод is_valid() на форме. Есть и другие вещи, которые также могут вызвать очистку и проверку (обращение к атрибуту errors или прямой вызов full_clean()), но обычно они не нужны.

В общем, любой метод очистки может поднять ValidationError, если есть проблема с данными, которые он обрабатывает, передавая соответствующую информацию конструктору ValidationError. See below для лучшей практики поднятия ValidationError. Если не поднимается ValidationError, метод должен вернуть очищенные (нормализованные) данные в виде объекта Python.

Большинство валидаций можно выполнить с помощью validators — помощников, которые можно использовать повторно. Валидаторы — это функции (или callables), которые принимают один аргумент и вызывают ValidationError при недопустимом вводе. Валидаторы запускаются после вызова методов to_python и validate поля.

Валидация формы разбита на несколько этапов, которые можно настроить или отменить:

  • Метод to_python() на Field является первым шагом в каждой валидации. Он преобразует значение к правильному типу данных и выдает сообщение ValidationError, если это невозможно. Этот метод принимает необработанное значение от виджета и возвращает преобразованное значение. Например, FloatField превратит данные в Python float или выдаст ValidationError.

  • Метод validate() на Field обрабатывает специфическую для поля валидацию, которая не подходит для валидатора. Он принимает значение, которое было приведено к правильному типу данных, и при любой ошибке выдает сообщение ValidationError. Этот метод ничего не возвращает и не должен изменять значение. Вы должны переопределить его для обработки логики валидации, которую вы не можете или не хотите поместить в валидатор.

  • Метод run_validators() на поле Field запускает все валидаторы поля и объединяет все ошибки в один ValidationError. Вам не нужно переопределять этот метод.

  • Метод clean() в подклассе Field отвечает за выполнение to_python(), validate() и run_validators() в правильном порядке и распространение их ошибок. Если в любой момент времени какой-либо из методов вызывает ошибку ValidationError, валидация останавливается, и эта ошибка выдается. Этот метод возвращает чистые данные, которые затем вставляются в словарь cleaned_data формы.

  • Метод clean_<fieldname>() вызывается на подклассе формы – где <fieldname> заменяется на имя атрибута поля формы. Этот метод выполняет любую очистку, специфичную для данного атрибута, не связанную с типом поля, которым он является. Этому методу не передаются никакие параметры. Вам нужно будет найти значение поля в self.cleaned_data и помнить, что в этот момент это будет объект Python, а не исходная строка, представленная в форме (она будет в cleaned_data, потому что метод general field clean(), описанный выше, уже однажды очистил данные).

    Например, если вы хотите проверить, что содержимое CharField под названием serialnumber является уникальным, clean_serialnumber() будет подходящим местом для этого. Вам не нужно конкретное поле (это CharField), но вам нужен специфический для поля формы фрагмент проверки и, возможно, очистки/нормализации данных.

    Возвращаемое значение этого метода заменяет существующее значение в cleaned_data, поэтому это должно быть значение поля из cleaned_data (даже если этот метод не изменил его) или новое очищенное значение.

  • Метод clean() подкласса формы может выполнять валидацию, требующую доступа к нескольким полям формы. Сюда можно отнести такие проверки, как «если поле A предоставлено, то поле B должно содержать действительный адрес электронной почты». При желании этот метод может вернуть совершенно другой словарь, который будет использован в качестве cleaned_data.

    Поскольку методы валидации полей были запущены к моменту вызова clean(), у вас также есть доступ к атрибуту errors формы, который содержит все ошибки, возникшие при очистке отдельных полей.

    Обратите внимание, что любые ошибки, возникающие при переопределении Form.clean(), не будут связаны с каким-либо конкретным полем. Они попадают в специальное «поле» (называемое __all__), к которому вы можете получить доступ через метод non_field_errors(), если вам это необходимо. Если вы хотите прикрепить ошибки к определенному полю формы, вам нужно вызвать add_error().

    Также обратите внимание, что существуют особые соображения при переопределении метода clean() подкласса ModelForm. (см. ModelForm documentation для получения дополнительной информации)

Эти методы выполняются в указанном выше порядке, по одному полю за раз. То есть, для каждого поля формы (в порядке их объявления в определении формы) выполняется метод Field.clean() (или его переопределение), затем clean_<fieldname>(). Наконец, когда эти два метода выполнены для каждого поля, выполняется метод Form.clean(), или его переопределение, независимо от того, вызвали ли предыдущие методы ошибки.

Примеры каждого из этих методов приведены ниже.

Как уже упоминалось, любой из этих методов может вызвать ошибку ValidationError. Для любого поля, если метод Field.clean() вызывает ValidationError, любой метод очистки, специфичный для данного поля, не вызывается. Однако методы очистки для всех оставшихся полей все равно выполняются.

Поднятие ValidationError

Чтобы сделать сообщения об ошибках гибкими и легко переопределяемыми, примите во внимание следующие рекомендации:

  • Предоставить описательную ошибку code конструктору:

    # Good
    ValidationError(_('Invalid value'), code='invalid')
    
    # Bad
    ValidationError(_('Invalid value'))
    
  • Не вставляйте переменные в сообщение; используйте заполнители и аргумент params конструктора:

    # Good
    ValidationError(
        _('Invalid value: %(value)s'),
        params={'value': '42'},
    )
    
    # Bad
    ValidationError(_('Invalid value: %s') % value)
    
  • Используйте ключи отображения вместо позиционного форматирования. Это позволяет располагать переменные в любом порядке или вообще их не использовать при переписывании сообщения:

    # Good
    ValidationError(
        _('Invalid value: %(value)s'),
        params={'value': '42'},
    )
    
    # Bad
    ValidationError(
        _('Invalid value: %s'),
        params=('42',),
    )
    
  • Оберните сообщение символом gettext, чтобы включить перевод:

    # Good
    ValidationError(_('Invalid value'))
    
    # Bad
    ValidationError('Invalid value')
    

Собираем все вместе:

raise ValidationError(
    _('Invalid value: %(value)s'),
    code='invalid',
    params={'value': '42'},
)

Следование этим рекомендациям особенно необходимо, если вы пишете многократно используемые формы, поля форм и поля моделей.

Хотя это и не рекомендуется, если вы находитесь в конце цепочки валидации (т.е. ваша форма clean() метод) и вы знаете, что вам никогда не понадобится переопределять сообщение об ошибке, вы можете выбрать менее многословный вариант:

ValidationError(_('Invalid value: %s') % value)

Методы Form.errors.as_data() и Form.errors.as_json() значительно выигрывают от полнофункциональных ValidationErrors (с code именем и params словарем).

Возникновение множества ошибок¶

Если вы обнаружили несколько ошибок во время работы метода очистки и хотите сигнализировать обо всех из них отправителю формы, можно передать список ошибок конструктору ValidationError.

Как и выше, рекомендуется передавать список экземпляров ValidationError с codes и params, но подойдет и список строк:

# Good
raise ValidationError([
    ValidationError(_('Error 1'), code='error1'),
    ValidationError(_('Error 2'), code='error2'),
])

# Bad
raise ValidationError([
    _('Error 1'),
    _('Error 2'),
])

Использование валидации на практике¶

В предыдущих разделах объяснялось, как работает валидация в целом для форм. Поскольку иногда бывает проще понять, как работает каждая функция, здесь приведена серия небольших примеров, в которых используется каждая из предыдущих функций.

Использование валидаторов¶

Поля формы (и модели) Django поддерживают использование полезных функций и классов, известных как валидаторы. Валидатор — это вызываемый объект или функция, которая принимает значение и не возвращает ничего, если значение действительно, или выдает ошибку ValidationError, если нет. Они могут быть переданы в конструктор поля через аргумент validators или определены в самом классе Field с помощью атрибута default_validators.

Валидаторы могут использоваться для проверки значений внутри поля, давайте посмотрим на Django’s SlugField:

from django.core import validators
from django.forms import CharField

class SlugField(CharField):
    default_validators = [validators.validate_slug]

Как вы можете видеть, SlugField — это CharField с настроенным валидатором, который проверяет, что отправленный текст соответствует некоторым правилам символов. Это также можно сделать при определении поля так:

эквивалентно:

slug = forms.CharField(validators=[validators.validate_slug])

Обычные случаи, такие как проверка по электронной почте или регулярному выражению, могут быть обработаны с помощью существующих классов валидаторов, доступных в Django. Например, validators.validate_slug — это экземпляр RegexValidator, построенный с первым аргументом в виде шаблона: ^[-a-zA-Z0-9_]+$. Смотрите раздел writing validators, чтобы увидеть список того, что уже доступно, и пример того, как написать валидатор.

Очистка полей формы по умолчанию¶

Давайте сначала создадим поле пользовательской формы, которое проверяет, что его входные данные — это строка, содержащая адреса электронной почты, разделенные запятыми. Полный класс выглядит следующим образом:

from django import forms
from django.core.validators import validate_email

class MultiEmailField(forms.Field):
    def to_python(self, value):
        """Normalize data to a list of strings."""
        # Return an empty list if no input was given.
        if not value:
            return []
        return value.split(',')

    def validate(self, value):
        """Check if value consists only of valid emails."""
        # Use the parent's handling of required fields, etc.
        super().validate(value)
        for email in value:
            validate_email(email)

В каждой форме, использующей это поле, эти методы будут выполняться до того, как с данными поля можно будет сделать что-либо еще. Это очистка, специфичная для данного типа поля, независимо от того, как оно будет использоваться в дальнейшем.

Давайте создадим ContactForm, чтобы продемонстрировать, как вы будете использовать это поле:

class ContactForm(forms.Form):
    subject = forms.CharField(max_length=100)
    message = forms.CharField()
    sender = forms.EmailField()
    recipients = MultiEmailField()
    cc_myself = forms.BooleanField(required=False)

Используйте MultiEmailField как любое другое поле формы. Когда на форме будет вызван метод is_valid(), в процессе очистки будет запущен метод MultiEmailField.clean(), который, в свою очередь, вызовет пользовательские методы to_python() и validate().

Очистка определенного атрибута поля¶

Продолжая предыдущий пример, предположим, что в нашем ContactForm мы хотим убедиться, что поле recipients всегда содержит адрес "fred@example.com". Это проверка, специфичная для нашей формы, поэтому мы не хотим помещать ее в общий класс MultiEmailField. Вместо этого мы напишем метод очистки, который работает с полем recipients, следующим образом:

from django import forms
from django.core.exceptions import ValidationError

class ContactForm(forms.Form):
    # Everything as before.
    ...

    def clean_recipients(self):
        data = self.cleaned_data['recipients']
        if "fred@example.com" not in data:
            raise ValidationError("You have forgotten about Fred!")

        # Always return a value to use as the new cleaned data, even if
        # this method didn't change it.
        return data

Очистка и проверка полей, которые зависят друг от друга¶

Предположим, мы добавим еще одно требование к нашей контактной форме: если поле cc_myself является True, то subject должно содержать слово "help". Мы выполняем проверку более чем одного поля одновременно, поэтому метод формы clean() является хорошим местом для этого. Обратите внимание, что здесь мы говорим о методе clean() на форме, тогда как ранее мы писали метод clean() на поле. Важно четко различать поля и формы, когда мы решаем, где проводить валидацию. Поля — это отдельные точки данных, а формы — это набор полей.

К моменту вызова метода clean() формы будут запущены все методы очистки отдельных полей (предыдущие два раздела), поэтому self.cleaned_data будет заполнен любыми данными, которые сохранились до сих пор. Поэтому вам также нужно помнить о том, что поля, которые вы хотите проверить, могут не выдержать первоначальной проверки отдельных полей.

Есть два способа сообщить о любых ошибках на этом этапе. Вероятно, самый распространенный способ — вывести ошибку в верхней части формы. Чтобы создать такую ошибку, вы можете поднять ValidationError из метода clean(). Например:

from django import forms
from django.core.exceptions import ValidationError

class ContactForm(forms.Form):
    # Everything as before.
    ...

    def clean(self):
        cleaned_data = super().clean()
        cc_myself = cleaned_data.get("cc_myself")
        subject = cleaned_data.get("subject")

        if cc_myself and subject:
            # Only do something if both fields are valid so far.
            if "help" not in subject:
                raise ValidationError(
                    "Did not send for 'help' in the subject despite "
                    "CC'ing yourself."
                )

В этом коде, если возникает ошибка валидации, форма выводит сообщение об ошибке в верхней части формы (обычно) с описанием проблемы. Такие ошибки являются не-полевыми ошибками, которые отображаются в шаблоне с помощью {{ form.non_field_errors }}.

Вызов super().clean() в коде примера гарантирует, что любая логика валидации в родительских классах будет сохранена. Если ваша форма наследует другую, которая не возвращает словарь cleaned_data в своем методе clean() (это необязательно), то не присваивайте cleaned_data результату вызова super() и используйте self.cleaned_data вместо этого:

def clean(self):
    super().clean()
    cc_myself = self.cleaned_data.get("cc_myself")
    ...

Второй подход для сообщения об ошибках валидации может включать присвоение сообщения об ошибке одному из полей. В данном случае давайте присвоим сообщение об ошибке обеим строкам «subject» и «cc_myself» в отображении формы. Будьте осторожны, делая это на практике, так как это может привести к запутанному выводу формы. Мы показываем, что здесь возможно, и предоставляем вам и вашим дизайнерам самим решать, что будет эффективно работать в вашей конкретной ситуации. Наш новый код (заменяющий предыдущий пример) выглядит следующим образом:

from django import forms

class ContactForm(forms.Form):
    # Everything as before.
    ...

    def clean(self):
        cleaned_data = super().clean()
        cc_myself = cleaned_data.get("cc_myself")
        subject = cleaned_data.get("subject")

        if cc_myself and subject and "help" not in subject:
            msg = "Must put 'help' in subject when cc'ing yourself."
            self.add_error('cc_myself', msg)
            self.add_error('subject', msg)

Вторым аргументом add_error() может быть строка или, предпочтительно, экземпляр ValidationError. Более подробную информацию смотрите в Поднятие ValidationError. Обратите внимание, что add_error() автоматически удаляет поле из cleaned_data.

Custom validation and complex relationships between objects can be achieved using the validator decorator.

from pydantic import BaseModel, ValidationError, validator


class UserModel(BaseModel):
    name: str
    username: str
    password1: str
    password2: str

    @validator('name')
    def name_must_contain_space(cls, v):
        if ' ' not in v:
            raise ValueError('must contain a space')
        return v.title()

    @validator('password2')
    def passwords_match(cls, v, values, **kwargs):
        if 'password1' in values and v != values['password1']:
            raise ValueError('passwords do not match')
        return v

    @validator('username')
    def username_alphanumeric(cls, v):
        assert v.isalnum(), 'must be alphanumeric'
        return v


user = UserModel(
    name='samuel colvin',
    username='scolvin',
    password1='zxcvbn',
    password2='zxcvbn',
)
print(user)
#> name='Samuel Colvin' username='scolvin' password1='zxcvbn' password2='zxcvbn'

try:
    UserModel(
        name='samuel',
        username='scolvin',
        password1='zxcvbn',
        password2='zxcvbn2',
    )
except ValidationError as e:
    print(e)
    """
    2 validation errors for UserModel
    name
      must contain a space (type=value_error)
    password2
      passwords do not match (type=value_error)
    """

You need to be aware of these validator behaviours.

  • Validators are «class methods», so the first argument value they receive is the UserModel class, not an instance of UserModel.
  • The second argument is always the field value to validate; it can be named as you please.
  • You can also add any subset of the following arguments to the signature (the names must match):
    • values: a dict containing the name-to-value mapping of any previously-validated fields.
    • config: the model config.
    • field: the field being validated.
    • **kwargs: if provided, this will include the arguments above not explicitly listed in the signature.
  • Validators should either return the parsed value or raise a ValueError, TypeError, or AssertionError (assert statements may be used).
  • Where validators rely on other values, you should be aware that:
    • Validation is done in the order fields are defined.
    • If validation fails on another field (or that field is missing) it will not be included in values, hence if 'password1' in values and ... in this example.

Pre and per-item validators⚑

Validators can do a few more complex things:

  • A single validator can be applied to multiple fields by passing it multiple field names.
  • A single validator can also be called on all fields by passing the special value '*'.
  • The keyword argument pre will cause the validator to be called prior to other validation.
  • Passing each_item=True will result in the validator being applied to individual values (e.g. of List, Dict, Set, etc.), rather than the whole object.
from typing import List
from pydantic import BaseModel, ValidationError, validator


class DemoModel(BaseModel):
    square_numbers: List[int] = []
    cube_numbers: List[int] = []

    # '*' is the same as 'cube_numbers', 'square_numbers' here:
    @validator('*', pre=True)
    def split_str(cls, v):
        if isinstance(v, str):
            return v.split('|')
        return v

    @validator('cube_numbers', 'square_numbers')
    def check_sum(cls, v):
        if sum(v) > 42:
            raise ValueError('sum of numbers greater than 42')
        return v

    @validator('square_numbers', each_item=True)
    def check_squares(cls, v):
        assert v ** 0.5 % 1 == 0, f'{v} is not a square number'
        return v

    @validator('cube_numbers', each_item=True)
    def check_cubes(cls, v):
        # 64 ** (1 / 3) == 3.9999999999999996 (!)
        # this is not a good way of checking cubes
        assert v ** (1 / 3) % 1 == 0, f'{v} is not a cubed number'
        return v


print(DemoModel(square_numbers=[1, 4, 9]))
#> square_numbers=[1, 4, 9] cube_numbers=[]
print(DemoModel(square_numbers='1|4|16'))
#> square_numbers=[1, 4, 16] cube_numbers=[]
print(DemoModel(square_numbers=[16], cube_numbers=[8, 27]))
#> square_numbers=[16] cube_numbers=[8, 27]
try:
    DemoModel(square_numbers=[1, 4, 2])
except ValidationError as e:
    print(e)
    """
    1 validation error for DemoModel
    square_numbers -> 2
      2 is not a square number (type=assertion_error)
    """

try:
    DemoModel(cube_numbers=[27, 27])
except ValidationError as e:
    print(e)
    """
    1 validation error for DemoModel
    cube_numbers
      sum of numbers greater than 42 (type=value_error)
    """

Subclass Validators and each_item

If using a validator with a subclass that references a List type field on a parent class, using each_item=True will cause the validator not to run; instead, the list must be iterated over programatically.

from typing import List
from pydantic import BaseModel, ValidationError, validator


class ParentModel(BaseModel):
    names: List[str]


class ChildModel(ParentModel):
    @validator('names', each_item=True)
    def check_names_not_empty(cls, v):
        assert v != '', 'Empty strings are not allowed.'
        return v


# This will NOT raise a ValidationError because the validator was not called
try:
    child = ChildModel(names=['Alice', 'Bob', 'Eve', ''])
except ValidationError as e:
    print(e)
else:
    print('No ValidationError caught.')
    #> No ValidationError caught.


class ChildModel2(ParentModel):
    @validator('names')
    def check_names_not_empty(cls, v):
        for name in v:
            assert name != '', 'Empty strings are not allowed.'
        return v


try:
    child = ChildModel2(names=['Alice', 'Bob', 'Eve', ''])
except ValidationError as e:
    print(e)
    """
    1 validation error for ChildModel2
    names
      Empty strings are not allowed. (type=assertion_error)
    """

Validate Always⚑

For performance reasons, by default validators are not called for fields when a value is not supplied. However there are situations where it may be useful or required to always call the validator, e.g. to set a dynamic default value.

from datetime import datetime

from pydantic import BaseModel, validator


class DemoModel(BaseModel):
    ts: datetime = None

    @validator('ts', pre=True, always=True)
    def set_ts_now(cls, v):
        return v or datetime.now()


print(DemoModel())
#> ts=datetime.datetime(2020, 7, 15, 20, 1, 48, 966302)
print(DemoModel(ts='2017-11-08T14:00'))
#> ts=datetime.datetime(2017, 11, 8, 14, 0)

You’ll often want to use this together with pre, since otherwise with always=True pydantic would try to validate the default None which would cause an error.

Reuse validators⚑

Occasionally, you will want to use the same validator on multiple fields/models (e.g. to normalize some input data). The «naive» approach would be to write a separate function, then call it from multiple decorators. Obviously, this entails a lot of repetition and boiler plate code. To circumvent this, the allow_reuse parameter has been added to pydantic.validator in v1.2 (False by default):

from pydantic import BaseModel, validator


def normalize(name: str) -> str:
    return ' '.join((word.capitalize()) for word in name.split(' '))


class Producer(BaseModel):
    name: str

    # validators
    _normalize_name = validator('name', allow_reuse=True)(normalize)


class Consumer(BaseModel):
    name: str

    # validators
    _normalize_name = validator('name', allow_reuse=True)(normalize)


jane_doe = Producer(name='JaNe DOE')
john_doe = Consumer(name='joHN dOe')
assert jane_doe.name == 'Jane Doe'
assert john_doe.name == 'John Doe'

As it is obvious, repetition has been reduced and the models become again almost declarative.

Tip

If you have a lot of fields that you want to validate, it usually makes sense to define a help function with which you will avoid setting allow_reuse=True over and over again.

Root Validators⚑

Validation can also be performed on the entire model’s data.

from pydantic import BaseModel, ValidationError, root_validator


class UserModel(BaseModel):
    username: str
    password1: str
    password2: str

    @root_validator(pre=True)
    def check_card_number_omitted(cls, values):
        assert 'card_number' not in values, 'card_number should not be included'
        return values

    @root_validator
    def check_passwords_match(cls, values):
        pw1, pw2 = values.get('password1'), values.get('password2')
        if pw1 is not None and pw2 is not None and pw1 != pw2:
            raise ValueError('passwords do not match')
        return values


print(UserModel(username='scolvin', password1='zxcvbn', password2='zxcvbn'))
#> username='scolvin' password1='zxcvbn' password2='zxcvbn'
try:
    UserModel(username='scolvin', password1='zxcvbn', password2='zxcvbn2')
except ValidationError as e:
    print(e)
    """
    1 validation error for UserModel
    __root__
      passwords do not match (type=value_error)
    """

try:
    UserModel(
        username='scolvin',
        password1='zxcvbn',
        password2='zxcvbn',
        card_number='1234',
    )
except ValidationError as e:
    print(e)
    """
    1 validation error for UserModel
    __root__
      card_number should not be included (type=assertion_error)
    """

As with field validators, root validators can have pre=True, in which case they’re called before field validation occurs (and are provided with the raw input data), or pre=False (the default), in which case they’re called after field validation.

Field validation will not occur if pre=True root validators raise an error. As with field validators, «post» (i.e. pre=False) root validators by default will be called even if prior validators fail; this behaviour can be changed by setting the skip_on_failure=True keyword argument to the validator. The values argument will be a dict containing the values which passed field validation and field defaults where applicable.

Field Checks⚑

On class creation, validators are checked to confirm that the fields they specify actually exist on the model.

Occasionally however this is undesirable: e.g. if you define a validator to validate fields on inheriting models. In this case you should set check_fields=False on the validator.

Dataclass Validators⚑

Validators also work with pydantic dataclasses.

from datetime import datetime

from pydantic import validator
from pydantic.dataclasses import dataclass


@dataclass
class DemoDataclass:
    ts: datetime = None

    @validator('ts', pre=True, always=True)
    def set_ts_now(cls, v):
        return v or datetime.now()


print(DemoDataclass())
#> DemoDataclass(ts=datetime.datetime(2020, 7, 15, 20, 1, 48, 969037))
print(DemoDataclass(ts='2017-11-08T14:00'))
#> DemoDataclass(ts=datetime.datetime(2017, 11, 8, 14, 0))

Troubleshooting validators⚑

pylint complains on the validators⚑

Pylint complains that R0201: Method could be a function and N805: first argument of a method should be named 'self'. Seems to be an error of pylint, people have solved it by specifying @classmethod between the definition and the validator decorator.

References⚑

  • Pydantic validators

Last update: 2020-12-29

Понравилась статья? Поделить с друзьями:
  • Validation error please try again if this error persists please contact the site administrator
  • Validation error on the server перевод
  • Validation error on the server гиис дмдк
  • Validation error not enough nodes you need at least three nodes to complete this process
  • Validation error mongoose