Jan 6, 2017

Django Under The Hood 2016 : Validation

One of many great perks of working for Kiwi.com is that the company provides you with opportunities to get as much education and self-improvement as you can. That means — attending conferences!

Here’s what I learned at Django Under The Hood talk from Loïc Bistuer about Django Validation.

What are the main concerns with validation?

When it comes to validation, we all have similar concerns.

Firstly, there is enforcement — how can you ensure that your validation logic runs? Then there is user experience — how do you provide good feedback to users about what went wrong? Also, performance — how do you make sure that the validation logic isn’t wasteful. And finally, convenience — is the logic easy to change?

Some concerns go well together. Enforcement and user experience like each other. You don’t want incorrect data and you want good feedback. Validation helps with that.

But user experience and performance are harder to combine. Checks take time.

Similarly, it’s harder to combine user experience and developer convenience. Why do you have to check anything on the backend when you already checked it on the frontend? It means extra work.

Where to validate data and how?

This is a tough question. You can place validation code into any of the 5 listed parts below of your application and still, that wouldn’t be the only correct place.

1. Frontend

It’s fast, provides direct feedback and is great for UX. The drawback is that it needs to be duplicated on the server side for security reasons and also on the backend: you cannot trust anything coming from the frontend.

2. Django view

Not the most elegant way of doing validation, but it’s on the server side and can provide feedback by rendering an http error. It’s also easy to circumvent. For example, you can create a new object in the console, fill it up with incorrect data, save it, and boom! You have a new invalid object in your database.

3. Django forms / DRF serializer

Just like views, they are very easy to circumvent. Imagine that you have a model form with excellent test coverage for it, but then your colleague adds a model admin for it and all your validation logic is bypassed. On the other hand, they are designated tools for handling and submitting user submissions, they have dedicated APIs to validate data and to handle errors.

4. Model

This only runs partially and is pretty easy to enforce by calling full_clean() in the save() method: this way, it’s much harder to circumvent. Bulk creating objects bypasses the save method though. Most of the time it’s redundant, as a lot of the validation happens in the form.

5. Database

It’s designed for the task and is always enforced. A huge advantage of database validation over model validation is performance. However, it is backend-specific, harder to write, harder to audit and harder to maintain.

Field validation

Fields are the cornerstone of Django validation: they contain most of the validation logic. The first thing about the field is its type. Type validation is called by the to_python() method which coerces the value to a correct datatype and raises ValidationError if that is not possible.

Then there is presence validation with required=True, blank=True.

And similarly, bounds validation: min_length=None, max_length=None on text field or min_val=None, max_val=None on integers.

Those are all set on the fields but you can also write your own validation.

def validate_even(value):
if value % 2 != 0:
raise ValidationError('Value is not an even number')IntegerField(validators=[validate_even])

The great thing is that you can even customise the error messages raised by validations with a dict. If a ValidationError is raised, the validation mechanism looks for a code attribute on the error message. This is taken as a key that’s looked up in the dict of error messages.

class Field(object):
default_error_messages = {
'required': _('This field is required'),
} def __init__(self, error_messages=None, ...):
messages = {}
for c in reversed(self.__class__.__mro__):
getattr(c, 'default_error_messages', {})
) messages.update(error_messages or {})
self.error_messages = messages

Now that we’ve declared a field, we can trigger its validation cycle, which is done by calling its clean() method.

def clean(self, value):
value = self.to_python(value)
return value

It starts by calling to_python() which is responsible for enforcing data type. Then it calls validate(), which performs the presence validation and any validation that is specific to the field. Finally it calls run_validators(). It loops through every member of self.validators, calling them with the value as argument and collecting all the resulting errors in an array to raise them at the end.

Triggering form validation

The most common way to trigger field validation in forms is the call is_valid() method. It runs validation and returns a boolean if the data was valid. is_valid() calls clean() automatically

The clean() method returns cleaned_data, so once you’ve created a form instance with a set of data and validated it, you can access the clean data via its cleaned_data attribute:

>>> data = {'subject': 'hello',
... 'message': 'Hi there',
... 'sender': '[email protected]',
... 'cc_myself': True}
>>> f = ContactForm(data)
>>> f.is_valid()
>>> f.cleaned_data
{'cc_myself': True, 'message': 'Hi there', 'sender': '[email protected]', 'subject': ‘hello’}

cleaned_data dictionary contains only valid fields

Model validation

Model validation is similar to form validation, but there are a few differences. First, there is no is_valid() method. So if you want to check if a model is valid, you can either:

  • Validate the model fields — Model.clean_fields()
  • Validate the model as a whole — Model.clean()
  • Validate the field uniqueness — Model.validate_unique()

Or call a model’s full_clean() method directly which would perform all 3 of these steps.

Model validation is great if you want to have a custom validation on your model fields, e.g. if you need to check connections between model attributes.

You can also exclude some fields from validation and validate unique constraints.

def full_clean(self, exclude=None, validate_unique=True):
except ValidationError as e:
errors = e.update_error_dict(errors)

Final tips

  • Validate on your frontend if you can afford it: it’s fast and UX friendly.
  • Validate in the database for anything mission-critical.
  • Pick the spot where it is handiest for you — forms, models, fields, model forms, etc.

And also, look at the Django REST framework when thinking about validation. DRF did many things right.

Featured articles
Generating SwiftUI snapshot tests with Swift macros
Don’t Fix Bad Data, Do This Instead