Chapter 2: Fields and Identity
In the previous chapter we created a simple Book aggregate with a handful
of fields. Now let's explore the full field system — the types available,
the options you can set, and how identity works in Protean.
Enriching the Book Aggregate
Our Book currently has just title, author, isbn, and price. A real
bookstore needs more. Let's add several new fields to demonstrate the range
of types available:
from enum import Enum
from protean import Domain
from protean.fields import Boolean, Date, Float, Integer, List, String, Text
domain = Domain()
class Genre(Enum):
FICTION = "FICTION"
NON_FICTION = "NON_FICTION"
SCIENCE = "SCIENCE"
HISTORY = "HISTORY"
BIOGRAPHY = "BIOGRAPHY"
FANTASY = "FANTASY"
MYSTERY = "MYSTERY"
@domain.aggregate
class Book:
title = String(max_length=200, required=True)
author = String(max_length=150, required=True)
isbn = String(max_length=13)
price = Float()
description = Text()
publication_date = Date()
page_count = Integer()
in_print = Boolean(default=True)
genre = String(max_length=20, choices=Genre)
tags = List(content_type=String)
We have added:
Textfordescription— long-form text, unlikeStringwhich has amax_lengthcap.Dateforpublication_date— a date without time.Integerforpage_count— whole numbers.Booleanforin_print— true/false with a default ofTrue.Listfortags— a list of strings. Thecontent_typeparameter specifies the type of each element.
These cover the most common field types. Protean also provides DateTime
(date with time), Float (which we already use for price), and Dict
(key-value pairs). See the
Fields reference for the
complete list.
Field Options
Every field accepts a set of common options:
| Option | Description | Example |
|---|---|---|
required |
Must be provided on creation | title = String(required=True) |
default |
Value when not provided | in_print = Boolean(default=True) |
max_length |
Maximum string length | isbn = String(max_length=13) |
choices |
Restrict to a set of values | genre = String(choices=Genre) |
Constraining with Choices
The genre field uses a Python Enum to restrict valid values:
FICTION = "FICTION"
NON_FICTION = "NON_FICTION"
SCIENCE = "SCIENCE"
HISTORY = "HISTORY"
BIOGRAPHY = "BIOGRAPHY"
FANTASY = "FANTASY"
MYSTERY = "MYSTERY"
Attempting to create a book with an invalid genre raises a ValidationError:
>>> Book(title="Test", author="Test", genre="ROMANCE")
ValidationError: {'genre': ["Value 'ROMANCE' is not a valid choice. ..."]}
Using enums for choices makes your code self-documenting and catches invalid values at the domain boundary.
What Happens When Validation Fails
When you provide invalid data — a missing required field, a string
exceeding max_length, or an invalid choice — Protean raises a
ValidationError with a dictionary mapping field names to error messages:
>>> from protean.exceptions import ValidationError
>>> try:
... Book(author="Unknown") # title is required
... except ValidationError as e:
... print(e.messages)
{'title': ['is required']}
Validation happens at creation time, so invalid aggregates never enter your domain.
Identity
Every aggregate needs a unique identifier. By default, Protean
auto-generates a UUID string as the id field:
>>> book = Book(title="Dune", author="Frank Herbert")
>>> book.id
'a3b2c1d0-e5f6-7890-abcd-ef1234567890'
The Identifier Field
You can also mark a specific field as the identifier using
identifier=True:
@domain.aggregate
class Book:
isbn = String(max_length=13, identifier=True)
title = String(max_length=200, required=True)
# ...
With this, isbn becomes the primary identity — no id field is
auto-generated. This is useful when your domain has a natural identity
(like ISBN for books, email for users, etc.).
Identity Strategies
Protean supports several identity strategies beyond UUID. You can configure a custom identity function at the domain level. See the Identity guide for details.
For this tutorial, we will stick with the default auto-generated UUIDs.
Querying the Repository
In Chapter 1, we used repo.get(id) to retrieve a single book. But a
bookstore needs richer queries — searching, filtering, and sorting.
Protean repositories expose a query API through the _dao (Data Access
Object):
tags=["history", "anthropology", "non-fiction"],
)
repo.add(sapiens)
# Retrieve by ID
book = repo.get(gatsby.id)
print(f"Retrieved: {book.title} by {book.author}")
print(f"Genre: {book.genre}, Pages: {book.page_count}")
print(f"Tags: {book.tags}")
# Query all books
all_books = repo._dao.query.all()
print(f"\nTotal books: {all_books.total}")
# Filter by genre
fiction_books = repo._dao.query.filter(genre="FICTION").all()
print(f"Fiction books: {fiction_books.total}")
for b in fiction_books.items:
print(f" - {b.title}")
Key Query Operations
| Operation | Description | Example |
|---|---|---|
.all() |
Return all records | repo._dao.query.all() |
.filter(...) |
Filter by field values | .filter(genre="FICTION") |
.order_by(...) |
Sort results | .order_by("title") or .order_by("-price") for descending |
.first() |
Return first match | repo._dao.query.filter(author="...").first() |
The .all() method returns a ResultSet with a .total count and
.items list. You can chain operations:
# Fiction books, sorted by price (descending), limited to 5
results = repo._dao.query.filter(genre="FICTION").order_by("-price").limit(5).all()
Lookups
Filter supports Django-style lookups for comparisons:
price__gte=10.0— price greater than or equal to 10page_count__lt=300— fewer than 300 pagestitle__contains="Great"— title contains "Great"
Running as a Script
Let's put it all together in a complete runnable script:
from enum import Enum
from protean import Domain
from protean.fields import Boolean, Date, Float, Integer, List, String, Text
domain = Domain()
class Genre(Enum):
FICTION = "FICTION"
NON_FICTION = "NON_FICTION"
SCIENCE = "SCIENCE"
HISTORY = "HISTORY"
BIOGRAPHY = "BIOGRAPHY"
FANTASY = "FANTASY"
MYSTERY = "MYSTERY"
@domain.aggregate
class Book:
title = String(max_length=200, required=True)
author = String(max_length=150, required=True)
isbn = String(max_length=13)
price = Float()
description = Text()
publication_date = Date()
page_count = Integer()
in_print = Boolean(default=True)
genre = String(max_length=20, choices=Genre)
tags = List(content_type=String)
domain.init(traverse=False)
if __name__ == "__main__":
with domain.domain_context():
repo = domain.repository_for(Book)
# Create several books
gatsby = Book(
title="The Great Gatsby",
author="F. Scott Fitzgerald",
isbn="9780743273565",
price=12.99,
description="A story of the mysteriously wealthy Jay Gatsby.",
page_count=180,
genre=Genre.FICTION.value,
tags=["classic", "american", "jazz-age"],
)
repo.add(gatsby)
brave_new = Book(
title="Brave New World",
author="Aldous Huxley",
isbn="9780060850524",
price=14.99,
description="A dystopian novel set in a futuristic World State.",
page_count=311,
genre=Genre.FICTION.value,
tags=["classic", "dystopia", "science-fiction"],
)
repo.add(brave_new)
sapiens = Book(
title="Sapiens",
author="Yuval Noah Harari",
isbn="9780062316097",
price=18.99,
description="A brief history of humankind.",
page_count=443,
genre=Genre.HISTORY.value,
tags=["history", "anthropology", "non-fiction"],
)
repo.add(sapiens)
# Retrieve by ID
book = repo.get(gatsby.id)
print(f"Retrieved: {book.title} by {book.author}")
print(f"Genre: {book.genre}, Pages: {book.page_count}")
print(f"Tags: {book.tags}")
# Query all books
all_books = repo._dao.query.all()
print(f"\nTotal books: {all_books.total}")
# Filter by genre
fiction_books = repo._dao.query.filter(genre="FICTION").all()
print(f"Fiction books: {fiction_books.total}")
for b in fiction_books.items:
print(f" - {b.title}")
# Order by title
ordered = repo._dao.query.order_by("title").all()
print("\nBooks alphabetically:")
for b in ordered.items:
print(f" - {b.title} (${b.price})")
# Verify
assert all_books.total == 3
assert fiction_books.total == 2
print("\nAll checks passed!")
Run it:
$ python bookshelf.py
Retrieved: The Great Gatsby by F. Scott Fitzgerald
Genre: FICTION, Pages: 180
Tags: ['classic', 'american', 'jazz-age']
Total books: 3
Fiction books: 2
- The Great Gatsby
- Brave New World
Books alphabetically:
- Brave New World ($14.99)
- Sapiens ($18.99)
- The Great Gatsby ($12.99)
All checks passed!
Summary
In this chapter you learned:
- Field types:
String,Text,Integer,Float,Boolean,Date,DateTime,List, andDictcover most data modeling needs. - Field options:
required,default,max_length, andchoicesconstrain field values and catch errors early. - Identity: Aggregates get auto-generated UUIDs by default, but you
can designate any field as the identifier with
identifier=True. - Querying: The repository's
_dao.queryAPI supports filtering, ordering, and pagination.
Our Book aggregate is getting richer, but it still models everything
with primitive fields. In the next chapter, we will introduce value
objects — a way to group related fields into meaningful, immutable
concepts like Money and Address.