2. Core Concepts

2.1 Document Definition

Butty documents are Pydantic models that inherit from Document[ID_T], where the generic parameter specifies the identity type (e.g., int, str, or ObjectId). These behave as standard Pydantic models while adding MongoDB persistence capabilities through the generic CRUD API built into the Document base class, parameterized with identity type and concrete Document type. This gives IDE and type tooling the ability to check function calls and infer result types, whether for single documents or sequences. The class definition should include an identity field declaration, corresponding with the generic parameter, annotated with IdentityField().

When acting as a base class for all concrete Documents in the application, it should be marked as ABC to prevent inclusion in the document registry for autobinding.

Example of custom str identity Document:

class BaseDocument(Document[str], ABC):
    id: Annotated[str | None, IdentityField(identity_provider=lambda: str(uuid4()))] = None

Example of MongoDB ObjectId identity Document:

class BaseDocument(Document[ObjectId], ABC):
    id: Annotated[ObjectId | None, IdentityField(alias="_id")] = None

    model_config = ConfigDict(
        arbitrary_types_allowed=True,
    )

Note: arbitrary_types_allowed option in model config (v2 syntax is used here), because ObjectId is not supported by Pydantic by default. Full model configuration to support e.g. JSON serialization of ObjectId is up to developers and out of scope of this guide.

The collection name for a bound Document is generated automatically, and the name of the class itself by default. Collection naming can be customized either through the collection_name_format callable during engine setup or via the collection_name and collection_name_from_model fields of the DocumentConfig class (see 2.3 Document Config).

Example of DocumentConfig with collection_name:

class User(BaseDocument):
    name: str
    
    class DocumentConfig(DocumentConfigBase):
        collection_name = "users_collection"

2.2 Identity Management

Butty offers flexible identity management for both MongoDB-native and custom identifiers. Each document must declare exactly one identity field marked with IdentityField(), with its type matching the document’s ID_T generic parameter. This field acts as the primary key for all persistence operations.

The identity lifecycle differentiates between transient documents (with None identity) and persisted ones, requiring the identity field to be optional in most cases. When saving a transient document, the system generates a new identity based on the configured strategy and inserts the document. For documents with existing identities, save operations perform updates by default, though this can be configured via the save() method’s mode parameter.

Three primary identity strategies are supported through IdentityField() configuration:

Native MongoDB _id

The strategy is automatically detected when the _id alias is present in the identity field declaration. While offering optimal database performance, this requires additional Pydantic configuration for JSON serialization and ties the application to MongoDB’s identifier format.

Example of native MongoDB identity field declaration:

class BaseDocument(Document[ObjectId], ABC):
    id: Annotated[ObjectId | None, IdentityField(alias="_id")] = None

Custom identity

Supports synchronous and asynchronous providers through identity_provider and identity_provider_factory parameters. The factory pattern enables class-aware identity sequences, maintaining JSON compatibility while supporting application-controlled identifiers like auto-incremented integers.

Example of identity provider factory, which returns async coroutine for serial id generation:

class SerialIDCounter(Document[str]):
    name: Annotated[str, IdentityField()]
    count: int


def _serial(doc_model: DocModel) -> Callable[[], Awaitable[int]]:
    async def identity_provider() -> int:
        return (
            await SerialIDCounter.update_document(
                doc_model.__collection__.name,
                Inc({F(SerialIDCounter.count): 1}),
                upsert=True,
            )
        ).count

    return identity_provider


class SerialIDDocument(Document[int], ABC):
    id: Annotated[int | None, IdentityField(identity_provider_factory=_serial)] = None


Provided identities

Identity can be managed outside the Butty engine. For example, the identity field can use default_factory from Pydantic’s model configuration. Documents with existing identities are treated as persistent by default, requiring explicit insert mode specification for new documents.

Example of provided identity field:

class SerialIDCounter(Document[str]):
    name: Annotated[str, IdentityField()]
    count: int

2.3 Document Linking

Automatic Pipeline Generation

By analyzing the complete document relationship graph during initialization, Butty automatically generates MongoDB aggregation pipelines with nested lookups. This forms the core value of the engine, enabling efficient joins across related documents while maintaining type safety. The pipelines are generated statically and cannot be dynamically adjusted during read operations - all nesting levels and lookup paths are predetermined. This design means the relationship graph cannot contain circular references or self-references, ensuring all document connections can be resolved through a finite series of lookups.

2.4 Query Building

The query builder provides type-safe construction of MongoDB queries through operator overloading and logical composition. During engine setup, standard attributes of Document models are replaced with ButtyField instances that maintain original field aliases and document nesting structure. These references support attribute-style navigation through nested documents and collections while maintaining proper MongoDB field path notation. The system preserves original field names from Pydantic models while handling alias translation for database operations.

Comparison operations implement standard Python comparison operators (==, >, >=, <, <=, !=) which generate appropriate MongoDB query operators ($eq, $gt, $gte, $lt, $lte, $ne). The modulo operator (%) provides regular expression matching support, translating to MongoDB’s $regex operator with configurable options. Each comparison operation produces a query leaf node containing the field reference, operator, and comparison value.

Logical operators combine query components using & (AND) and | (OR) operators, building nested query structures that translate to MongoDB’s $and and $or operators. The query builder maintains proper operator precedence through explicit grouping, ensuring logical expressions evaluate as intended. Complex queries can combine multiple levels of logical operations with various comparison conditions.

The query interface requires all components to implement conversion to native MongoDB query syntax through the to_mongo_query() method. This polymorphic design allows mixing raw dictionary queries with builder-constructed queries while maintaining consistent output format. The system automatically handles field alias substitution when converting builder queries to database syntax.

Utility functions F() and Q() provide explicit type conversion points for static type checkers. F() casts model fields to ButtyField instances for query building, while Q() finalizes query construction by converting builder objects or hybrid dictionaries to pure MongoDB query syntax. These functions serve as integration points between the type-safe builder and raw dictionary queries.

Update operations follow a similar pattern with dedicated operators like Set and Inc that construct MongoDB update documents. These operations maintain field reference integrity while supporting both direct values and nested update expressions. The update builder ensures proper syntax generation for atomic update operations.

Queries example:

class Product(BaseDocument):
    name: str
    price: float


class Order(BaseDocument):
    order_items: Annotated[list[OrderItem] | None, BackLinkField()] = None


class OrderItem(BaseDocument):
    order: Order
    product: Product


Order.model_rebuild()


async def main():
    await Product.find(F(Product.name) == "Chair")
    await Product.find({F(Product.name): "Chair"})
    await Product.find(F(Product.price) > 100)
    await Product.find((F(Product.price) > 100) & (F(Product.name) == "Chair"))
    await Order.find(F(Order.order_items[...].product.name) == "Chair")

Note: The array query syntax [...] primarily serves to satisfy IDE attributes validation. The expression can alternatively be written as F(Order.order_items).product.name - this bypasses IDE attribute resolution while remaining fully valid MongoDB query syntax.

2.5 Collection Views

Butty supports MongoDB collection views through the document configuration system. Multiple document types can reference the same underlying MongoDB collection while presenting different schemas and validation rules. This enables scenarios where different application components need varying perspectives on the same data.

Collection views are configured by specifying the source document class in the collection_name_from_model field of the DocumentConfig. The view document inherits the collection binding of its source while maintaining independent schema validation and field definitions. All documents sharing a collection must use compatible identity types.

When saving documents through a view, only fields explicitly defined in that view document will be modified in the database. All other fields in the underlying collection remain untouched.

Example of collection view:

class User(BaseDocument):
    name: str
    password: str


class UserView(BaseDocument):
    name: str

    class DocumentConfig(DocumentConfigBase):
        collection_name_from_model = User

2.6 Versioning for Optimistic Concurrency Control

Butty implements optimistic concurrency control through configurable version fields. The version field can use any comparable type (integer, string, etc.) annotated with VersionField, with custom logic for generating new version values. During save operations, the system performs an atomic check comparing the document’s current version against the stored value, rejecting the update if they don’t match.

Version value generation is fully customizable through the version provider function, which receives the current value and returns the next version. This allows implementations ranging from simple counters to UUID-based schemes or timestamp versions. Failed updates due to version conflicts raise DocumentNotFound, while successful updates atomically persist both the new document state and its updated version identifier.

Example of BaseDocument with version field:

class BaseDocument(OIDDocument):
    version: Annotated[int | None, VersionField(version_provider=lambda v: 0 if v is None else v + 1)] = None

2.7 Document Config

Butty provides document configuration through a nested DocumentConfig class that should inherit from DocumentConfigBase. This inheritance enables IDE autocompletion and type checking for the available configuration options:

  • collection_name: Explicit MongoDB collection name

  • collection_name_from_model: Document class whose collection should be reused (creates a view)

2.8 Fields Declaration

Butty supports two syntax variations for field definitions, both using native Pydantic field declarations enhanced with specialized field information:

  • Annotated style: Annotated[FieldType, FieldInformation(...)]

  • Default value style: FieldType = FieldInformation(...)

Available field information types provide document-specific capabilities:

  • IdentityField(): Designates the primary key field. Supports custom identity providers through identity_provider and identity_provider_factory parameters.

  • VersionField(): Enables optimistic concurrency control. Requires a version_provider function to generate version values.

  • LinkField(): Defines document relationships. Configurable with:

    • link_name: Custom storage field name

    • on_delete: Cascade behavior (“nothing”, “cascade”, “propagate”)

    • link_ignore: Skip link processing

  • BackLinkField(): Creates reverse references from linked documents.

  • IndexedField(): Specifies fields for MongoDB indexing. Supports unique constraint flag.

All field information types maintain compatibility with standard Pydantic field arguments while extending functionality for MongoDB operations.

Example of IndexedField declarations:

class Department(BaseDocument):
    name: Annotated[str, IndexedField()] = "unknown"

# or

class Department(BaseDocument):
    name: str = IndexedField("unknown")

2.8 Error handling

Butty uses a hierarchy of exceptions while also propagating relevant MongoDB driver exceptions for operational integrity. All Butty-specific errors inherit from ButtyError, providing consistent error handling while maintaining separation from other exception types. Certain operations may raise native MongoDB exceptions like DuplicateKeyError alongside Butty’s exception types to reflect database-level constraints.

Key error types include:

  • ButtyError: Base class for all Butty-specific exceptions

  • ButtyValueError: Indicates invalid field values or operation parameters

  • DocumentNotFound: Signals missing documents during get/update operations (contains doc_model, op, and query attributes)

  • MongoDB driver exceptions: Including DuplicateKeyError for identity conflicts during insert operations