Skip to main content

Architecture

Inference

The Inference architecture is comprised of several core components: a text, or frontend client, a FastAPI webserver, a database with several tables, Reddis used for queueing, and distributed gpu workers.

A more detailed overview can be viewed here.