Introduction
Welcome to Multimodal! We're excited to understand your approach to backend system design. This take-home assignment is designed to assess your ability to architect, design, and justify a backend service relevant to the kind of work we do – automating complex knowledge tasks.
Our systems often involve processing documents or data, sometimes involving long-running tasks (like interacting with LLMs or complex databases), and then making the results available via APIs. This assignment asks you to design a small piece of such a system.
Focus: This assignment is about design and justification, not coding. We want to see how you think about building robust, scalable, and maintainable systems.
Estimated Time: We expect you to spend approximately 3-5 hours thinking through and documenting your design.
Scenario: Asynchronous Task Processing API Design
Multimodal needs a backend service that can accept tasks, process them asynchronously, and allow clients to check the status and retrieve the results of these tasks. For this assignment, the "task" will be a simplified version of text processing – analyzing a piece of text to extract some basic information.
Imagine this service could later be extended to handle more complex tasks, like summarizing text using an LLM, extracting entities, or classifying documents.
Your Task: Design the System
You are to provide a detailed design document for the "Asynchronous Task Processing API" described above. Your document should cover the following aspects:
- Overall Architecture:
- Provide a high-level architectural diagram or a clear textual description of the main components of the service and how they interact (e.g., API gateway, web server, task queue, worker processes, database).
- Explain your rationale for this overall architecture.
- API Design:
- Submit a Task:
- Describe the API endpoint (e.g.,
POST /tasks
). Specify the request payload (JSON structure) for submitting a piece of text.
- Describe the immediate synchronous response (e.g.,
task_id
, initial status).
- How would you ensure
task_id
is unique?
- Check Task Status & Results:
- Describe the API endpoint (e.g.,
GET /tasks/{task_id}
).
- Specify the response format, including how status (e.g., "PENDING", "PROCESSING", "COMPLETED", "FAILED") and results (if available) are conveyed.
- How would you handle a request for a non-existent
task_id
?
- List Tasks:
- Describe the API endpoint (e.g.,
GET /tasks
).
- What information would be returned for each task?
- What considerations would you have for pagination and potentially filtering (e.g., by status)?
- Asynchronous Task Processing:
-
Describe the mechanism you would use for asynchronous task execution.
-
What technologies or patterns would you consider (e.g., Celery with Redis/RabbitMQ, Python's asyncio
with a background task manager, a simpler queue, etc.)? Justify your choice for this specific scenario and discuss trade-offs.
-
Please explain how your solution addresses:
- Idempotency
- Retries
- Race conditions
-
Explain the lifecycle of a task: from submission, through processing, to completion or failure. How is task state managed and updated?
-
For the "text processing" logic (simulating a 5-10 second delay, calculating word count, identifying 5 most frequent words – ignoring common stop words):
- How would this logic be implemented within a worker?
- How would you handle potential errors during this processing?
- If the text processing step used a third-party LLM API (e.g OpenAI), how would that affect your async design?
- Data Persistence:
- Describe the data model/schema for storing task information (e.g., task ID, status, submitted text, results, timestamps).
- What type of database (e.g., relational like PostgreSQL, NoSQL like MongoDB, or a key-value store like Redis for certain parts) would you choose for this service?
- Justify your choice for both the scale of this assignment AND for a production environment handling high throughput. Discuss trade-offs.
- Technology Stack:
- Propose a primary technology stack, focusing on Python for the backend.
- If you were to migrate to a more performant language, which language would you choose and why?
- Specify your choice of web framework (e.g., FastAPI, Flask, Django) and justify why it's suitable.
- Mention any other key libraries or tools you'd employ for aspects like task queuing, data validation, etc.
- Scalability and Reliability:
- How would you design the service to scale horizontally to handle an increasing number of tasks and concurrent users? (Consider the API layer, workers, and database).
- How would you ensure the reliability of task processing? What if a worker crashes mid-task? How would you handle retries or dead-letter queues?
- Containerization & Deployment (Conceptual):
- Briefly describe how you would approach containerizing this service using Docker. What would be the key concerns for the
Dockerfile
(without writing the actual file)?
- Conceptually, how might this service be deployed (e.g., to a cloud provider, Kubernetes)?
- Testing Strategy:
- What is your strategy for testing this service?
- What types of tests would you prioritize (e.g., unit, integration, end-to-end) and why?
- What key components or interactions would be most critical to test?
Deliverables