Status: ready
S3 bucket: comp4349-a2-yawu8371
S3 key: uploads/1780492903_ff8f4926cbe44d718e28aa73c3741ca3_comp4349_comp5349_assignment2_2026.pdf
Uploaded: 2026-06-03 13:21:43.815181+00:00
| Strategy | Status | Chunks | Average length | Processing time | Error |
|---|---|---|---|---|---|
| Fixed-size chunking | completed | 20 | 967.8 | 0.353 sec | |
| Paragraph-aware chunking | completed | 15 | 1035.3 | 0.403 sec |
Chunk 0 - 1000 characters
# Page 1 School of Computer Science Dr. Ying Zhou COMP4349/COMP5349: Cloud Computing Sem. 1/2026 Assignment 2: AWS Project Individual Work: 20% 08.05.2026 Tasks In this assignment, you will deploy an event-driven Python web application on AWS. The application allows users to upload PDF documents and compare two different text chunk- ing strategies for retrieval. The retrieval uses a simple local TF-IDF embedding approach to avoid long latency caused by calls to LLM services and to avoid potential API rate-limit issues. You are required to submit a report describing your deployment and to attend a demonstration to verify the deployment. Application Description The application supports the following high-level workflow: 1. A user accesses the web application through a public web endpoint. 2. The user uploads a PDF document through the web application. 3. The uploaded PDF is stored in dura...
Chunk 1 - 1000 characters
. The user uploads a PDF document through the web application. 3. The uploaded PDF is stored in durable object storage. 4. The application records metadata about the uploaded document in a database. 5. The system processes the uploaded PDF using two different chunking strategies: •fixed-size chunking; •paragraph-aware chunking. 6. The system generates chunks and processing statistics for each strategy . 7. The web application retrieves processing status and generated results from the database. 8. Once both processing strategies have completed, the web application displays a side- by-side comparison of their results. 9. The user may enter a retrieval query to compare which chunks are retrieved by each strategy . 1 # Page 2 Required Architectural Properties Your deployed system must satisfy the following architectural properties: •The web application must be accessed through an Applicati...
Chunk 0 - 8 characters
# Page 1
Chunk 1 - 1512 characters
School of Computer Science Dr. Ying Zhou COMP4349/COMP5349: Cloud Computing Sem. 1/2026 Assignment 2: AWS Project Individual Work: 20% 08.05.2026 Tasks In this assignment, you will deploy an event-driven Python web application on AWS. The application allows users to upload PDF documents and compare two different text chunk- ing strategies for retrieval. The retrieval uses a simple local TF-IDF embedding approach to avoid long latency caused by calls to LLM services and to avoid potential API rate-limit issues. You are required to submit a report describing your deployment and to attend a demonstration to verify the deployment. Application Description The application supports the following high-level workflow: 1. A user accesses the web application through a public web endpoint. 2. The user uploads a PDF document through the web application. 3. The uploaded PDF is stored in durable object...