Open Engineer

Project Definition: Upgrading Paul Graham Essay Search

In the previous lesson, we built a semantic search pipeline to search through Paul Graham's collection of online essays. We used OpenAI's Embedding API to embed each paragraph of each essay and then performed an exhaustive search to find the top 5 most similar paragraphs to a given query. However, we concluded by noting that (a) we are not persisting our embedding anywhere and (b) our search would not scale well to millions of documents. In this lesson, we are going to address these limitations.