ASTRA (ArXiv Sourced Text Recommendation Agent), as the name suggests, is a paper recommendation service based on arXiv. We've collected 57000+ papers across various scientific disciplines for recommendation. Simply provide a paper's title, arXiv ID, or a handful of keywords, and ASTRA will curate a list of relevant documents.
The core component of ASTRA's recommendation engine is Embed , a large language model developed by Cohere. Embed was built to handle mixed-modality documents and has demonstrated results in retrieval, making it a perfect choice for ASTRA. With this approach, recommendation is simply a k-nearest neighbors problem, made expedient via the use of FAISS.
The full source code for ASTRA, including the raw datasets, can be found here.