Skip to main content

Performance

Here are some top-level performance metrics for Khoj. These are rough estimates and will vary based on your hardware and data.

info

These performance metrics were last evaluated in 2022.

Search performance

Semantic search using the default embeddings model is fairly fast at <100 ms across all content types
Reranking using the cross-encoder model is slower at <2s on 15 results. Tweak top_k to tradeoff speed for accuracy of results
Filters in query (e.g. by file, word or date) usually add <20ms to query latency

Indexing performance

Indexing is more strongly impacted by the size of the source data
Indexing 100K+ line corpus of notes takes about 10 minutes
Note: It should only take this long on the first run as the index is incrementally updated

Miscellaneous

Testing done on a Mac M1 and a >100K line corpus of notes

Search performance
Indexing performance
Miscellaneous