AVNLP

Biothink

Github: https://github.com/avnlp/biothink

RAG-Model-Training

Github: https://github.com/avnlp/rag-model-training

GRPO

Github: https://github.com/avnlp/grpo

LLM-Finetuning

Github: https://github.com/avnlp/llm-finetuning

LLM Rankers

Github: https://github.com/avnlp/rankers
Paper: LLM Rankers

Pairwise Ranking Prompting (PRP)

Github: https://github.com/avnlp/prp

RRF

Github: https://github.com/avnlp/rrf
Paper: Performance Evaluation of Rankers and RRF Techniques for Retrieval Pipelines

LLM Blender

Github: https://github.com/avnlp/llm-blender
Paper: LLM Ensembling: Haystack Pipelines with LLM-Blender

Omega RAG

Omega RAG provides a framework to combine several advanced RAG techniques into a high-performing RAG pipeline. Query Rewriting, Hyde, Adaptive retrieval (no retrieval, single-step, iterative retrieval), Correction by retrieval evaluation and confidence scoring, Unified active retrieval, Reranking, Citation generation, User feedback, Hybrid structured router, Scattered knowledge structurizer and Structured knowledge utilizer.

Under active development.

Vector-DB

Github: https://github.com/avnlp/vectordb

Feature Milvus Weaviate Qdrant Pinecone Chromadb
Indexes Supports both sparse and dense vectors, using IVF for dense indexing and BM25 for sparse retrieval Supports both sparse and dense vectors, using HNSW for dense and BM25 for sparse indexing Supports both sparse and dense vectors, using HNSW for dense and hybrid search mechanisms for sparse Supports only dense vectors, optimized for approximate nearest neighbor (ANN) search. Sparse vectors not supported Supports only dense vectors with flat embeddings, optimized for in-memory search
Hybrid Search BM25 + vector search using hybrid query modes BM25 + vector search with alpha parameter for balance BM25 + ANN search with structured filtering Single sparse-dense index. Requires both sparse and dense query vectors Not supported
Partition Uses partitions to separate data. Queries limited to a partition Uses tenants for isolation. Queries limited to a tenant Uses named collections for data separation. Queries filtered within collections Uses namespaces to partition records. Queries limited to one namespace Uses collections as namespaces. Queries directed to a collection
Semantic Search Uses IVF, HNSW, and ANNOY for efficient vector retrieval Vector-based retrieval. Results based on embedding similarity Real-time vector similarity search with contextual relevance Finds similar content using vector proximity. Supports metadata filtering Stores and retrieves vector embeddings for similarity search
Metadata Filtering SQL-like filtering with structured metadata fields GraphQL-based filtering with hierarchical queries Payload-based filtering with structured metadata Dictionary-based metadata filtering attached to vectors Key-value filtering using Pythonic expressions

Hyperparameter-Tuning

Github: https://github.com/avnlp/hyperparameter-tuning
Paper: Optimizer Inclusions