Build LLM applications with data connectors, indexing, and retrieval using LlamaIndex (formerly GPT Index). Connect your LLMs to private data sources via vector stores, structured indices, and advanced query engines.
Build LLM applications that connect to your private data using LlamaIndex, a comprehensive data framework for retrieval-augmented generation (RAG). This skill guides you through setting up LlamaIndex, ingesting data from various sources, creating vector indices, and querying your data with LLMs.
LlamaIndex helps you augment LLMs with your own data by providing:
Choose your installation approach based on your needs.
**Option A: Install core + specific integrations (recommended for production)**
```bash
pip install llama-index-core
pip install llama-index-llms-openai
pip install llama-index-embeddings-openai
```
**Option B: Install starter bundle (includes common integrations)**
```bash
pip install llama-index
```
**For OpenAI:**
```python
import os
os.environ["OPENAI_API_KEY"] = "your-api-key-here"
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)
```
**For Llama 2 via Replicate (or other providers):**
```python
import os
os.environ["REPLICATE_API_TOKEN"] = "your-replicate-token"
from llama_index.core import Settings, VectorStoreIndex, SimpleDirectoryReader
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.replicate import Replicate
from transformers import AutoTokenizer
Settings.llm = Replicate(
model="meta/llama-2-7b-chat:8e6975e5ed6174911a6ff3d60540dfd4844201974602551e10e9e87ab143d81e",
temperature=0.01,
additional_kwargs={"top_p": 1, "max_new_tokens": 300}
)
Settings.tokenizer = AutoTokenizer.from_pretrained("NousResearch/Llama-2-7b-chat-hf")
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)
```
Place your documents (PDFs, text files, markdown, etc.) in a directory and load them:
```python
from llama_index.core import SimpleDirectoryReader
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)
```
Create a query engine and ask questions about your data:
```python
query_engine = index.as_query_engine()
response = query_engine.query("What are the key findings in the research papers?")
print(response)
```
**Save index to disk:**
```python
index.storage_context.persist()
```
**Reload index from disk:**
```python
from llama_index.core import StorageContext, load_index_from_storage
storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)
```
```python
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("./company_docs").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What is our company's return policy?")
print(response)
```
```python
from llama_index.readers.file import PDFReader
from llama_index.core import VectorStoreIndex
loader = PDFReader()
documents = loader.load_data(file="./research_paper.pdf")
index = VectorStoreIndex.from_documents(documents)
```
```python
from llama_index.core import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(similarity_top_k=5)
response = query_engine.query("Explain the methodology")
print(response)
```
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/llamaindex-rag-framework/raw