1. Text Chunking
In RAG systems, documents are broken into smaller chunks for more precise retrieval. Try different chunking strategies:
- Fixed-size chunks: Simple character-based chunking with adjustable size. Simple but may break semantic units.
- Sliding Window: Fixed-size chunks with overlap to maintain context between chunks. Better for preserving context across chunk boundaries.
- Sentence-based: Groups sentences together, preserving semantic meaning but creating variable-sized chunks.
- Paragraph-based: Uses natural paragraph breaks to create chunks, maintaining complete thoughts.
The choice of chunking strategy significantly impacts retrieval quality. Semantic-preserving methods (sentence/paragraph) often perform better for complex queries.
2. Vector Embedding
Each text chunk is converted into a high-dimensional vector (embedding) using a language model.
Real embeddings typically have 768-1536 dimensions, where each dimension represents abstract semantic features.
Below is a simplified representation showing only 10 dimensions per chunk. For educational purposes, we've labeled these dimensions with semantic categories, though in real embedding models, dimensions don't have human-interpretable labels.
Note: In this demo, words related to specific topics (like "Mars" or "red") influence certain dimensions more than others. For example, planet names have higher values in the "Celestial Bodies" dimension.
Chunk | Dim 1 Celestial Bodies | Dim 2 Planetary Features | Dim 3 Size/Scale | Dim 4 Position/Order | Dim 5 Composition | Dim 6 Surface Features | Dim 7 Satellites | Dim 8 Color/Appearance | Dim 9 Scientific Interest | Dim 10 Historical Significance |
---|---|---|---|---|---|---|---|---|---|---|
The solar system consists of the Sun and everythin | 0.83 | 0.33 | 0.17 | 0.17 | 0.56 | 0.94 | 1.00 | 0.78 | 0.61 | 0.61 |
g that orbits around it. This includes eight plane | -1.00 | -1.00 | -1.00 | -1.00 | -1.00 | -1.00 | -1.00 | -1.00 | -1.00 | -1.00 |
ts: Mercury, Venus, Earth, Mars, Jupiter, Saturn, | 1.00 | 1.00 | 0.73 | 0.47 | 0.38 | 0.20 | 0.07 | 0.12 | 0.25 | 0.34 |
Uranus, and Neptune. Earth is our home planet, wit | 0.13 | 1.00 | 0.54 | 0.79 | 0.63 | 0.46 | 0.42 | 0.25 | 0.08 | 0.08 |
h vast oceans covering most of its surface. It has | -0.69 | 0.37 | 0.16 | -0.05 | -0.26 | 1.00 | -0.69 | -0.90 | -0.48 | 0.79 |
one natural satellite called the Moon. Mars, ofte | 0.22 | 0.28 | 1.00 | 0.82 | 0.64 | 0.46 | 0.70 | 0.34 | 0.16 | 0.16 |
n called the Red Planet, has captured our imaginat | 0.23 | 0.72 | 0.23 | 0.79 | 0.65 | 0.51 | 0.51 | 1.00 | 0.79 | 0.65 |
ion for centuries. Scientists believe Mars once ha | 0.87 | 1.00 | 0.87 | 0.74 | 0.61 | 0.48 | 0.35 | 0.74 | 0.61 | 0.48 |
d flowing water on its surface. Jupiter is the lar | 0.40 | 0.91 | 0.79 | 0.66 | 0.53 | 1.00 | 0.28 | 0.28 | 0.79 | 0.91 |
gest planet in our solar system, with a distinctiv | 0.48 | 0.62 | 0.28 | 0.52 | 0.76 | 1.00 | 0.90 | 0.79 | 0.69 | 0.59 |
e Great Red Spot and numerous moons. | 0.39 | 0.47 | 0.54 | 0.47 | 0.39 | 0.31 | 0.39 | 1.00 | 0.85 | 0.77 |
3. Vector Similarity Visualization
When a query is entered, it's also converted to an embedding vector.
The system calculates similarity between the query vector and all chunk vectors using cosine similarity.
The visualization below shows how the query embedding overlaps with each chunk embedding:
- Query column: Shows the query's embedding values across dimensions
- Chunk column: Shows the chunk's embedding values across dimensions
- Overlap column: Shows the product of query and chunk values for each dimension
Positive products (green) contribute positively to similarity, while negative products (red) reduce similarity. The width of the bar indicates the magnitude of contribution.
4. Retrieval Results
Chunks are ranked by similarity score, and the most relevant ones are retrieved.
These chunks provide the context needed to answer the query accurately.
Results:
5. LLM Prompt Construction
In a RAG system, the retrieved chunks are combined with the original query to create a prompt for the language model.
This prompt provides the LLM with the necessary context to generate an accurate and relevant response.