Interactive RAG-based Learning Platform

GenAIMachine LearningStreamlitRAGAI

Tuesday, October 15, 2024

Developing an Interactive RAG-Powered Learning Tool

In this project, I set out to create an engaging and personalized learning experience using Retrieval-Augmented Generation (RAG) technology and AI-driven tools. Here’s a detailed look at the journey and outcome.

The Problem Statement

The aim was to address two key challenges:

  1. Implementing a RAG system and developing tools to leverage its capabilities.
  2. Adding text-to-speech functionality to make learning more accessible and interactive.

To determine the tools to develop, I started with some fundamental questions:

  • How can we make studying more interactive and engaging?
  • Can we provide instant, accurate answers to students' questions?
  • Is it possible to generate custom study materials tailored to individual needs?

With these objectives in mind, I ventured into the world of RAG and AI agents.

Developing the RAG System

The core of the project is the Retrieval-Augmented Generation (RAG) system, which I built through the following steps:

  1. Document Ingestion:
  2. Created an ingest.py script to process the NCERT Sound chapter PDF. Used PyPDFLoader and PDFPlumberLoader for robust PDF parsing. Split the text into manageable chunks using RecursiveCharacterTextSplitter.
  3. Vector Database:
  4. Implemented vector_db.py to create a Chroma vector store for efficient similarity searches. Used the sentence-transformers/all-MiniLM-L6-v2 model for generating embeddings, balancing performance and accuracy.
  5. RAG System Implementation:
  6. Developed the RAG system in rag_system.py using Google’s Gemini 1.5 Flash model for generating responses. Integrated context retrieval from the vector store to inform responses.
  7. API Development:
  8. Built a FastAPI backend (app.py) to expose endpoints for generating responses, creating quizzes, and more.
  9. Frontend Design:
  10. Designed a Streamlit frontend (frontend.py) to provide an intuitive interface for students to interact with the system.

Specialized Agents and Tools

To enhance the learning experience, I developed several specialized agents and tools:

  1. Quiz Agent (quiz_agent.py):
  2. Generates dynamic quizzes based on chapter content. Supports multiple question types (e.g., multiple-choice, true/false). Provides instant feedback on user responses.
  3. Diagram Agent (diagram_agent.py):
  4. Creates ASCII flowcharts to visualize complex concepts. Summarizes key chapter topics in an engaging format.
  5. Exam Guide Agent (exam_guide_agent.py):
  6. Generates custom exam guides with important questions and detailed solutions. Tailored for effective test preparation.
  7. Summary Tool (summary_tool.py):
  8. Produces concise summaries for quick revision. Lists important topics in a structured format.

Text-to-Speech Integration

To make learning accessible and interactive, I integrated the Sarvam.ai API for text-to-speech functionality. This feature allows students to:

  • Listen to answers, summaries, and other content.
  • Cater to auditory learners and enhance multi-modal learning experiences.

Integration Process:

  1. Set up authentication with the Sarvam.ai API key.
  2. Added a dedicated FastAPI endpoint for text-to-speech requests.
  3. Implemented frontend controls for text-to-speech conversion and playback.

Key Features

  1. Question & Answer System: Ask detailed questions about the Sound chapter and get accurate answers.
  2. Text-to-Speech: Convert text responses to speech for auditory learning.
  3. Chapter Summary: Generate concise summaries of chapter content.
  4. Interactive Quiz: Take quizzes with dynamically generated questions.
  5. Summary Flowchart: Visualize key concepts as flowcharts.
  6. Exam Guide: Create tailored guides with practice questions.

Technology Stack

  • Backend: FastAPI
  • Frontend: Streamlit
  • AI Model: Google’s Gemini 1.5 Flash
  • Vector Database: Chroma
  • Embeddings: Hugging Face (sentence-transformers/all-MiniLM-L6-v2)
  • PDF Processing: PyPDFLoader, PDFPlumberLoader
  • Text-to-Speech: Sarvam AI API

Setup and Installation

Clone the repository:

git clone <repository_url>
cd <repository_directory>

Install dependencies:

pip install fastapi streamlit langchain google-generativeai requests chromadb sentence_transformers langchain_community pydantic uvicorn

Set up environment variables:

  • GOOGLE_API_KEY: Your Google API key for Gemini.
  • SARVAM_API_KEY: Your Sarvam AI API key for text-to-speech.

Run the application:

Start the FastAPI backend:

uvicorn api:app --reload

Run the Streamlit frontend:

streamlit run frontend.py

Results and Applications

This interactive learning tool demonstrates the potential of AI in education by offering:

  • Engaging Study Sessions: Interactive Q&A, quizzes, and flowcharts.
  • Accessibility: Multi-modal learning with text-to-speech capabilities.
  • Customization: Personalized study materials and exam guides.

Potential applications include:

  • Student Learning: Enhancing comprehension of complex topics.
  • Test Preparation: Efficiently revising and practicing key concepts.
  • Educational Platforms: Integrating advanced tools for interactive learning.

Conclusion

Developing this RAG-powered interactive learning tool has been an incredible journey. By combining RAG technology, specialized AI agents, and text-to-speech capabilities, I’ve created a platform that redefines how students engage with study materials.

This open-source project is available on GitHub. Feel free to explore, contribute, or adapt it to your needs. Let’s make learning more accessible, engaging, and effective for everyone!

Happy learning!