Files

Melvin Ragusa 935b800221 feat: add design and requirements documents for AI sidebar enhancements

2025-10-25 21:44:16 +02:00

14 KiB

Raw Blame History

Design Document: AI Sidebar Enhancements

Overview

This design document outlines the technical approach for enhancing the AI sidebar module with streaming responses, improved UI, conversation management commands, persistence features, and reasoning mode controls. The enhancements build upon the existing GTK4-based architecture using the Ollama Python SDK.

The current implementation uses:

GTK4 for UI with gtk4-layer-shell for Wayland integration
Ollama Python SDK for LLM interactions
JSON-based conversation persistence via ConversationManager
Threading for async operations with GLib.idle_add for UI updates

Architecture

Current Architecture Overview

┌─────────────────────────────────────────────────────────┐
│                    SidebarWindow (GTK4)                  │
│  ┌────────────────────────────────────────────────────┐ │
│  │  Header (Title + Model Label)                      │ │
│  ├────────────────────────────────────────────────────┤ │
│  │  ScrolledWindow                                    │ │
│  │    └─ Message List (Gtk.Box vertical)             │ │
│  ├────────────────────────────────────────────────────┤ │
│  │  Input Box (Entry + Send Button)                  │ │
│  └────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
         │                           │
         ▼                           ▼
┌──────────────────┐        ┌──────────────────┐
│ ConversationMgr  │        │  OllamaClient    │
│  - Load/Save     │        │  - chat()        │
│  - Messages      │        │  - stream_chat() │
└──────────────────┘        └──────────────────┘

Enhanced Architecture

The enhancements will introduce:

CommandProcessor: New component to parse and execute slash commands
StreamingHandler: Manages token streaming and UI updates
ConversationArchive: Extends ConversationManager for multi-conversation management
ReasoningController: Manages reasoning mode state and formatting
Enhanced Input Widget: Multi-line text view replacing single-line entry

Components and Interfaces

1. Streaming Response Display

StreamingHandler Class

class StreamingHandler:
    """Manages streaming response display with token-by-token updates."""
    
    def __init__(self, message_widget: Gtk.Label, scroller: Gtk.ScrolledWindow):
        self._widget = message_widget
        self._scroller = scroller
        self._buffer = ""
        self._is_streaming = False
    
    def start_stream(self) -> None:
        """Initialize streaming state."""
        
    def append_token(self, token: str) -> None:
        """Add token to buffer and update UI via GLib.idle_add."""
        
    def finish_stream(self) -> str:
        """Finalize streaming and return complete content."""

Integration Points

Modify _request_response() to use ollama_client.stream_chat() instead of chat()
Use GLib.idle_add to schedule UI updates for each token on the main thread
Create message widget before streaming starts, update label text progressively
Maintain smooth scrolling by calling _scroll_to_bottom() periodically (not per token)

Technical Considerations

Token updates must occur on GTK main thread via GLib.idle_add
Buffer tokens to reduce UI update frequency (e.g., every 3-5 tokens or 50ms)
Handle stream interruption and error states gracefully
Show visual indicator (e.g., cursor or "..." suffix) during active streaming

2. Improved Text Input Field

Replace Gtk.Entry with Gtk.TextView wrapped in Gtk.ScrolledWindow:

# Current: Gtk.Entry (single line)
self._entry = Gtk.Entry()

# Enhanced: Gtk.TextView (multi-line)
self._text_view = Gtk.TextView()
self._text_buffer = self._text_view.get_buffer()
text_scroller = Gtk.ScrolledWindow()
text_scroller.set_child(self._text_view)
text_scroller.set_min_content_height(40)
text_scroller.set_max_content_height(200)

Features

Automatic text wrapping with set_wrap_mode(Gtk.WrapMode.WORD_CHAR)
Dynamic height expansion up to max height (200px), then scroll
Shift+Enter for new lines, Enter alone to submit
Placeholder text using CSS or empty buffer state
Maintain focus behavior with proper event controllers

Key Bindings

Enter: Submit message (unless Shift is held)
Shift+Enter: Insert newline
Ctrl+A: Select all text

3. Conversation Management Commands

CommandProcessor Class

class CommandProcessor:
    """Parses and executes slash commands."""
    
    COMMANDS = {
        "/new": "start_new_conversation",
        "/clear": "start_new_conversation",  # Alias for /new
        "/models": "list_models",
        "/model": "switch_model",
        "/resume": "resume_conversation",
        "/list": "list_conversations",
    }
    
    def is_command(self, text: str) -> bool:
        """Check if text starts with a command."""
        
    def execute(self, text: str) -> CommandResult:
        """Parse and execute command, return result."""

Command Implementations

/new and /clear

Save current conversation with timestamp-based ID
Reset conversation manager to new default conversation
Clear message list UI
Show confirmation message

/models

Query ollama_client.list_models()
Display formatted list in message area
Highlight current model

/model <name>

Validate model name against available models
Update _current_model attribute
Update model label in header
Show confirmation message

/list

Scan conversation storage directory
Display conversations with ID, timestamp, message count
Format as selectable list

/resume <id>

Load specified conversation via ConversationManager
Clear and repopulate message list
Update window title/header with conversation ID

UI Integration

Check for commands in _on_submit() before processing as user message
Display command results as system messages (distinct styling)
Provide command help via /help command
Support tab completion for commands (future enhancement)

4. Conversation Persistence and Resume

ConversationArchive Extension

Extend ConversationManager with multi-conversation capabilities:

class ConversationArchive:
    """Manages multiple conversation files."""
    
    def __init__(self, storage_dir: Path):
        self._storage_dir = storage_dir
    
    def list_conversations(self) -> List[ConversationMetadata]:
        """Return metadata for all saved conversations."""
        
    def archive_conversation(self, conversation_id: str) -> str:
        """Save conversation with timestamp-based archive ID."""
        
    def load_conversation(self, archive_id: str) -> ConversationState:
        """Load archived conversation by ID."""
        
    def generate_archive_id(self) -> str:
        """Create unique ID: YYYYMMDD_HHMMSS_<short-hash>"""

File Naming Convention

Active conversation: default.json
Archived conversations: archive_YYYYMMDD_HHMMSS_<hash>.json
Metadata includes: id, created_at, updated_at, message_count, first_message_preview

Workflow

User types /new or /clear
Current conversation saved as archive file
New ConversationManager instance created with "default" ID
UI cleared and reset
Confirmation message shows archive ID
User types /list
System scans storage directory for archive files
Displays formatted list with metadata
User types /resume <id>
ConversationManager loads specified archive
UI repopulated with conversation history
User can continue conversation

5. Reasoning Mode Toggle

ReasoningController Class

class ReasoningController:
    """Manages reasoning mode state and API parameters."""
    
    def __init__(self):
        self._enabled = False
        self._preference_file = Path.home() / ".config" / "aisidebar" / "preferences.json"
    
    def is_enabled(self) -> bool:
        """Check if reasoning mode is active."""
        
    def toggle(self) -> bool:
        """Toggle reasoning mode and persist preference."""
        
    def get_chat_options(self) -> dict:
        """Return Ollama API options for reasoning mode."""

UI Components

Add toggle button to header area:

self._reasoning_toggle = Gtk.ToggleButton(label="🧠 Reasoning")
self._reasoning_toggle.connect("toggled", self._on_reasoning_toggled)

Ollama Integration

When reasoning mode is enabled, pass additional options to Ollama:

# Standard mode
ollama.chat(model=model, messages=messages)

# Reasoning mode (model-dependent)
ollama.chat(
    model=model,
    messages=messages,
    options={
        "temperature": 0.7,
        # Model-specific reasoning parameters
    }
)

Message Formatting

When reasoning is enabled and model supports it:

Display thinking process in distinct style (italic, gray text)
Separate reasoning from final answer with visual divider
Use expandable/collapsible section for reasoning (optional)

Persistence

Save reasoning preference to ~/.config/aisidebar/preferences.json
Load preference on startup
Apply to all new conversations

Data Models

ConversationMetadata

@dataclass
class ConversationMetadata:
    """Metadata for conversation list display."""
    archive_id: str
    created_at: str
    updated_at: str
    message_count: int
    preview: str  # First 50 chars of first user message

CommandResult

@dataclass
class CommandResult:
    """Result of command execution."""
    success: bool
    message: str
    data: dict | None = None

PreferencesState

@dataclass
class PreferencesState:
    """User preferences for sidebar behavior."""
    reasoning_enabled: bool = False
    default_model: str | None = None
    theme: str = "default"

Error Handling

Streaming Errors

Connection Lost: Display partial response + error message, allow retry
Model Unavailable: Fall back to non-streaming mode with error notice
Stream Timeout: Cancel after 60s, show timeout message

Command Errors

Invalid Command: Show available commands with /help
Invalid Arguments: Display command usage syntax
File Not Found: Handle missing conversation archives gracefully
Permission Errors: Show clear error message for storage access issues

Conversation Loading Errors

Corrupted JSON: Log error, skip file, continue with other conversations
Missing Files: Remove from list, show warning
Version Mismatch: Attempt migration or show incompatibility notice

Testing Strategy

Unit Tests

StreamingHandler
- Token buffering logic
- Thread-safe UI updates
- Stream completion handling
CommandProcessor
- Command parsing (valid/invalid formats)
- Each command execution path
- Error handling for malformed commands
ConversationArchive
- Archive ID generation uniqueness
- List/load/save operations
- File system error handling
ReasoningController
- Toggle state management
- Preference persistence
- API option generation

Integration Tests

End-to-End Streaming
- Mock Ollama stream response
- Verify UI updates occur
- Check final message persistence
Command Workflows
- /new → archive → /list → /resume flow
- Model switching with active conversation
- Command execution during streaming (edge case)
Multi-line Input
- Text wrapping behavior
- Submit vs newline key handling
- Height expansion limits

Manual Testing Checklist

Stream response displays smoothly without flicker
Multi-line input expands and wraps correctly
All commands execute successfully
Conversation archives persist across restarts
Resume loads correct conversation history
Reasoning toggle affects model behavior
UI remains responsive during streaming
Error states display helpful messages

Implementation Notes

GTK4 Threading Considerations

All UI updates must occur on main thread via GLib.idle_add()
Worker threads for Ollama API calls to prevent UI blocking
Use GLib.PRIORITY_DEFAULT for normal updates, GLib.PRIORITY_HIGH for critical UI state

Performance Optimizations

Buffer tokens (3-5 at a time) to reduce GLib.idle_add overhead
Limit scroll updates to every 100ms during streaming
Cache conversation metadata to avoid repeated file I/O
Lazy-load conversation content only when resuming

Backward Compatibility

Existing default.json conversation file remains compatible
New archive files use distinct naming pattern
Preferences file is optional; defaults work without it
Graceful degradation if gtk4-layer-shell unavailable

Future Enhancements

Command history with up/down arrow navigation
Conversation search functionality
Export conversations to markdown
Custom keyboard shortcuts
Syntax highlighting for code in messages
Image/file attachment support

14 KiB Raw Blame History