summaryrefslogtreecommitdiff
path: root/CLAUDE.md
diff options
context:
space:
mode:
Diffstat (limited to 'CLAUDE.md')
-rw-r--r--CLAUDE.md97
1 files changed, 86 insertions, 11 deletions
diff --git a/CLAUDE.md b/CLAUDE.md
index c08d8c8..211510e 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -3,7 +3,7 @@
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
-SORMARK is a bookmark management application that uses LLMs to analyze and categorize bookmarks from Twitter (X), browser bookmarks, and other sources. The LLM processes each bookmark to categorize it, read linked content, and decide actions like storing in an Obsidian-like vault or creating calendar events.
+SORMARK is an AI-powered bookmark management application that analyzes and categorizes Twitter bookmarks using LLMs. It fetches bookmarks from Twitter, processes them with Google's Gemini AI to categorize content, and provides a workflow to review and organize bookmarks by category.
## Tech Stack
- **Frontend**: Waku (React framework with server components)
@@ -11,6 +11,8 @@ SORMARK is a bookmark management application that uses LLMs to analyze and categ
- **Styling**: Tailwind CSS v4
- **Language**: TypeScript with strict configuration
- **Development**: Devenv for environment management
+- **LLM**: Google Gemini 2.5-flash for categorization
+- **Data Storage**: JSON files and potential Obsidian integration
## Development Commands
@@ -37,9 +39,9 @@ bun start
### Environment Setup
The project uses devenv for environment management. Key environment variables:
-- `ANTHROPIC_BASE_URL`: Set to "https://api.moonshot.ai/anthropic"
-- `ANTHROPIC_AUTH_TOKEN`: API key for LLM integration (already configured)
-- `TWITTER_COKI`: Cookie used on the x.com frontend.
+- `GEMINI_API_KEY`: Google Gemini API key for LLM categorization
+- `TWITTER_COKI`: Cookie used on x.com frontend for bookmark access
+- `TWATTER_COKI`: Alternative environment variable name for Twitter cookie
## Project Structure
```
@@ -47,20 +49,93 @@ app/
├── src/
│ ├── pages/ # Waku pages (file-based routing)
│ │ ├── _layout.tsx # Root layout component
-│ │ ├── index.tsx # Home page
+│ │ ├── index.tsx # Home page - displays all bookmarks
+│ │ ├── categorize.tsx # Bookmark categorization workflow
│ │ └── about.tsx # About page
│ ├── components/ # React components
│ │ ├── counter.tsx # Client-side counter example
│ │ ├── header.tsx # Site header
│ │ └── footer.tsx # Site footer
+│ ├── lib/ # Core libraries
+│ │ ├── twitter-api.ts # Twitter API integration
+│ │ ├── categorization.ts # Category definitions and types
+│ │ ├── llm-service.ts # Google Gemini integration
+│ │ ├── llm-prompts.ts # LLM categorization prompts
+│ │ ├── bookmark-storage.ts # Bookmark storage utilities
+│ │ └── testData.json # Test bookmark data
│ └── styles.css # Global styles with Tailwind
├── public/ # Static assets
└── package.json # Dependencies and scripts
```
-## Architecture Notes
-- Uses Waku framework with React Server Components
-- File-based routing in `src/pages/` directory
-- Static rendering configured for all pages
-- Client components use `'use client'` directive
-- Tailwind CSS v4 configured with PostCSS
+## Core Features Implemented
+
+### ✅ Twitter Integration
+- **Bookmark Fetching**: Fetches all bookmarks from Twitter using authenticated API calls
+- **Media Processing**: Extracts images and video thumbnails from bookmarks
+- **Rate Limiting**: Respects Twitter API limits with 1-second delays between requests
+- **Error Handling**: Comprehensive error handling for API failures
+
+### ✅ AI Categorization
+- **LLM Integration**: Uses Google Gemini 2.5-flash for intelligent content categorization
+- **Image Analysis**: Analyzes bookmark images to enhance categorization accuracy
+- **Custom Categories**: User-defined category system with criteria
+- **Multi-category Support**: Allows bookmarks to belong to multiple categories
+- **Confidence Scoring**: Provides confidence levels for categorization suggestions
+
+### ✅ Server-Rendered Categorization Workflow
+- **Progressive Processing**: Processes bookmarks one-by-one with server-side LLM calls
+- **Progress Tracking**: Shows "Bookmark X of Y" with visual progress bar
+- **Category Selection**: Checkbox-based interface for selecting categories
+- **Save & Next**: Form-based navigation to next bookmark after categorization
+- **Skip Functionality**: Allows skipping bookmarks without categorization
+
+### ✅ Bookmark Management
+- **Complete Data**: Stores tweet text, author info, media, hashtags, and URLs
+- **Search & Filter**: Filter bookmarks by categories and search text
+- **Remove Bookmarks**: API endpoint to remove bookmarks from Twitter
+- **Export Options**: Integration with Obsidian vault for note storage
+
+## Next Steps & Roadmap
+
+### 🔧 Immediate Improvements
+1. **Database Integration**: Replace JSON storage with SQLite/PostgreSQL for persistence
+2. **Category Management**: Add UI for managing custom categories
+3. **Search/Filter**: Implement advanced filtering by date, author, categories
+4. **Bulk Operations**: Allow bulk categorization of similar bookmarks
+
+### 🚀 Advanced Features
+1. **Smart Suggestions**: Improve LLM prompts based on user feedback
+2. **Content Analysis**: Extract and analyze linked article content
+3. **Calendar Integration**: Create events from bookmarks with dates
+4. **RSS Export**: Generate RSS feeds for categorized bookmarks
+5. **Browser Extension**: Chrome/Firefox extension for bookmarking directly
+
+### 🎯 User Experience
+1. **Bookmark Queues**: Create queues for different processing priorities
+2. **Keyboard Shortcuts**: Add keyboard navigation for faster categorization
+3. **Dark Mode**: Implement dark theme support
+4. **Mobile Responsive**: Optimize for mobile devices
+
+### 📝 Data Persistence
+1. **Save Configuration**: Store user category preferences
+2. **Processing State**: Save categorization progress across sessions
+3. **Export Formats**: Add CSV, JSON, and Markdown export options
+4. **Backup/Restore**: Implement bookmark backup and restore functionality
+
+## Environment Variables Required
+```bash
+# Required
+GEMINI_API_KEY=your_google_gemini_api_key
+TWITTER_COKI=your_twitter_authentication_cookie
+
+# Optional
+LLM_BASE_URL=custom_llm_endpoint_if_needed
+LLM_API_KEY=custom_llm_api_key_if_needed
+```
+
+## Development Tips
+- **Test Data**: Use `testData.json` for development without Twitter API calls
+- **Cookie Refresh**: Twitter cookies expire frequently - refresh as needed
+- **Rate Limits**: Be mindful of Twitter API rate limits during development
+- **LLM Costs**: Monitor Gemini API usage during development