Data Flow

This document describes how data flows through the DeepLint system, focusing on the current implementation.

This is part of the Architecture documentation series:

Architecture Overview - High-level system architecture and design principles
Components - Detailed component descriptions and relationships
Data Flow (current page) - How data flows through the system

Overview

DeepLint's data flow is centered around the context building process and LLM-powered analysis, which gathers information from the Git repository and codebase, sends it to an LLM, and displays results.

Detailed Data Flow

1. Command Line Input

The data flow begins with the user running the DeepLint CLI:

deeplint [command] [options]

The CLI parses the command-line arguments using yargs and identifies the appropriate command to execute.

Data: Command name, options, and arguments

2. Command Execution

The command registry identifies the command class and creates an instance. The command's execute() method is called with the parsed arguments.

Data: Parsed command-line arguments

3. Configuration Loading

The command loads configuration from:

Default values
Configuration file (deeplint.config.js or .deeplintrc.yml)
Environment variables
Command-line options

Data: Configuration object with typed properties

4. Git Integration

The context builder uses Git operations to:

Check if there are staged changes
Get the diff for staged changes
Parse the diff into a structured format

Data:

Staged files list
Diff information (file paths, additions, deletions, changes)

5. Repository Indexing

The repository indexing process involves several steps:

5.1 File System Scanning

The file system scanner traverses the repository to:

Build a directory structure
Collect file metadata
Categorize files by type

Data:

Repository structure (directories, files)
File metadata (path, size, type, last modified)

5.2 Dependency Analysis

The dependency analyzer examines files to:

Parse import/export statements
Build a dependency graph
Identify direct dependencies of changed files

Data:

Dependency graph
Direct dependencies of changed files

5.3 Code Structure Analysis

The code structure analyzer extracts:

Functions and their signatures
Classes and their methods
Interfaces and types
Export information

Data:

Code structure information (functions, classes, interfaces, types)
Export information

6. Context Assembly

The context builder assembles all the gathered information into a structured context:

interface LLMContext {
  repository: {
    name: string;
    root: string;
    structure: ContextRepositoryStructure;
  };
  changes: {
    files: ContextChange[];
    summary: string;
  };
  relatedFiles: ContextFile[];
  metadata: {
    contextSize: {
      totalTokens: number;
      changesTokens: number;
      relatedFilesTokens: number;
      structureTokens: number;
    };
    generatedAt: string;
    contextType: "light" | "deep";
    error?: {
      message: string;
      timestamp: string;
      phase?: string;
    };
  };
}

Data: Assembled context object

7. Token Management

The token counter ensures the context fits within LLM token limits:

Count tokens for each part of the context
Truncate file content if necessary
Track total token usage

Data:

Token counts
Truncated file content

8. Result Handling

The context builder returns the assembled context and statistics:

interface ContextBuildResult {
  context: LLMContext;
  stats: {
    totalFiles: number;
    changedFiles: number;
    relatedFiles: number;
    totalTokens: number;
    buildTime: number;
    error?: string;
  };
}

Data: Context build result

9. Output Formatting

The command formats the results for display:

Summary of context building
Statistics about the context
Debug information if requested

Data: Formatted output

Data Flow for Specific Commands

Default Command

The default command follows the full data flow described above:

Check for staged changes
Build context
Display results or dump context to a file

Init Command

The init command has a simpler data flow:

Check if configuration file exists
Create configuration file with default values
Display success message

Error Flow

When errors occur, the data flow includes error handling:

Each component includes error handling that:

Catches exceptions
Creates typed errors with context
Logs errors with appropriate level
Returns error information in the result

Data:

Error type
Error message
Error context
Stack trace (in debug mode)

Architecture Overview - High-level architecture and design principles
Components - Detailed component descriptions and relationships
Context Builder - Detailed documentation of the Context Builder component

The current implementation includes LLM-powered analysis. Auto-fix generation is planned for future implementation.

PreviousArchitecture Overview NextContext Builder

Last updated 2 months ago

Overview

Detailed Data Flow

1. Command Line Input

2. Command Execution

3. Configuration Loading

4. Git Integration

5. Repository Indexing

5.1 File System Scanning

5.2 Dependency Analysis

5.3 Code Structure Analysis

6. Context Assembly

7. Token Management

8. Result Handling

9. Output Formatting

Data Flow for Specific Commands

Default Command

Init Command

Error Flow

Related Resources