# Data Flow

This document describes how data flows through the DeepLint system, focusing on the current implementation.

{% hint style="info" %}
This is part of the Architecture documentation series:

1. [Architecture Overview](https://docs.deeplint.com/developer-guide/broken-reference) - High-level system architecture and design principles
2. [Components](https://docs.deeplint.com/developer-guide/broken-reference) - Detailed component descriptions and relationships
3. **Data Flow** (current page) - How data flows through the system
   {% endhint %}

## Overview

DeepLint's data flow is centered around the context building process and LLM-powered analysis, which gathers information from the Git repository and codebase, sends it to an LLM, and displays results.

## Detailed Data Flow

### 1. Command Line Input

The data flow begins with the user running the DeepLint CLI:

```bash
deeplint [command] [options]
```

The CLI parses the command-line arguments using yargs and identifies the appropriate command to execute.

**Data**: Command name, options, and arguments

### 2. Command Execution

The command registry identifies the command class and creates an instance. The command's `execute()` method is called with the parsed arguments.

**Data**: Parsed command-line arguments

### 3. Configuration Loading

The command loads configuration from:

* Default values
* Configuration file (deeplint.config.js or .deeplintrc.yml)
* Environment variables
* Command-line options

**Data**: Configuration object with typed properties

### 4. Git Integration

The context builder uses Git operations to:

1. Check if there are staged changes
2. Get the diff for staged changes
3. Parse the diff into a structured format

**Data**:

* Staged files list
* Diff information (file paths, additions, deletions, changes)

### 5. Repository Indexing

The repository indexing process involves several steps:

#### 5.1 File System Scanning

The file system scanner traverses the repository to:

1. Build a directory structure
2. Collect file metadata
3. Categorize files by type

**Data**:

* Repository structure (directories, files)
* File metadata (path, size, type, last modified)

#### 5.2 Dependency Analysis

The dependency analyzer examines files to:

1. Parse import/export statements
2. Build a dependency graph
3. Identify direct dependencies of changed files

**Data**:

* Dependency graph
* Direct dependencies of changed files

#### 5.3 Code Structure Analysis

The code structure analyzer extracts:

1. Functions and their signatures
2. Classes and their methods
3. Interfaces and types
4. Export information

**Data**:

* Code structure information (functions, classes, interfaces, types)
* Export information

### 6. Context Assembly

The context builder assembles all the gathered information into a structured context:

```typescript
interface LLMContext {
  repository: {
    name: string;
    root: string;
    structure: ContextRepositoryStructure;
  };
  changes: {
    files: ContextChange[];
    summary: string;
  };
  relatedFiles: ContextFile[];
  metadata: {
    contextSize: {
      totalTokens: number;
      changesTokens: number;
      relatedFilesTokens: number;
      structureTokens: number;
    };
    generatedAt: string;
    contextType: "light" | "deep";
    error?: {
      message: string;
      timestamp: string;
      phase?: string;
    };
  };
}
```

**Data**: Assembled context object

### 7. Token Management

The token counter ensures the context fits within LLM token limits:

1. Count tokens for each part of the context
2. Truncate file content if necessary
3. Track total token usage

**Data**:

* Token counts
* Truncated file content

### 8. Result Handling

The context builder returns the assembled context and statistics:

```typescript
interface ContextBuildResult {
  context: LLMContext;
  stats: {
    totalFiles: number;
    changedFiles: number;
    relatedFiles: number;
    totalTokens: number;
    buildTime: number;
    error?: string;
  };
}
```

**Data**: Context build result

### 9. Output Formatting

The command formats the results for display:

1. Summary of context building
2. Statistics about the context
3. Debug information if requested

**Data**: Formatted output

## Data Flow for Specific Commands

### Default Command

The default command follows the full data flow described above:

1. Check for staged changes
2. Build context
3. Display results or dump context to a file

### Init Command

The init command has a simpler data flow:

1. Check if configuration file exists
2. Create configuration file with default values
3. Display success message

## Error Flow

When errors occur, the data flow includes error handling:

Each component includes error handling that:

1. Catches exceptions
2. Creates typed errors with context
3. Logs errors with appropriate level
4. Returns error information in the result

**Data**:

* Error type
* Error message
* Error context
* Stack trace (in debug mode)

## Related Resources

* [Architecture Overview](https://docs.deeplint.com/developer-guide/broken-reference) - High-level architecture and design principles
* [Components](https://docs.deeplint.com/developer-guide/broken-reference) - Detailed component descriptions and relationships
* [Context Builder](https://docs.deeplint.com/developer-guide/overview-1) - Detailed documentation of the Context Builder component

{% hint style="info" %}
The current implementation includes LLM-powered analysis. Auto-fix generation is planned for future implementation.
{% endhint %}
