PitchHut
Log in / Sign up
ClaudeFilePrep
3 views
Automate file collection for Claude AI projects with ease.
Pitch

ClaudeFilePrep is a powerful Python utility that automates the collection of files from nested directories. Designed for Claude AI projects, it streamlines the process of preparing data by recursively gathering relevant files, allowing you to focus more on analysis and less on manual uploads.

Description

ClaudeFilePrep is an intelligent Python utility designed to simplify the process of preparing files for Claude AI projects. This powerful tool recursively collects files from nested directories, allowing users to streamline their data uploads and maximize efficiency in their AI workflows.

šŸŽÆ Problem It Solves

Managing multiple files scattered across various folders can be cumbersome when working with AI tools like Claude. Manually navigating through complex directory structures and uploading files one by one is not only time-consuming but also increases the risk of errors. ClaudeFilePrep automates this tedious process by:

  1. Recursively discovering all relevant files across subdirectories.
  2. Organizing these files in a single location, preserving folder structures or flattening them as needed.
  3. Enabling bulk uploads of multiple files to Claude or similar platforms, saving time and effort.

šŸš€ Features

  • Recursive File Collection: Effortlessly navigates through subdirectories to find files.
  • Selective Ignoring: Customize which folders and file patterns to skip during the collection process.
  • Two Organization Modes:
    • Hierarchical: Keeps the original folder structure intact.
    • Flattened: Consolidates all files into one directory, renaming them to include path information.
  • Metadata Preservation: Retains essential file metadata, such as timestamps and permissions.
  • Error Handling: Provides detailed reporting to track successful operations and any failures.
  • Customizable Separators: Choose how path components are joined in flattened mode for clearer organization.

šŸŽ® Usage

Basic Usage

To copy all files while preserving the folder structure, simply run:

python file_collector.py source_directory

Flatten Directory Structure

To consolidate files into a single directory, utilize the flatten option:

python file_collector.py source_directory --flatten

Custom Output Directory

To specify a custom output directory, use:

python file_collector.py source_directory --output-dir my_output

Custom Separator for Flattened Mode

Customize the separator with:

python file_collector.py source_directory --flatten --separator=-

Python API Access

The tool can also be integrated as a Python module:

from file_collector import copy_files

# Preserve directory structure
copied, failed = copy_files("source_directory")

# Flatten directory structure
copied, failed = copy_files("source_directory", flatten=True)

# Custom configuration
copied, failed = copy_files(
    root_dir="source_directory",
    output_dir="custom_output",
    ignore_folders={'.git', 'node_modules'},
    ignore_patterns={'*.pyc', '*.log'},
    flatten=True,
    separator="-"
)

šŸŽÆ Use Cases

  1. AI Project Data Preparation: Quickly gather and organize all files necessary for uploading to Claude and streamline training data collection.
  2. Project Organization: Simplify the consolidation of files from complicated directory structures into flat archives.
  3. Backup and Migration: Efficiently collect specific file types for organized backups or seamless transfers between systems.

āš™ļø Configuration

Ignored Folders (Default)

  • .git
  • node_modules
  • __pycache__

Ignored File Patterns (Default)

  • *.pyc
  • *.log
  • .DS_Store

šŸ“ Example Output

Normal Mode Output

Given a project structure such as:

my_project/
ā”œā”€ā”€ data/
ā”‚   ā”œā”€ā”€ raw_data.csv
ā”‚   ā””ā”€ā”€ processed/
ā”‚       ā”œā”€ā”€ cleaned_data.csv
ā”‚       ā””ā”€ā”€ feature_data.csv
ā”œā”€ā”€ notebooks/
ā”‚   ā”œā”€ā”€ analysis.ipynb
ā”‚   ā””ā”€ā”€ visualization.ipynb
ā”œā”€ā”€ src/
ā”‚   ā”œā”€ā”€ __pycache__/
ā”‚   ā”‚   ā””ā”€ā”€ utils.cpython-39.pyc
ā”‚   ā”œā”€ā”€ utils.py
ā”‚   ā””ā”€ā”€ main.py
ā””ā”€ā”€ README.md

Running:

python file_collector.py my_project

Will generate:

result/
ā”œā”€ā”€ data/
ā”‚   ā”œā”€ā”€ raw_data.csv
ā”‚   ā””ā”€ā”€ processed/
ā”‚       ā”œā”€ā”€ cleaned_data.csv
ā”‚       ā””ā”€ā”€ feature_data.csv
ā”œā”€ā”€ notebooks/
ā”‚   ā”œā”€ā”€ analysis.ipynb
ā”‚   ā””ā”€ā”€ visualization.ipynb
ā”œā”€ā”€ src/
ā”‚   ā”œā”€ā”€ utils.py
ā”‚   ā””ā”€ā”€ main.py
ā””ā”€ā”€ README.md

Flattened Mode Output

Alternatively, running:

python file_collector.py my_project --flatten

Produces:

result/
ā”œā”€ā”€ data_raw_data.csv
ā”œā”€ā”€ data_processed_cleaned_data.csv
ā”œā”€ā”€ data_processed_feature_data.csv
ā”œā”€ā”€ notebooks_analysis.ipynb
ā”œā”€ā”€ notebooks_visualization.ipynb
ā”œā”€ā”€ src_utils.py
ā”œā”€ā”€ src_main.py
ā””ā”€ā”€ README.md

šŸ¤ Contributing

Contributions are welcome! Enhance the project by adding new features, improving documentation, reporting bugs, or suggesting improvements.

šŸ™ Acknowledgments

ClaudeFilePrep was inspired by the need to streamline file preparation for AI tools, ultimately aiming to make the lives of data scientists easier and more efficient.