ClaudeFilePrep is a powerful Python utility that automates the collection of files from nested directories. Designed for Claude AI projects, it streamlines the process of preparing data by recursively gathering relevant files, allowing you to focus more on analysis and less on manual uploads.
ClaudeFilePrep is an intelligent Python utility designed to simplify the process of preparing files for Claude AI projects. This powerful tool recursively collects files from nested directories, allowing users to streamline their data uploads and maximize efficiency in their AI workflows.
šÆ Problem It Solves
Managing multiple files scattered across various folders can be cumbersome when working with AI tools like Claude. Manually navigating through complex directory structures and uploading files one by one is not only time-consuming but also increases the risk of errors. ClaudeFilePrep automates this tedious process by:
- Recursively discovering all relevant files across subdirectories.
- Organizing these files in a single location, preserving folder structures or flattening them as needed.
- Enabling bulk uploads of multiple files to Claude or similar platforms, saving time and effort.
š Features
- Recursive File Collection: Effortlessly navigates through subdirectories to find files.
- Selective Ignoring: Customize which folders and file patterns to skip during the collection process.
- Two Organization Modes:
- Hierarchical: Keeps the original folder structure intact.
- Flattened: Consolidates all files into one directory, renaming them to include path information.
- Metadata Preservation: Retains essential file metadata, such as timestamps and permissions.
- Error Handling: Provides detailed reporting to track successful operations and any failures.
- Customizable Separators: Choose how path components are joined in flattened mode for clearer organization.
š® Usage
Basic Usage
To copy all files while preserving the folder structure, simply run:
python file_collector.py source_directory
Flatten Directory Structure
To consolidate files into a single directory, utilize the flatten option:
python file_collector.py source_directory --flatten
Custom Output Directory
To specify a custom output directory, use:
python file_collector.py source_directory --output-dir my_output
Custom Separator for Flattened Mode
Customize the separator with:
python file_collector.py source_directory --flatten --separator=-
Python API Access
The tool can also be integrated as a Python module:
from file_collector import copy_files
# Preserve directory structure
copied, failed = copy_files("source_directory")
# Flatten directory structure
copied, failed = copy_files("source_directory", flatten=True)
# Custom configuration
copied, failed = copy_files(
root_dir="source_directory",
output_dir="custom_output",
ignore_folders={'.git', 'node_modules'},
ignore_patterns={'*.pyc', '*.log'},
flatten=True,
separator="-"
)
šÆ Use Cases
- AI Project Data Preparation: Quickly gather and organize all files necessary for uploading to Claude and streamline training data collection.
- Project Organization: Simplify the consolidation of files from complicated directory structures into flat archives.
- Backup and Migration: Efficiently collect specific file types for organized backups or seamless transfers between systems.
āļø Configuration
Ignored Folders (Default)
.git
node_modules
__pycache__
Ignored File Patterns (Default)
*.pyc
*.log
.DS_Store
š Example Output
Normal Mode Output
Given a project structure such as:
my_project/
āāā data/
ā āāā raw_data.csv
ā āāā processed/
ā āāā cleaned_data.csv
ā āāā feature_data.csv
āāā notebooks/
ā āāā analysis.ipynb
ā āāā visualization.ipynb
āāā src/
ā āāā __pycache__/
ā ā āāā utils.cpython-39.pyc
ā āāā utils.py
ā āāā main.py
āāā README.md
Running:
python file_collector.py my_project
Will generate:
result/
āāā data/
ā āāā raw_data.csv
ā āāā processed/
ā āāā cleaned_data.csv
ā āāā feature_data.csv
āāā notebooks/
ā āāā analysis.ipynb
ā āāā visualization.ipynb
āāā src/
ā āāā utils.py
ā āāā main.py
āāā README.md
Flattened Mode Output
Alternatively, running:
python file_collector.py my_project --flatten
Produces:
result/
āāā data_raw_data.csv
āāā data_processed_cleaned_data.csv
āāā data_processed_feature_data.csv
āāā notebooks_analysis.ipynb
āāā notebooks_visualization.ipynb
āāā src_utils.py
āāā src_main.py
āāā README.md
š¤ Contributing
Contributions are welcome! Enhance the project by adding new features, improving documentation, reporting bugs, or suggesting improvements.
š Acknowledgments
ClaudeFilePrep was inspired by the need to streamline file preparation for AI tools, ultimately aiming to make the lives of data scientists easier and more efficient.