README
¶
Go Bulk Git Tools
A comprehensive suite of Go command-line tools for bulk Git repository operations with support for GitHub, GitLab, and Gerrit.
Table of Contents
Features
- Multi-platform Git hosting support: GitHub, GitLab, and Gerrit
- Cross-provider forking: Fork/clone repositories between different Git hosting providers
- SSH authentication: Full SSH support with automatic key detection and hardware token support
- Intelligent thread pooling: Configurable worker threads with automatic rate limiting detection
- Exponential backoff: Automatic retry with exponential backoff for failed operations
- Archive filtering: Skip archived repositories by default with optional inclusion
- Rich CLI interface: Built with Cobra for comprehensive help and shell completion
- Comprehensive testing: Full test suite with coverage reporting
- Modular design: Reusable components for building additional tools
- Password prompt prevention: Automatic detection and prevention of credential helper prompts
- Gerrit SSH enumeration: SSH-based repository discovery for protected Gerrit instances
SSH Authentication Implementation
Core Features
- Auto-detection of SSH infrastructure:
- SSH agent socket path (
SSH_AUTH_SOCK
) - Common SSH key files (
~/.ssh/id_*
) - SSH config file (
~/.ssh/config
)
- SSH agent socket path (
- Multiple authentication methods:
- SSH Agent authentication (works with hardware tokens like Secretive)
- SSH key file authentication with passphrase support
- Host-specific SSH configurations
- Git SSH wrapper generation:
- Automatic SSH wrapper script creation
- Environment variable setup (
GIT_SSH
) - Provider-specific SSH options
Provider Integration
- GitHub Provider: SSH authentication with
git@github.com
format, port 22 - GitLab Provider: SSH authentication with
git@gitlab.com
format, port 22 - Gerrit Provider: SSH authentication with custom port (29418), format
ssh://host:29418/repo
SSH Authentication Priority
- SSH Agent (if
SSH_AUTH_SOCK
environment variable exists) - SSH Key Files (in order of preference):
~/.ssh/id_ed25519
~/.ssh/id_rsa
~/.ssh/id_ecdsa
~/.ssh/id_dsa
- SSH Config (host-specific configurations from
~/.ssh/config
)
Password Prompt Prevention
The tool implements comprehensive password prompt detection and prevention:
Detection Capabilities
- Credential Helper Patterns: Detects
osxkeychain
,manager-core
, password prompts - SSH Fallback Scenarios: Identifies when SSH auth fails and git falls back to HTTPS
- Terminal Prompt Detection: Recognizes various credential helper activation patterns
Prevention Mechanisms
- Default Protection:
--disable-credential-helpers=true
(default) - Git Command Enhancement: Adds
-c credential.helper= -c core.askpass=
to git commands - Environment Variables: Sets
GIT_ASKPASS="", SSH_ASKPASS="", GIT_TERMINAL_PROMPT=0
- Runtime Monitoring: Analyzes git output for credential helper usage
User Guidance
Provides specific troubleshooting guidance when credential helpers are detected:
- Exact commands to prevent prompts
- SSH authentication setup guidance
- Configuration validation steps
Installation
go install github.com/ModeSevenIndustrialSolutions/git-bulk/cmd/git-bulk@latest
Or build from source:
git clone https://github.com/ModeSevenIndustrialSolutions/git-bulk
cd git-bulk
make build
# Install to your Go bin path
make install
# Or install system-wide (requires sudo)
make install-system
Quick Start
# Clone all repositories from a GitHub organization
git-bulk clone github.com/myorg --output ./repos
# Fork repositories from GitHub to GitLab (cross-provider)
git-bulk clone --source github.com/sourceorg --target gitlab.com/targetgroup
# SSH is the default
git-bulk clone github.com/myorg --output ./repos
# Validate SSH setup
git-bulk ssh-setup --verbose
Usage
Clone Operations
# Clone all repositories from a GitHub organization
git-bulk clone github.com/myorg --output ./repos
# Clone from GitLab
git-bulk clone gitlab.com/mygroup --output ./repos
# Clone from Gerrit
git-bulk clone https://gerrit.example.com --output ./repos
# SSH is the default; HTTPS fallback requires interactive approval (use --allow-https-fallback for non-interactive environments)
git-bulk clone github.com/myorg --allow-https-fallback --output ./repos
git-bulk clone github.com/myorg --output ./repos
# Dry run to see what would be cloned
git-bulk clone github.com/myorg --dry-run --verbose
# Limit number of repositories
git-bulk clone github.com/myorg --max-repos 10 --output ./repos
# Include archived repositories (skipped by default)
git-bulk clone github.com/myorg --clone-archived --output ./repos
# Use custom credentials file
git-bulk clone github.com/myorg --credentials-file ./my-credentials --output ./repos
Note: SSH is the default for all clone and fork operations. If an SSH attempt fails, git-bulk will ask you to approve a one-time HTTPS fallback before proceeding, or you can pass --allow-https-fallback to permit non-interactive fallback (useful for CI).
Fork Operations
Same-Provider Forking
# GitHub to GitHub (native fork)
git-bulk clone --source github.com/sourceorg --target github.com/targetorg
# GitLab to GitLab (native fork)
git-bulk clone --source gitlab.com/sourcegroup --target gitlab.com/targetgroup
# Enable sync mode to update existing forks
git-bulk clone --source github.com/sourceorg --target github.com/targetorg --sync
Cross-Provider Forking
# GitHub to GitLab
git-bulk clone --source github.com/sourceorg --target gitlab.com/targetgroup
# GitLab to GitHub
git-bulk clone --source gitlab.com/sourcegroup --target github.com/targetorg
# Gerrit to GitHub
git-bulk clone --source gerrit.example.com --target github.com/targetorg
# Gerrit to GitLab
git-bulk clone --source gerrit.example.com --target gitlab.com/targetgroup
Fork Options
# SSH is the default for fork operations
git-bulk clone --source github.com/sourceorg --target github.com/targetorg
# Fork with custom output directory
git-bulk clone -o ./forks --source github.com/sourceorg --target github.com/targetorg
# Dry run to see what would be forked
git-bulk clone --source github.com/sourceorg --target github.com/targetorg --dry-run
# Fork with verbose output
git-bulk clone --source github.com/sourceorg --target github.com/targetorg --verbose
Note: Fork functionality supports both same-provider and cross-provider operations:
Same-Provider Forking (GitHub→GitHub, GitLab→GitLab):
- Uses native fork APIs for true repository forks
- Supports
--sync
flag to update existing forks- Maintains fork relationship in the provider
Cross-Provider Forking (GitHub→GitLab, GitLab→GitHub, Gerrit→GitHub/GitLab):
- Creates new repositories and copies all content (branches, tags, history)
- Sets up proper
origin
(target) andupstream
(source) remotes- Does not maintain fork relationship (creates independent repositories)
- Gerrit can be used as source but not as target
All fork operations:
- Create repositories in target organization if they don't exist
- Skip existing repositories (unless
--sync
is used for same-provider)- Clone repositories locally with proper remote configuration
SSH Authentication
The tool provides transparent SSH authentication support that integrates with your existing SSH infrastructure including ssh-agent, GPG, and hardware security modules like Secretive (for Apple Silicon secure enclave).
SSH Setup and Validation
# Basic SSH setup validation
git-bulk ssh-setup
# Detailed SSH setup information
git-bulk ssh-setup --verbose
Using SSH for Operations
# SSH is the default for all clone operations
git-bulk clone github.com/myorg --output ./repos
# SSH works with all supported providers by default
git-bulk clone gitlab.com/mygroup --output ./repos
git-bulk clone https://gerrit.example.com --output ./repos
# Non-interactive HTTPS fallback when SSH fails
git-bulk clone github.com/myorg --allow-https-fallback --output ./repos
# SSH is the default with fork operations
git-bulk clone --source github.com/sourceorg --target gitlab.com/targetgroup
SSH Configuration
The tool automatically detects and uses:
- SSH Agent: Automatically detects
SSH_AUTH_SOCK
environment variable - SSH Keys: Auto-discovers common SSH key files in
~/.ssh/
(id_rsa, id_ed25519, etc.) - SSH Config: Reads
~/.ssh/config
for host-specific settings - Hardware Tokens: Works with hardware security modules and secure enclaves
SSH Provider Support:
- GitHub: Uses standard SSH port 22 with
git@github.com
- GitLab: Uses standard SSH port 22 with
git@gitlab.com
- Gerrit: Uses SSH port 29418 with SSH URL format
ssh://host:29418/repo
SSH Authentication Priority:
- SSH Agent (if available and contains loaded keys)
- SSH key files (with automatic passphrase detection)
- Fallback to HTTPS authentication if SSH fails
Advanced SSH Configuration
For custom SSH configurations, the tool respects standard SSH config files:
# ~/.ssh/config example
Host my-gerrit
HostName gerrit.company.com
Port 29418
User myusername
IdentityFile ~/.ssh/id_ed25519_work
ProxyCommand ssh gateway.company.com -W %h:%p
Host github.com
HostName github.com
User git
IdentityFile ~/.ssh/id_ed25519_personal
Configuration
Authentication Credentials
Set authentication tokens via environment variables:
export GITHUB_TOKEN="your_github_token"
export GITLAB_TOKEN="your_gitlab_token"
export GERRIT_USERNAME="your_gerrit_username"
export GERRIT_PASSWORD="your_gerrit_password"
Credentials File Support
You can also store credentials in a file instead of environment variables. The tool will automatically look for credential files in the following locations:
.credentials
(current directory).env
(current directory)~/.config/git-bulk/credentials
~/.git-bulk-credentials
Credentials file format:
# Git hosting provider tokens
GITHUB_TOKEN="ghp_your_github_token_here"
GITLAB_TOKEN="glpat-your_gitlab_token_here"
# Gerrit credentials
GERRIT_USERNAME="your_username"
GERRIT_PASSWORD="your_password"
# Comments and empty lines are ignored
Priority order for credentials:
- Command-line flags (
--github-token
,--gitlab-token
, etc.) - Environment variables (
GITHUB_TOKEN
,GITLAB_TOKEN
, etc.) - Credentials file values
Using a custom credentials file:
git-bulk clone github.com/myorg --credentials-file /path/to/my/credentials
View credential status:
git-bulk clone github.com/myorg --dry-run --verbose
# Shows which credentials are available with ✅/❌ indicators
Troubleshooting
Password Prompts During Bulk Operations
If you're experiencing unexpected password prompts during bulk clone operations (especially on macOS), this is typically caused by
git credential helpers like osxkeychain
. The tool now automatically prevents these prompts by default with comprehensive detection and
guidance.
🔍 Automatic Detection and Guidance
git-bulk automatically detects when git attempts to use credential helpers or password authentication during bulk operations and provides specific guidance:
⚠️ CREDENTIAL HELPER DETECTED: Git attempted password authentication
🔍 SSH authentication likely failed, git fell back to HTTPS authentication
đź’ˇ SSH FALLBACK DETECTED: Check SSH configuration and credentials
đź’ˇ SOLUTION: Use --disable-credential-helpers flag to prevent password prompts:
git-bulk clone git@github.com:myorg/repo.git --disable-credential-helpers
đź“– NOTE: Credential helpers are disabled by default since v1.0
đź”§ SSH ALTERNATIVE: Fix SSH authentication setup:
git-bulk ssh-setup --verbose
ssh -T git@github.com # Test SSH connectivity
🛡️ Prevention Mechanisms
Default Protection (Recommended):
# Default behavior - credential helpers disabled
git-bulk clone github.com/myorg
# Equivalent to:
git-bulk clone github.com/myorg --disable-credential-helpers=true
Manual Override (if needed):
# If you need interactive authentication for some reason
git-bulk clone github.com/myorg --disable-credential-helpers=false
Root Cause Analysis
Why This Happens:
- SSH authentication fails (invalid keys, ssh-agent issues, etc.)
- Git automatically falls back to HTTPS authentication
- Git credential helpers attempt to prompt for stored credentials
- This causes unexpected password prompts during bulk operations
Common Credential Helper Sources:
- macOS Keychain (
credential.helper=osxkeychain
) - Windows Credential Manager (
credential.helper=manager-core
) - Git credential store (
credential.helper=store
)
Detection Patterns
The tool detects various credential helper scenarios:
- Password prompts:
"Password for"
,"Username for"
- Credential helpers:
"keychain"
,"osxkeychain"
,"manager-core"
- SSH failures:
"Permission denied (publickey)"
,"Host key verification failed"
- Terminal prompts:
"terminal prompts disabled"
,"could not read Username"
Prevention Measures
-
Use SSH authentication (prevents fallback to HTTPS)
-
Keep credential helpers disabled (default behavior)
-
Validate SSH setup regularly:
git-bulk ssh-setup --verbose
-
Check your git configuration:
git config --global --list | grep credential
Technical Implementation
The tool implements multiple layers of protection:
- Git command flags:
-c credential.helper= -c core.askpass=
- Environment variables:
GIT_ASKPASS="", SSH_ASKPASS="", GIT_TERMINAL_PROMPT=0
- Runtime detection: Monitors git output for credential helper patterns
- User guidance: Provides specific troubleshooting steps when issues are detected
SSH Authentication Details
The tool provides comprehensive SSH authentication support that integrates with your existing SSH infrastructure including ssh-agent, GPG, and hardware security modules like Secretive (for Apple Silicon secure enclave).
SSH Configuration and Validation
# Basic SSH setup validation
git-bulk ssh-setup
# Detailed SSH setup information
git-bulk ssh-setup --verbose
SSH Operations Examples
# SSH is the default for all clone operations
git-bulk clone github.com/myorg --output ./repos
# SSH works with all supported providers by default
git-bulk clone gitlab.com/mygroup --output ./repos
git-bulk clone https://gerrit.example.com --output ./repos
# SSH is the default with fork operations
git-bulk clone --source github.com/sourceorg --target gitlab.com/targetgroup
SSH Configuration Details
The tool automatically detects and uses:
- SSH Agent: Automatically detects
SSH_AUTH_SOCK
environment variable - SSH Keys: Auto-discovers common SSH key files in
~/.ssh/
(id_rsa, id_ed25519, etc.) - SSH Config: Reads
~/.ssh/config
for host-specific settings - Hardware Tokens: Works with hardware security modules and secure enclaves
SSH Provider Support:
- GitHub: Uses standard SSH port 22 with
git@github.com
- GitLab: Uses standard SSH port 22 with
git@gitlab.com
- Gerrit: Uses SSH port 29418 with SSH URL format
ssh://host:29418/repo
SSH Authentication Priority:
- SSH Agent (if available and contains loaded keys)
- SSH key files (with automatic passphrase detection)
- Fallback to HTTPS authentication if SSH fails
Advanced SSH Configuration Details
For custom SSH configurations, the tool respects standard SSH config files:
# ~/.ssh/config example
Host my-gerrit
HostName gerrit.company.com
Port 29418
User myusername
IdentityFile ~/.ssh/id_ed25519_work
ProxyCommand ssh gateway.company.com -W %h:%p
Host github.com
HostName github.com
User git
IdentityFile ~/.ssh/id_ed25519_personal
SSH Implementation Features
- Auto-detection of SSH infrastructure: SSH agent socket, key files, config files
- Multiple authentication methods: SSH Agent, key files with passphrase support
- Git SSH wrapper generation: Automatic SSH wrapper script creation
- Provider-specific SSH options: Optimized for each Git hosting provider
- Connection testing: Built-in SSH connectivity validation
SSH Troubleshooting
Validate SSH setup:
git-bulk ssh-setup --verbose
Test SSH connectivity manually:
ssh -T git@github.com
ssh -T git@gitlab.com
Check SSH agent:
ssh-add -l
Common SSH Issues:
- Key not loaded: Use
ssh-add ~/.ssh/id_ed25519
to load keys - Wrong permissions: SSH keys should be
600
,.ssh
directory should be700
- Host key verification: First connection may require host key acceptance
- Hardware tokens: Ensure hardware security modules are properly configured
Common Issues
Rate Limiting:
- The tool automatically detects and handles rate limits
- Use
--verbose
to see rate limiting messages - Consider using personal access tokens for higher rate limits
Authentication Errors:
- Verify tokens have correct permissions
- Check that tokens aren't expired
- Use
--verbose
to see detailed authentication status
Network Timeouts:
- Adjust timeout settings:
--timeout 60m --clone-timeout 10m
- Check network connectivity to Git hosting providers
Repository Not Found:
- Verify organization/user names are correct
- Ensure you have access to private repositories
- Check if repositories exist and aren't archived (unless
--clone-archived
is used)
Advanced Features
Cross-Provider Repository Migration
The tool supports cross-provider "fork-like" functionality, allowing users to copy repositories from one Git hosting provider to another while maintaining proper Git history and remote configuration.
Supported Cross-Provider Operations
- GitHub → GitLab: Clone GitHub repos and create new GitLab projects
- GitLab → GitHub: Clone GitLab projects and create new GitHub repos
- Gerrit → GitHub: Clone Gerrit projects and create new GitHub repos
- Gerrit → GitLab: Clone Gerrit projects and create new GitLab projects
Cross-Provider Workflow
- Repository Discovery: List repositories from source provider
- Existence Check: Check if repository exists in target provider
- Repository Creation: Create empty repository in target provider
- Content Transfer: Clone source repository (bare) and push to target
- Local Clone: Clone target repository locally with proper remotes
- Remote Setup: Configure
origin
(target) andupstream
(source) remotes
Remote Configuration
Same-Provider Operations:
origin
: Points to fork in target organizationupstream
: Points to original repository in source organization
Cross-Provider Operations:
origin
: Points to new repository in target providerupstream
: Points to original repository in source provider
Example for GitHub → GitLab:
# origin: git@gitlab.com:targetgroup/repo.git
# upstream: git@github.com:sourceorg/repo.git
Limitations
- Gerrit as Target: Gerrit cannot be used as a target (no CreateRepository support)
- Cross-Provider Sync:
--sync
only works for same-provider operations - Fork Relationships: Cross-provider operations create independent repositories (not true forks)
Archive Filtering
By default, the tool skips archived/read-only repositories to focus on active development. You can override this behavior
with the --clone-archived
flag.
Archive Detection
- GitHub/GitLab: Repositories with
archived
field set totrue
- Gerrit: Projects with
state
field set toREAD_ONLY
orHIDDEN
Archive Handling Usage
# Default: skip archived repositories
git-bulk clone github.com/myorg
# Include archived repositories
git-bulk clone github.com/myorg --clone-archived
# See which repositories are filtered
git-bulk clone github.com/myorg --verbose --dry-run
Gerrit SSH Enumeration
For Gerrit instances protected by services like Cloudflare, the tool automatically falls back to SSH-based repository enumeration when HTTP/HTTPS API endpoints fail.
SSH Enumeration Features:
- Automatic Fallback: Triggers when HTTP methods fail and SSH is available
- Native Commands: Uses Gerrit's
ls-projects
SSH command - JSON Support: Attempts JSON format with description, falls back to plain text
- Compatibility: Maintains compatibility with HTTP API responses
SSH Connection Flow:
- HTTP First: Always attempts HTTP methods first for better performance
- SSH Fallback: Only triggers SSH when HTTP fails and SSH is available
- Command Execution: Runs
gerrit ls-projects --format json --description
- Output Processing: Parses command output into repository objects
- Graceful Degradation: Maintains compatibility with existing workflows
Usage:
# Interactive HTTPS fallback when SSH fails
git-bulk clone gerrit.example.com
# With Gerrit username
git-bulk clone gerrit.example.com --gerrit-user myusername
Implementation Details:
- Uses existing SSH infrastructure from
/internal/ssh/auth.go
- Establishes SSH connection to Gerrit server (port 29418)
- Runs native Gerrit commands for repository enumeration
- Gracefully handles both authenticated and anonymous access
- Maintains compatibility with HTTP API response format
Architecture
The tool is built around a modular thread pool architecture:
- Worker Pool: Manages concurrent operations with configurable thread count
- Rate Limiting: Automatically detects and handles API rate limits
- Retry Logic: Exponential backoff with configurable retry attempts
- Job Management: Persistent job state for manual retry of failed operations
- Provider Abstraction: Unified interface for different Git hosting providers
Provider Interface
The tool uses a unified provider interface that supports:
type Provider interface {
Name() string
ParseSource(source string) (*SourceInfo, error)
GetOrganization(ctx context.Context, orgName string) (*Organization, error)
ListRepositories(ctx context.Context, orgName string) ([]*Repository, error)
CreateFork(ctx context.Context, sourceRepo *Repository, targetOrg string) (*Repository, error)
CreateRepository(ctx context.Context, orgName, repoName, description string, private bool) (*Repository, error)
SyncRepository(ctx context.Context, repo *Repository) error
RepositoryExists(ctx context.Context, orgName, repoName string) (bool, error)
}
Development
Dependencies
The project uses the following major dependencies:
- CLI Framework:
github.com/spf13/cobra
for command-line interface - GitLab API:
gitlab.com/gitlab-org/api/client-go@v0.129.0
for GitLab integration - GitHub API:
github.com/google/go-github/v53
for GitHub integration - Rate Limiting:
golang.org/x/time@v0.11.0
for API rate limiting
Running tests
make test
# Run tests with coverage
make test-coverage
# Run integration tests
make test-integration
# Run CLI tests
make cli-test
Building
# Build for current platform
make build
# Build for multiple platforms
make build-all
# Clean build artifacts
make clean
Development workflow
# Set up development environment
make dev-setup
# Full development cycle
make all
# Run linting and security checks
make lint
make security
Testing
The project includes comprehensive testing:
- Unit tests: Located alongside source code in
internal/*/
directories - Integration tests: Located in
tests/
directory - Cross-Provider Tests: Validates fork functionality across different providers
- SSH Tests: Validates SSH authentication and connection handling
- Archive Filtering Tests: Validates repository filtering logic
Test Coverage
- âś… GitHub provider operations (clone, fork, sync)
- âś… GitLab provider operations (clone, fork, sync)
- âś… Gerrit provider operations (clone, SSH enumeration)
- ✅ Cross-provider forking (GitHub↔GitLab, Gerrit→GitHub/GitLab)
- âś… SSH authentication (keys, agent, hardware tokens)
- âś… Archive filtering (GitHub, GitLab, Gerrit)
- âś… Rate limiting and error handling
- âś… Credential management and validation
Run tests with:
# Run all tests
make test
# Run tests with coverage
make test-coverage
# Run integration tests
./tests/test-utils.sh integration
Contributing
When contributing new features or improvements, please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality (see
tests/
directory) - Ensure all tests pass
- Update documentation as needed
- Submit a pull request
Development Guidelines
- Follow existing code patterns and error handling approaches
- Add comprehensive tests for new features
- Update documentation for user-facing changes
- Ensure backward compatibility when possible
- Follow Go best practices and conventions
Pre-commit Checks
Before submitting changes, ensure:
# All tests pass
make test
# Code is properly formatted
make fmt
# Linting passes
make lint
# Security checks pass
make security
License
Apache-2.0 License - see LICENSE file for details.
Directories
¶
Path | Synopsis |
---|---|
cmd
|
|
git-bulk
command
|
|
internal
|
|
clone
Package clone provides functionality for cloning and managing Git repositories in bulk.
|
Package clone provides functionality for cloning and managing Git repositories in bulk. |
config
Package config provides credential management functionality for git-bulk operations.
|
Package config provides credential management functionality for git-bulk operations. |
provider
Package provider implements Git hosting provider interfaces for GitHub, GitLab, and Gerrit.
|
Package provider implements Git hosting provider interfaces for GitHub, GitLab, and Gerrit. |
ssh
Package ssh provides transparent SSH authentication support for Git operations
|
Package ssh provides transparent SSH authentication support for Git operations |
worker
Package worker provides a thread pool implementation for concurrent execution of jobs.
|
Package worker provides a thread pool implementation for concurrent execution of jobs. |