dfiles: Sync Specific Files Across Git Repositories

dfiles: Sync Specific Files Across Git Repositories


I keep config files, AI agent skills, and workflow templates in multiple repositories, and I often only need a handful of paths from each one. Cloning an entire repository just to copy one file creates unnecessary local noise and maintenance overhead. I built dfiles to solve this with a manifest-driven workflow that syncs only the files and directories I explicitly declare.

Why I Built It

I needed a reliable way to keep AI agent skill files (SKILL.md) and shared dotfiles in sync across machines without pulling in complete repositories. In practice, those files are spread across different repos, and cloning everything just to grab one path is overkill.

I evaluated existing approaches, but each had trade-offs:

  • Git submodules: powerful, but they tightly couple repositories and add operational complexity when I only need a few files.
  • Manual copy-paste: quick at first, but error-prone and not reproducible.
  • Custom shell scripts: flexible, but usually fragile and hard to share across environments.

My goal was simple: one manifest file, one command, and only the paths I declared—nothing more.

What is dfiles?

dfiles is a TypeScript CLI tool that declaratively tracks and syncs specific files or directories from remote Git repositories into local destinations, without cloning full repos. It is published on npm as @madkoo/dfiles, and the source code lives at github.com/madkoo/distributed-files. It is cross-platform and runs with Node.js >= 16.7.0 and Git.

Key Features

  • Manifest-driven (dfiles.json) — commit it, share it, reproduce anywhere
  • Shallow clones (--depth 1) on first sync; incremental git fetch + pull after
  • SHA-256 change detection — files are never unnecessarily recopied
  • Supports both individual files and entire directory trees
  • Branch-aware — each entry pins its own branch
  • Offline status check without hitting the network
  • Private repo support via existing Git auth (SSH, credential manager, PAT)
  • Wildcard source paths (added in v1.1.0)

Installation

Prerequisites

  • Node.js >= 16.7.0
  • Git installed and available on PATH

Install via npm

npm install -g @madkoo/dfiles

Verify the installation:

dfiles --version

Quick Start

These four commands are enough to get operational.

Step 1: Initialize a Manifest

dfiles init

This creates dfiles.json in the current directory. I treat this file as the single source of truth for everything I sync.

Step 2: Add a Tracked Entry

dfiles add https://github.com/org/repo \
  path/to/file.md \
  ~/.local/destination/file.md \
  --branch main \
  --id my-entry

--id is how you reference an entry later when pulling a subset or removing it. source is the relative path inside the remote repository, and destination supports ~ expansion.

Step 3: Pull from Remotes

dfiles pull

Preview changes without writing files:

dfiles pull --dry-run

Target a single entry:

dfiles pull my-entry

Step 4: Check Sync State (Offline)

dfiles status

This reports each entry as current, outdated, or missing, and it does not require a network call.

Commands Reference

Command Description Key options
dfiles init [dir] Create a manifest file dir optional, defaults to current directory
dfiles add <repo> <src> <dest> Track a file or directory --branch, --id
dfiles pull [ids...] Sync tracked entries from remotes --dry-run, ids... optional to target specific entries
dfiles status Check sync state offline
dfiles list List all tracked entries --json for machine-readable output
dfiles remove <id> Remove a tracked entry id required

The Manifest File

Here is an example dfiles.json with entries from different repositories and branches:

{
  "version": 1,
  "entries": [
    {
      "id": "canvas-design-skill",
      "repo": "https://github.com/org/skills-repo",
      "branch": "main",
      "source": "skills/canvas-design/SKILL.md",
      "destination": "~/.claude/skills/canvas-design/SKILL.md"
    },
    {
      "id": "deploy-workflow",
      "repo": "https://github.com/org/templates-repo",
      "branch": "feature-branch",
      "source": "workflows/deploy.yml",
      "destination": "~/.local/workflows/deploy.yml"
    }
  ]
}

The key fields are straightforward:

  • id: stable identifier used by commands like dfiles pull <id> and dfiles remove <id>
  • repo: remote Git repository URL
  • branch: branch to sync for that specific entry
  • source: path inside the remote repository (file, directory, or wildcard path)
  • destination: local output path, including support for ~

Note: dfiles.json is discovered by walking parent directories (similar to how Git discovers .git), so I recommend committing it at the root of your project.

How It Works Under the Hood

dfiles caches repositories under ~/.dfiles/cache/<SHA-256-of-URL>/. On first sync, it performs a shallow git clone --depth 1 so it does not pull full history. On later syncs, it updates the cached clone using git fetch and git pull. Before copying, it compares file contents using SHA-256 hashes, so unchanged content is never recopied. This works for both single files and full directory trees.

Private Repositories

dfiles uses simple-git, which inherits process.env, so it automatically uses whatever Git authentication you already have configured—SSH keys, Git credential manager, or PAT-based setup in your .gitconfig. No extra authentication layer is needed in dfiles itself.

Conclusion

When important files are scattered across repositories, manual synchronization is slow, error-prone, and hard to reproduce. dfiles solves that with one declarative manifest (dfiles.json) and one sync command, while still staying efficient through shallow clones, cache reuse, and hash-based diffing.

If you want to try it, check the source repository at github.com/madkoo/distributed-files and the documentation site at madkoo.github.io/distributed-files. If you run into gaps or have improvements, open an issue or PR.