Getting Started With PicSift: Forensic-Grade Photo Deduplication for Your Workflow

If your photo library has grown past the point where you can manually remember what you have already imported, edited, or backed up, you have a deduplication problem. The longer it goes unsolved, the worse it gets — storage climbs, backups bloat, and finding the “real” version of a shot becomes a guessing game between five nearly identical files. PicSift is a desktop application built for exactly this: scan a folder, find exact and near-duplicate images (and videos), group them by photo shoot, rename them sequentially, and quarantine the extras — without permanently deleting a single file until you are ready.

This guide walks through the full workflow: installation and activation, your first scan, understanding duplicate detection, shoot grouping, sequential rename, and the safety model that keeps originals untouched. By the end, you will know enough to point PicSift at your messiest folder and walk away with a clean, chronologically ordered library.

Installation and Activation

PicSift is a Windows desktop application. After purchase, you receive a license key tied to your account. Launch START.bat from the install directory to open the application. On first run, PicSift contacts the activation server to validate your license and register the machine. After activation, the app re-validates weekly with a thirty-day offline grace period — so a brief internet outage will not lock you out.

Tier	Price	PCs	Updates
Starter	$29 one-time	1 PC	1 year
Unlimited	$59 one-time	Unlimited PCs	Lifetime

Both tiers include every feature. The difference is activation count and update duration. If you work across a desktop and a laptop, Unlimited pays for itself immediately; if you have one machine, Starter is plenty.

Your First Scan: Point, Scan, Review

The simplest workflow is three steps: tell PicSift which folder to scan, let it process, and review the results. By default, PicSift runs in dry-run mode — it shows you what it would do without moving or renaming anything. That means your first scan is risk-free: look at the report, understand the findings, and only execute when you are satisfied.

From the command line, a basic scan looks like this:

python -m picsift --input "D:\Photos\2025" --quarantine "D:\Photos\Quarantine"

PicSift will scan every supported image format in the input directory (JPG, PNG, GIF, BMP, TIFF, WebP, HEIC/HEIF) and output a detailed report. Add --include-videos to extend coverage to MP4, MOV, AVI, MKV, and other video formats (requires FFmpeg on your PATH for keyframe extraction).

Safety first

PicSift never permanently deletes files. Duplicates are moved to a quarantine folder, and a restore.ps1 PowerShell script is generated alongside every quarantine operation so you can undo any move with a single command. Dry-run is the default; you must explicitly pass --run to execute changes.

How Duplicate Detection Works

Not all duplicates are byte-identical copies. A JPEG exported from Lightroom and the same JPEG re-saved in Preview are visually identical but differ at the byte level. PicSift handles this with a three-tier detection system:

Tier 1 — Exact duplicates. SHA-256 hash match catches byte-identical files instantly. A pixel-hash pass then catches files that are pixel-identical but differ in metadata or encoding (for example, the same image saved as PNG and as lossless WebP).

Tier 2 — Near-duplicates. Perceptual hashing (pHash, dHash, aHash) compares the visual content of images regardless of resolution, compression, or minor edits. PicSift measures Hamming distance between hashes — the closer the distance, the more visually similar the images. The default threshold is 6; tighten it for stricter matching or loosen it to catch more aggressively cropped or filtered versions.

Tier 3 — Screenshot detection. Heuristics for border patterns, uniform regions, and aspect ratios flag screenshot-style images and match them against their source photos when possible.

Duplicate groups are clustered using a union-find algorithm, so if image A matches image B and image B matches image C, all three end up in the same group even if A and C are not direct matches.

Quality Scoring: PicSift Keeps the Best Copy

Within each duplicate group, PicSift does not pick a keeper at random. It scores every file on a weighted scale:

Resolution (35%): Higher pixel dimensions score higher.
Sharpness (25%): Measured via Laplacian variance — sharper images score higher.
Compression quality (15%): Less-compressed files (larger JPEG quality settings) score higher.
File size efficiency (10%): Balances quality-per-byte.
Metadata completeness (5%): Files with intact EXIF data score higher.
Screenshot penalty (10%): Detected screenshots score lower unless they are the only copy.

The highest-scoring file becomes the keeper; everything else moves to quarantine. This means you consistently end up with the best version of every image, not just the first one PicSift happened to find.

Shoot Grouping: Keep Sessions Together

If you import photos from multiple sessions into one folder — a weekend trip, a Tuesday headshot session, a birthday party — they all land in chronological chaos. PicSift’s shoot grouping clusters images by visual similarity and capture timestamp so related photos stay together.

The grouper uses perceptual hash distances (pHash and dHash) within a configurable threshold, then sorts groups by the earliest EXIF capture time. The result is a series of coherent “shoots” that respect the way you actually took the photos, even if the files were renamed, re-imported, or scattered across subfolders.

Shoot grouping is especially useful as a precursor to sequential rename: group first so related photos are clustered, then rename so file names reflect chronological order within each group.

Sequential Rename: Clean, Consistent File Names

After grouping and deduplication, sequential rename assigns clean, ordered names to your keepers. The default pattern is Photo (1).jpg, Photo (2).jpg, and so on, but you can customize the pattern with the --rename-pattern flag:

python -m picsift --input "D:\Photos\2025" --rename --rename-pattern "Trip ({n})" --run

Renaming uses a two-phase approach: files are first renamed to unique temporary names to avoid collisions (critical when Photo (3) needs to become Photo (1)), then moved to their final names. Original modification timestamps are preserved where the OS allows, and a restore script is generated so you can revert the rename if needed.

Full workflow in one command

Scan for duplicates, group by shoot, rename keepers, and quarantine extras:

python -m picsift --input "D:\Photos" --quarantine "D:\Quarantine" --rename --rename-pattern "Session ({n})" --include-videos --run

Remove --run for a dry-run preview first.

Reports and Audit Trail

Every PicSift run generates output files you can review before or after execution:

report.csv — Tabular summary of every file, its duplicate group, quality score, and action taken.
groups.json — Machine-readable duplicate group data for scripting or integration.
actions.log — Timestamped log of every move or rename operation.
plan.json — The execution plan PicSift will follow (or followed) during the run.
verification.txt — Post-run integrity check confirming all operations completed.
review.html — Optional visual gallery (--review-html) showing duplicate groups side by side for manual verification.

The audit trail exists so you never have to trust the tool blindly. Review the CSV, spot-check the HTML gallery, and only commit to the quarantine when you are confident in the results. PicSift is built for professionals who need accountability, not just convenience.

Performance and Scale

PicSift uses an SQLite cache to avoid re-hashing files on subsequent runs, parallel workers for hash computation, and lazy imports to keep startup fast. In practice, a folder of roughly 1,600 files scans in under four minutes, and libraries of 10,000 or more images complete in well under an hour. If you run PicSift regularly (weekly imports, monthly archive cleanups), the cache means re-scans are nearly instant for unchanged files.

Where PicSift Fits in the Wigley Studios Ecosystem

PicSift solves a different problem than the rest of the Wigley Studios product line. PromptUI generates UI from text, the UI Kit Generator builds design systems, and Developer Labs offers free browser-based tools for design tokens, API contracts, and mock data. PicSift is for anyone who works with large volumes of media files: photographers, content creators, archivists, and teams managing shared asset libraries. It is a standalone desktop tool, not a browser service, because deduplication requires local file access and the kind of processing power that belongs on your machine.

If you shoot professionally, manage client deliverables, or simply have years of phone backups that have never been sorted, PicSift turns an afternoon of manual comparison into a few minutes of confident automation — with every original safely quarantined, never deleted.

Getting Started With PicSift: Forensic-Grade Photo Deduplication for Your Workflow

Installation and Activation

Your First Scan: Point, Scan, Review

Safety first

How Duplicate Detection Works

Quality Scoring: PicSift Keeps the Best Copy

Shoot Grouping: Keep Sessions Together

Sequential Rename: Clean, Consistent File Names

Full workflow in one command

Reports and Audit Trail

Performance and Scale

Where PicSift Fits in the Wigley Studios Ecosystem

Clean up your photo library today

Brandon Wigley

Getting Started With PicSift: Forensic-Grade Photo Deduplication for Your Workflow

Installation and Activation

Your First Scan: Point, Scan, Review

Safety first

How Duplicate Detection Works

Quality Scoring: PicSift Keeps the Best Copy

Shoot Grouping: Keep Sessions Together

Sequential Rename: Clean, Consistent File Names

Full workflow in one command

Reports and Audit Trail

Performance and Scale

Where PicSift Fits in the Wigley Studios Ecosystem

Clean up your photo library today

Brandon Wigley

Related Articles