Batch Image Compression: Tips for Processing Images at Scale

Compressing one image at a time is not viable at scale. This guide covers batch workflows, tooling, automation, and CI/CD integration for teams processing large volumes of images.

The scale problem

Manual image compression works for a handful of files. It does not scale to product catalogues with thousands of images, blog platforms where authors upload arbitrary files, or e-commerce sites where new product photography arrives weekly. At scale, compression must be automated, consistent, and integrated into existing workflows.

This guide covers the approaches available — from command-line batch tools to CI/CD integration — with specific advice on format selection, quality settings, and the trade-offs of each approach.

Command-line batch processing

For one-off batch jobs or simple automation scripts, command-line tools are the most practical approach.

Sharp (Node.js) is the most capable and widely-used image processing library for batch work. It wraps libvips, which is significantly faster than ImageMagick for typical image operations. A simple batch script using Sharp:

const sharp = require('sharp');
const glob = require('glob');
const path = require('path');

const files = glob.sync('images/input/**/*.{jpg,jpeg,png}');

for (const file of files) {
  const ext = path.extname(file).toLowerCase();
  const output = file
    .replace('images/input', 'images/output')
    .replace(/.(jpg|jpeg|png)$/, '.webp');

  if (ext === '.png') {
    await sharp(file)
      .webp({ lossless: true })
      .toFile(output);
  } else {
    await sharp(file)
      .webp({ quality: 82 })
      .toFile(output);
  }
}

This converts PNGs to lossless WebP and JPEGs to lossy WebP at quality 82, maintaining the directory structure. Add withMetadata(false) to strip EXIF data from the output.

Format routing logic

The single most important decision in a batch pipeline is format routing: which output format to use for each input file. A naive approach — "convert everything to WebP" — misses the point that PNG (transparency, lossless) and JPEG (photographs) have different optimal treatments.

A robust routing strategy:

  • If the input is PNG with an alpha channel → output WebP lossless (to preserve transparency).
  • If the input is PNG without alpha and has ≤ 256 distinct colours → output WebP lossless (likely a graphic/icon).
  • If the input is PNG without alpha and has many colours → test both WebP lossless and WebP lossy at quality 90; keep smaller (photographs saved as PNG are edge cases).
  • If the input is JPEG → output WebP lossy at quality 82 (or strip EXIF and re-encode JPEG if WebP is not acceptable).
  • Always provide an original-format fallback alongside the WebP variant.

Parallelism and throughput

Image compression is CPU-bound. A single-threaded script processing thousands of images serially is slow. Use worker threads or process-level parallelism:

const { Worker, isMainThread, workerData } = require('worker_threads');
const os = require('os');

// Split files across os.cpus().length workers
const chunks = splitIntoChunks(files, os.cpus().length);
const workers = chunks.map(chunk =>
  new Worker(__filename, { workerData: { files: chunk } })
);

Sharp itself uses libvips internally, which parallelises within a single image operation. For bulk processing, process-level parallelism (multiple Sharp processes or worker threads) gives the best throughput on multi-core machines.

CI/CD integration

Images added to a repository or CMS can be automatically optimised on commit or upload. Common integration points:

GitHub Actions: A workflow that runs on push to trigger compression of any new or changed image files:

name: Optimise images
on:
  push:
    paths: ['public/images/**']
jobs:
  compress:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '20' }
      - run: npm ci
      - run: node scripts/compress-images.js
      - uses: stefanzweifel/git-auto-commit-action@v5
        with:
          commit_message: 'chore: optimise images'

Upload hooks in CMS platforms: Headless CMS platforms (Contentful, Sanity, Cloudinary) support transformation hooks on upload. Configure them to auto-convert and compress images at ingest time rather than at request time.

CDN-side on-the-fly transformation

Services like Cloudinary, Imgix, Bunny Stream, and Vercel's Image Optimization API handle compression and format conversion on the CDN edge. You upload a single master image; the CDN serves the right format, size, and quality for each device/browser based on URL parameters or Accept header negotiation.

Advantages: no pre-processing pipeline, automatic format serving (WebP for supporting browsers, JPEG/PNG for others), responsive image generation without storing every size variant.

Disadvantages: per-transformation costs at high volume, CDN vendor lock-in, and slightly higher first-request latency (the first request triggers transformation; subsequent requests are served from cache).

Quality and consistency

Batch pipelines need quality settings that work well across diverse inputs. A quality setting that is appropriate for studio product photography may over-compress a screenshot. Recommendations:

  • For photographs: WebP quality 80–85. Below 80, compression artefacts become noticeable on some images.
  • For mixed inputs you cannot classify: WebP quality 85 is a safe default that rarely produces visible artefacts.
  • For UI/screenshots/logos: always use lossless. Do not apply lossy compression to these — artefacts at sharp edges are disproportionately obvious.
  • Run a sample of outputs through visual review before deploying. Automated SSIM or PSNR metrics help but do not replace a human spot check.

Storing originals

Never discard original files. Store them in a separate location (a cloud storage bucket, a Git LFS store, an archive folder) before or alongside the optimised versions. Compression algorithms improve; you may want to re-process your archive with better tools in two years. If you have only the compressed outputs, you have lost the originals permanently.

For quick, manual batch compression without building a pipeline, compressanimage.com supports dropping multiple files at once and downloading compressed results — useful for ad-hoc batches where automation overhead is not justified.

batch compressionautomationCI/CDsharpimageminworkflows

Ready to compress your images without losing quality?

Try the free tool →