Skip to content

About the algorithm

Lloyd Konneker edited this page May 30, 2021 · 4 revisions

Disambiguation

These terms from the literature describe what the Resynthesizer can accomplish:

  • inpainting
  • texture synthesis or transfer
  • content-aware fill
  • patch matching
  • reconstruction

Generally, similar algorithms seek to "plausibly reconstruct" an image, in some sense to understand "objects" in the image.

Patch matching

The algorithm does patch matching. A patch is a pixel and a small set of surrounding pixels. The patch need not be square or even contiguous.

The result from a matching is "for" a middle pixel of a patch. That is, the algorithm only changes that one pixel after a matching. A middle pixel is only approximately in the center, since a patch is digital and can be irregular.

The algorithm makes many passes. In this algorithm, during the first pass, the target is not filled completely in, but gradually gets filled in. The empty pixels are processed in random order. In the first pass, a patch is like a shotgun pattern: a small set of nearest pixels, but not necessarily nearby. The middle pixel has no value. A new value for the middle pixel, in the target, is found by searching the corpus for the best matching patch to the target pixel's patch (including its surrounding that may not be in the target area per se.)

In subsequent passes, the target is already filled in, but with imperfect results. But the patches are all regular, contiguous shapes (except for image borders.) Subsequent passes also process pixels at random.

The algorithm as search

The algorithm uses a search heuristic: a new best match for a pixel is often found (in the corpus) at the opposite offset from a pixel (in the target) to a neighbor (in the target) pixel's best match (in the corpus.)

The algorithm is adaptive: it quits when improvement from pass to pass diminishes below a threshold. Also, on any particular patch matching search, the algorithm stops searching when it finds a match whose quality is below a threshold.

For most uses, the algorithm is "search and replace". That is, after every search, it replaces a pixel in the target with the middle pixel of the best match in the corpus. So in subsequent passes, a searchee patch might be different than earlier passes.

The algorithm keeps the coordinates of the best matches in the corpus, but does not return those coordinates as a result. Search without replacement could yield a result: the coordinates of the best matches in the corpus. That would be another mode of operation of the algorithm. This has been partially implemented only in the "falseColorMatch" branch of the repository.

Matching

The matching or comparison is "best fit." Any patch in an image is likely to be unique: not have an exact match elsewhere.

The algorithm uses a special distance measure function, the Cauchy probability distribution function. See Paul Harrison's thesis for a discussion why this might be better than other measures.

A more general discussion of the algorithm as a best-fit algorithm in a multi-dimensional space is in Paul Harrison's thesis.

Memory use and kernels

Unlike many image filters, the Resynthesizer algorithm does not have a "kernel" in the sense of a computation that is scanned in a linear order across the image. Instead the Resynthesizer processes in a random order. A kernel has the property that its memory use pattern plays well with cached memory, especially when images are also "tiled" at a lower level of the image implementation (as is done in GIMP and Gegl.)

Because the Resynthesizer does not have a kernel, it needs a view of the entire image, i.e. it performs best when the entire image fits in memory, especially cache memory. Thus the Resynthesizer algorithm is very memory intensive, and not simple to multi-process. This particular implementation of the Resynthesizer algorithm emphasizes data structures that play well with cached memory, namely interleaving mask bytes with color bytes.

Other patch matching algorithms

There are other patch matching algorithms. One starts by completely filling in the target with random guesses from the corpus. Then the target is processed in a raster scan (scan line) order, again matching patches. A simple distance measure is used between patches. Processing in raster order lets the algorithm reuse some of computations for pixels above and to the left. Completely filling in the initial target lets the algorithm use square, vectorizable patches. The simple distance measure is also vectorizable.

(I briefly explored vectorization of the resynthesizer code. There are some remnants in the repository.)

You may also be interested in 'bidirectional similarity', which uses patch matching (and supplements it.)

More about the algorithm

The algorithm might be better than other algorithms because it uses two kinds of patches: shotgun and regular patches. The shotgun kind of patch is more complicated to implement, but might give better results in the early stages of the algorithm, and the overall result might depend on the quality of the early results.

The algorithm also implements several kinds of ordering or direction for the "random" search. For example it can fill in the target from the outside in, still randomly, but not totally randomly. A user can often choose a direction that gives better apparent results.