CS 180: Intro to Computer Vision and Computational Photography, Fall 2024

Project 1: Colorizing the Prokudin-Gorskii Photo Collection

Aishik Bhattacharyya

Overview

Sergei Mikhailovich Prokudin-Gorskii, as early as 1907, had a visionary idea--to take three exposures of every scene onto a glass plate with a red, blue, and green filter. He traveled the world and took color photographs everywhere, including the only color portrait of Leo Tolstoy. To honor his commitment, the goal of this project is to take these digitized glass plate images and form a single RGB color image.

Exhaustive Search Approach (Simple)

To solve this issue, I exhaustively searched over a 15 by 15 pixel window using the L2 norm as my scoring metric. For the red and green channels, we look at 125 total possibilities each, shift the channel, and compare it to the blue channel with the aforementioned scoring metric.

This process should work in theory if an image is perfect, but that’s not how it was. I implemented this and kept getting blurry images because there were potential border artifacts such as dust or scanning issues. To solve this, I cropped the image’s borders by 15%.
Cathedral (G: [5,2], R: [12,3])
Monastery (G: [-3,2], R: [3,2])
Tobolsk (G: [3,3], R: [6,3])

Image Pyramid Approach (Advanced) + Bells & Whistles Auto Cropping

The simple exhaustive approach has its limitations if the pixel displacement becomes too large. To solve this issue, I implemented an image pyramid, which represents an image at smaller scales and updates estimates as I go. This is particularly effective because we can find an estimate for alignment at a smaller image, where the shifts are also possibly smaller. This is then applied to larger images in a coarse-to-fine strategy. At the end, there are less shifts that need to be considered in the larger images. To implement this, I used my code from the simple approach--this will be referred to as the align function moving forward.

The user can pick how many levels it wants to iterate down recursively and the base case is when there are no more levels left to recur through, in which case we just call the basic align function. We take our two images and scale them down by half and call align on them. We rescale the resulting shift back and shift our original image by this. Then, we align this modified image once again to get our final shift at the current pyramid level. We continue this process until the base case.

I had an issue rendering emir.tif as seen below. To solve this issue, just cropping the image by 15% was not enough so I had to implement auto cropping with a Sobel filter. The Sobel filter uses edge detection to find boundaries where there is a significant change in intensity. This adds smoothing and reduces aliasing within the image, more specifically the black borders in the image. After implementing the filter, I was able to successfully render the emir.tif as seen below. Note that the rest of the images just had the pyramid algorithm run on it.
Emir.tif without Sobel filter (G: [49,24], R: [112, -1050])
Emir.tif with Sobel filter (G: [49,24], R: [107,40])



Church (G: [25,4], R: [58,-4])
Harvesters (G: [60,17], R: [124,14])
Three Generations (G: [53,14], R: [112,11])
Icon (G: [41,17], R: [90,23])
Lady (G: [52,9], R: [120,11])
Melons (G: [82,10], R: [178,13])
Onion Church (G: [52,26], R: [108,36])
Sculpture (G: [33,-11], R: [140,-27])
Self Portrait (G: [79,29], R: [176,37])
Train (G: [42,6], R: [87,32])