We will be playing around with an image from a plenoptic camera and obtaining refocused images.
Also, we will create an all-focused image from this focal stack.
The codes are written in Python >=3.8 with the help of some basic libraries.
In order to run the code, first unzip the codes and run
pip install -r requirements.txt
in the desired Python enviornment.
import numpy as np
import cv2
from einops import rearrange
We will load the image from a plenoptic camera and save it as a formal lightfield format.
Each (u,v)
stands for a pinhole in the camera, where it captures an image (s, t)
.
# Read image
lightfield_image = cv2.imread("./data/chessboard_lightfield.png")
U, V = 16, 16
S, T = lightfield_image.shape[0] // U, lightfield_image.shape[1] // V
lightfield = np.zeros((U, V, S, T, 3), dtype=np.float32)
# Split lightfield image into UxV tiles
for s, t in np.ndindex(S, T):
lightfield[:, :, s, t, :] = lightfield_image[s * U:(s + 1) * U, t * V:(t + 1) * V, :]
The lightfield is a flexible format for multiple uses, but we will visualize it to observe the image for each pinhole view. This can be easily done through a single line code which rearranges the axes.
# Read image
# Select sub-aperture views
sub_aperture_views = rearrange(lightfield, "u v s t c -> (u s) (v t) c")
for u, v in np.ndindex(U, V):
cv2.imwrite(f'./debug/sub_aperture_view_{u}_{v}.png', lightfield[u, v, :, :, :])
cv2.imwrite('2_sub_aperture_views.png', sub_aperture_views)
We can obtain a partially focused image with a lightfield. We can simply take the average of all the sub-aperture views to obtain an image focused on the furthest distance. We can also sample the light rays from a different location to obtain an image focused on nearer regions.
# Mean refocusing
mean_refocusing = np.mean(lightfield, axis=(0, 1))
# Focal-stack generation
depth_list = [0, 0.25, 0.5, 0.75, 1.0, 1.25, 1.5]
D = len(depth_list)
focal_stack = np.zeros((S, T, 3, D), dtype=np.float32)
for d, depth in enumerate(depth_list):
for s, t in tqdm(np.ndindex(S, T), total=S*T, desc=f'Focal stack {d}'):
for u, v in np.ndindex(U, V):
shifted_s, shifted_t = int(min(s + (depth * u), S-1)), int(max(t - (depth * v), 0))
focal_stack[s, t, :, d] += lightfield[u, v, shifted_s, shifted_t, :]
focal_stack[:, :, :, d] /= U * V
focal_stack = np.load('focal_stack.npy', allow_pickle=True)
for d, depth in enumerate(depth_list):
cv2.imwrite(f'3_focal_stack_{depth:.2f}.png', focal_stack[:, :, :, d])
Note that the distnace d
ranges from 0 to 1.25, which is a parameter I had to find by myself.
Also, after carefull inspection of the sub-aperture views,
I concluded that the width/x-axis is inverted and took the negative of depth * v
.
Although we already have a all-focused image from the lightfield, we can also obtain one by merging the focal stack. We can do this by performing a weighted sum over the focal stack, where edges are given a higher weight. We experiment with multiple configurations for obtaining the edges, with different Gaussian filters. The image becomes blurries as the Gaussian filter becomes smoother.
std1, std2 = 1, 5
w_sharpness_stack = np.zeros((S, T, D), dtype=np.float32)
for d, depth in enumerate(depth_list):
luminance_Y = cv2.cvtColor(focal_stack[:, :, :, d], cv2.COLOR_BGR2XYZ)[:,:,1]
low_freq = cv2.GaussianBlur(luminance_Y, (5, 5), std1)
high_freq = luminance_Y - low_freq
w_sharpness_stack[:,:,d] = cv2.GaussianBlur(np.power(high_freq, 2), (5, 5), std2)
weighted_image = np.zeros((S, T, 3), dtype=np.float32)
weighted_depth = np.zeros((S, T, 3), dtype=np.float32)
weight_sum = np.zeros((S, T, 3), dtype=np.float32)
for d, depth in enumerate(depth_list):
weighted_image += np.expand_dims(w_sharpness_stack[:, :, d], axis=2) * focal_stack[:, :, :, d]
weighted_depth += d * np.expand_dims(w_sharpness_stack[:, :, d], axis=2)
weight_sum += np.expand_dims(w_sharpness_stack[:, :, d], axis=2)
all_focus_image = weighted_image / weight_sum
depth_map = weighted_depth / weight_sum
We can also obtain a pseudo depth map from this, which is obtained by bluring the weights with a Gaussian filter. However, the depth near strong edges have erros, which hinders its usage for actual depth estimation.