Analysis of a binary image¶

This tutorial explains how to analyze a 2D binary image by persistent homology. You learn the following topics in this tutorial.

  1. How to compute a PD (Persistence Diagram) from a image
  2. How to visualize the PD
  3. How to output birth-death pairs in the text format
  4. How to apply basic inverse analysis called birth pixels and death pixels
  5. How to use high-level inverse analysis called optimal volume

The PD is computed from a filtration constructed by signed distance transform. This means that the black area in the binary image is shrinked and expanded and HomCloud captures the information of the appearance and disappearance of connected components and holes. Please see Section 2.3 in https://arxiv.org/abs/1706.10082 .

How to compute a PD¶

The example image file is binary-image.png. First, we display the image.

In [1]:
# Load a module for displaying an image
# IPython package has some useful modules for jupyter notebook
# and import a function "display" and a class "Image" for displaying a image
from IPython.display import display, Image
In [2]:
%matplotlib inline
In [3]:
display(Image("binary-image.png"))
No description has been provided for this image

imageio library is used to load the image.

In [4]:
# Import imageio library
import imageio

Load the image. This image have R, G, B components and we read the data as grayscale data mode="L".

In [5]:
image = imageio.v3.imread("binary-image.png", mode="L")

Now we show the histogram of pixel values.

In [6]:
import matplotlib.pyplot as plt
import numpy as np
In [7]:
plt.hist(np.ravel(image), range=(0,256), bins=256); None
No description has been provided for this image

The pixel values are 0 or 255. That means that the image is already binarized. HomCloud accepts a boolean array as a binary image, we need to convert image into a boolean array. numpy's comparison operators make such a booelan array. We use 128 as the threshold for binarization. In this tutorial, we will focus on the black area, so image < 128 is used.

In [8]:
binary_image = image < 128

Now PD is computed.

In [9]:
# Load HomCloud library. "hc" is an alias of hocmloud.interface
import homcloud.interface as hc
In [10]:
hc.PDList.from_bitmap_levelset(hc.distance_transform(binary_image, signed=True), save_to="binary-image.pdgm")
Out[10]:
PDList(path=binary-image.pdgm)

hc.PDList.from_bitmap_distance_function can be used to compute PD. The file whose name is binary-image.pdgm is generated. This file has information about persistence diagrams. You can specify the file name by save_to=... argument. signed=True is used to shrink and expand the black area (This option should be always True unless you have any special reason).

The following code load the file.

In [11]:
pdlist = hc.PDList("binary-image.pdgm")

In fact, the return value of hc.PDList.from_bitmap_distance_function is the same as pdlist, so you don't need to load the file in this case. Computing PDs is relatively expensive, it is recommended that the result is saved into a file and the result is loaded when analyzing.

How to visualize a PD¶

Next, we visualize the 0th PD. We investigate the black connected components in the input image.

In [12]:
pdlist.dth_diagram(0).histogram().plot(colorbar={"type":"log"})
No description has been provided for this image

pdlist has the PDs of all degree, we get the 0th PD by dth_diagram(0), construct a 2D histogram by .histogram(), and plot the histogram by .plot(...). We specify the colorbar spec by colorbar={"type":"log"}, this means that log-scale colorbar is used. This is the same way as a pointcloud.

Some small points appear in the figure. In fact, we use the manhattan distance for the computation and all birth times and death times are integer values for that reason. Therefore the resolution is too fine for the visualization. Now we adjust the resolution. The arguments x_range=(-20.5, 7.5), x_bins=28 change the range and the number of bins. The size of each bin is adjusted to 1x1 and the center of each bin is adjusted to an integer point.

In [13]:
pdlist.dth_diagram(0).histogram(x_range=(-20.5, 7.5), x_bins=28).plot(colorbar={"type":"log"})
No description has been provided for this image

Since the y_range and y_bins arguments are not given, the range and the number of bins in y-axis are set as same as x-axis.

To save the figure into a file, savefig method in matplotlib is available.

In [14]:
pdlist.dth_diagram(0).histogram(x_range=(-20.5, 7.5), x_bins=28).plot(colorbar={"type":"log"})
plt.savefig("binary-image-pd0.png")
No description has been provided for this image

Inverse analysis (birth pixels and death pixels)¶

You can see the birth times and death times by attribtues births and deaths.

In [15]:
pd = pdlist.dth_diagram(0)
pd.births, pd.deaths
Out[15]:
(array([-19., -18., -18., -18., -18., -18., -18., -18., -18., -16., -16.,
        -14., -14., -14., -14., -14., -14., -14., -14., -14., -14., -14.,
        -13., -11.,  -9.,  -9.,  -8.,  -8.,  -8.,  -7.,  -7.,  -6.,  -6.,
         -6.,  -6.,  -6.,  -6.,  -6.,  -6.,  -6.,  -6.,  -6.,  -6.,  -6.,
         -6.,  -6.,  -7.,  -6.,  -6.,  -5.,  -5.,  -5.,  -5.,  -5.,  -5.,
         -5.,  -6.,  -5., -11.,  -5.,  -5.,  -5.,  -5.,  -5.,  -5.,  -5.,
         -5.,  -5.,  -5.,  -5.,  -5.,  -8.,  -5.,  -5.,  -5.,  -5.,  -5.,
         -5.,  -5.,  -5.,  -5.,  -5.,  -5.,  -6.,  -5.,  -5.,  -5.,  -5.,
         -5.,  -5.,  -5.,  -5.,  -5.,  -5.,  -5.,  -5.,  -5.,  -5.,  -5.,
         -5.,  -5.,  -5.,  -5.,  -5.,  -5.,  -6.,  -6.,  -5.,  -5.,  -6.,
        -15.]),
 array([-18., -17., -17., -17., -17., -17., -17., -17., -17., -15., -15.,
        -13., -13., -13., -13., -13., -13., -13., -13., -13., -13., -13.,
        -12., -10.,  -8.,  -8.,  -7.,  -7.,  -7.,  -6.,  -6.,  -5.,  -5.,
         -5.,  -5.,  -5.,  -5.,  -5.,  -5.,  -5.,  -5.,  -5.,  -5.,  -5.,
         -5.,  -5.,  -5.,  -5.,  -5.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,
         -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,
         -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,
         -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,
         -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,
         -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,  -4.,   5.,
          6.]))

From the above PD, you see that many birth-death pairs are concentrated at (-5, -4). Now we try to analyze these pairs. We use birth pixels and death pixels. Please see Section 2.3 in https://arxiv.org/abs/1706.10082 . In this paper, birth/death pixels are called birth/death positions.

In HomCloud, we get birth positions by birth_positions attribute.

In [16]:
pd.birth_positions
Out[16]:
[[57, 121],
 [47, 112],
 [48, 113],
 [49, 114],
 [50, 115],
 [51, 116],
 [52, 117],
 [53, 118],
 [46, 111],
 [73, 111],
 [74, 110],
 [40, 112],
 [83, 113],
 [84, 114],
 [85, 115],
 [86, 116],
 [87, 117],
 [88, 118],
 [89, 119],
 [90, 120],
 [91, 121],
 [92, 122],
 [37, 114],
 [9, 112],
 [75, 98],
 [74, 97],
 [47, 96],
 [50, 94],
 [71, 95],
 [48, 34],
 [68, 93],
 [26, 55],
 [28, 105],
 [36, 30],
 [51, 27],
 [51, 36],
 [52, 26],
 [53, 25],
 [54, 24],
 [55, 23],
 [56, 22],
 [57, 21],
 [58, 20],
 [59, 12],
 [59, 19],
 [58, 11],
 [62, 15],
 [103, 31],
 [104, 32],
 [9, 35],
 [10, 36],
 [11, 37],
 [18, 45],
 [19, 46],
 [20, 48],
 [21, 49],
 [25, 54],
 [25, 103],
 [7, 111],
 [28, 17],
 [30, 58],
 [31, 23],
 [32, 25],
 [27, 15],
 [44, 79],
 [49, 1],
 [50, 3],
 [48, 0],
 [73, 81],
 [75, 82],
 [82, 85],
 [44, 32],
 [84, 86],
 [86, 87],
 [87, 67],
 [87, 88],
 [91, 69],
 [92, 48],
 [93, 70],
 [94, 49],
 [94, 71],
 [94, 95],
 [95, 50],
 [98, 74],
 [96, 51],
 [99, 53],
 [42, 80],
 [102, 79],
 [103, 81],
 [104, 83],
 [105, 84],
 [106, 85],
 [108, 35],
 [109, 36],
 [111, 37],
 [112, 38],
 [112, 64],
 [114, 39],
 [115, 40],
 [116, 67],
 [117, 41],
 [117, 68],
 [118, 42],
 [119, 98],
 [120, 70],
 [127, 105],
 [127, 49],
 [121, 71],
 [124, 73],
 [4, 29],
 [99, 20]]

We also can get death positions by pd.death_positions.

Please note that each coordinate is ordered by (y, x). This is because numpy's indexing rule.

The above attributes have the array of birth times, the array of death times, the array of birth positions, and the array of death positions. You sometimes want an object that has birth and death times and birth and death positions. hc.Pair class holds the information. pd.pairs() returns the list of Pair objects.

In [17]:
pairs = pd.pairs()
In [18]:
pairs[0].birth_time(), pairs[0].death_time()
Out[18]:
(-19.0, -18.0)
In [19]:
pairs[0].birth_position
Out[19]:
[57, 121]
In [20]:
pairs[0].death_position
Out[20]:
[56, 121]

Now we select all (-5, -4) pairs from the list. Python's list comprehension expression is useful for that purpose.

In [21]:
pairs_m5_m4 = [pair for pair in pairs if pair.birth_time() == -5 and pair.death_time() == -4]
In [22]:
pairs_m5_m4
Out[22]:
[Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0),
 Pair(-5.0, -4.0)]

Looks good. We plot birth positions of these birth-death pairs on the input image. draw_birthdeath_pixels_2d is available for the purpose.

In [23]:
birth_pixels_image = hc.draw_birthdeath_pixels_2d(
    pairs_m5_m4, "binary-image.png", draw_birth=True
)
display(birth_pixels_image)
No description has been provided for this image

The first argument is the list of plotted pairs, the second argument is the background image. draw_birth=True is given to draw birth pixels. If you want to draw death pixels, please give the argument draw_death=True .

Red points are drawn on the black roads. This means that the (-5, -4) pairs correspond to this black road shapes.

Advanced invese analysis¶

Each pair in 0th PD corresponds a connected component (an island), but the death pixel of the pair can only visualize one pixel in the island. Although the birth/death pixels are easy to use and their computation cost is quite cheap, the information is a little. We want to see the islands themselves. The regions are called volumes in this tutorial and we can compute volumes by using HomCloud.

BitmapPHTrees class is available. The following python code computes the information.

In [24]:
hc.BitmapPHTrees.for_bitmap_levelset(hc.distance_transform(binary_image, signed=True), save_to="binary-image-tree.pdgm")
Out[24]:
PDList(path=binary-image-tree.pdgm)

The data is loaded as follows.

In [25]:
phtrees = hc.PDList("binary-image-tree.pdgm").bitmap_phtrees(0)

In this object (this is an instance of BitmapPHTrees class), the infomation is represented by trees and all birth-death pairs are represented by tree nodes. By pair_nodes_in_rectangle method we get all nodes corresponding to (-5, -4).

In [26]:
nodes = phtrees.pair_nodes_in_rectangle(-5, -5, -4, -4)

The nodes are visualized by draw_volumes_on_2d_image.

The meaning of the arguments are explained as follows:

  • nodes - Nodes in BitmapPHTrees
  • binary-image.png" - The background image\
  • color=(255, 0, 0) - The color for drawing the volumes. (R, G, B) format.
  • alpha=0.5 - Alpha value for drawing the volumes.
  • birth_position=(255,0,0) - birth positions are drawn by the red color. If this argument is not given, the birth positions are not drawn.
In [27]:
hc.draw_volumes_on_2d_image(nodes, "binary-image.png", color=(255, 0, 0), alpha=0.5, 
                            birth_position=(0, 255, 0))
Out[27]:
No description has been provided for this image

The volumes are drawn in light red, and birth positions are drawn by bright red.

The tutorial ends here. Thank you for your attention.