Analysis of a grayscale image¶
This tutorial explains how to analyze a 2D grayscale image by persistent homology. You learn the following topics in this tutorial. The topics are similar to the analysis of a binary image.
- How to compute a PD (Persistence Diagram) from a image
- How to visualize the PD
- How to output birth-death pairs in the text format
- How to apply basic inverse analysis called birth pixels and death pixels
- How to use high-level inverse analysis called optimal volume
In this tutrial, the input data is text data, but the methods learned in this tutorial is applicable to other types of images.
About input data¶
The data file is grayscale.txt
. Numbers are ordered 200x200. The tutorial analyze the file.
We see the data at first.
import numpy as np # numpy ライブラリの読み込み
import matplotlib.pyplot as plt # 可視化のため matplitlib の読み込み
%matplotlib inline
# Load the data
pict = np.loadtxt("grayscale.txt")
# The shape of the data (200x200)
pict.shape
(200, 200)
pict
array([[-1.73075897, -1.71385852, -1.69163311, ..., -1.82802151, -1.81327553, -1.79317707], [-1.73451332, -1.71938114, -1.69941846, ..., -1.78830493, -1.7756238 , -1.75860343], [-1.7414541 , -1.72962206, -1.71387809, ..., -1.72829107, -1.72211076, -1.71168612], ..., [-0.46998849, -0.46657936, -0.45652418, ..., 0.9402477 , 0.95085668, 0.96084144], [-0.46945536, -0.46409205, -0.44845549, ..., 0.89788258, 0.90681179, 0.9185174 ], [-0.46899705, -0.4622184 , -0.44248941, ..., 0.86943194, 0.87788647, 0.8911879 ]])
This is a 200x200 array.
Before computing a PD, we visualize the data.
plt.imshow(pict, "gray")
<matplotlib.image.AxesImage at 0x7ff871c5b470>
We check the distribution of pixel values.
np.min(pict), np.max(pict), np.mean(pict), np.std(pict)
(-1.8280215091732186, 3.72602997298875, -6.821210263296961e-17, 1.0)
The pixel values is in the range of (-1.8, 3.7), their mean is about 0, and the standard deviation is about 1. The following code displays the histogram of pixel values.
plt.hist(pict.ravel(), range=(-4, 4), bins=128); None
There are some peaks.
How to compute PDs¶
Persistence diagrams can be computable by using homcloud.interface
module. The following
code imports the module and computes PDs.
import homcloud.interface as hc
hc.PDList.from_bitmap_levelset(pict, "superlevel", save_to="grayscale.pdgm")
PDList(path=grayscale.pdgm)
hc.PDList.from_bitmap_levelset
computes levelset persistence. The first argument is an array (picture), the second argument
is "superleve" or "sublevel". The output file name is specified by save_to=...
.
How to visualize the PD¶
We visualize the 0th PD. The PD has the information about connected components or islands. Since we use a superlevel filtration, the peaks of the input can be captured.
The following code loads the PDs (the 0th and 1st PD) and the 0th PD is retrieved by dth_diagram(0)
.
pd = hc.PDList("grayscale.pdgm").dth_diagram(0)
The following code is common to pointclouds.
pd.histogram().plot(colorbar={"type": "log"})
The grids are too file and we feel it is a bit difficult to understand the histogram (The default resolution is 128x128). Now we change the resolution.
pd.histogram(x_bins=64).plot(colorbar={"type": "log"})
In this diagram, birth-death pairs appear in the right-down side. This is because the threshold changes from high to low (this is the definition of the superlevel filtration).
Exercise: Many birth-death pairs are scattered at the region in [0,1.5]x[0,1.5]
. Try to zoom up these region.
How to see birth times and death times¶
You can output pairs in the same way as pointclouds and binary images.
np.vstack([pd.births, pd.deaths]).transpose()
array([[ 3.72049835, 3.7197502 ], [ 3.7163519 , 3.71581448], [ 3.62199957, 3.61398173], [ 1.17646489, 1.17569729], [ 1.17565024, 1.17560317], [ 1.17677567, 1.17489635], [ 1.20250959, 1.17155953], [ 1.16070165, 1.1514168 ], [ 1.1448862 , 1.13935918], [ 1.10316815, 1.10012108], [ 1.09772847, 1.09706687], [ 1.08018771, 1.07677108], [ 1.06793501, 1.06744707], [ 1.10463065, 1.04357627], [ 0.88167134, 0.87714672], [ 0.87779258, 0.87464922], [ 0.87221495, 0.87149484], [ 0.87378846, 0.87081415], [ 0.84911391, 0.79822282], [ 0.78615028, 0.78444829], [ 0.81883079, 0.7472538 ], [ 1.21522446, 0.74060972], [ 0.71483203, 0.71049393], [ 0.76913211, 0.70963995], [ 0.71656008, 0.7091577 ], [ 0.68458576, 0.67847277], [ 0.58538702, 0.584013 ], [ 0.57038128, 0.5610367 ], [ 0.56187367, 0.5578117 ], [ 0.57912716, 0.5437553 ], [ 0.51183726, 0.5114067 ], [ 0.51330706, 0.50602736], [ 0.52295205, 0.50019607], [ 0.43017107, 0.40577379], [ 0.40833399, 0.36977646], [ 0.39775193, 0.35347578], [ 0.93587969, 0.34877654], [ 0.33259517, 0.33257242], [ 0.34220866, 0.32492189], [ 0.32863671, 0.31583634], [ 0.32598309, 0.30275411], [ 0.33318458, 0.30085797], [ 1.29888048, 0.23855504], [ 0.22924011, 0.21786746], [ 0.24130412, 0.2082839 ], [ 0.23668959, 0.19994758], [ 0.21192509, 0.19917353], [ 0.29015713, 0.19807269], [ 0.15068774, 0.06964889], [ 0.06627242, 0.06472002], [ 0.06179495, 0.06084787], [ 1.03477437, 0.04809407], [ 0.03062501, 0.02347062], [ 0.03864399, -0.0397299 ], [-0.10149762, -0.1128864 ], [-0.1240269 , -0.12764439], [-0.16523646, -0.18459246], [-0.18987744, -0.19232931], [-0.18800893, -0.19267836], [-0.14841084, -0.19376773], [-0.09791811, -0.19904108], [-0.23042979, -0.24591754], [-0.32730138, -0.33949227], [-0.37450332, -0.37519826], [-0.39194886, -0.39529467], [-0.34620351, -0.40272859], [-0.40412379, -0.43159528], [-0.41042166, -0.44578582], [-0.46482568, -0.46497538], [-0.29606023, -0.46951701], [-0.41233545, -0.47659197], [-0.46199888, -0.47779584], [-0.47777149, -0.48532429], [-0.51266144, -0.51710601], [-0.51813031, -0.51852311], [-0.51736249, -0.52223673], [-0.59607204, -0.63550974], [-0.56677649, -0.65532262], [-0.80571 , -0.83893901], [-0.74883863, -0.84006845], [-1.13211692, -1.1524443 ], [-1.1729987 , -1.19613943], [-1.214334 , -1.21494912], [-1.2109894 , -1.22049237], [-1.19800118, -1.23013554], [-1.2360744 , -1.25939254], [-1.21375775, -1.26177874], [-1.29852304, -1.2998627 ], [-1.29742081, -1.31664078], [-1.31813159, -1.36734127], [-1.35644958, -1.38143293], [-1.39827581, -1.41087179], [-1.44162259, -1.46342071], [-1.45418752, -1.50165077], [-1.52926759, -1.54008366], [-1.54135963, -1.58319969], [-1.58842435, -1.59080461], [-1.59300271, -1.59844884], [-1.56682931, -1.60183314], [-1.52328 , -1.62831348], [-1.60949457, -1.63901185], [-1.59958114, -1.64762312], [-1.4799324 , -1.65144922], [-1.6797181 , -1.70790571]])
The first column has birth times, and the second column has death times. You see the a birth time in each row is larger than the death time of the same row.
pd.essential_births
array([3.72602997])
essential_births
has birth times of birth-death pairs whose death time is $-\infty$.
Any 0th PD has such a pair because one connected component never die.
Simple inverse analysis (birth pixels, death pixels)¶
Now we investigate the birth-death pairs with $(\textrm{lifetime}) = (\textrm{death}) - (\textrm{birth}) < -0.3$. These pairs may be important since these pairs are further from diagram than other pairs.
You can apply the inverse analysis methods in the same way as binary images.
The following codes pick up all pairs whose lifetime is less than -0.3.
pairs = [pair for pair in pd.pairs() if pair.lifetime() < -0.3]
pairs
[Pair(1.2152244583757872, 0.7406097154584625), Pair(0.9358796930939498, 0.3487765394031042), Pair(1.2988804792640827, 0.2385550375924738), Pair(1.03477437479703, 0.04809406503892041)]
The birth pixels and death pixels are drawn by the following code.
hc.draw_birthdeath_pixels_2d(
pairs, pict, draw_birth=True, draw_death=True, draw_line=True
)
The red points are the birth pixels, the blue points are the death pixels, and green lines connect the corresponding birth pixels and death pixels.
You see that the birth pixels are located on the peaks of white areas. But the brightest area is not marked. What happened? In fact, the brightest area has a birth-deapth pair whose death time is $-\infty$ and such a pair is not handled in HomCloud by default. Therefore the brightest area is not marked.
Advanced inverse analysis¶
BitmapPHTreesPair
is used for advanced inverse analysis (called volume). This class is available for the analysis of
levelset persistence.
BitmapPHTreesPair.from_bitmap_levelset
is used. The first and second arguments are the same as PDList.from_bitmap_levelset
.
save_to=...
specifies the output file name. The extension should be .p2mt
.
hc.BitmapPHTrees.for_bitmap_levelset(pict, "superlevel", save_to="grayscale-tree.pdgm")
PDList(path=grayscale-tree.pdgm)
We can load the data by the following code and get the information of the 0th PD by dim_0_trees()
.
phtrees = hc.PDList("grayscale-tree.pdgm").bitmap_phtrees(0)
In fact, internally, the information is represented by trees and each tree node corresponds to a birth-death pair. I also mention that the information of the 0th PD and (n-1)th PD is computed when your input data is n dimensional data.
If you want to get the (n-1)th PD's information, please use codim_1_trees
method.
All nodes whose lifetime is less than -0.3.
[node for node in phtrees.nodes if node.lifetime() < -0.3]
[BitmapPHTrees.Node(1.2988804792640827, 0.2385550375924738), BitmapPHTrees.Node(1.2152244583757872, 0.7406097154584625), BitmapPHTrees.Node(0.9358796930939498, 0.3487765394031042), BitmapPHTrees.Node(3.72602997298875, -inf), BitmapPHTrees.Node(1.03477437479703, 0.04809406503892041)]
A node appear whose death time is $-\infty$. The volume for the node is the whole image and it is not useful if the volume is visualized. Therefore we remove the node as follows.
nodes = [node for node in phtrees.nodes if node.lifetime() < -0.3 and node.death_time() != -np.inf]
Finally we call the function draw_volumes_on_2d_image
for visualziation. This is the same way as binary images.
hc.draw_volumes_on_2d_image(nodes, pict, color=(255, 0, 0), alpha=0.2, birth_position=(255,0,0), marker_size=3)
This tutorial stops here. Try to analyze your own data.