This notebook demonstrates training data generation for a combined denoising and upsampling task of synthetic 3D data, where corresponding pairs of isotropic low and high quality stacks can be acquired. Anisotropic distortions along the Z axis will be simulated for the low quality stack, such that a CARE model trained on this data can be applied to images with anisotropic resolution along Z.
We will use only a few synthetically generated stacks for training data generation, whereas in your application you should aim to use stacks from different developmental timepoints to ensure a well trained model.
More documentation is available at http://csbdeep.bioimagecomputing.com/doc/.
from __future__ import print_function, unicode_literals, absolute_import, division
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
from tifffile import imread
from csbdeep.utils import download_and_extract_zip_file, plot_some, axes_dict
from csbdeep.io import save_training_data
from csbdeep.data import RawData, create_patches
from csbdeep.data.transform import anisotropic_distortions
First we download some example data, consisting of a synthetic 3D stacks with membrane-like structures.
download_and_extract_zip_file (
url = 'http://csbdeep.bioimagecomputing.com/example_data/synthetic_upsampling.zip',
targetdir = 'data',
)
Files missing, downloading... extracting... done. data: - synthetic_upsampling - synthetic_upsampling/test_stacks_sub_4 - synthetic_upsampling/test_stacks_sub_4/stack_low_sub_4_03.tif - synthetic_upsampling/training_stacks - synthetic_upsampling/training_stacks/high - synthetic_upsampling/training_stacks/high/stack_01.tif - synthetic_upsampling/training_stacks/high/stack_02.tif - synthetic_upsampling/training_stacks/high/stack_00.tif - synthetic_upsampling/training_stacks/low - synthetic_upsampling/training_stacks/low/stack_01.tif - synthetic_upsampling/training_stacks/low/stack_02.tif - synthetic_upsampling/training_stacks/low/stack_00.tif
We plot XY and XZ slices of a training stack pair:
y = imread('data/synthetic_upsampling/training_stacks/high/stack_00.tif')
x = imread('data/synthetic_upsampling/training_stacks/low/stack_00.tif')
print('image size =', x.shape)
plt.figure(figsize=(16,15))
plot_some(np.stack([x[5],y[5]]),
title_list=[['XY slice (low)','XY slice (high)']],
pmin=2,pmax=99.8);
plt.figure(figsize=(16,15))
plot_some(np.stack([np.moveaxis(x,1,0)[50],np.moveaxis(y,1,0)[50]]),
title_list=[['XZ slice (low)','XZ slice (high)']],
pmin=2,pmax=99.8);
image size = (128, 512, 512)
We first need to create a RawData
object, which defines how to get the pairs of low/high SNR stacks and the semantics of each axis (e.g. which one is considered a color channel, etc.).
Here we have two folders "low" and "high", where corresponding low and high-SNR stacks are TIFF images with identical filenames.
For this case, we can simply use RawData.from_folder
and set axes = 'ZYX'
to indicate the semantic order of the image axes.
raw_data = RawData.from_folder (
basepath = 'data/synthetic_upsampling/training_stacks',
source_dirs = ['low'],
target_dir = 'high',
axes = 'ZYX',
)
Furthermore, we must define how to modify the Z axis to mimic a real microscope as closely as possible if data along this axis is acquired with reduced resolution. To that end, we define a Transform
object that will take our RawData
as input and return the modified image. Here, we use anisotropic_distortions
to accomplish this.
The most important parameter is the subsampling factor along Z, which should for example be chosen as 4 if it is planned to later acquire (low-SNR) images with 4 times reduced axial resolution.
anisotropic_transform = anisotropic_distortions (
subsample = 4,
psf = None,
subsample_axis = 'Z',
yield_target = 'target',
)
From the synthetically undersampled low quality input stack and its corresponding high quality stack, we now generate some 3D patches. As a general rule, use a patch size that is a power of two along XYZT, or at least divisible by 8.
Typically, you should use more patches the more trainings stacks you have. By default, patches are sampled from non-background regions (i.e. that are above a relative threshold), see the documentation of create_patches
for details.
Note that returned values (X, Y, XY_axes)
by create_patches
are not to be confused with the image axes X and Y.
By convention, the variable name X
(or x
) refers to an input variable for a machine learning model, whereas Y
(or y
) indicates an output variable.
X, Y, XY_axes = create_patches (
raw_data = raw_data,
patch_size = (32,64,64),
n_patches_per_image = 512,
transforms = [anisotropic_transform],
save_file = 'data/my_training_data.npz',
)
================================================================== 3 raw images x 1 transformations = 3 images 3 images x 512 patches per image = 1536 patches in total ================================================================== Input data: data/synthetic_upsampling/training_stacks: target='high', sources=['low'], axes='ZYX', pattern='*.tif*' ================================================================== Transformations: 1 x Anisotropic distortion (along Z axis) ================================================================== Patch size: 32 x 64 x 64 ==================================================================
100%|██████████| 3/3 [00:26<00:00, 8.75s/it]
Saving data to data/my_training_data.npz.
assert X.shape == Y.shape
print("shape of X,Y =", X.shape)
print("axes of X,Y =", XY_axes)
shape of X,Y = (1536, 1, 32, 64, 64) axes of X,Y = SCZYX
This shows a ZY slice of some of the generated patch pairs (odd rows: source, even rows: target)
for i in range(2):
plt.figure(figsize=(16,2))
sl = slice(8*i, 8*(i+1)), slice(None), slice(None), 0
plot_some(X[sl],Y[sl],title_list=[np.arange(sl[0].start,sl[0].stop)])
plt.show()
None;