wsiprocess package
Subpackages
- wsiprocess.annotationparser package
- Submodules
- wsiprocess.annotationparser.ASAP_parser module
- wsiprocess.annotationparser.GeoJson_parser module
- wsiprocess.annotationparser.NDPView_parser module
- wsiprocess.annotationparser.QuPath_parser module
- wsiprocess.annotationparser.SlideRunner_parser module
SlideRunnerAnnotation
SlideRunnerAnnotation.path
SlideRunnerAnnotation.classes
SlideRunnerAnnotation.mask_coords
SlideRunnerAnnotation.bbox_to_circle()
SlideRunnerAnnotation.parse_mask_coords()
SlideRunnerAnnotation.read_annotations()
SlideRunnerAnnotation.read_classes()
SlideRunnerAnnotation.read_coordinates()
SlideRunnerAnnotation.read_labels()
SlideRunnerAnnotation.read_slides()
SlideRunnerAnnotation.verify_annotations()
- wsiprocess.annotationparser.WSIDissector_parser module
- wsiprocess.annotationparser.parser_utils module
- Module contents
- wsiprocess.converters package
- Submodules
- wsiprocess.converters.base_converter module
- wsiprocess.converters.wsiprocess_to_coco module
SaveTo
ToCOCOConverter
ToCOCOConverter.add_annotation()
ToCOCOConverter.add_categories()
ToCOCOConverter.add_category()
ToCOCOConverter.add_image()
ToCOCOConverter.add_images_and_annotations()
ToCOCOConverter.add_info()
ToCOCOConverter.assert_annotation()
ToCOCOConverter.convert()
ToCOCOConverter.get_base()
ToCOCOConverter.get_ratio()
ToCOCOConverter.getargs()
ToCOCOConverter.make_link_to_images()
ToCOCOConverter.read_annotation()
ToCOCOConverter.save_data()
ToCOCOConverter.set_id()
annotation()
base()
category()
image()
info()
- wsiprocess.converters.wsiprocess_to_voc module
- wsiprocess.converters.wsiprocess_to_yolo module
- Module contents
- wsiprocess.pytorch package
Submodules
wsiprocess.annotation module
Annotation object.
Annotation object is optional metadata for slide. This object can handle ASAP or WSIViewer style annotation. By adding annotationparser, you can process annotation data from other types of annotation tools.
Example
Loading annotation data:: python
import wsiprocess as wp annotation = wp.annotation(“path_to_annotation_file.xml”)
Loading annotation data from image:: python
import wsiprocess as wp annotation = wp.annotation(“”)
- class wsiprocess.annotation.Annotation(path=False, is_image=False, slide=False)
Bases:
object
- add_class(classes)
- base_mask(cls, mask_height, mask_width)
Masks have same size of as the slide.
Masks are canvases of 0s.
- Parameters:
cls (str) – Class name for each mask.
mask_height (int) – The height of base masks.
mask_width (int) – The width of base masks.
- base_masks(size, wsi_height, wsi_width)
Make base masks.
- Parameters:
size (int) – The long side of masks.
wsi_height (int) – The height of wsi.
wsi_width (int) – The width of wsi.
- check_classes(annotation_class, rule_class)
- check_memory_consumption(wsi_height, wsi_width)
- dot_to_bbox(width=30, height=False)
Translate dot annotations to bounding boxes.
If the len(self.mask_coords[cls][idx]) is 1, the annotation is a dot. And, the dot is the midpoint of the bounding box.
- Parameters:
width (int) – Width of the translated bounding box.
height (int) – Height of the translated bounding box. If not set, height is equal to width.
- exclude_coords(rule)
Exclude coordinations following the rule.
- Parameters:
rule (wsiprocess.rule.Rule) – Rule object.
- exclude_masks(rule)
Exclude area from base mask with following the rule.
- Parameters:
rule (wsiprocess.rule.Rule) – Rule object.
- export_mask(save_to, cls)
Export one binary mask image.
Export mask image with 0 or 1 binaries.
- Parameters:
save_to (str) – Parent directory to save the thumbnails.
cls (str) – Class name for each mask.
- export_masks(save_to)
Export binary mask images.
For later computing such as segmentation, export the mask images. Exported masks have 0 or 1 binary data.
- Parameters:
save_to (str) – Parent directory to save the thumbnails.
- export_thumb_mask(cls, save_to='.', size=512)
Export a thumbnail of one of the masks.
For prior check, export one thumbnail of one of the masks.
- Parameters:
cls (str) – Class name for each mask.
save_to (str, optional) – Parent directory to save the thumbnails.
size (int, optional) – Length of the long side of thumbnail.
- export_thumb_masks(save_to='.', size=512)
Export thumbnail of masks.
For prior check, export thumbnails of masks.
- Parameters:
save_to (str) – Parent directory to save the thumbnails.
size (int) – Length of the long side of thumbnail.
- fix_mask_size()
- foreground_mask(slide, size=5000, wsi_height=False, wsi_width=False, fn='otsu', min_=30, max_=190)
Make foreground mask.
With otsu thresholding, make simple foreground mask.
- Parameters:
slide (wsiprocess.slide.Slide) – Slide object.
size (int, or function, optional) – Size of foreground mask on calculating with the Otsu Thresholding.
wsi_width (int or bool) – Width of the wsi.
wsi_height (int or bool) – Height of the wsi.
fn (str or function, optional) – Binarization method. As default, calculates with Otsu Thresholding.
min (int, optional) – Used if method is “minmax”. Annotation object defines foreground as the pixels with the value between “min” and “max”.
max (int, optional) – Used if method is “minmax”. Annotation object defines foreground as the pixels with the value between “min” and “max”.
- from_image(mask, cls)
Load mask data from an image.
- Parameters:
mask (numpy.ndarray) – 2D mask image with background as 0, and foreground as 255.
cls (str) – Name of the class of the mask image.
- get_patch_mask(cls, x, y, w, h)
- get_scale(size, wsi_height, wsi_width)
- include_masks(rule)
Merge masks following the rule.
- Parameters:
rule (wsiprocess.rule.Rule) – Rule object.
- main_mask(cls, scale)
- main_masks(size, wsi_height, wsi_width)
Main masks
Write border lines following the rule and fill inside with 255.
- make_masks(slide, rule=False, foreground_fn='otsu', size=5000, min_=30, max_=190)
Make masks from the slide and rule.
Masks are for each class and foreground area.
- Parameters:
slide (wsiprocess.slide.Slide) – Slide object
rule (
wsiprocess.rule.Rule
, optional) – Rule objectforeground_fn (str or callable, optional) – This can be {otsu, minmax} or a user specified function.
size (int, optional) – Size of foreground mask.
min (int, optional) – Used if method is “minmax”. Annotation object defines foreground as the pixels with the value between “min” and “max”.
max (int, optional) – Used if method is “minmax”. Annotation object defines foreground as the pixels with the value between “min” and “max”.
- merge_include_coords(rule)
Merge coordinations following the rule.
- Parameters:
rule (wsiprocess.rule.Rule) – Rule object.
- read_annotation(annotation_type=False)
Parse the annotation data.
- Parameters:
annotation_type (str) – If provided, pass the auto type detection.
- resize_mask(wsi_height, wsi_width, cls)
Resize a mask as the same size as the slide
- resize_masks(wsi_height, wsi_width)
Resize the masks as the same size as the slide
- Parameters:
slide (wsiprocess.slide.Slide) – Slide object.
- set_scale(size, wsi_height, wsi_width)
wsiprocess.cli module
- class wsiprocess.cli.Args(command)
Bases:
object
- add_annotation_args(parser, slide_is_sparse=False)
- add_binarization_method(parser)
- add_on_annotation(parser, slide_is_sparse=False)
- add_on_foreground(parser, slide_is_sparse=False)
- build_args(command)
Base Parser
- fillattr(key: str)
- fillattrs(keys)
- set_base_parser()
- set_classification_args()
Arguments for classification
- set_common_args(parser)
Common Arguments
- set_detection_args()
Arguments for detection tasks.
- set_evaluation_args()
- set_method_args()
- set_segmentation_args()
Arguments for segmentation tasks.
- set_wsi_arg(parser)
- wsiprocess.cli.main(command=None)
- wsiprocess.cli.process_annotation(args, slide, rule)
wsiprocess.converter module
Convert wsiprocess style annotation data to COCO or VOC style.
wsiprocess.error module
Custom errors for wsiprocess.
- exception wsiprocess.error.AnnotationLabelError(message)
Bases:
WsiProcessError
Error of annotations
- Parameters:
message (str) – Message to show in the stdout.
- exception wsiprocess.error.MissCombinationError(message)
Bases:
WsiProcessError
Error of the combination of the method and the anntoation file.
- Args
message (str): Message to show in the stdout.
- exception wsiprocess.error.OnParamError(message)
Bases:
WsiProcessError
Error of on_annotation.
on_annotation must be more than 0 and up to 1.
- Parameters:
message (str) – Message to show in the stdout.
- exception wsiprocess.error.PatchSizeTooSmallError(message)
Bases:
WsiProcessError
Error of the size of patches.
This should be warning?
- Parameters:
message (str) – Message to show in the stdout.
- exception wsiprocess.error.SizeError(message)
Bases:
WsiProcessError
Error of sizes.
The slide size is larger than the patches, and the patch size is larger than the overlap size.
- Parameters:
message (str) – Message to show in the stdout.
- exception wsiprocess.error.SlideLoadError(message)
Bases:
WsiProcessError
Error on loading slides.
- Parameters:
message (str) – Message to show in the stdout.
- exception wsiprocess.error.WsiProcessError(message)
Bases:
Exception
Root error class.
- Parameters:
message (str) – Message to show in the stdout.
wsiprocess.patcher module
Patcher object to extract patches from whole slide images.
- class wsiprocess.patcher.Patcher(slide, method, annotation=False, save_to='.', patch_width=256, patch_height=256, overlap_width=0, overlap_height=0, offset_x=0, offset_y=0, on_foreground=0.5, on_annotation=0.5, ext='jpg', magnification=False, start_sample=False, finished_sample=False, no_patches=False, crop_bbox=False, verbose=False, dryrun=False)
Bases:
object
Patcher object.
- Parameters:
slide (wsiprocess.slide.Slide) – Slide object.
method (str) – Method name to run. One of {“evaluation”, “classification”, “detection”, “segmentation}. Characters are converted to lowercase.
annotation (wsiprocess.annotation.Annotation, optional) – Annotation object.
save_to (str, optional) – The root of the output directory.
patch_width (int, optional) – The width of the output patches.
patch_height (int, optional) – The height of the output patches.
overlap_width (int, optional) – The width of the overlap areas of patches.
overlap_height (int, optional) – The height of the overlap areas of patches.
offset_x (int, optional) – The offset pixels along the x-axis.
offset_y (int, optional) – The offset pixels along the y-axis.
on_foreground (float, optional) – Ratio of overlap area between patches and foreground area.
on_annotation (float or dict) – Ratio of overlap area between patches and annotation. dict is also available ex: {“label”:value}.
magnification (int, optional) – Magnification of output patches.
ext (str, optional) – Extension of extracted patches.
start_sample (bool, optional) – Whether to save sample patches on Patcher starting.
finished_sample (bool, optional) – Whether to save sample patches on Patcher finished its work.
extract_patches (bool, optional) – This is deprecated because unless “no_patches” is set, Patcher extracts patches.
no_patches (bool, optional) – If set, Patcher runs without extracting patches and saves them to disk.
verbose (bool, optional) – If set, a progress bar appears when patching.
dryrun (bool, optional) – Only run patching for first 100 patches.
- slide
Slide object.
- Type:
- wsi_width
Width of the slide.
- Type:
int
- wsi_height
Height of the slide.
- Type:
int
- filepath
Path to the whole slide image.
- Type:
str
- filestem
Stem of the file name.
- Type:
str
- method
Method name to run. One of {“evaluation”, “classification”, “detection”, “segmentation}
- Type:
str
- annotation
Annotation object.
- masks
Masks to show the location of classes.
- Type:
dict
- classes
Classes to extract.
- Type:
list
- save_to
The root of the output directory.
- Type:
str
- p_width
The width of the cropped patches. It magnification is not set, it is same as output patches.
- Type:
int
- p_height
The height of the cropped patches. It magnification is not set, it is same as output patches.
- Type:
int
- p_area
The area of single patch.
- Type:
int
- o_width
The width of the overlap areas of patches.
- Type:
int
- o_height
The height of the overlap areas of patches.
- Type:
int
- offset_x
The The offset pixels along the x-axis.
- Type:
int
- offset_y
The The offset pixels along the y-axis.
- Type:
int
- on_foreground
Ratio of overlap area between patches and foreground area.
- Type:
float
- on_annotation
Ratio of overlap area between patches and annotation. dict is also available ex: {“label”:value}.
- Type:
float or dict
- magnification
Magnification of output patches.
- Type:
int
- p_scale
Ratio of p_width or p_height to output patch size.
- Type:
float
- ext
Extension of extracted patches.
- Type:
str
- start_sample
Whether to save sample patches on Patcher start.
- Type:
bool
- finished_sample
Whether to save sample patches on Patcher finish.
- Type:
bool
- extract_patches
Whether to save patches when Patcher runs.
- Type:
bool
- no_patches
Whether to save patches when Patcher runs.
- Type:
bool
- verbose
If set, a progress bar appears when patching.
- Type:
bool, optional
- dryrun
Only run patching for first 100 patches.
- Type:
bool, optional
- x_lefttop
Offsets of patches to the x-axis direction except for the right edge.
- Type:
list
- y_lefttop
Offsets of patches to the y-axis direction except for the bottom edge.
- Type:
list
- iterator
Offset coordinates of patches.
- Type:
list
- last_x
X-axis offset of the right edge patch.
- Type:
int
- last_y
Y-axis offset of the right edge patch.
- Type:
int
- result
Temporary storage for the computed result of patches.
- Type:
dict
- annotation_cover_patch(coords, x, y)
Check if the annotation is covering the whole patch.
- Parameters:
coords (np.array) – Coordinations of annotations.
px (int) – X coordinate of left top corner of the patch.
py (int) – Y coordinate of left top corner of the patch.
- Returns:
- List of np.int64s which are the indices
of bounding boxes on the patch.
- Return type:
idx_of_bb_on_patch (list)
- corner_on_patch(coords, x, y)
Check if at least one of the corners is on the patch.
- Parameters:
coords (np.array) – Coordinations of annotations.
px (int) – X coordinate of left top corner of the patch.
py (int) – Y coordinate of left top corner of the patch.
- Returns:
- List of np.int64s which are the indices
of bounding boxes on the patch.
- Return type:
idx_of_bb_on_patch (list)
- crop_patch(x, y, w=False, h=False)
- find_bbs(x, y, cls)
Find bounding boxes which are on the patch.
- Bounding boxes with one of its corners on the patch is on the patch.
- exannotation.mask_coords[“benign”][0]
= [small_x, small_y, large_x, large_y] = [bbleft, bbtop, bbright, bbbottom]
- Parameters:
x (int) – X-axis offset of patch.
y (int) – Y-axis offset of patch.
cls (str) – Class of the patch or the bounding box or the segmented area.
- find_masks(x, y, cls)
Get the masked area corresponding to the given patch area.
- Parameters:
x (int) – X-axis offset of a patch.
y (int) – Y-axis offset of a patch.
cls (str) – Class of the patch or the bounding box or the segmented area.
- Returns:
- List containing a dict of coords and its class. This
coords is a path to the png image.
- Return type:
masks (list)
- get_iterator(dryrun=False)
- get_mini_patch_parallel(classes=False)
- get_patch(x, y, classes=False)
Extract a single patch.
- Parameters:
x (int) – X-axis offset of a patch.
y (int) – Y-axis offset of a patch.
classes (list) – For the case of method is classification, extract the patch for multiple times if the patch is on the border of two or more classes. To prevent patcher to extract a single patch for multiple classes, on_annotation=1.0 should work.
- get_patch_parallel(classes=False, max_workers=-1)
Run get_patch() in parallel.
- Parameters:
classes (list) – Classes to extract.
max_workers (int) – Workers to run. -1 runs with cores*5 threads.
- get_random_sample(phase, sample_count=1)
Get random patch to check if the patcher can work properly.
- Parameters:
phase (str) – When to check. One of {start, finish}
sample_count (int) – Number of patches to extract.
- patch_on_annotation(cls, x, y)
Check if the patch is on the annotation area of a class.
- Parameters:
cls (str) – Class of the patch or the bounding box or the segmented area.
x (int) – X-axis offset of a patch.
y (int) – Y-axis offset of a patch.
- Returns:
Whether the patch is on the anntation.
- Return type:
(bool)
- patch_on_foreground(x, y)
Check if the patch is on the foreground area.
- Parameters:
x (int) – X-axis offset of a patch.
y (int) – Y-axis offset of a patch.
- Returns:
Whether the patch is on the foreground area.
- Return type:
(bool)
- remove_dup_in_results()
Remove duplicate results in self.result[“results”]
- save_patch(patch, save_as)
- save_patch_result(x, y, cls)
Save the extracted patch data to result
- Parameters:
x (int) – X-axis offset of patch.
y (int) – Y-axis offset of patch.
cls (str) – Class of the patch or the bounding box or the segmented area.
- save_results()
Save the extraction results.
Saves some metadata with the patches results.
- set_magnification(slide, magnification)
- side_on_patch(coords, x, y)
Check if at least one of the side is on the patch.
- Parameters:
coords (np.array) – Coordinations of annotations.
px (int) – X coordinate of left top corner of the patch.
py (int) – Y coordinate of left top corner of the patch.
- Returns:
- List of np.int64s which are the indices
of bounding boxes on the patch.
- Return type:
idx_of_bb_on_patch (list)
- to_bb(coord)
Convert coordinates to voc coordinates.
- Parameters:
coord (list) –
List of coordinates stored as below:
[[xOfOneCorner, yOfOneCorner], [xOfApex, yOfApex]]
- Returns:
List of coordinates stored as below:
[[xmin, ymin], [xmin, ymax], [xmax, ymax], [xmax, ymin]]
- Return type:
outer_coord (list)
wsiprocess.rule module
Object to define rules for extracting patches. Rule file should be a json file or a dict. The content of rule.json is like below.
Example
Json data below defines
extract the patches of benign and malignant
benign includes stroma but excludes malignant or uncertain
malignant means malignant itself but excludes benign
{
"benign" : {
"includes" : [
"stroma"
],
"excludes : [
"malignant",
"uncertain"
]
},
"malignant": {
"includes" : [
],
"excludes" :[
"benign"
]
}
}
- class wsiprocess.rule.Rule(rule)
Bases:
object
Base class for rule.
- Parameters:
rule (str or dict) – Path to the rule.json or the rule dict.
- classes
List of the classes. i.e. [“benign”, “malignant”]
- Type:
list
- assert_incl_excl(incl_excl: dict)
Assert the rule has assumed keys.
Supposed shape is like below
{“includes”: [“stroma”], “excludes”: [“malignant”, “uncertain”]}
- load_rule(rule: dict)
Read the rule file.
Parse the rule file and save as the classes.
wsiprocess.slide module
Slide object to pass to annotation object and patcher object. Slide is whole slide image, scanned with whole slide scanners. Mannually you can make pyramidical tiff file, which you can handle just the same as the scanned digital data, except for the magnification.
- class wsiprocess.slide.Slide(path, backend='openslide')
Bases:
object
Slide object.
- Parameters:
path (str) – Path to the whole slide image file.
backend (str) – Openslide or pyivps.
- path
Path to the whole slide image file.
- Type:
str
- slide
pyvips Image object.
- Type:
pyvips.Image
- wsi_width
Width of slide.
- Type:
int
- wsi_height
Height of slide.
- Type:
int
- crop(x, y, w, h)
- export_thumbnail(save_as='./thumb.png', size=500)
Export thumbnail image.
- Parameters:
save_as (str) – Path to save as the thumbnail image.
size (int, optional) – Size of the exported thumbnail.
- get_thumbnail(size=500)
Get thumbnail image.
- Parameters:
size (int or tuple, optional) – Size of the exported thumbnail.
- load_slide()
wsiprocess.utils module
- wsiprocess.utils.show_bounding_box(patch_path, result_path, save_as)
- wsiprocess.utils.show_mask_on_patch(patch_path, mask_path, save_as)
wsiprocess.verify module
Verification script runs before the patcher works. Verify class works for verification of the output directory, annotation files, rule files, etc. Mainly runs for cli.
- class wsiprocess.verify.Verify(save_to, filestem, method, start_sample, finished_sample, no_patches, crop_bbox)
Bases:
object
Verification class.
- Parameters:
save_to (str) – The root of the output directory.
filestem (str) – The name of the output directory.
method (str) – Method name to run. One of {“none”, “classification”, “detection”, “segmentation}
start_sample (bool) – Whether to save sample patches on Patcher start.
finished_sample (bool) – Whether to save sample patches on Patcher finish.
extract_patches (bool) – [Deleted]Whether to save patches when Patcher runs.
no_patches (bool) – Whether to save patches when Patcher runs.
- save_to
The root of the output directory.
- Type:
str
- filestem
The name of the output directory.
- Type:
str
- method
Method name to run. One of {“none”, “classification”, “detection”, “segmentation}
- Type:
str
- start_sample
Whether to save sample patches on Patcher start.
- Type:
bool
- finished_sample
Whether to save sample patches on Patcher finish.
- Type:
bool
- extract_patches
[Deleted]Whether to save patches when Patcher runs.
- Type:
bool
- no_patches
Whether to save patches when Patcher runs.
- Type:
bool
- magnification(slide, magnification)
Check if the slide has data for the magnification the user specified.
- Parameters:
slide (wp.slide.Slide) – Slide object to check.
magnification (int) – Target magnification value which has to be smaller than the magnification of the slide.
- static make_dir(path)
Make output directory.
- make_dirs()
Ensure the output directories exists for each tasks.
- on_params(on_annotation, on_foreground)
Verify the ratio of on_annotation.
- Parameters:
on_annotation (float) – Overlap ratio of patches and annotations.
on_foreground (float) – Overlap ratio of patches and foreground area
- Raises:
wsiprocess.error.SizeError – If the sizes are invalid.
- sizes(wsi_width, wsi_height, offset_x, offset_y, patch_width, patch_height, overlap_width, overlap_height, dot_bbox_width=False, dot_bbox_height=False)
Verify the sizes of the slide, the patch and the overlap area.
- Raises:
wsiprocess.error.SizeError – If the sizes are invalid.