Uploading Annotations

In this tutorial, we explore some different options to upload annotations in Remo. We will:

  • add annotations from a file in a format supported by remo
  • add annotations from code, which enables uploading annotations from any input format
  • introduce the concept of Annotation Sets, for finer control over annotations

We start off by creating a dataset and populating it with some images

%load_ext autoreload
%autoreload 2
import sys
# need to specify path to remo in notebook
local_path_to_repo =  '/home/andrea/Desktop/Projects/repo/remo-python'
sys.path.insert(0, local_path_to_repo)

import remo
import os
import pandas as pd

urls = ['https://remo-scripts.s3-eu-west-1.amazonaws.com/open_images_sample_dataset.zip']
my_dataset = remo.create_dataset(name = 'D1', urls = urls)

((\ (>':') Remo server is running: v0.3.10-77-g3ab8aa16

Add annotations stored in a file supported by remo

Adding annotations from a file supported by Remo just requires passing the file via dataset.add_data method and specifying the task.

Remo is able to automatically parse annotations in JSON, CSV, XML in a variety of formats (such as Pascal, CoCo, Open Images, etc). You can read more about file formats supported by remo in our documentation.

As an example, let's see how to add some annotations for an Object Detection task from a CSV file with encoded classes

In this case, annotations are stored in a CSV file in a format already supported by Remo. Class labels were encoded using GoogleKnowledgeGraph. Remo automatically detects the class encoding and translates it into the corresponding labels

annotation_files=[os.getcwd() + '/assets/open_sample.csv']

df = pd.read_csv(annotation_files[0])
df.columns

Index(['ImageID', 'Source', 'LabelName', 'Confidence', 'XMin', 'XMax', 'YMin', 'YMax', 'IsOccluded', 'IsTruncated', 'IsGroupOf', 'IsDepiction', 'IsInside'], dtype='object')

my_dataset.add_data(local_files=annotation_files, annotation_task = 'Object detection')

{'files_link_result': {'files uploaded': 0, 'annotations': 9, 'errors': []}}

We can now see annotation statistics, explore the dataset and further leverage Remo

my_dataset.get_annotation_statistics()

[{'AnnotationSet ID': 19, 'AnnotationSet name': 'Object detection', 'n_images': 9, 'n_classes': 15, 'n_objects': 84, 'top_3_classes': [{'name': 'Fruit', 'count': 27}, {'name': 'Sports equipment', 'count': 12}, {'name': 'Mammal', 'count': 7}], 'creation_date': None, 'last_modified_date': '2020-03-15T20:38:00.140964Z'}]

my_dataset.view()

Open http://localhost:8123/datasets/11

<iframe id="remo_frame_901a61a5-01ea-4f4b-9794-110cb25c5f0f" width="100%" height="100px" src="http://localhost:8123/datasets/11?allheadless" frameborder="0" allowfullscreen

dataset_added_annotation.jpeg

Add annotations from code

In case your annotations are in a custom format, it's still very easy to upload annotations from code (as long as the task is one of those currently supported by Remo).

As an example, let's see how we can add annotations to a specific image from code using the Annotation object and dataset.add_annotations()

This can be useful for instance to add model predictions as annotations or to tag specific images. In case your input data is in a custom file, you can write a parser to load annotations using the Annotation object.

First, let's retrieve one image. The

images = my_dataset.images()
my_image = images[1]
print(my_image)
print('Resoultion: ', my_image.width, 'x', my_image.height)

Image: 1527 - 000a1249af2bc5f0.jpg Resoultion: 1024 x 678

Now we can easily add annotations using add_annotations() method of the dataset class

annotations = []

annotation = remo.Annotation()
annotation.img_filename = my_image.name
annotation.classes='Human hand'
annotation.bbox=[227, 284, 678, 674]
annotations.append(annotation)

annotation = remo.Annotation()
annotation.img_filename = my_image.name
annotation.classes='Fashion accessory'
annotation.bbox=[496, 322, 544,370]
annotations.append(annotation)

my_dataset.add_annotations(annotations)

Progress 100% - 1/1 - elapsed 0:00:01.001000 - speed: 1.00 img / s, ETA: 0:00:00

my_dataset.view_image(my_image.id)

Open http://localhost:8123/image/1527?dataset_id=18

<iframe id="remo_frame_88efd8a5-4f2d-47dd-8289-f3a2877005e4" width="100%" height="100px" src="http://localhost:8123/image/1527?dataset_id=18&allheadless" frameborder="0" allowfullscreen

dataset_added_annotation.jpeg

Annotation sets

------------------------ THIS SECTION IS WORK IN PROGRESS ------------------------

Behind the scenes, Remo organises annotations in Annotation sets. An annotation set is simply a collection of all the annotations of Dataset.

An annotation set is characterized by a task (such as 'Object Detection') and a list of classes, besides of course the actual annotations.

The advantage of grouping annotations in an Annotation Set is that it allows for high-level group operations on all the annotations, such as: - grouping classes together - deleting objects of specific classes - comparing of different annotations (such as ground truth vs prediction, or annotations coming from different annotators)

In the examples we have seen before, Remo automatically creates an annotation set and sets it as default. For more control, it's however possible to explicit manipulate Annotation sets objects.

Let's first create an empty annotation set with a predetermined list of classes

my_classes = ['Airplane', 'Clothing', 'Dog', 'Fashion accessory', 'Food', 'Footwear', 'Fruit', 'Human arm', 
         'Human body', 'Human hand', 'Human leg', 'Mammal', 'Man', 'Person', 'Salad', 'Sports equipment', 'Trousers', 'Woman']

annotation_set = my_dataset.create_annotation_set(annotation_task = 'Object detection',
                                          name = 'Objects',
                                          classes = my_classes)

We can easily retrieve different annotation sets of a dataset

my_dataset.annotation_sets()

[Annotation set 15 - 'Object detection', task: Object detection, #classes: 15, Annotation set 16 - 'my_ann_set_2', task: Object detection, #classes: 3]

When adding the file to an existing annotation set, remo automatically only adds annotations for classes that are part of that annotation set. It's possible to add classes to an annotation set, but we require this to be done explicitly.