apacker
diff --git a/‎README.md
Lines changed: 3 additions & 2 deletions b/‎README.md
Lines changed: 3 additions & 2 deletions
diff --git a/‎introduction_to_amazon_algorithms/object_detection_pascalvoc_coco/object_detection_image_json_format.ipynb
Lines changed: 676 additions & 0 deletions b/‎introduction_to_amazon_algorithms/object_detection_pascalvoc_coco/object_detection_image_json_format.ipynb
Lines changed: 676 additions & 0 deletions
diff --git a/‎introduction_to_amazon_algorithms/object_detection_pascalvoc_coco/object_detection_recordio_format.ipynb
Lines changed: 526 additions & 0 deletions b/‎introduction_to_amazon_algorithms/object_detection_pascalvoc_coco/object_detection_recordio_format.ipynb
Lines changed: 526 additions & 0 deletions
diff --git a/‎introduction_to_amazon_algorithms/object_detection_pascalvoc_coco/tools/concat_db.py
Lines changed: 127 additions & 0 deletions b/‎introduction_to_amazon_algorithms/object_detection_pascalvoc_coco/tools/concat_db.py
Lines changed: 127 additions & 0 deletions
@@ -43,6 +43,7 @@ These examples provide quick walkthroughs to get you up and running with Amazon
 - [XGBoost for multi-class classification](introduction_to_amazon_algorithms/xgboost_mnist) uses Amazon SageMaker's implementation of [XGBoost](https://github.com/dmlc/xgboost) to classify handwritten digits from the MNIST dataset as one of the ten digits using a multi-class classifier. Both single machine and distributed use-cases are presented.
 - [DeepAR for time series forecasting](introduction_to_amazon_algorithms/deepar_synthetic) illustrates how to use the Amazon SageMaker DeepAR algorithm for time series forecasting on a synthetically generated data set.
 - [BlazingText Word2Vec](introduction_to_amazon_algorithms/blazingtext_word2vec_text8) generates Word2Vec embeddings from a cleaned text dump of Wikipedia articles using SageMaker's fast and scalable BlazingText implementation.
+- [Object Detection](introduction_to_amazon_algorithms/object_detection_pascalvoc_coco) illustrates how to train an object detector using the Amazon SageMaker Object Detection algorithm with different input formats (RecordIO and image).
 
 ### Scientific Details of Algorithms
 
@@ -65,8 +66,8 @@ These examples that showcase unique functionality available in Amazon SageMaker.
 - [Bring Your Own R Algorithm](advanced_functionality/r_bring_your_own) shows how to bring your own algorithm container to Amazon SageMaker using the R language.
 - [Installing the R Kernel](advanced_functionality/install_r_kernel) shows how to install the R kernel into an Amazon SageMaker Notebook Instance.
 - [Bring Your Own scikit Algorithm](advanced_functionality/scikit_bring_your_own) provides a detailed walkthrough on how to package a scikit learn algorithm for training and production-ready hosting.
-- [Bring Your Own MXNet Model](advanced_functionality/mxnet_mnist_byom) shows how to bring a model trained anywhere using MXNet into Amazon SageMaker
-- [Bring Your Own TensorFlow Model](advanced_functionality/tensorflow_iris_byom) shows how to bring a model trained anywhere using TensorFlow into Amazon SageMaker
+- [Bring Your Own MXNet Model](advanced_functionality/mxnet_mnist_byom) shows how to bring a model trained anywhere using MXNet into Amazon SageMaker.
+- [Bring Your Own TensorFlow Model](advanced_functionality/tensorflow_iris_byom) shows how to bring a model trained anywhere using TensorFlow into Amazon SageMaker.
 
 ### Amazon SageMaker Pre-Built Deep Learning Framework Containers and the Python SDK
 
 
@@ -0,0 +1,127 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from imdb import Imdb
+import random
+
+class ConcatDB(Imdb):
+    """
+    ConcatDB is used to concatenate multiple imdbs to form a larger db.
+    It is very useful to combine multiple dataset with same classes.
+    Parameters
+    ----------
+    imdbs : Imdb or list of Imdb
+        Imdbs to be concatenated
+    shuffle : bool
+        whether to shuffle the initial list
+    """
+    def __init__(self, imdbs, shuffle):
+        super(ConcatDB, self).__init__('concatdb')
+        if not isinstance(imdbs, list):
+            imdbs = [imdbs]
+        self.imdbs = imdbs
+        self._check_classes()
+        self.image_set_index = self._load_image_set_index(shuffle)
+
+    def _check_classes(self):
+        """
+        check input imdbs, make sure they have same classes
+        """
+        try:
+            self.classes = self.imdbs[0].classes
+            self.num_classes = len(self.classes)
+        except AttributeError:
+            # fine, if no classes is provided
+            pass
+
+        if self.num_classes > 0:
+            for db in self.imdbs:
+                assert self.classes == db.classes, "Multiple imdb must have same classes"
+
+    def _load_image_set_index(self, shuffle):
+        """
+        get total number of images, init indices
+
+        Parameters
+        ----------
+        shuffle : bool
+            whether to shuffle the initial indices
+        """
+        self.num_images = 0
+        for db in self.imdbs:
+            self.num_images += db.num_images
+        indices = list(range(self.num_images))
+        if shuffle:
+            random.shuffle(indices)
+        return indices
+
+    def _locate_index(self, index):
+        """
+        given index, find out sub-db and sub-index
+
+        Parameters
+        ----------
+        index : int
+            index of a specific image
+
+        Returns
+        ----------
+        a tuple (sub-db, sub-index)
+        """
+        assert index >= 0 and index < self.num_images, "index out of range"
+        pos = self.image_set_index[index]
+        for k, v in enumerate(self.imdbs):
+            if pos >= v.num_images:
+                pos -= v.num_images
+            else:
+                return (k, pos)
+
+    def image_path_from_index(self, index):
+        """
+        given image index, find out full path
+
+        Parameters
+        ----------
+        index: int
+            index of a specific image
+
+        Returns
+        ----------
+        full path of this image
+        """
+        assert self.image_set_index is not None, "Dataset not initialized"
+        pos = self.image_set_index[index]
+        n_db, n_index = self._locate_index(index)
+        return self.imdbs[n_db].image_path_from_index(n_index)
+
+    def label_from_index(self, index):
+        """
+        given image index, return preprocessed ground-truth
+
+        Parameters
+        ----------
+        index: int
+            index of a specific image
+
+        Returns
+        ----------
+        ground-truths of this image
+        """
+        assert self.image_set_index is not None, "Dataset not initialized"
+        pos = self.image_set_index[index]
+        n_db, n_index = self._locate_index(index)
+        return self.imdbs[n_db].label_from_index(n_index)