Recognize and Track Persons

You use an instance of the ObjectTracker class to detect, recognize, and track persons in a video stream. The video stream may originate from a camera or a video file. The object tracker allows you to either process the video stream in realtime or as fast as possible. The former mode would be typically used for a video stream that originates from a camera while the later mode can be used to process the content of a video file at a speed faster than the normal playback speed of the video file.

The following explains the creation and use of an object tracker.

Initialize the ArgusKit Framework

You must initialize the ArgusKit framework before you request any of its APIs. You can do this by calling the ARKInit() function at the beginning of your application.

Note: You may call this function only once per application session.

Create the Object Tracker

Create an instance of an object tracker configuration object that stores all the configuration information an object tracker needs to do its work. Most of the configuration information has sensible default values, but the following information must be explicitly provided:

The following code snippet shows how to set up the object tracker configuration and how to initiate the object tracker.

var trackerRef: ObjectTracker! = nil
 
 
// Select the appropriate cloud environment. Eg. PROD
let envRef = ARKEnvironmentCopyNamed("com.real.PROD")
 
// Specify the cloud environment login credentials
// NOTE: replace $USER and $PWD with the username and password that you
// NOTE: have received from us.
let userRef = ARKUserCreate($USER, $PWD)
 
// Set the user directory in which the cloud should store its data
ARKUserSetDirectory(userRef, "test")
             
 
let configRef = ARKObjectTrackerConfigurationCreate()!
ARKObjectTrackerConfigurationSetObject(configRef, kARKObjectTrackerConfigurationKey_Environment, envRef)
ARKObjectTrackerConfigurationSetObject(configRef, kARKObjectTrackerConfigurationKey_User, userRef)
ARKObjectTrackerConfigurationSetString(configRef, kARKObjectTrackerConfigurationKey_SiteID, "Building 1")
ARKObjectTrackerConfigurationSetString(configRef, kARKObjectTrackerConfigurationKey_SourceID, "Camera 1!")
ARKObjectRelease(userRef);
ARKObjectRelease(envRef);
             
 
// Create the object tracker
var callbacks1 : ARKObjectTrackerCallbacks = ARKObjectTrackerCallbacks()
callbacks1.context = Unmanaged.passUnretained(self).toOpaque()
callbacks1.willBeginTracking = object_tracker_will_begin_tracking_callback
callbacks1.didEndTracking = object_tracker_did_end_tracking_callback
callbacks1.didCreateTrackingResult = object_tracker_did_create_tacking_rsult_callback
 
trackerRef = ARKObjectTrackerCreate(configRef, true, &callbacks1)
ARKObjectRelease(configRef)

This example code selects the desired cloud environment and creates a new user object with the required user identifier and password. It also sets the cloud directory where the cloud-based face recognizer should save recognition-related information.

It then creates an object tracker configuration object and sets the cloud environment, cloud user, and some additional information to help identify the camera. It then sets up the necessary callbacks the object tracker should invoke as it processes the video stream. Finally, the example code creates the actual object tracker object.

Note: The example code assumes that the input video stream originated from a camera. This is why true is passed to the real-time parameter of the ARKObjectTrackerCreate() function.

Start a Tracking Session

Next, start a new tracking session. All tracking-related activities are done in the context of a tracking session. The object tracker uses a tracking session to maintain the necessary state. The following code snippet shows how to start a new tracking session.

ARKObjectTrackerBeginTracking(trackerRef);

Note: Always end the current tracking session and start a new session if the video source has changed or the resolution or frame rate of the video stream has changed. For example, end the current tracking session and start a new one if the user switched cameras or selected a different capture profile.

The object tracker invokes the begin-tracking callback at the start of a new tracking session. Your application can use this callback as a signal that a new tracking session has started. The following code snippet shows an example of such a callback.

private func object_tracker_will_begin_tracking_callback(_ trackerRef: ARKObjectTrackerRef, _ context: UnsafeMutableRawPointer?)
{
    print("willBeginTracking\n");
}

Run a Tracking Session

A new video frame object should be created for every decoded video frame, and this video frame object should then be passed to the object tracker. The object tracker in turn runs a face detector on the video frame and it triggers face recognitions as needed. The object tracker then updates its internal list of tracked object with the result of detectors and recognizers.

The object tracker invokes the application-provided callback with the current state of the tracked objects list. The application can inspect this list and trigger actions based on it.

Note: The application should create a copy of the tracked objects list if it wants to retain the data (e.g. if the application wants to process the tracked objects on a different thread).

The following code snippet shows how to create a video frame object and how to pass it to the object tracker.

let frameRef = ARKVideoFrameCreateWithPixelBuffer(imageBuffer, timestamp, false)!
 
// Pass this video frame to the object tracker
ARKObjectTrackerTrackObjects(trackerRef, frameRef);
ARKObjectRelease(frameRef)

You must provide a timestamp when you create a video frame object. This is typically the presentation timestamp of the video frame. The isSceneChange parameter of the ARKVideoFrameCreateWithPixelBuffer() function should be set to true if the video frame is the first video frame after a scene change. Scene changes in movies are often indicated by a cut transition from one scene to another. The object tracker uses this information to enhance its ability to disambiguate between persons in different scenes.

Note: Although a video stream from a camera includes key frames, these key frames do not indicate a scene change for tracking purposes, so the key frames should not be treated as a scene change.

The following code snippet shows an example implementation of the did-track that simply prints a description of the new tracking-results callback.

private func object_tracker_did_create_tacking_rsult_callback(_ trackerRef: ARKObjectTrackerRef, _ resultRef: ARKTrackingResultRef, _ context: UnsafeMutableRawPointer?)
{
    ARKObjectPrintDebugDescription(resultRef)
}

The object tracker invokes the end-tracking callback at the end of the tracking session. Your application code can use this callback as a signal that a tracking session has ended. The following code snippet shows an example implementation of such a callback:

private func object_tracker_did_end_tracking_callback(_ trackerRef: ARKObjectTrackerRef, _ context: UnsafeMutableRawPointer?)
{
}

End a Tracking Session

You inform the object tracker about the end of a tracking session by invoking the ARKObjectTrackerEndTracking() function. This allows the object tracker to clean up its internal state and know that it should execute pending callbacks as soon as possible. The following code snippet shows how to end a tracking session.

ARKObjectTrackerEndTracking(trackerRef);

See Also