Add another camera to DLC and take your data to 3D

How awesome is DeepLabCut??? Open-source markerless tracking of videos. So fun!!! Now let’s take that fun to 3D space!

This project demonstrates how to use DLC to generate three dimensional feature tracking data from multiple views of the same scene/behavior. Two (or more) cameras can be placed around an area of interest without the need for a fancy setup (that is, no a priori, distance, angle, or focal length needs to be precise). The code here will show how you can capture the video (with time stamped frames). You can then take a ‘wand’ (any object that has two unique points, like a marker or pen) and wave it in front of the cameras, using DLC to track the points. We then have some code to clean up the DLC data, and finally use a software package to create a calibration file. With this, you can now capture new videos of behaviors of interest, run it though DLC, and reconstruct it in 3D!!

This is written and tested only on Ubuntu. EasyWand is written for Matlab. The scripts below are written for a Jupyter notebook. We assume you know how to use Jupyter.

Unfamiliar with DeepLabCut?


Scripts should be called in a particular order. We provide a script to capture videos, then jump on over to DLC to track the features of the videos. We show how to get this data into a form suitable for EasyWand, then use EasyWand for calibration, and finally reconstruct the features in 3D space.


Download this zip file and extract it to some location. This has the files necessary to run these scripts. The demo data we have is too large for me to host, but if you want it, email us and we’ll figure something out.

Jupyter installs when you install Anaconda.

If you don’t have openCV installed, pop open the terminal (ctrl, alt, t) and enter

sudo apt install python3-opencv

Then start a jupyter notebook

jupyter notebook

Your favorite browser will launch. Navigate to where you extracted the 3DLC_demo folder. Open the video_capture.ipynb file.

1) video_capture.ipynb

SSequentially capture video footage from multiple cameras using openCV. Other software can be used to capture footage, but it is critical that each video have the same number of frames. Cameras should ideally be synchronized (or near so) to simultaneously capture frames.

This code can be used to record “wand” video and then also to record the “behavior” video you would like to analyze. Carefully set up your cameras so they are well positioned to record the behavior of interest (try to position the cameras such that each feature that you would like to analyze is visible to at least two cameras at a time). We will calibrate the cameras using a wand. The wand should be an object of a known length, preferably a length that is similar to the behavior. It should have two well defined small points on the ends (like two different color tapes wrapped around a marker). For the wand video, you will want to record the wand moving around in the overlapping field of view of all cameras. It is advised to keep both points within the view of all cameras, and try not to point directly at a camera (with both points exactly overlapping, the calibration can become confused). We can take this video, track the two points in DLC, and create a calibration between all cameras. Then, if the camera set up doesn’t have any major changes, we can then use this to get 3D tracking of new behaviors.

This code can also be used to record “behavior” video which we can then track features with DLC. Later we will use the calibration file to make take the 2D data from DLC and convert it to 3D space!

Note that there is often a sizable delay (a second or two) between the first and second frames of videos captured with this script (initialization). The exact duration of this delay can be found within the timestamps output file.

  • Output Files
    • a series of ‘.avi’ video files with the “wand” moving around in the overlapping field of view of the cameras
    • a series of ‘.avi’ video files with the “features” from multiple camera angles
    • a ‘timestamps’ csv file file that will record the time that every frame was captured from each camera. All times are relative to the moment when the first frame of the first video was captured. Rows are frames, columns are cameras.

2) DeepLabCut Tracking of Features in Videos

DeepLabCut will be used to track features in each set of videos. It is preferable to first train a network to track both tips of the “wand” being used. With the wand data collected, you can initiate training the network to track features in the behavior videos.

  • Input Files
    • wand ‘.avi’ video files for each camera angle or
    • feature ‘.avi’ video files for each camera angle
  • Output Files
    • a ‘.csv’ file for each camera angle with wand position data or
    • a ‘.csv’ file for each camera angle with data for each tracked feature

3) format_wand_csv.ipynb

Clean up and reformat DeepLabCut’s wand data for use in camera calibration. It’s unlikely that the wand was perfectly visible in every frame of the wand videos. This script uses DeepLabCut’s “likelihood” metric to judge whether both tips of the wand were clearly identified within each frame of the videos. If the wand is ever not visible within a frame, then that frame will be omitted from the file used for camera calibration.

Note that you have control over the “likelihood” cutoff, but a cutoff >= 0.999999 is recommended.

  • Input Files
    • a ‘.csv’ file for each camera angle with position data for each tracked feature
  • Output Files
    • a “formatted” csv file that will be passed into easyWand5

4) easyWand5.m

Calculates the Direct Linear Transformation (DLT) coefficients in MATLAB for the cameras based off of the formatted wand data. easyWand and it’s userguide can be found here. We would love to find a python based alternative, but for now it’s hard to beat easyWand5’s functionality.

Pull up the easyWand GUI by calling the following within MATLAB from the “demo_3d” folder.


  • Input Files
    • ‘.csv’ data files from DLC, reformated with format_wand_csv.ipynb, for each camera, from wand videos
  • Output Files
    • a ‘.csv’ of the DLT coefficients

5) reconstruct_3d.ipynb

Reconstructs 3D data for tracked features using DLT coefficients.

  • Input Files
    • a ‘csv’ output file from DeepLabCut for every camera
    • a ‘.csv’ of the DLT coefficients
    • (Optional) a timestamps ‘.csv’ file as was generated in the video_capture script. If a direct path is not given, the script will search the current directory for the most recently generated csv file with the word ‘timestamps’ in it and load the data from it. If no path was given and no timestamps files were found, the program will assume a framerate of 20 fps for the animations.
  • Output Files
    • Contained within a folder in the demo_3d directory, you’ll find a csv file for every 3d tracked body part along with a couple animation files (in mp4 and gif formats)

Multiple Cameras

But are multiple cameras necessary? Maybe not. We will test a novel technology and report later this week (hopefully). It promises absolute synchrony across multiple views from a single camera. It utilizes a crazy new technology I’m calling Moment Investigation: Reach Research ORdered Synchrony (or MIRRORS (cough cough))….

ONE Core acknowledgement

Please acknowledge the ONE Core facility in your publications. An appropriate wording would be:

“The Optogenetics and Neural Engineering (ONE) Core at the University of Colorado School of Medicine provided engineering support for this research. The ONE Core is part of the NeuroTechnology Center, funded in part by the School of Medicine and by the National Institute of Neurological Disorders and Stroke of the National Institutes of Health under award number P30NS048154.”