This was a term project for Carnegie Mellon’s 15-112: Fundamentals of Programming
Frolf 112: A Disc Golf Simulator
Prompt
The prompt offered next to no requirements, other than a minimum amount of complexity. Knowing that I was fascinated by computer vision, and that frisbee is one of my passions, I set out to combine the two in my project, Frolf 112 (Frisbee golf)
Code Overview
3D graphics
I knew that I wanted to make the game in 3D, and it quickly became apparent that a library would be the much faster way to accomplish this. After looking around for a bit, I came across Panda3D, which had a decent amount of documentation, and was developed by Carnegie Mellon! It also had the functionality I was looking for, including loading in models, applying hit boxes, and creating animations. After a bit of tinkering and learning the library I was able to install it and use it as intended.
Computer Vision
It was very important to me that CV be used in the project, both because it was very exciting to me, and because as a Frisbee player, it was a bit difficult for me to imagine throwing the disc with arrow keys. Ultimately, I used the OpenCV library to build in the CV pipeline, which takes in data from the webcam, finds the contour that represents the hand, finds the centroid of that contour to read in as hand position, and checks the relative size of the hand contour frame to frame to see when the hand has become larger, signifying that the disc has been thrown (as the throwing motion is made the player thrusts their hand towards the camera, making it appear larger).
A difficulty that arrose with this approach is that the hand contouring was accomplished by first thresholding the input imagery with color bounds, which leads to issues with different lightings, different skin tones, and skin colored items elsewhere in frame. Ultimately I was able to solve this problem by taking inspiration from the green screen: the player simply wore a brightly colored glove that I could more easily select for.
In a further edition of the game, it may be possible to semantically segment the image inputs to allow for glove-less play, but that was well beyond the scope for the project.
Physics Engine
The Panda3D built in physics engine was a bit too rudimentary to model something as complex as a spinning, flying disc. Discs are essentially airfoils in that they are affected by lift and drag forces, as well as gravity. The issue arises with the number of states that can change do to the discs orientation and speed.
The linear speed of the disc as well as its roll and pitch affect both the drag and lift forces in the forward direction, but also affect lateral motion. Furthermore, these roll and pitch angles tend to flatten out over the course of the flight. In the end this was a bit too much to compute a closed form solution for, so instead I utilized an Euler method to build out my physics engine. The initial states come from the CV input, and then the physics engine computes the next step given the previous one, at every time step. These states were:
x, y, and z coordinates
roll, and pitch angles
x, y, and z velocities
angular velocities in the roll and pitch directions
I would use these states to determine forces, and by extension the accelerations in each direction and about each axis, then use the time step to find all the n+1 values for the states, print the result to the animation if it was time for a new frame, rinse and repeat. This is still a simplification of the true physics at play, but it gets the point across, and without truly simulating all of the aerodynamics at play, this type of engine is about all I could accomplish given the processing power I had access to.
Multiprocessing
Ultimately one of the largest hurdles that I stumbled upon was related to the passing of information between the CV, physics, and animation pipelines. The issue was that the camera would take a full second to initialize every time it was called upon. This caused a really laggy and frustrating experience. My solution was to just initialize a live feed from the camera and pass frames when they were needed, rather than requesting that the camera turn on every time a new frame was being rendered. This was accomplished with the use of multiprocessing, which allowed me to run two instances of python simultaneously, and port data as needed between the two. It worked super well because as far as the camera was concerned it was just turning on once and sending constant data, and as far as the physics engine was concerned, it always had access to the most recent camera frame.
Final Result
Ultimately I successfully implemented my own physics engine, 3D graphics, computer vision, and multiprocessing to get this to work, and it came out pretty well! This project even won me the Mastercard best project award (in a course with about 400 people!)
See for yourself: