I have just finished my first week of the Google Summer of Code (GSOC) program with the CCExtractor organization. This is the first of thirteen weeks during which I will be working on developing a system which will be able to extract hard (burned-in) subtitles from a given video, adding to the current functionality of CCExtractor which extracts soft subtitles, i.e. those which are part of the data structures of the video stream but are not part of the video itself. This process will involve subtitle text localization followed by optical character recognition (OCR). In a common man’s words, this will make the computer understand which letters are part of the subtitles as compared to the series of pixels it originally sees in the video frame. These recognized letters can then be written to one of many popular subtitle formats such as the SubRip format (with a .srt file).
You can read my project proposal here.
The Setup
I am working from my university accommodation, with my development environment being Linux (Ubuntu 14.04). My setup can be seen in the featured image on this post, with two OS’s running side by side. I write all my code in Ubuntu and have GitHub or any docs open on the left screen in Windows 10.
A great thing about GSOC is the complete freedom you are provided, as long as you work for the stipulated amount of time. It is like a full-time job, without the fixed hours or travel times. As long as I finish my assigned tasks and work for 40 hours in the week on my project, I can do whatever I want with my spare time. This week has been really fun in that aspect. I have had the perfect blend of working and enjoying my summer at the same time. In fact, if you look at the picture of my setup, you will see League of Legends open in the background (I was about to play a game at the moment of taking the picture). I would often update a pull request and run tests on it, and play a game while the tests ran, and then get back to work immediately after. I never felt short of time to enjoy my personal life while working on the project at the same time. I am working on a computer vision problem which interests me and is fun, while getting enough time to game or go out, and earning $5500 over the summer. What more could I ask for, really?
The Bug Hunt
The first week of the coding duration of the program was devoted to trying to fix existing bugs in the code. The rationale behind this is that all developers work together for a week to try and fix as many bugs as possible following which there is a new, improved release of CCExtractor (version 0.81 to be precise). This gives all developers a much more stable/bug-free version to work with for the remaining part of GSOC.
In addition to fixing bugs, this week also gives a lot of time to actually get acquainted with the code in depth, which was particularly useful for me because I need to seamlessly integrate the pipeline I will develop with the normal workflow of the existing program, for which I need to know what is happening in the code and what parts I need to use and edit. And fixing bugs is a fantastic way to achieve this higher level of acquaintance with the code base.
I was assigned 6 bugs to try and fix during this week. I successfully managed to fix 3 bugs. The bugs assigned to me were:-
- Case fixing for teletext subtitles (Fixed)
- Seeking DVD video using the IFO (information) metadata file
- DVB subtitle extraction not working for a Spanish channel
- Issues with timing is ISDB (Brazilian) subtitles (Partially Fixed)
- Missing subtitles in a Korean broadcast
- Very high RAM consumption by the program (Fixed)
I also managed to fix another bug which was reported by David Liontooth of the Red Hen Lab at UCLA and had originated due to a previous pull request of mine. This entire process educated me a bit about how the maintenance stage of the software development life cycle is the one which involves the most effort. It is easy to write hundreds of lines of code and make something work at first glance, but with usage over time, we inevitably discover problems which need to be fixed. And in the open source world, it is good form for the person who wrote the part of the code which is causing problems to be responsible for it and work on a fix.
The Major Fix – Reducing RAM Consumption by 180 MB
In my opinion, the most significant issue which I was able to solve this week was reducing the memory consumption of the program by 180 MB.
A fellow developer had reported very high memory consumption by every instance of CCExtractor in an application he is building. After a lot of analysis, I narrowed down the problem to the fact that around 180 MB of space was being allocated statically for EPG (Electronic Program Guide) data for two months in every instance, even if it was never being used. Once I had this pinpointed, I submitted a fix which changed the allocation to only when needed. Easy enough right? That’s what I thought too, and I ran tests on my pull request, which did reduce memory consumption from 200 MB to around 20 MB. Bizarrely, the fix was causing segmentation faults (accessing invalid parts of the memory) for a set of test cases. I had no idea why. I spent the whole next day on tracing variable initializations throughout the code which could be causing the issue, and finally saw that at one small line of code, the wrong context was being passed to a function causing a required value to not be set. You can take a look at the technical details here.
I managed to fix the issue by passing the correct context later, but the baffling question was that why did it work earlier even when the wrong context was passed? As it turned out, when 180 MB of memory was previously being allocated (and set to 0 entirely), the value which was later uninitialized due to having an invalid pointer actually pointed to one of these 0’s in this huge chunk of memory, thus getting a valid value, and not causing anything to break. This was a perfect example of how a bug may just sit there, hidden and not break anything and come up later in the development life cycle and completely mess with your mind. But anyway, it was fixed and the program was now running as expected, while being 180 MB lighter on RAM 😀
An excerpt of the conversation on the GitHub issue page is as follows:-
It made me really happy to be able to help out with a fellow developer’s application. After all, that is what open source is all about. We come together and use the cool things that each other have developed, and move ahead to create something even better. It may be something as small as writing a small ping pong game or something as big as writing a deep neural network API such as Caffe which drives cutting edge features of even giant companies like Facebook, every little bit counts, and makes the code that the world uses better.
Moving to Week 2
In week 2, I will actually commence work on my project. I need to work on a frame extraction and pre-processing module for white colored burned-in subtitles. I look forward to making it happen, and I will update my progress right here. Cheers!