Google Summer of Code, Week 1 – The Bug Hunt

Google Summer of Code, Week 1 – The Bug Hunt

gsoclogocclogo

I have just finished my first week of the Google Summer of Code (GSOC) program with the CCExtractor organization. This is the first of thirteen weeks during which I will be working on developing a system which will be able to extract hard (burned-in) subtitles from a given video, adding to the current functionality of CCExtractor which extracts soft subtitles, i.e. those which are part of the data structures of the video stream but are not part of the video itself. This process will involve subtitle text localization followed by optical character recognition (OCR). In a common man’s words, this will make the computer understand which letters are part of the subtitles as compared to the series of pixels it originally sees in the video frame. These recognized letters can then be written to one of many popular subtitle formats such as the SubRip format (with a .srt file).

You can read my project proposal here.

The Setup

I am working from my university accommodation, with my development environment being Linux (Ubuntu 14.04). My setup can be seen in the featured image on this post, with two OS’s running side by side. I write all my code in Ubuntu and have GitHub or any docs open on the left screen in Windows 10.

A great thing about GSOC is the complete freedom you are provided, as long as you work for the stipulated amount of time. It is like a full-time job, without the fixed hours or travel times. As long as I finish my assigned tasks and work for 40 hours in the week on my project, I can do whatever I want with my spare time. This week has been really fun in that aspect. I have had the perfect blend of working and enjoying my summer at the same time. In fact, if you look at the picture of my setup, you will see League of Legends open in the background (I was about to play a game at the moment of taking the picture). I would often update a pull request and run tests on it, and play a game while the tests ran, and then get back to work immediately after. I never felt short of time to enjoy my personal life while working on the project at the same time. I am working on a computer vision problem which interests me and is fun, while getting enough time to game or go out, and earning $5500 over the summer. What more could I ask for, really?

The Bug Hunt

The first week of the coding duration of the program was devoted to trying to fix existing bugs in the code. The rationale behind this is that all developers work together for a week to try and fix as many bugs as possible following which there is a new, improved release of CCExtractor (version 0.81 to be precise). This gives all developers a much more stable/bug-free version to work with for the remaining part of GSOC.

In addition to fixing bugs, this week also gives a lot of time to actually get acquainted with the code in depth, which was particularly useful for me because I need to seamlessly integrate the pipeline I will develop with the normal workflow of the existing program, for which I need to know what is happening in the code and what parts I need to use and edit. And fixing bugs is a fantastic way to achieve this higher level of acquaintance with the code base.

I was assigned 6 bugs to try and fix during this week. I successfully managed to fix 3 bugs. The bugs assigned to me were:-

  1. Case fixing for teletext subtitles (Fixed)
  2. Seeking DVD video using the IFO (information) metadata file
  3. DVB subtitle extraction not working for a Spanish channel
  4. Issues with timing is ISDB (Brazilian) subtitles (Partially Fixed)
  5. Missing subtitles in a Korean broadcast
  6. Very high RAM consumption by the program (Fixed)

I also managed to fix another bug which was reported by David Liontooth of the Red Hen Lab at UCLA and had originated due to a previous pull request of mine. This entire process educated me a bit about how the maintenance stage of the software development life cycle is the one which involves the most effort. It is easy to write hundreds of lines of code and make something work at first glance, but with usage over time, we inevitably discover problems which need to be fixed. And in the open source world, it is good form for the person who wrote the part of the code which is causing problems to be responsible for it and work on a fix.

The Major Fix – Reducing RAM Consumption by 180 MB

In my opinion, the most significant issue which I was able to solve this week was reducing the memory consumption of the program by 180 MB.

A fellow developer had reported very high memory consumption by every instance of CCExtractor in an application he is building. After a lot of analysis, I narrowed down the problem to the fact that around 180 MB of space was being allocated statically for EPG (Electronic Program Guide) data for two months in every instance, even if it was never being used. Once I had this pinpointed, I submitted a fix which changed the allocation to only when needed. Easy enough right? That’s what I thought too, and I ran tests on my pull request, which did reduce memory consumption from 200 MB to around 20 MB. Bizarrely, the fix was causing segmentation faults (accessing invalid parts of the memory) for a set of test cases. I had no idea why. I spent the whole next day on tracing variable initializations throughout the code which could be causing the issue, and finally saw that at one small line of code, the wrong context was being passed to a function causing a required value to not be set. You can take a look at the technical details here.

I managed to fix the issue by passing the correct context later, but the baffling question was that why did it work earlier even when the wrong context was passed? As it turned out, when 180 MB of memory was previously being allocated (and set to 0 entirely), the value which was later uninitialized due to having an invalid pointer actually pointed to one of these 0’s in this huge chunk of memory, thus getting a valid value, and not causing anything to break. This was a perfect example of how a bug may just sit there, hidden and not break anything and come up later in the development life cycle and completely mess with your mind. But anyway, it was fixed and the program was now running as expected, while being 180 MB lighter on RAM 😀

An excerpt of the conversation on the GitHub issue page is as follows:-

memoryfix

It made me really happy to be able to help out with a fellow developer’s application. After all, that is what open source is all about. We come together and use the cool things that each other have developed, and move ahead to create something even better. It may be something as small as writing a small ping pong game or something as big as writing a deep neural network API such as Caffe which drives cutting edge features of even giant companies like Facebook, every little bit counts, and makes the code that the world uses better.

Moving to Week 2

In week 2, I will actually commence work on my project. I need to work on a frame extraction and pre-processing module for white colored burned-in subtitles. I look forward to making it happen, and I will update my progress right here. Cheers!

Advertisement

A Strange Road Accident

A Strange Road Accident

I had a strange day.. Finished my last board exam and headed home by an auto-rickshaw.. On the way, a 7-8 year old child was walking on the road with his mother and a friend of hers.. He was listening to music on a mobile phone and ran blindly across the road.. My auto-rickshaw collided with the child who suffered multiple injuries and lay on the street, unconscious.. A mob gathered around us and had it not been for me and the auto-wallah rushing him to the hospital (the accident took place right in front of the Civil Hospital in Wardha), the auto-wallah would surely have been beaten brutally and his auto burnt.. We managed to get him to a bed of the hospital where he lay helplessly, now awake and crying.. It was clear that he had suffered a break in his knee joint (It looked limp).. A doctor came around 3-4 minutes later.. She looked at the kid and advised his mother to get him admitted.. Shockingly, the mother said that she couldn’t as she was in somewhat of a hurry.. The doctor then went on to stress the fact that the kid’s knee was fractured and he couldn’t possibly walk in that state.. The mother reluctantly agreed to have him admitted after being insisted to do so by almost everyone present.. There was a police inspector there taking a statement from another accident victim.. He advised the auto-wallah and the mother to not say that it was a road accident, as it would involve some police paperwork, which he seemed to want to avoid.. All I did was try and console the kid that it was going to be okay.. When I felt that I could do no more, I paid the driver and went home in another auto-rickshaw..
Life is strange sometimes..

A Hero Comes Home

A Hero Comes Home

The floodlights were on, the pitch was glistening
The whole world watching, the players listening
To the roar of the crowd as they walked out that night
On what was supposed to be a footballing delight

A hero returned, to his home which once was
He was greeted by cheers and thunderous applause
Thousands of fans, singing his name
It was just like the old days, almost all the same

But this night he was, on the side of the foe
Even if on the night, he was reluctantly so
He lined up against old friends, it was what he had to do
Because the foe kept food on his table you know

The game kicked off, the battle began
The quest was on to see who can
Emerge victorious on the biggest stage
And in doing so write a page
Of the famed pages of footballing history
But just who would win, was a total mystery

The giants clashed, it was an epic in the making
Every player on the pitch knew how vital was taking
Any half-chance which may come his way
To help his team take the next step to a Wembley day

The game went on, the home team went ahead
On the night of champions, first blood was red

Then came the moment, which changed it all
A home player jumped in the air for the ball
But the referee did not like what he saw
And the player was sent off, distraught

The momentum shifted, the foe now attacked
And with a touch of genius, they pulled it back

The fans still roared, singing in the night
As the home team put up, an incredible fight
But in the end, it was to be in vain
A night of magic, turned into one of pain

The clinching moment, was written in the stars
The hero did what he had to do
He latched onto the end of a low pass
And struck a blow into every home fan’s heart

He never celebrated, out of heartfelt respect
For he remembered his roots, where he grew
As a player, and as a man
Showing the world, what he could do

The game ended, the final whistle blew
The home fans still sang, for they knew
Their team had made them all proud
And their voices never ceased to sing aloud

They still cheered for him, their hero of old
Of this magical night, stories will forever be told
He turned back the years, showing his desire
To create magic on the field, for all to admire

The hero was doubtless, the star of the game
Battling with emotion, he walked off the stage
And with his heart pounding, as he did so

He could still hear the fans singing, “Viva Ronaldo”

When India Won The Cricket World Cup

When India Won The Cricket World Cup

A NATION CELEBRATES…

After a long wait of 28 years and six unsuccesful attempts, the Men in Blue finally got their hands on the holy grail of the cricket world, the World Cup, once again. But it was not a walk in the park for them by any means. They faced a spirited Sri Lankan side that was raring to go and put up a gritty performance.

MATCH REPORT

The toss was won by the Lankans, and unsurprisingly, Sangakkara elected to bat first on a wicket that was expected to deteriorate and slow down by the time the second innings was to be played. But whatever advantage was offered to them by winning the toss was quickly taken away as they were hit by a fantastic opening spell by Zaheer Khan who bowled 3 maiden overs to increase the already overwhelming pressure of the occasion on the Sri Lankans. Added to this was the superb fielding of Yuvraj, Kohli and Raina, who between them saved many a boundary by throwing their bodies around. Finally, the pressure told, and Tharanga edged one to Sehwag at first slip. This was followed by Dilshan getting out to a soft dismissal, trying to sweep a Harbhajan ball that was outside leg stump and only managing to deflect it onto the stumps. Sri Lanka then tried to consolidate their position and work their way to a respectable total. With two calm heads, Sangakkara and Jayawardene at the crease, they started rotating the strike well and putting together a decent partnership. But two runs short of his half century, Sangakkara misjudged an innocuous looking Yuvraj delivery and edged it into the waiting gloves of Dhoni. Now, India had broken into the suspect middle-order of the Lankans, and soon, Samaraweera was given out LBW to Yuvraj after the review and reversal of Simon Taufel’s original decision. But at the other end, Jayawardene was still going strong and he found an able companion in the form of Nuwan Kulasekara, who played a little cameo towards the end of the innings. Much to the delight of the Lankan fans, Jayawardene completed his century and Thisara Perera showed that he can hit the ball too, and the Lankan recovery from a slow start was complete with a blistering Batting Powerplay. The target set for the Indians was a very respectable 275 at a required rate of 5.5 runs per over.

Like their counterparts, the Indian innings had a shaky start too. Both teams had finished with their lowest mandatory powerplay scores in the tournament. The first wicket to go down was Sehwag, out LBW in a similar fashion to Malinga as he was in the semis against Pakistan. But the biggest shock of the evening came when the little master, Sachin Tendulkar himself was caught behind after flashing at a Malinga outswinger, and the crowd was wrapped in deafening silence, with their expectations of a Tendulkar hundred unfulfilled. Although India got off to a bad start, the Delhi boys, Gambhir and Kohli started to stamp their authority on the game by stringing together a solid stand until the latter was out caught and bowled to Dilshan after a leading edge. Surprisingly, Dhoni himself elected to walk in at number 4 ahead of Yuvraj. But his tactics proved to be spot on as his partnership with Gambhir became the highest by an Indian pair in a World Cup Final. But when the chase was going on swimmingly, a moment of madness by Gambhir saw him bowled, just three runs short of what would have been a magnificent ton. But if the Lankans thought that there was a turn in their fortunes to come, they were sadly mistaken because Yuvraj showed no signs of taking his foot off the gas. However hard the Lankans tried, the duo had the answer to every delivery they would bowl. Finally, India clinched victory in a fitting manner with Dhoni himself hitting a huge six to start the celebrations.

AFTERMATH

There was joy in the streets, victory parades were started nationwide, and the country rejoiced in a way never seen before. Fireworks went off all around. A billion people celebrated. The promise had been fulfilled, the tag of favourites had been justified, and the World Cup was brought home…