Things to do in San Francisco – one day

Here’s what I recommended to a friend visiting San Francisco for a day -

  • Visit the Ghirardeli factory – they’re an institution, renowned for their ice-creams. Pick up the 80% cocoa dark chocolate. Recommended time evening around 7ish.
  • Visit Pier 29 – Take a boat to Alcatraz (the kala pani prison like Andaman NIcobar wala). Take the audio tour there. Recommended time 10ish. Very touristy thing to do.
  • Visit Fisherman’s Wharf – again a very touristy thing to do.
  • Take a ferry to Sausalito – tourist town, very scenic. It’s a half-day affair, so do at your risk.
  • Visit the Embarcadero – (a boarding point for the Sausalito Ferry). Scenic area, right by the ocean. Also a boarding point for the Street Car. Again very touristy.
  • Take a cable car. Must must do.
  • Visit the crooked street (Lombard st.) – overrated.
  • Visit the Twin Peaks at late evening to get a raat wala marine drive kinda view of the entire SF city. Only difference is it is top of a hill and you can see pretty much the entire SF.
  • Visit the Golden gate bridge, drive to the other side and go all way up the Marin Headlands. Must must do.
  • Take the 49 mile scenic drive which will take you through most of the above stuff - https://en.wikipedia.org/wiki/49-Mile_Scenic_Drive
If you have more than one day to spend there in the bay area, do SF for one day and the next day drive down to Yosemite. You’ll thank me. Nature and beauty at the best I’ve seen so far in my life.

The 7 minute workout resources

The 12 exercises of the 7-minute workout

The 12 exercises of the 7-minute workout

For all you busy entrepreneurs out there, do these 7-minute exercises everyday to get the benefits of several hours of cardio and lifting weights. Apart from your own body which is used to provide resistance, you’ll need a chair and a wall both of which are easily available. All you need to do is perform each exercise for 30 seconds with a 10 second rest in between. It’ll all last 7 minutes and then you’re done :-)

Getting your VirtualBox VM to work with Tata Photon data card

As described in one of my previous post -> Run CentOS VM on your windows laptop, it is easy to get an Oracle VirtualBox VM to host CentOS on your windows laptop.

On a Wifi network the VM fires up fine and is able to access the outside world. However, if you’re a road warrior and need to be able to use the VM while being connected through the Tata Photon Plus data card then follow the steps below -

So let’s get the VirtualBox VM Manager and the CentOS VM to work fine in the following scenarios -

  1. Office/home network with wifi connectivity
  2. Tata Photon data card – USB connectivity

For #1 – Office/home network with wifi connectivity 

  1. Setup the eth0 in DHCP mode as described previously

For #2 – Tata Photon data card – USB connectivity 

Setting up Internet connectivity

  1. In Windows, Go to -> Control Panel\Network and Internet\Network Connections
  2. On the Photon Plus network connection, do a Right Click -> Properties -> Sharing
  3. Select “Allow other network users to connect through this …. “
  4. In the drop down select “VirtualBox Host-Only Network”. Hit OK. If you were previously connected, disconnect and connect again.
  5. Open up a dos console and hit “ipconfig“, note down the IP that you see against “Ethernet adapter VirtualBox Host-Only Network:”

Setting up VM Networking

  1. In the Oracle VM Virtual Box Manager interface, located your VM instance.
  2. Open up Network
  3. Adapter 1 is setup for Wifi access. Leave it alone.
  4. Click on Adapter 2, Select the following values
    1. Attached to -> Host Only Adapter
    2. Name -> VirtualBox Host-Only Virtual Adapter
    3. Jot down the MAC address as shown in Advanced
  5. Now you’ve exposed Virtual Adapter #2 to your VM.
  6. Now power up your VM instance
  7. It might take an unusually long time as it might unsuccessfully try to acquire an IP address for eth0, which was set to auto-start previously. Inhale slowly and deeply. Exhale. 50% done.
  8. Log into the instance via the console
  9. Now go to
    /etc/sysconfig/network-scripts/
  10. Fire up your favorite editor (mine’s VI) and punch in the following
    1. DEVICE="eth1"
    2. BOOTPROTO="static"
    3. HWADDR="08:00:27:24:7D:07" #this is the MAC address that you jotted down above in step “VM Networking #4->3.”
    4. IPADDR="192.168.137.10" #this is a hard-coded IP address from the same series (just change the last octet) that you jotted down in “Internet connectivity #5″
    5. NETMASK="255.255.255.0"
    6. GATEWAY="192.168.137.1" #this is the hard-coded IP address as jotted down in “Internet connectivity #5″
    7. NM_CONTROLLED="yes"
    8. ONBOOT="yes"
    9. TYPE="Ethernet"
    10. UUID="c9a64bf5-fa0c-4d65-8237-a340335b699f"
  11. Save and rename the file “ifcfg-eth1″
  12. Almost there. About 80% done.
  13. Now open up /etc/resolve.conf
  14. Add “nameserver 8.8.8.8” as the first line in the file. Save.
  15. Now let’s bring up the eth1 interface.
  16. Run the following command ->ifup eth1
  17. If all goes okay, you should instantly come back to the command prompt, without any errors.
  18. Now do a “ping google.com” to see if you are able to access the internet.
  19. From putty you should be able to connect just fine to this server using the IP address as given in step “VM Networking #10->4″

 

When you’re connected via Wifi connection do remember to turn off eth1 via ifdown eth1 (to bring it back up use ifup eth1).

When you’re connected via Tata Photon Plus connection do remember to turn off the eth0 via ifdown eth0 (to bring it back up use ifup eth0)

NodeJS vs. Tornado benchmarking

I ran an Apache Benchmark test on similar NodeJS and Tornado Webserver instances. Here are the results -

ab -n 10000 -c1000 192.168.1.107:8888/

Tornado (disabled console logging via --logging=none)
Server Software:        TornadoServer/2.4.1
Server Hostname:        192.168.1.107
Server Port:            8888

Document Path:          /
Document Length:        12 bytes

Concurrency Level:      1000
Time taken for tests:   145.729 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      1700000 bytes
HTML transferred:       120000 bytes
Requests per second:    68.62 [#/sec] (mean)
Time per request:       14572.901 [ms] (mean)
Time per request:       14.573 [ms] (mean, across all concurrent requests)
Transfer rate:          11.39 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   14 203.1      1    3012
Processing:    88 14113 3198.7  15748   18858
Waiting:       15 8399 4052.4   9394   15839
Total:         88 14128 3200.1  15749   18859

Percentage of the requests served within a certain time (ms)
  50%  15749
  66%  15780
  75%  15801
  80%  15813
  90%  15839
  95%  18794
  98%  18843
  99%  18854
 100%  18859 (longest request)

NodeJS

Server Software:
Server Hostname:        192.168.1.107
Server Port:            8888

Document Path:          /
Document Length:        12 bytes

Concurrency Level:      1000
Time taken for tests:   124.844 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      1130000 bytes
HTML transferred:       120000 bytes
Requests per second:    80.10 [#/sec] (mean)
Time per request:       12484.435 [ms] (mean)
Time per request:       12.484 [ms] (mean, across all concurrent requests)
Transfer rate:          8.84 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   12 187.1      1    3009
Processing:   183 11937 3366.1  12766   18837
Waiting:       21 6747 3388.1   6419   12817
Total:        184 11949 3367.8  12766   18839

Percentage of the requests served within a certain time (ms)
  50%  12766
  66%  12792
  75%  12815
  80%  15760
  90%  15812
  95%  15826
  98%  15840
  99%  15842
 100%  18839 (longest request)

 

Observations -

  1. NodeJS spits out lesser HTML response headers than Tornado for a similar response output (see total transferred)
  2. Time taken to perform the benchmark is lesser in NodeJS at 124.84 seconds as compared to Tornado at 145.73 seconds
  3. NodeJS gives a better throughput at 80 reqs/sec as compared to Tornado at 68 reqs/sec

Run CentOS on your Windows laptop

Here are the steps to run a virtual instance of Linux CentOS on your Windows laptop using Oracle VirtualBox -

Downloading the essentials

  1. Get Oracle VirtualBox from https://www.virtualbox.org/wiki/Downloads
  2. Get CentOS 6.3 ~ 64bit from AOL India servers -> http://centos.aol.in/6.3/isos/x86_64/ (assuming you’re in India, else get your CentOS from another mirror). I had downloaded CentOS minimal which is about 300+ MB.

VM setup

  1. Go ahead and install Oracle VirtualBox. This should be an easy step.
  2. Now it’s time to setup your Linux instance….
  3. Click “New” and go ahead and give the new instance a name.
  4. Select Type:”Linux” and Version:”2.6″ (64 bit). Hit Next.
  5. Change recommended memory size to 512 MB.
  6. Select “Create a Virtual Hard Drive now” > Next > Keep VDI selected. Hit Next.
  7. Select “Fixed size” > 2.0 GB. Hit Next.
  8. The wizard should now close.
  9. The instance you just created should show up on the Left hand bar in the “powered off” state.
  10. Now right click on the instance you just created. Hit Start.
  11. When prompted to provide a start disk, select the CentOS 6.3 ISO that you had previously downloaded. Hit Start.
  12. Now follow the instructions to install Linux just like you would on a bare metal box.
  13. If you plan to run web-server or an app-server on this CentOS VM then you’ll have to change the networking mode for this VM from NAT to Bridged. This ensures that your VM will get an IP address from the same DHCP source as your Windows Laptop.

Making CentOS ready

  1. Now log into your CentOS instance via the console
  2. Once you’re logged in as “root”, run -> ifup eth0. This will bring up your ethernet interface.
  3. Now your instance will have a “real” IP address. To check, run -> ifconfig
  4. Now open up “/etc/sysconfig/network-scripts/ifcfg-eth0” and change ONBOOT to “yes“. This will ensure that you don’t have to perform step #2 above whenever you bring up your VM instance.
  5. From now on you can SSH into your instance (via putty, if you prefer) via the IP address of the machine – as found out in step #3.
  6. Now let’s change an ipfilter rule to allow HTTP traffic to the VM instance.
  7. Open up the /etc/sysconfig/iptables file and add the following rule
    • -A INPUT -m state --state NEW -m tcp -p tcp --dport 80 -j ACCEPT
    • The above should be added just below the this line “-A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT” in the file
  8. You’re all set. Now go ahead and run -> init 6 for a VM reboot.
  9. Now go ahead and install Apache or any other web server and  it shall be ready to serve on port 80 since we’ve opened up the port via step #7.

 

Hope the above steps will help you take the first few steps towards understanding virtualization and running your own virtual linux instance under Windows.

Celebrating Diwali without firecrackers

While driving back from work this evening I noticed that most shops along the way are all lit up and decorated well for the festive season. It hit me that Diwali is just around the corner. I don’t know when but somewhere along the way I had a random thought of convincing my kids to celebrate Diwali without bursting firecrackers this year. Absolutely no firecrackers. Not even a sparkler.

Since when did this festival of lights turn into a festival of noise and air pollution. Don’t we have enough pollution around us already? I’ve heard some TED talks in which they say this might be the last century for our species. It’s shocking. We’ve already polluted our environment enough to cause global warming and that it’s too late now to stop the impending catastrophe.

I don’t want my kids to associate Diwali d to to attempt making an Indian sweet dish.

They said they’ll try convincing their friends to try giving up firecrackers too this Diwali.

Just because it has been done like this for years doesn’t make it right. Don’t follow the herd. Think and do the right thing.

My kids got convinced to do the right thing, what about you?

Happy Diwali.

Update: On Sunday we went ahead and donated the stuff at St. Catherine’s Home, which is an orphanage located on Veera Desai Road in Andheri West, Mumbai. When we reached there the Sisters were kind enough to explain their concept & vision and give us a tour of the facilities. The facility is divided into various cottages. They have a massive campus which is kept tidy and cared for. They also have the facilities to take care of HIV+ kids. The Sisters also told us of success stories of kids who’ve now grown up and moved on and are successful in life. Visiting the orphanage was a leveling experience.

How to reduce your VOD bandwidth bill by upto 30%

If you’re a large media organization that publishes a lot of Video on Demand (VOD) content you may want to read the below post on how to save more than 30% in VOD bandwidth bills without compromising the quality of your videos or the degrading the user experience.

Objective – Save bandwidth costs on VOD without any changing the video itself

 

Current mechanism - 

This is how all the existing flash based video players play out the VOD content  - when the user starts playing out a video, the player starts downloading the entire video in the background. This happens irrespective of the fact whether the user is going to watch the entire video or not. With high speed internet connectivity these days, the time taken to download the video is usually much faster than the playout time of the video.

For e.g. say a user is watching a 2 minute video clip which could easily be 20 MB plus depending on the encoding quality and FPS. As the video starts playing, the player starts downloading the rest of the 20 MB video in the background  What if the user moves on to another page or website after watching only half of the video?

There are two problems with the current approach -

  1. Wasted bandwidth – As you can guess, in the above example if the user only watched half of the video then the bandwidth is wasted downloading the entire clip. This is wasted bandwidth both for the content provider as well as the user. They’re both being charged for the bytes transferred.
  2. Blocked server connection – as the video player is downloading the rest of the video in the background, it results in a blocked server-side connection until the video is fully downloaded.

 

Proposed mechanism - 

Summary – Change the flash player to fetch the video in segments instead of downloading it all in one go.

Say when the user starts playing the 2 minute video, only the next 30 seconds worth of video is pre-fetched. As the user approaches the 15 second viewing mark, the next 15 seconds worth of video is downloaded. A new 15 second video segment is fetched every time the user finishes watching the current 15 second segment. Every time the video player needs to fetch the video, it establishes a new connection and releases it once the segment is downloaded.

In the above mentioned approach, the video is downloaded in chunks depending on how much the user has watched and not all at once.

There are two advantages with the proposed approach -

  1. Reduction of wasted bandwidth - Say if the user watches only 1 minute of the video, the downloaded part of the video would be for approximately 1 min, 15 seconds – 12 MB. A saving of 8 MB from the previous scenario. That’s a 40% saving in bandwidth consumed both for the user as well as the video hosting provider.
  2. Better scalability –  every time the video player needs to fetch a segment, it makes a new connection to the server which is released as soon as the segment is downloaded. This mechanism ensures that a connection to the server is not blocked until the entire video is downloaded. This is akin to a connection pool mechanism which frees up the server to serve other connections.

 

Assumptions - 

One of the key assumptions in the above approach is that a user may not watch the full length of the video. The saved bandwidth comes from the part of the video that the user abandons or doesn’t watch. As a part of an experiment, I had instrumented the video player for one of our properties to find out whether people watch VODs in the entirety or abandon them midway. On one of our high traffic properties we found that almost a third of the users don’t end up watching the video till the very end.

 

Actual live results -

I have been lucky enough to try out this theory on heavy-traffic sites. The results have been fantastic. No user complaints about the video experience, and we’ve ended up saving over 30% costs via reduced bandwidth bills. Want to see the video live in action, check it out here on Moneycontrol videos. You’ll notice the experience is pretty seamless.

This is purely a client side solution (and so easy to implement) without you having to change anything on the server-side or in the video itself. It’s been tested and works fine with both Akamai and Tata/Bitgravity CDNs. The approach works for both FLVs as well as (hinted) MP4 video files.

ps: Would you like to know how to modify or write a  flash video player that would download videos in segments or chunks instead of the entire thing at once? Let me know via the comments. If I get enough requests, I’ll write up a post explaining various approaches on how to do it. Else, I’ll assume you are a smart bunch that knows how to do it.