Pages

What Programming Language Should I Use to Build a Startup?

Often entrepreneurs ask me 'What technology should I build my startup on?' There is no right or wrong answer to this question. It's a decision every company makes for itself, depending on what it's trying to build and the skills of its cofounders. Nonetheless, there are a few rules that one should adhere to. We discuss them in this blog post.

Incident Response Policy

What happens in your company when a production incident occurs? Usually in a typical startup, you will see engineers running around frantically trying to resolve the problem. However, as soon as the incident is resolved, they forget about it and go back to their usual business. A good incident response policy can help bring order into chaos. We provide a sample template in this blog post.

Why Software Deadlines Never Make Sense

We discuss why software deadlines usually don't make sense.

Analyzing Front-End Performance With Just a Browser

We discuss a number of freely available online tools which can be used to analyze bottlenecks in your website.

Why Smaller Businesses Can't Ignore Security and How They Can Achieve It On a Budget

In this article, we show that security is both important and achievable for smaller companies without breaking a bank.

Saturday, December 15, 2012

Mechanical Turk


Many of us have used the Amazon Turk to solve problems. Here is an interesting story from ChessLab.com about the real original mechanical turk automaton:



The second game returned would be: 'Bonaparte, Napoleon -- Automaton, The Turk, 0-1, 1809'.  Automaton, The Turk was the first chess playing machine.  Of course, it was operated by a little man hidden in a machine. The Turk was  very famous, touring through the whole Europe for decades.  Apparently, there were multiple hidden Turk operators -- the whole succession of strong chess players, who bought and sold the machine to each other.  Allegedly, in the course of one of the games between Bonaparte and The Turk, Bonaparte had started making illegal chess moves.  In a mechanical sort of gesture, The Turk's hand had corrected the emperor's moves twice.   As soon as Bonaparte had moved the same piece to the same illegal position for the third time, The Turk machine allegedly sent the chess pieces flying off the board...

Thursday, December 13, 2012

Why Work at Cinchcast and BlogTalkRadio


By Rob Blackin (VP of Engineering) and Dr. Aleksandr Yampolskiy (CTO)

We’re a growing team of innovators in design, product, technology and business. We’re passionate about what we are building, and we’re excited about what’s to come. We help companies better connect and communicate with the people that are most important to their business online every day. We’ve cultivated a dynamic and entrepreneurial culture for our employees that you can help shape.

Our Culture

Our work hours are flexible. We love what we do, but we also value our free time (and sleep). We offer competitive benefits and compensation + stock options. We work hard but we also like to have fun around here – our pantry’s loaded, our couches are lounge-worthy and we’re bringing back bocce ball in a big way. We’re based in New York City, but we have members of our team working from all over the world.

Toys

When a team works hard, you need a way to blow off some stream. Programmers especially need to sometimes walk away from the screen and clear their heads in order to find the solution to a tricky problem or just to celebrate getting that complex algorithm to work perfectly. It’s sort of the equivalent of a football player dancing in the end zone. Out team has developed a Foosball ranking system that we use for bragging rights. Nerf darts and Nerf Footballs are often seen flying through the air. Bocce ball and Xbox 360 sit waiting for those who want to partake and, of course, plenty of red bull.

Work with best and the brightest

We have some of the best developers you can find anywhere. We have people who routinely present at meetups, are active in the development community and are always keeping up with the latest technology. Many of our developers were CTOs and heads of development organizations in other companies before coming here. Everyone has great experience and ideas. Our white board architecture discussions by the couch are tremendously fun and intellectually stimulating. We use all the hottest technologies: .NET, NodeJS, IronRuby, MongoDB, REDIS, HTML5, and Objective-C for Iphone development. Check out our Tech bloghttp://tech.cinchcast.com to see what we work on.

Work in a high volume environment where scalability and high availability is critical

Our Blogtalkradio.com website gets millions of visitors per day. We are building an infrastructure to enable us to scale another 10x via a new service oriented architecture including a data caching layer, notification queuing and front end donut caching. Our Cinchcast business handles conference calls with more than 1000 participants. There are very few other companies in the world that can handle that volume. Recently, we were featured in VentureBeat for the work we did to scale conference calls: http://venturebeat.com/2012/11/29/cinchcast-uses-the-cloud-to-scale-up-its-self-serve-conference-calls/

Our product makes a difference in people’s lives

Our BlogTalkRadio website gives a voice to people who have something to say.“BlogTalkRadio has turned my ordinary life into something pretty amazing. I couldn’t be living my dream if it wasn’t for BlogTalkRadio.” Amy McCrackenWe get emails from listeners telling how terrific out product is. We have gotten thank you emails from people with lots of interesting use cases. • Interview with their father about his experience living through the holocaust shortly before he passed away • Recording their children’s first words • People connecting on common subjects sub as home schooling and ADHD, etc. • We even received an email from a blind woman loving the ability to listen to so many talk shows.

Come Join Us!

Our product is Software as a Service with a telephony twist. What we do each day matters to the bottom line of the company. Millions of people will be affected by the code that you write and architecture that you design. Every day we are building and reevaluating and going after markets in an agile way. We are passionate about hiring great people. If you are a great developer or an IT professional, contact us. Have more questions? Email me at alexyampolskiy@blogtalkradio.com

Wednesday, December 12, 2012

Unexpected uses of Blogtalkradio

We've been digging through some of the emails over the past few years, and uncovered emails from BlogTalkRadio listeners who thank us for our product.

 There are a lot of interesting usecases:

• Interview with their father about his experience living through the holocaust shortly before he passed away
• Recording their children’s first words
• People connecting on common subjects sub as home schooling and ADHD, etc.
• We even received an email from a blind woman loving the ability to listen to so many talk shows.

 For an engineer, it's great when the work you do, matters and makes a difference in the lives of your
 customers.

  

Wednesday, November 28, 2012

CloudBeat 2012 -- The S Factor: Optimizing a Cloud-Based Platform for Scalability and Security

Cinchcast is a cloud-based, enterprise solution for webcasts and conference calls of any size. On a monthly basis, Cinchcast powers 15 million audio streams and attracts over 36 million unique visitors. In this talk, we’ll discuss how Cinchcast development and production environments operate and the role of New Relic in scaling Cinchcast platform to meet event demands. Dr. Yampolskiy will explain how Cinchcast maintains agile release cycles, while monitoring for performance and security issues. He will give some concrete examples of how a drastic drop in page views was discovered through a monitoring tool, or how his team thwarted a DDOS attack through cloud provisioning. Speaker Dr. Aleksandr Yampolskiy, CTO, Cinchcast Moderator Vanessa Alvarez, Director of Product Marketing, Gridstore

Monday, November 26, 2012

Good Questions to Ask During Technical Architecture Reviews


Here is a list of good technical questions to ask during technical architecture reviews.
If a presenter doesn't know answers to them, then the product is probably not ready to be built:

  1. Can you draw a systems diagram for me?
  2. How will this work on 4 or more boxes? How will you load balance requests between them?
  3. What's the average latency for a request? What can you cache? (Again, if a person didn't think through this, then the systems isn't ready).
  4. How will you test this?
  5. What can fail? How can we build a system so that it degrades gracefully when failures happen?
  6. What are the security risks?

Sunday, November 25, 2012

Cinchcast Architecture - Producing 1,500 Hours Of Audio Every Day

(This article originally appeared on High Scalability website a few months back: http://highscalability.com/blog/2012/7/16/cinchcast-architecture-producing-1500-hours-of-audio-every-d.html)


Cinchcast provides solutions that allow companies to create, share, measure and monetize audio content to reach and engage the people that are most important to their business.  Our technology integrates conference bridge with live audio streaming to simplify online events and enhance participant engagement. The Cinchcast technology is also used to power Blogtalkradio, the world’s largest audio social network. Today our platform produces and distributes over 1,500 hours of original content every day.   In this article, we describe the engineering decisions we have made in order to scale our platform to support this scale of data.

Stats

  • Over 50 million page views a month
  • 50,000 hours of audio content created
  • 15,000,000 media streams       
  • 175,000,000 ad impressions
  • Peak rate of 40,000 concurrent requests  per second  
  • Many TB/day of data stored in MSSQL, Redis, and ElasticSearch clusters
  • Around a 100 hardware nodes in production.

Data Centers

  • Production website is run from the data center in Brooklyn. We like to control our own destiny instead of relegating data to the cloud. 
  • Amazon EC2 instances are used mostly for QA and Staging environments.

Hardware

  • About 50 web servers
  • 15 MS SQL database servers
  • 2 Redis NOSQL key value servers
  • 2 NodeJS  servers
  • 2 servers for elastic search cluster

Dev Tools

  • .NET 4 C# : ASP.NET and MVC3
  • Visual Studio 2010 Team Suite as an IDE
  • StyleCop, Resharper for enforcing code standards
  • Agile development methodology, with Scrum used for large features and Kanban taskboard for smaller tasks
  • Jenkins + Nunit for testing and continuous integration
  • Sauce On Demand – Selenium for automation testing

Software And Technologies Used

  • Windows Server 2008 R2 x64: Operating System
  • SQL Server 2005 running under Microsoft Windows Server 2008 Web Server
  • Equalizer load balancers: for load balancing
  • REDIS: used as the distributed caching layer and for message pub-sub queue
  • NODEJS for real-time analytics and updating studio dashboard
  • ElasticSearch : for show search
  • Sawmill + custom parser scripts: for log analysis

Monitoring

  • NewRelic for performance monitoring 
  • Chartbeat for impact of performance on KPI (conversions, page views)
  • Gomez, WhatsupGold, Nagios for various alerting
  • SQL Monitor: from Red Gate - for SQL Server monitoring

Our Approach

  • “Be brief, be bright, be gone” : Respect another person’s time. Don’t come with problems, come with solutions.
  • Don’t go chasing hot technologies of the day. Instead ‘mitigate your top problems’.   We adopt new technologies but do so, when the business case requires it. Appetite for Production outages decreases significantly when you have millions of users.
  • Achieve “essential”, then worry about “excellent”.
  • Be a “how team” instead of a “no team”.
  • Build security into the software development lifecycle.  You need to train developers on how to write secure software and make it a business priority from the start.

Architecture

  • All Javascript, CSS and images are cached at the CDN level. The DNS points to a CDN which passes requests to origin servers. We use Cotendo because it allows to make L7 routing decisions at the CDN.
  • Separate cluster of web servers is used to serve requests for regular users and requests for ad users, differentiated by a cookie.
  • We are moving towards a service-oriented architecture where key pieces of the system, such as search, authentication, caching, are RESTFUL services implemented in various languages. These services also provide a caching layer.
  • REDIS NOSQL key-value store (redis.io) is used as a cache layer before database calls.
  • Scaleout is used to maintain a session state across a garden of web servers. However, we are considering switching onto REDIS.


Lessons Learned

  • Text search in SQL server database doesn’t work well.  It was clogging up the CPU so we switched to ElasticSearch (a Lucene derivative).
  • The built-in session module by Microsoft is prone to deadlocks, so we ended up replacing it with AngiesList session module, storing data to REDIS.
  • Logging is key to detecting problems.
  • Reinventing the wheel can be a good thing. For example, initially we used a vendor product for bundling JS/CSS together which started causing performance issues. We then rewrote bundling ourselves, and significantly improved performance of our site.
  • Not all data is relational, so database isn’t always a good medium. A good analogy is “Imagine you have water flowing down the pipe. The pipe is wide at the top but gets narrow towards the bottom.”  The top is the web servers (there are many of them), the bottom is the databases (there are few and they get clogged up).
  • Not using metrics in your development process is like trying to land a plane in a storm with your altimeter not working. Throughout your development process, compute metrics such as site throughput, time to fix Blocker/Critical bugs, code coverage and use them to gauge your performance.


The S Factor: Optimizing a Cloud-Based Platform for Scalability and Security

This Tuesday, I am flying out to CloudBeat 2012 conference (http://venturebeat.com/events/cloudbeat2012/agenda/) to talk about how we scaled Cinchcast audio technology to handle millions of visitors.

The abstract is below. If you are in the San Fran, do stop by.  Thanks to Vanessa Alvarez from Forrester for moderating:

Cinchcast is a cloud-based, enterprise solution for webcasts and conference calls of any size. On a monthly basis, Cinchcast powers 15 million audio streams and attracts over 36 million unique visitors. In this talk, we’ll discuss how Cinchcast development and production environments operate and the role of New Relic in scaling Cinchcast platform to meet event demands. Dr. Yampolskiy will explain how Cinchcast maintains agile release cycles, while monitoring for performance and security issues. He will give some concrete examples of how a drastic drop in page views was discovered through a monitoring tool, or how his team thwarted a DDOS attack through cloud provisioning.

Saturday, November 10, 2012

Startup Exits : a primer

A killer deck by Mark Suster. Thanks to Jatin Shah for pointing it out (cross-posted from http://jatinshah.tumblr.com/post/35405359389/startup-exits-by-mark-suster)


 

Friday, November 9, 2012

Angular

At Cinchcast, we've been investigating the use of Angular for structuring our HTML and Javascript code. I've got to admit it it's a very clean framework, and the data binding feature is amazing. You no longer have to write jquery functions and callbacks to update the page. Angular does it all for you: By the way, we are hiring great engineers in New York area. If you know .NET and want to work with NodeJS, Redis, MongoDB, Angular, and a slew of other technologies, drop us a note at jobs@cinchcast.com

Incident Response Policy


What happens in your company when a production incident occurs?
Usually in a typical startup, you will see engineers running around frantically trying to resolve the problem. However, as soon as the incident is resolved, they forget about it and go back to their usual business.


A good incident response policy can help bring order into chaos. There are a few best-practices that one should keep in mind when production outages occur:

- Having a procedure in place helps reduce the panic. Security incidents should be treated differently than production outages.
- In the report, explain a response timeline and how the problem was discovered.
- An incident report should be written the same day as an incident occurred. Otherwise, you risk forgetting what happened.
- It should have concrete follow-up actions, tracked as JIRA tickets. If you don't do this, then engineers will not follow up.
- Put up incident reports in a public location and compute metrics Are incidents happening less frequently this month than the previous? Is there any correlation between incidents? Are follow-up actions being addressed?

Attached is a sample incident response template that I've used.


Incident Analysis Report



Time of Incident:
5:22AM
Time of Recovery:
5:50AM 3/15/12
Date Issue first identified
3/15/12
Discovered by:
Alex
Incident Report Prepared By:
Alex
Date:
3/15/12






I.           Description of Incident:
II.         AWS Statement
2:40 AM PDT We are investigating connectivity issues for EC2 in the US-EAST-1 region.
3:03 AM PDT Between 2:22 AM and 2:43 AM PDT internet connectivity was impaired in the US-EAST-1 region. Full connectivity has been restored. The service is operating normally.


III.       Business Impact: Frustrated customers because the website ACME was unaccessible.

IV.       Security Impact:
There are no known issues related to this subject           

V.         Technical Impact:
There weren’t enough servers to handle the load for new customers.

VI.       Event Timeline: 


5:30AM
All Amazon hosts were inaccessible
5:50PM
Service was restored 











VII.     Lessons Learned:
-          We need to know the business impact for each server on Amazon and put DR polices and procedures in place for outages.  We could also leverage the California EC2 Cloud to potentially help outages in just Virginia.



VIII.   Action Items:


1.       Called EC2 and they are going to alert us of what they find out about the issue (INFRA-123)
2.       Identify what we can  and can’t do if EC2 goes down   (INFRA-345)




Thursday, November 8, 2012

Icecast security


[1]


Icecast is a server program used to stream in MP3 or Ogg Vorbis formats, which is very popular in Internet radio community. Many CDNs including Limelight use it to stream live MP3 streams. I've been browsing the web for typical vulnerabilities afflicting Icecast.  It looks like the trend is positive.  According to CVEdetails [2] the last vulnerability in the database dates 2007 and the trend has been declining :


Vulnerabilities By Year
5
2
3
2
1
  2001 5
 2002 2
 2004 3
 2005 2
 2007 1
Vulnerabilities By Type
5
7
7
2
1
1
  Denial of Service 5
 Execute Code 7
 Overflow 7
 Directory Traversal 2
 XSS 1
 Bypass Something 1













References
[1] Illustration from http://livestream123.com/wp-content/uploads/icecast.jpg
[2] http://www.cvedetails.com/vendor/693/Icecast.html

Cinchcast Connect product

I am very excited that our Tech team has released Cinchcast Connect product. A common problem in large conference calls with hundreds or thousands of participants is that you do not know who is on the line. In Cinchcast Connect, we implemented a universal PIN :
"Registered participants receive a unique PIN code to access the audio conferencing portion of corporate events hosted on the Cinchcast platform. Event participants do not have to wait on hold to be screened by operators prior to entering events. In addition, for users who may attend multiple corporate events (Employee Town Halls, Team Meetings, Earnings /Analyst Calls), once an individual has registered on the Cinchcast platform, their unique PIN code will always be the same."  [1]

Now you no longer have to guess who is on the call because names of attendees are displayed in our studio.  You will see in real-time the number of listeners on the web and callers on the phone.

Our player is HTML5 compliant and requires no browser plugins, works over regular HTTP port 80 so you don't need to poke holes in a firewall, and requires minimal bandwidth requirement (15x-20x less than a video stream). So it turned out to be a great product:


If you are interested to try it out, please drop us a line: http://cinchcast.com/contact/



References
[1]  http://cinchcast.com/news/cinchcast-launches-universal-pin-code-access-for-enterprise-conference-calls/




We'
Read m

Referore here: http://www.sacbee.com/2012/10/29/4945817/cinchcast-launches-universal-pin.html#storylink=cpy

Branching Strategy

At Cinchcast Tech, we've been spending a lot of time discussing a proper branching strategy for our codebase.

There exist dev, qa, and staging branches.  All development starts locally and then gets merged into the dev branch. After testing, QA team can merge it into qa branch. Finally, when the code is ready to be released it gets merged into the staging branch:



When we work on new releases, we follow one of two approaches:
1. Release branches. A separate branch is created for each release.
For example, FOO_3_1_2 branch would be created for all work done on release 3.1.2 of the FOO project.

2. Feature branches. A separate branch is created for each large component. Typically these components require isolated testing, and are merged into the main branch only at the end. The naming convention is AY_MODULE where AY is initials of a developer and MODULE is the name of the component.

All new branches are created off a staging branch, which should mimic the code that's running in production.
Any urgent hotfixes are typically made directly on a staging branch, and then backported into other branches.

Any load testing or security analysis is typically done during QA stage when the code has been merged into qa branch. We have a variety of scanners running 24x7 against our qa and production environments, such as Mcafee Secure scanning for dynamic security vulnerabilities and NewRelic continuously checking the performance of the application. If any issues are found, then the code is rolled back and cannot go into Production.

Note: We are always looking to hire great software engineers. So if you are one, and are looking for an exciting environment to work at, email us at jobs@cinchcast.com



Wednesday, November 7, 2012

A nice diagram of OpenRTB ecosystem

(from http://www.iab.net/media/file/OpenRTB_API_Specification_Version2.0_FINAL.PDF)

Interesting Stats About My Gmail

GMailMeter (http://www.gmailmeter.com/) is a clever tool, which analyzes your Gmail mailbox for detailed statistics on how you use your email. Hourly and weekly volume, number of words per email, time to respond are all interesting statistics that it measures on a month-to-month basis.

I tried it out and within 30 minutes learned that :

- most of my emails have between 1-100 words (i do like to cut right to the point)
- i get a lot of emails (already knew that)
- i respond to 15% of my emails in under 5 minutes (now that's scary)
- and only 59% of emails are addressed directly to me
- number of emails i send spikes up after 6pm (logical with two little kids in the house)

Overall, GMailMeter seemed like a very useful tool and I recommend everyone else to try it.
Now I just need to figure out what to do with this statistics.

In the past month:


1927 conversations

660 were important
47 have been starred
I have started 20.29% of them
and have replied to 6.12% of the others

2487 emails received

received from 580 people
59.07% were sent directly to me

739 emails sent

to 138 people











Saturday, November 3, 2012

What Technology Stack Should My Startup Use?

Often entrepreneurs ask me 'What technology should I build my startup on?'
There is no right or wrong answer to this question.  It's a decision every company makes for itself, depending on what it's trying to build and the skills of its cofounders.  Nonetheless, there are a few rules that I try to adhere to:

1. Your technology choice doesn't matter much.
For early stage startups, the main goal should be to get their application up and running as soon as possible. Then, they will be able to get customers, funding and hire great engineers. Most languages are similar to one another, and even if you discover that a particular technology choice was a wrong one, you can fix it up in the future. For example, Facebook was written in PHP, then they ran into scalability issues, rewrote parts of the application as services communicating over Thrift messaging protocol, and fixed them.

2. 'Don't chase hot technologies of the day'.
You should use technology that's right for the job, and not just because it's trendy. Often, engineers like to choose a technology just because it's trendy. Guess what - technology trends just like fashion trends come and go.  For example, NodeJS is very hip right now, but it uses a single thread for computations which makes it not a top choice for CPU intensive computations.

3. Ask around.
Ask around other people about what they are doing and why.   Did they use MongoDB for storing analytics information or stored in a database? Are they using SOLR or ElasticSearch for real-time search. Experience helps and many technologists will be happy to lend free advice.

4. Don't use esoteric technologies where little open-source innovation is happening.
By using esoteric technologies, you will have a harder time to recruit engineers. Technologies where lots of open-source innovation is happening (Ruby on rails, .net, java, etc.) are always a good mainstream choice.   On the other hand, Pascal, maybe not so much.

Friday, November 2, 2012

Donate to Sandy


Everyone, please go ahead and donate to support the recovery efforts from Hurricane Sandy.
GENERAL
For local Red Cross chapters:
New York
New Jersey
Connecticut
For more from the Salvation Army or to donate, visit  https://donate.salvationarmyusa.org/disaster
For local Salvation Army chapters:
NEW YORK CITY

Thursday, November 1, 2012

Jiro Dreams of Sushi - Quest for Perfection

I just saw on NetFlix "Jiro Dreams of Sushi"


It's a touching documentary about an 85 year old sushi chef
Jiro Ono, and his quest for a perfect sushi. His hole-in-a-wall restaurant possesses the coveted 3-star Michelin rating because of his attention for detail, love for his work, and constant strive for perfection.


A famous food critic Yamamoto says in the movie about what it takes to make a great chef.
I believe that the same qualities apply to being a great Computer Scientist or an Entrepreneur:
A great chef generally has the following five attributes.
First, they take their work very seriously and consistently strive to perform at the highest level.
Second, they aspire to continually improve their skills. To be better today than yesterday. To be better tomorrow than today.
Third, cleanliness. If the restaurant doesn’t feel clean, the food isn’t going to taste good.
The fourth attribute is impatience. They are not prone to collaboration. They’re stubborn and insist on having things their own way.
What ties these attributes together is passion. That’s what makes a great chef.

Thursday, October 18, 2012

Building a complete HTML5 + Node app

A good tutorial on building a complete app using latest HTML5 features

Scalability cube

AKF partners wrote a good blog post about how to visualize IT scalability: http://akfpartners.com/techblog/2008/05/08/splitting-applications-or-services-for-scale/


An application that's monolithic and not scalable starts at the bottom, left corner (0,0,0).
You have three choices as you scale your application:

1. Put an application on more servers behind a load balancer and evenly distribute load across them. (horizontal duplication on X-axis). As Blogtalkradio had more visitors, we kept adding more web servers and databases.
2. Separate application into components each of which can be run on different servers.  (Y-axis split by function). At Blogtalkradio, our studio control board is hosted on set of servers, RSS feeds are produced by another group, and regular visitors go to yet another cluster.
3. Finally, for websites with hypergrowth, you can do sharding (aka Z-axis lookup-oriented splits). Here, you can take user GUID or ID and compute H(.) consistent hash or modulo N and send the traffic to the corresponding server.   This is nice to implement at CDN or load balancer level if you can augment them with scripts.

Monday, October 15, 2012

Developer's toolkit (from founders and funders group)

Interestingly .NET is more often used than Java.


Wednesday, October 10, 2012

Hot 2012 Technologies


HTML5 Job Trends graph

At Cinchcast Tech, we use 7 out of 10 hottest technologies listed at Indeed job trends report  (http://www.indeed.com/jobtrends). 

1. HTML5  (our players)
2. MongoDB (used for Cinchcast reporting and analytics)
3. iOS (cinchcast iphone app)
4. Android   .
5. Mobile app (done that)
6. Puppet  
7. Hadoop 

8. jQuery (used throughout)
9. PaaS (we could argue we are platform as a service)
10. Social media (sharing etc.)


We are also adopters of NodeJS, Redis, and ElasticSearch. Now that's a lot of acronyms!



Sunday, September 30, 2012

Impact of page load time on conversions


A nice chart from Kissmetrics about the impact of web page load time on profits.

Wednesday, September 26, 2012

Success

Tuesday, September 18, 2012

Cinchcast Introduces an Innovative Solution for Virtual Events, Webcasts, and Conference Needs

Wednesday, September 12, 2012

Even the small bytes matter


Great post by Cinchcast engineer Enrique Alegretta about how we implemented a framework to minify HTML and inline Javascripts on our Blogtalkradio webpages.
This is cross-referenced from our Tech blog http://tech.cinchcast.com 

Even the small bytes matter
Here at Cinchcast we found ourselves thinking about different ways we could reduce ourbandwidth usage. Since BlogTalkRadio has aconsiderable user base, we use a lotof bandwidth in our efforts to provide an enjoyable user experience.
One of the techniques we use to facilitate faster page loads is enabling gzip, this is a common practice and will save lots of bytes simply by compressing the response output. Since we want to continually evolve and have a great infrastructure we continued to look for more ways to improve response times and page load speed. One particular technique we are using (and the main reason of this blog post) is removing whitespace and minifying inline css and inline javascript (we’re currently minifying external css and js files thanks to the excellent YUI compressor).
Since we’re developers at heart anddon’t like to reinvent the wheel unless we plan on learning more about wheelswe asked ourselves: is there anything out there already that we could use or at least useas a base for our implementation? The answer was yes and was located at this codeplex project.

Enter WebOptimizer

WebOptimizer.NET is a set of http modules you can add to your website for whitespace removal, inline javascript minification and inline css minification among other things. It’s a great project but since its open source what we did instead of using it as is was to take the pieces we wanted and build our own solution (because we like to learn more about wheels XD). This effort lead to the birth of the Cinchcast Framework, a common set of tools and functionality which will be shared among the Cinchcast and Blog Talk Radio applications.

The Cinchcast Web Framework

We ended up digging through WebOptimizer and took the pieces we needed and performed some changes to it. One key class we took and changed was theBaseFilterStream class. This is an abstract class that provides the base functionality to create FilterStreams which will be assigned to the Response.Filter property in order to properly modify the response. One of the changes we did was to modify its constructor in order to have the following:
/// 
/// Initializes a new instance of the  class.
/// 
/// The context.
protected BaseFilterStream(HttpContextBase context)
{
      if (context == null)
            throw new ArgumentNullException("context", "The context cannot be null");

      HttpContext = context;
      Sink = context.Response.Filter;
}

/// 
/// Initializes a new instance of the  class.
/// 
/// The context.
/// The previous stream.
protected BaseFilterStream(HttpContextBase context, Stream previousStream)
{
      HttpContext = context;
      Sink = previousStream;
}

/// 
/// Gets the HTTP context.
/// 
/// 
/// The HTTP context.
/// 
protected HttpContextBase HttpContext
{
      get;  
      private set;
}
The reasons for doing this are two fold:
  1. We wanted to have the HttpContext available to all the classes which inherit from this class, we’re using HttpContextBase abstraction in order to be able to mock it during our unit tests.
  2. We wanted to provide stream chainability to perform several chained actions over the Response output. This is why one constructor has a reference to the previous stream. Every time the Write method is called, all the actions are performed and a call to Sink.Write is made, by using the previousStream what we’re doing is just calling another filter stream Write method which will perform its glorious magic.
We did reuse the RemoveWhiteSpaceFilterStream, MinifyInlineStylesFilterStream and MinifyInlineJavascriptFilterStream, but instead of using the tools for minification the WebOptimizer provided, we used Microsoft’s AjaxMinLib for CSS minification and the JSCompressor class from this codeproject article.

Cleaning up the response

We took our base foundation classes to get a Response.Filter, twisted it to make it show what we want, then wrapped this up into an HttpModulewe named…. yeah, you guess it, CleanHttpResponseModule or CHRM (pronounced charm).
Adding this module to the website automatically performs whitespace removal, inline javascriptminification and inline cssminification, but since we need to debug stuff and if you spent too much time debugging something that is minified you’ll end up as Crazy Dave from PvZ we gave the module the ability to deactivate any of those features. You can change this behavior on a page by page basis (using the queryString) or to the entire application using the appSettings section.

Conclusion and results

Using this module to its full potential (minifies js, css and removing whitespaces) allows us to save something between 2KB to 5KB per request, but since our requests are inthe millions take that value and multiply it and in the end the gains will be significant.
This is why even the small bytes matter.