Archive

Posts Tagged ‘thrift’

In a bit pipe world, do we need telecom standards?

GSMA, ETSI, etc. have been defining standards for the telecom world for years. However outside of the telecom industry these standards have found little or no adoption. In a world where telecom operators are fast becoming bit pipes, do we really need telecom standards? Why can’t the telecom industry just use SIP, WebRTC, REST, etc. just like everybody else?

The current systems in telecom are assuming calls and SMS need to be billed for. What would happen if the starting point is: data insights, network apps and connectivity are the only things that are billed? Connectivity is likely going to be unlimited or with very high limits over time. New revenue would have to come from selling data insights, either individually with consent, or aggregated and anonymised. As well as from apps that run inside the network: on CPEs, DSLAMs, mobile base stations, etc. So for the purpose of this blog let’s focus on a world where calls and SMS can no longer be charged for and connectivity is close to unlimited for most normal use cases. To move bits fast through a network, you want the least number of protocol converters. So using many different standards would make things slow and expensive. Additionally telecom operators have overpaid for lots of standards and their software support during years without ever using them. Finally implementing a standard is very costly because often only 20% of the functionality is really used, but the other 80% needs to be there to pass compliance tests.

The nonstandard or empty networking appliance
In a world where software can define networks and any missing functionality is just a networking app away, it would be a lot better to start from an empty networking appliance, i.e. networking hardware without software, and then to buy everything you need. If you need a standard then you might want to buy the minimal/light, equals 20%, implementation and see if you can live with it. Chances are you still have too many functions that are not used. Facebook open sourced its top of the rack networking solution and surprise, surprise, the interface is Thrift based. Thrift is used in all the other Facebook services to have a standard high throughput interface for all its software services. Google probably uses protocol buffers. Apache Avro would be another alternative and the most openly licensed of them all. So instead of focusing on a standard, it would be better to standardise on a highly throughput optimised interface technology instead of public slow standards. Inside a telecom operator this would work very efficient and for those systems that talk to legacy or outside world systems, adding a standard is just a networking app away. This would simplify a telecom network substantially, saving enormous costs and accelerating speed of change because less code needs to be written and maintained making integrations easier. These are all ideas that assume there are actual appliances that are software defined. As soon as general purpose compute becomes fast enough for heavy data plane traffic then the reality will be software defined networking in a virtualized way with autoscaling and all the other cloud goodies. However this reality is still some years off, unfortunately. In the short run virtualization of the control plane and software defined networking appliances [SDNA] for the data plane, is the most realistic option…

Open Source Solution Index from the Big Dotcoms

January 26, 2012 Leave a comment

The big names in dotcom world are busy open sourcing some of their secret sause. It is very important to become familiar with these often strangely named projects because they are responsible for several competitive advantages. Since the list is growing please suggest new solutions in the comments section so they can be added.

Google

Facebook

  • Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store
  • Hive a data warehouse infrastructure that provides data summarization and ad hoc querying.
  • FlashCache is a general purpose writeback block cache for Linux. It was developed as a loadable Linux kernel module, using the Device Mapper and sits below the filesystem.
  • HipHop for PHP transforms PHP source code into highly optimized C++. HipHop offers large performance gains and was developed over the past two years.
  • Open Compute Project an open hardware project aims to accelerate data center and server innovation while increasing computing efficiency through collaboration on relevant best practices and technical specifications.
  • Scribe is a scalable service for aggregating log data streamed in real time from a large number of servers.
  • Thrift provides a framework for scalable cross-language services development in C++, Java, Python, PHP, and Ruby.
  • Tornado is a relatively simple, non-blocking web server framework written in Python. It is designed to handle thousands of simultaneous connections, making it ideal for real-time Web services.
  • codemod assists with large-scale codebase refactors that can be partially automated but still require human oversight and occasional intervention.
  • Facebook Animation is a JavaScript library for creating customizable animations using DOM and CSS manipulation.
  • Online Schema Change for MySQL lets you alter large database tables without taking your cluster offline.
  • Phabricator is a collection of web applications which make it easier to write, review, and share source code. It is currently available as an early release and is used by hundreds of Facebook engineers every day.
  • PHPEmbed makes embedding PHP truly simple for all of our developers (and indeed the world) we developed this PHPEmbed library which is just a more accessible and simplified API built on top of the PHP SAPI.
  • phpsh provides an interactive shell for PHP that features readline history, tab completion, and quick access to documentation. It is ironically written mostly in Python.
  • Three20 is an Objective-C library for iPhone developers which provides many UI elements and data helpers behind our iPhone application.
  • XHP is a PHP extension which augments the syntax of the language such that XML document fragments become valid expressions.
  • XHProf is a function-level hierarchical profiler for PHP with a simple HTML-based navigational interface.

Twitter

Twitter open sourced some complete projects (e.g. FlockDB) but especially adds extensions to existing projects. For a full list see here.

Yahoo

  • Apache Traffic Server is fast, scalable and extensible HTTP/1.1 compliant caching proxy server.
  • Hadoop THE nosql solution at the moment was started by Yahoo. Yahoo actively contributes also to several extensions like Avro and Pig.
  • YUI is a free, open source JavaScript and CSS framework for building richly interactive web applications.

LinkedIn

  • Azkaban is simple batch scheduler for constructing and running Hadoop jobs or other offline processes
  • Bobo is a Faceted Search implementation written purely in Java, an extension of Apache Lucene
  • Cleo is a flexible, partial, out-of-order and real-time typeahead search.
  • Datafu is Hadoop library for large-scale data processing.
  • Decomposer is for massive matrix decompositions
  • Glu is a deployment automation platform
  • A set of useful gradle plugins
  • Indexing engine for IndexTank and API, BackOffice, Storefront, and Nebulizer for IndexTank  
  • Kafka is a distributed publish/subscribe messaging system
  • Kamikaze is a utility package for performing operations on compressed arrays of sorted integers
  • Krati is a simple persistent data store with very low latency and high throughput
  • Base utilities shared by all linkedin open source projects
  • A set of utility classes and wrappers around ZooKeeper
  • Norbert is a library that provides easy cluster management and workload distribution
  • Sensei is a distributed, elastic, realtime, searchable database
  • Voldemort is a distributed key-value storage system
  • Zoie is a real-time search and indexing system built on Apache Lucene

Alternatives to paying millions in software licenses

January 9, 2012 2 comments

Telecom operators pay millions in software licenses each year. By doing so they are sustaining an industry of “feature loading”. “Feature loading” refers to complex software solutions that in order to win RFPs add more and more features. Most telecom operators are using RFPs to compare different software solutions. Whoever has more features for the lowest price wins the deal. The end result is that telecom software is unnecessary complex and expensive. Software providers do not want to respond with “not compliant” and prefer to add some extra feature even if the one who wrote the RFP will never ever use them.

The likes of Apple have shown us that software is most beautiful when it does very few things very well. The era of mini applications allows users to use special purpose “apps” for each activity. No training required. No heavy investment. No heavy integrations.

Telecom operators should move away from the long RFPs with hundreds of features being compared. Instead they should try to simplify. Why pay millions for a complex system that does too many things too complex? Many large dotcoms have moved away from this type of solutions and have used Open Source, have built single purpose systems/services or generic platforms with plugins to reduce complexity.

Examples:

  • Amazon has pioneered Cloud Computing and has created individual single purpose systems or services that are easily accessible via REST or Web Interfaces. Different individual services (e.g. product recommendation, virtual server, virtual storage) get aggregated into complex solutions at the last moment.
  • Google built its Google File System, BigTable, etc. as generic platforms on which hundreds of other services could be easily added.
  • Thousands of dotcoms are using Hadoop, Cassandra, etc. to store data.

Each telecom service needs to be provisioned, rated, charged, billed, monitored, operated, supported, migrated, etc. By building solutions in which network, IT, communication and services are mixed into mega-complex architectures it has become impossible to launch new services in less than 12 months.

Building a Free Telco PaaS

How to do it differently? Is it possible to build a zero-license Telco PaaS that acts like a giant service delivery platform in the Cloud? YES

Operators will need to use Open Source, IaaS and SaaS solutions. IaaS can be delivered cheaply by using Open Source components: KVM for virtualization, Open Nebula for virtual machine and storage management, Hadoop/Cassandra for storage, Open vSwitch for network virtualization, etc. On top PaaS platforms can be built with solutions like WSO2 Stratos. Telecom services like Twilio‘s or the private cloud version, RestComm, can be used to allow developers to quickly create VAS. Open Source billing systems have been announced, like Meveo. Online shops can be build with Opencart. Datawarehousing and data analytics with Pentaho or Jasper Reports. There are hundreds of open source monitoring solutions: Icinga, Nagios, Zenoss, etc. Helpdesk can either be SaaS like Zendesk, or Open Source like Request Tracker. CRM like SugarCRM. SIP backoffice systems like FreeSwitch.

Operators should start thinking about the Cloud as a way to simplify internal integrations. All back-office systems should be shielded from the outside via easy to use REST, Thrift/Protocol Buffers, etc. interfaces. Service-based loadbalancing should allow service upgrades and rolling migrations without outages. The architecture should be built with Salesforce.com in mind. Non-programmers, and even better end-users, can build their own VAS by using drag-and-drop interfaces and combining different service blocks together into custom solutions. Plug-ins allow for custom behaviour without cluttering a solution for the rest of the users.

Operators should embrace new disruptive technologies to simplify their business, lower their cost structures and be able to launch new services every hour of the day. Large dotcoms are launching new features every day and use A/B testing to validate if users like them and they add to the bottom line. Marketing and product management get a totally different dimension…

The power of binary SIP

December 10, 2010 Leave a comment

With the world looking more at XML, SOAP and REST these days, it is perhaps  anti-natural to think binary again. However with Protocol Buffers [Protobuf], Thrift, Avro and BSON being used by the large dotcoms, thinking binary feels modern again…

How can we apply binary to telecom? Binary SIP?

SIP is a protocol for handling sessions for voice, video and instant messaging. It is a dialect of XML. For a SIP session to be set-up a lot of communication is required between different parties. What if that communication is substituted by a binary protocol based for instance on protocol buffers? Google’s protocol buffers can dramatically reduce network loads and parsing, even between 10 to a 100 times compared to regular XML.

What would be the advantages:

  • Latency – faster parsing and smaller network traffic reduces latency which is key in real-time communication.
  • Performance – faster parsing and lower load means that more can be done for less. One server can handle more clients.
  • Scalability – distributing the handling of SIP sessions over more machines becomes easier if each transaction can be handled faster.

Disadvantages:

  • No easy debugging – SIP can be human ready hence debugging is “easier”. However in practice tools could be written that allow binary debugging.
  • Syncing client & server – clients and server libraries need to be in sync otherwise parsing can not be handled. Protocol buffers ignores extensions that are unknown so there is some freedom for an old client to connect to a newer server or vice-versa.
  • Firewalls/Existing equipment – a new binary protocol can not be interchanged with existing equipment. A SIP to binary SIP proxy is necessary.

It would be interesting to see if a binary SIP prototype joined with the latest NOSQL data stores can compete with commercial SIP/IMS equipment in scalability, latency and performance.

%d bloggers like this: