I just saw Eric Dishman’s TED session on “Health care should be a team sport“. I love the idea of providing people with chronicle illness the means to be diagnosed and treated remotely and use big data to learn of a large group of patients with similar issues. Personally this would mean that when my sons have breathing problems we would not have to drag them in the middle of the night to a hospital where they are exposed to many viruses. Instead by measuring their oxygen level and listening to their longs a personalized remote diagnose could be made and some nebulizers or other things administered. At scale all equipment would probably cost less than £200 because Maplin already sells the nebulizer and oxygen level meter for a combined £110. Add another £90 at worst for a stethoscope that can be connected via bluetooth to a smartphone. Now via Hangout a doctor could remotely diagnose the results and even in the future a computer programme. All results of millions of patients would be collected in order to improve treatment. So no need for an expensive hospital in London with a receptionist, nurse and doctor dedicating 2 hours. By just avoiding one hospital night, the whole system would be enormously profitable. Additionally Ubuntu’s Juju can be used to set up all the big data and diagnostic software in minutes in any cloud or server on any place in the world. If other open source solutions are used then the total solution would be in reach for any developing country. There are probably more than one developer whose kids are asthmatic, and would happily contribute time. It sounds like an ideal Gates Foundation or Kickstarter project. If you think you can help please reach out to me because this is not work for me, this is personal engagement.
The cloud is revolutionising IT. However there are two sides to every story: the winners and the losers. Who are they going to be and why? If you can’t wait here are the losers: HP, Oracle, Dell, SAP, RedHat, Infosys, VMWare, EMC, Cisco, etc. Survivors: IBM, Accenture, Intel, Apple, etc. Winners: Amazon, Salesforce, Google, CSC, Workday, Canonical, Metaswitch, Microsoft, ARM, ODMs.
Now the question is why and is this list written in stone?
What has cloud changed?
If you are working in a hardware business (storage, networking, etc. is also included) then cloud computing is a value destroyer. You have an organisation that is assuming small, medium and large enterprises have and always will run their own data centre. As such you have been blown out of the water by the fact that cloud has changed this fundamental rule. All of a sudden Amazon, Google and Facebook go and buy specialised webscale hardware from your suppliers, the ODMs. Facebook all of a sudden open sources hardware, networking, rack and data centre designs and makes it that anybody can compete with you. Cloud is all about scale out and open source hence commodity storage, software defined networks and network virtualisation functions are converting your portfolio in commodity products. If you are an enterprise software vendor then you always assumed that companies will buy an instance of your product, customise it and manage it themselves. You did not expect that software can be offered as a service and that one platform can offer individual solutions to millions of enterprises. You also did not expect that software can be sold by the hour instead of licensed forever. If you are an outsourcing company then you assume that companies that have invested in customising Siebel will want you to run this forever and not move to Salesforce.
Reviewing the losers
HP’s Cloud Strategy
HP has been living from printers and hardware. Meg rightfully has taken the decision to separate the cashcow, stop subsidising other less profitable divisions and let it be milked till it dies. The other group will focus on Cloud, Big Data, etc. However HP Cloud is more expensive and slower moving than any of the big three so economies of scale will push it into niche areas or make it die. HP’s OpenStack is a product that came 2-3 years late to the market. A market as we will see later that is about to be commoditised. HP’s Big Data strategy? Overpay for Vertica and Autonomy and focus your marketing around the lawsuits with former owners, not any unique selling proposition. Also Big Data can only be sold if you have an open source solution that people can test. Big Data customers are small startups that quickly have become large dotcoms. Most enterprises would not know what to do with Hadoop even if they could download it for free [YES you can actually download it for free!!!].
Oracle’s Cloud Strategy
Oracle has been denying Cloud existed until their most laggard customer started asking questions. Until very recently you could only buy Oracle databases by the hour from Amazon. Oracle has been milking the enterprise software market for years and paying surprise visits to audit your usage of their database and send you an unexpected bill. Recently they have started to cloud-wash [and Big Data wash] their software portfolio but Salesforce and Workday already are too far ahead to catch them. A good Christmas book Larry could buy from Amazon would be “The Innovator’s Dilemma“.
Dell’s Cloud Strategy
Go to the main Dell page and you will not find the word Big Data or Cloud. I rest my case.
SAP’s Cloud Strategy
Workday is working hard on making SAP irrelevant. Salesforce overtook Siebel. Workday is likely to do the same with SAP. People don’t want to manage their ERP themselves.
RedHat’s Cloud Strategy
[I work for their biggest competitor] RedHat salesperson to its customers: There are three versions. Fedora if you need innovation but don’t want support. CentOS if you want free but no security updates. RHEL is expensive and old but with support. Compare this to Canonical. There is only one Ubuntu, it is innovative, free to use and if you want support you can buy it extra.
For Cloud the story is that RedHat is three times cheaper than VMWare and your old stuff can be made to work as long as you want it according to a prescribed recipe. Compare this with an innovator that wants to completely commoditise OpenStack [ten times cheaper] and bring the most innovative and flexible solution [any SDN, any storage, any hypervisor, etc.] that instantly solves your problems [deploy different flavours of OpenStack in minutes without needing any help].
Infosys or any outsourcing company
If the data centre is going away then the first thing that will go away is that CRM solution we bought in the 90’s from a company that no longer exists.
For the company that brought virtualisation into the enterprise it is hard to admit that by putting a rest API in front of it, you don’t need their solution in each enterprise any more.
Commodity storage means that scale out storage can be offered at a fraction of the price of a regular EMC SAN solution. However the big killer is Amazon’s S3 that can give you unlimited storage in minutes without worries.
A Cisco router is an extremely expensive device that is hard to manage and build on top of proprietary hardware, a proprietary OS and proprietary software. What do you think will happen in a world where cheap ASIC + commodity CPU, general purpose OS and many thousands of network apps from an app store become available? Or worse, a network will no longer need many physical boxes because most of it is virtualised.
What does a cloud loser mean?
A cloud loser means that their existing cash cows will be crunched by disruptive innovations. Does this mean that losers will disappear or can not recuperate? Some might disappear. However if smart executives in these losing companies would be given the freedom to bring to market new solutions that build on top of the new reality then they might come out stronger. IBM has shown they were able to do so many times.
Let’s look at the cloud survivors.
IBM has shown over and over again that it can reinvent itself. It sold its x86 servers in order to show its employees and the world that the future is no longer there. In the past it bought PWC’s consultancy which will keep on reinventing new service offerings for customers that are lost in the cloud.
Just like PWC’s consultancy arm within IBM, Accenture will have consultants that help people make the transition from data centre to the cloud. Accenture will not be leading the revolution but will be a “me-to” player that can put more people faster than others.
X86 is not going to die soon. The cloud just means others will be buying it. Intel will keep on trying to innovate in software and go nowhere [e.g. Intel’s Hadoop was going to eat the world] but at least its processors will keep it above the water.
Apple knows what consumers want but they still need to prove they understand enterprises. Having a locked-in world is fine for consumers but enterprises don’t like it. Either they come up with a creative solution or the billions will not keep on growing.
What does a cloud survivor mean?
A cloud survivor means that the key cash cows will not be killed by the cloud. It does not give a guarantee that the company will grow. It just means that in this revolution, the eye of the tornado rushed over your neighbours house, not yours. You can still have lots of collateral damage…
IaaS = Amazon. No further words needed. Amazon will extend Gov Cloud into Health Cloud, Bank Cloud, Energy Cloud, etc. and remove the main laggard’s argument: “for legal & security reasons I can’t move to the cloud”. Amazon currently has 40-50 Anything-as-a-Service offerings in 36 months they will have 500.
PaaS & SaaS = Salesforce. Salesforce will become more than a CRM on steroids, it will be the world’s business solutions platform. If there is no business solution for it on Salesforce then it is not a business problem worth solving. They are likely to buy competitors like Workday.
Google is the king of the consumer cloud. Google Apps has taken the SME market by storm. Enterprise cloud is not going anywhere soon however. Google was too late with IaaS and is not solving on-premise transitional problems unlike its competitors. With Kubernetes Google will re-educate the current star programmers and over time will revolutionise the way software is written and managed and might win in the long run. Google’s cloud future will be decided in 5-10 years. They invented most of it and showed the world 5 years later in a paper.
CSC has moved away from being a bodyshop to having several strategic important products for cloud orchestration and big data. They have a long-term future focus, employing cloud visionaries like Simon Wardley, that few others match. You don’t win a cloud war in the next quarter. It took Simon 4 years to take Ubuntu from 0% to 70% on public clouds.
What Salesforce did to Oracle’s Siebel, Workday is doing to SAP. Companies that have bought into Salesforce will easily switch to Workday in phase 2.
Since RedHat is probably reading this blog post, I can’t be explicit. But a company of 600 people that controls up to 70% of the operating systems on public clouds, more than 50% of OpenStack, brings out a new server OS every 6 months, a phone OS in the next months, a desktop every 6 months, a complete cloud solution every 6 months, can convert bare-metal into virtual-like cloud resources in minutes, enables anybody to deploy/integrate/scale any software on any cloud or bare-metal server [Intel, IBM Power 8, ARM 64] and is on a mission to completely commoditise cloud infrastructure via open source solutions in 2015 deserves to make it to the list.
Metaswitch has been developing network software for the big network guys for years. These big network guys would put it in a box and sell it extremely expensive. In a world of commodity hardware, open source and scale out, Clearwater and Calico have catapulted Metaswitch to the list of most innovative telecom supplier. Telecom providers will be like cloud providers, they will go to the ODM that really knows how things work and will ignore the OEM that just puts a brand on the box. The Cloud still needs WAN networks. Google Fibre will not rule the world in one day. Telecom operators will have to spend their billions with somebody.
If you are into Windows you will be on Azure and it will be business as usual for Microsoft.
In an ODM dominated world, ARM processors are likely to move from smart phones into network and into cloud.
Nobody knows them but they are the ones designing everybody’s hardware. Over time Amazon, Google and Microsoft might make their own hardware but for the foreseeable future they will keep on buying it “en masse” from ODMs.
What does a cloud winner mean?
Billions and fame for some, large take-overs or IPOs for others. But the cloud war is not over yet. It is not because the first battles were won that enemies can’t invent new weapons or join forces. So the war is not over, it is just beginning. History is written today…
Have you ever counted the number of Linux devices at home or work that haven’t been updated since they came out of the factory? Your cable/fibre/ADSL modem, your WiFi point, television sets, NAS storage, routers/bridges, media centres, etc. Typically this class of devices hosts a proprietary hardware platform, an embedded proprietary Linux and a proprietary application. If you are lucky you are able to log into a web GUI often using the admin/admin credentials and upload a new firmware blob. This firmware blob is frequently hard to locate on hardware supplier’s websites. No wonder the NSA and others love to look into potential firmware bugs. They are the ideal source of undetected wiretapping.
The next IT revolution: micro-servers
The next IT revolution is about to happen however. Those proprietary hardware platforms will soon give room for commodity multi-core processors from ARM, Intel, etc. General purpose operating systems will replace legacy proprietary and embedded predecessors. Proprietary and static single purpose apps will be replaced by marketplaces and multiple apps running on one device. Security updates will be sent regularly. Devices and apps will be easy to manage remotely. The next revolution will be around managing millions of micro-servers and the apps on top of them. These micro-servers will behave like a mix of phone apps, Docker containers, and cloud servers. Managing them will be like managing a “local cloud” sometimes also called fog computing.
Micro-servers and IoT?
Are micro-servers some form of Internet of Things. Yes they can be but not all the time. If you have a smarthub that controls your home or office then it is pure IoT. However if you have a router, firewall, fibre modem, micro-antenna station, etc. then the micro-server will just be an improved version of its predecessor.
Why should you care about micro-servers?
If you are a mobile app developer then the micro-servers revolution will be your next battlefield. Local clouds need “Angry Bird”-like successes.
If you are a telecom or network developer then the next-generation of micro-servers will give you unseen potentials to combine traffic shaping with parental control with QoS with security with …
If you are a VC then micro-server solution providers is the type of startups you want to invest in.
If you are a hardware vendor then this is the type of devices or SoCs you want to build.
If you are a Big Data expert then imagine the new data tsunami these devices will generate.
If you are a machine learning expert then you might want to look at algorithms and models that are easy to execute on constraint devices once they have been trained on potentially thousands of cloud servers and petabytes of data.
If you are a Devop then your next challenge will be managing and operating millions of constraint servers.
If you are a cloud innovator then you are likely to want to look into SaaS and PaaS management solutions for micro-servers.
If you are a service provider then this is the type of solutions you want to have the capabilities to manage at scale and easily integrate with.
If you are a security expert then you should start to think about micro-firewalls, anti-micro-viruses, etc.
If you are a business manager then you should think about how new “mega micro-revenue” streams can be obtained or how disruptive “micro- innovations” can give you a competitive advantage.
If you are an analyst or consultant then you can start predicting the next IT revolution and the billions the market will be worth in 2020.
The next steps…
It is still early days but expect some major announcements around micro-servers in the next months…
Data volumes are growing exponentially. Unstructured data from Twitter, LinkedIn, Mailling Lists, etc. has the potential to transform many industries if it could be combined with structured data. Machine learning, natural language processing, sentiment analysis, etc. everybody talks about them, hardly anybody is really using them at scale. Too many people when they talk about Big Data unfortunately start with the answer and then ask what the problem it. The answer seems to be Hadoop. News flash: Hadoop is not the answer and if you start from the answer to look for problems then you are doing it wrong.
What are Common Data Problems?
Most Big Data problems are about storage and reporting. How do I store all the exponentially growing data in such a way that business managers can get to in seconds when they need it? Ad-hoc reporting, adequate prediction, and making sense of the exponentially growing data stream are the key problems.
Big Data Storage?
Do you have relational data, unstructured data, graph data, etc.? How do you store different types of data and make it available inside an enterprise? The basics for big data storage is cloud storage technology. You want to store any type of data and be able to quickly scale up storage. RedHat did not buy Inktank for $175M because traditional storage has solved all of today’s problems. Premium SAN and other storage technologies are old school. They are too expensive for Big Data. They were designed with the idea that each byte of data is critical for an enterprise. Unfortunately this is no longer the case. You mind loosing transactional sales data. You don’t mind so much loosing sample tweets you bought from Datasift or Apache log files from an internal low-impact server. This is where cloud storage solutions like Inktank’s Ceph allow commodity storage to be built that is reliable, scalable and extremely cost effective. Does this mean you don’t need SANs any more? Wrong again. TV did not kill Radio. Same here.
Cloud storage technologies are needed because each type of data behaves differently. If you have log data that only is appended then HDFS is fine. If you have read-mostly data then a relational database is ideal. If you have write-mostly data then you need to look at NoSQL. If you need heavy read-and-write then you need strong Big Data architecture skills. What is more important: short latency, consistency, reliability, cheap storage, etc.? Each of these means that the solution is different. No latency means in-memory or SSD. Consistency means transactional. Reliability means replication. You can even now find inconsistent databases like BlinkDB. There is no longer one size fits all. Oracle is no longer the answer to everybody’s data questions.
What will companies need? Companies need cloud storage solutions that offer these different storage capabilities like a service. Amazon’s RDS, DynamoDB, S3 and Redshift are examples of what companies need. However companies need more flexibility. They need to be able to migrate their data between public cloud providers to optimise their costs and have added security. They also need to be able to store data in private local clouds or nearby hosted private clouds for latency or regulatory reasons.
The future of ETL & BI
Traditional ETL will see a revolution. ETL never worked. Business managers don’t want to go and ask their IT department to make a change in a star schema in order to import some extra data from the Internet followed by updates to reports and dashboards. Business managers want an easy to use tool that can answer their ad-hoc queries. This is the reason why Tableau Software + Amazon Redshift are growing like crazy. However if your organisation is starting to pump terabytes of data into Redshift, please be warned: The day will come that Amazon sends you a bill that your CxO will not want to pay and he/she will want you to move out of Amazon. What will you do then? Do you have an exit strategy?
The future of ETL and BI will be web tools that any business manager can use to create ad-hoc reports. The Office generation wants to see dynamic HTML5 GUIs that allow them to drag-and-drop data queries into ad-hoc reports and dashboards. If you need training then the tool is too difficult.
These next-generation BI tools will need dynamic back-office solutions that allow storing real-time, graph, blob, historical relational, unstructured, etc. data into a commonly accessible cloud storage solution. Each one will be hosted by a different cloud service but they will all be an API away. Software will be packaged in such a way that it knows how to export its own data. Why do you need to know where Apache stores the access and error logs and in which format? Apache should be able to export whatever interesting information it contains in a standardised way into some deep storage. Machine learning should be used to make decisions on how best to store that data for ad-hoc reporting afterwards. Humans should no longer be involved in this process.
Talking about machine learning. With the volumes of data growing from gigabytes into petabytes, traditional data scientists will not scale. In many companies a data scientist is similar to a report monkey: “Find out why in region X we sold Y% less”, etc. Data scientist should not be synonymous for dynamic report generators. Data scientists should be machine learning experts. They should tell the computer what they want, not how they want it. Today’s data scientists pride themselves they know R, Python, etc. These tools are too low-level to be usable at scale. There are just not enough people in the world to learn R. Data is growing exponentially, R experts at best can grow linear. What we need are machine learning GUI solutions like RapidMiner Studio but supported by Petabyte cloud solutions. A short term solution could be an HTML5 GUI version of RapidMiner Studio that connects to a back-end set of cloud services that use some of the nice Apache Spark extensions for machine learning, streaming, Big Data warehousing/SQL, graph retrieval, etc. or solutions based on Druid.io. For sure there are other solutions possible.
What is important is that companies start realising that data is becoming a strategic weapon. Those companies that are able to collect more of it and convert it into valuable knowledge and wisdom will be tomorrow’s giants. Most average machine learning algorithms become substantially better just by throwing more and more data at them. This means that having a Big Data architecture is not as critical as having the best trained models in the industry and continue to train them. There will be a data divide between the have’s and have-not’s. Google, Facebook, Microsoft and others have been buying any startup that smells like Deep Belief Networks. They have done this with a good reason. They know that tomorrow’s algorithms and models will be more valuable than diamonds and gold. If you want to be one of the have’s then you need to invest in cloud storage now. You need to have massive historical data volumes to train tomorrow’s algorithms and start building the foundations today…
An MIT student recently created a new type of massively distributed database, one that runs on graphical processors instead of CPUs. Mapd, as it has been called, makes use of the immense computational power available in off-the-shelf graphics cards that can be found in any laptop or PC. Mapd is especially suitable for real-time quering, data analysis, machine learning and data visualization. Mapd is probably only one of many databases that will try new hardware configurations to cater for specific application use cases.
Alternative approaches could focus on large sets of cheap mobile processors, Parallella processors, Raspberry PIs, etc. all stitched together. The idea would be to create massive processing clouds based on cheap specialized hardware that could beat traditional CPU Clouds both in price and performance at least for some specific use cases…