Xenomorph Blog

Posts categorized "Database Technology"

Cloudy definitions

Given that I am English and can tend to start many personal introductions with a short conversation about the weather (generally either "awful" or "not bad for this time of year"...), then maybe I should be very receptive to the use of weather-related expressions in technology such as the "cloud". Maybe not however since the "cloud" and "cloud computing" have reached that zenith of marketing hype, when everyone is talking about a new technology regardless of if they are sure what it actually is (or might be, or could become...).

Anyway, I finally swallowed my cynicism and on Thursday morning went along to "Migrating Business to the Cloud", an event by Microsoft hosted at Bafta (small venue where the UK deals out its equivalent (?) of the Oscars). The master of ceremonies was Mark Taylor of Microsoft, who gave a general introduction to what Microsoft are doing in the "cloud", and of particular note he described the four types of computing scenarios where cloud computing can optimally be applied:

  • Predictable Bursting - where computing needs come and go in predictable waves of usage/demand
  • Growing Fast - where computing needs are rising exponentially like in a successful internet start-up
  • Unpredictable Bursting - where computing demand comes in unpredictable bursts, such as that associated with say usage of a backup computer centre in disaster recovery
  • On and Off - where you might run a process once a month or at an interval you decide

The above definitions seem ok to me but there is (probably understandably) some overlap in usage cases. The "Growing Fast" case for start-ups is interesting and more of that later.

Mark handed over to David Chappell who gave his perspective on cloud platforms as they are today in the market. David was a very entertaining and knowledgeable speaker, despite wearing a dodgy suit (what happened to those trousers?!) and having a peculiar wide foot stance when speaking. Anyway I digress, on to what he said. David started by saying what the "Cloud" is comprised of:

  • Cloud Applications - basically this is Software as a Service (SaaS) and some current examples of this would be Salesforce.com CRM, Microsoft Exchange Online and Google Apps.
  • Cloud Platforms - a platform for developing cloud applications, with the following characteristics that it:
    • is aimed at developers for creating and running cloud applications, not end consumers
    • provides self-service access to computing resources
    • allows very granular, on-demand allocation of computing resources
    • charges for the consumption of computing resources in a very granular manner

David then explained that due to its ambiguity he disliked the usage of the term "Private Cloud" in the ongoing debate about publicly available cloud services (such as those provided my Amazon, Microsoft and Google) vs. private clouds deployed within private institutions. David said the main difference was that private clouds do not have the economics of public clouds (i.e. pay for what you use only when you need it). That point seemed straightforward, however I would have thought that with a large global organisation with many different departmental computing demands the economics of a private cloud would be similar to a public one.

David then went on to explain that there are two kinds of Cloud Platform:

  • Infrastructure as a Service (IaaS) - this is a cloud platform the provides a developer with a virtual machine (VM) that has (almost) full access within it; put another way the development environment gives the developer total control but with that control comes responsibility.
  • Platform as a Service (PaaS) - this is a cloud platform that runs an application that a developer has created; it is easy to use but has limited control for the developer.

David put forward that there has been only 5 major software technology platforms over the past 50 years:

  • Mainframe
  • Mini-Computer
  • PC
  • PC-based Server
  • Mobile

He perceives that the Cloud is the 6th major software technology platform, and as such he is extremely enthusiastic about the opportunity and benefits that this presents to the whole of the software industry and its consumers.

David categorised Microsoft's cloud platform as (mostly) PaaS, which had three main components:

  • Windows Azure - for environment for running cloud applications within the platform
  • SQL Azure - relational storage within the platform
  • Windows Azure Platform AppFabric – (David noted the long name and sympathised with trying to name things sensibly) this provides and manages the infrastructure within the platform

He then moved on to describe the main usage scenarios for Windows Azure, for applications that:

  • need massive scale, such as Web 2.0 applications
  • need high reliability
  • have highly variable loading
  • have short or unpredictable lifetimes
  • need parallell processing
  • will either fail fast or scale fast
  • do not fit easily in a single organisation's data centre, such as joint venture
  • need external storage

David said that in the fail quickly or scale quickly scenario, this was squarely aimed at technology start-ups where using Cloud technologies would effectively increase the frequency at which new ideas could be tried out at less economic cost if they go wrong, but are ready to scale massively if they become the new "Facebook" - so much so that many of the VCs in Silicon Valley are now insisting that start-ups use cloud technology as a condition of funding.

Amazon's Elastic Compute Cloud (Amazon EC2) was the first major commercial cloud platform, and David categorised this as IaaS, where effectively you get a Virtual Machine (VM) environment that provides a lot of control but requires more effort to control than an PaaS such as Azure.

David said that he was surprised that the Google App Engine, which has Python and now Java as its programming languages, did not come with any traditional relational storage (unlike most other cloud platforms) but on speaking with Google he found that the storage engine and the whole platform is again designed primarily for Web 2.0 apps and as such storage usage was more about retrieving photos, video etc and less about querying across many records.

David was very complimentary about the cloud platform from Salesforce.com called Force.com, He said that the sales pitch from Salesforce.com would be straight to business users, effectively saying that they could build scaleable, resilient applications without involving the IT department and without needing programming expertise. He asked the audience if anyone had used these tools and a few folks confirmed that they were extremely impressed by what the platform offered.

Bob Muglia (President, Server and Business Tools, Microsoft) then gave a quick talk on Microsoft's plans for Azure. He mentioned how Microsoft's new search engine, Bing, was based on several hundred thousand servers running in Azure, but only had a handful of operating staff in contrast with the usual economics (taken from Gartner) that usually 1 operations person was needed for every 50 servers. He emphasised that Microsoft was committed to the further development of "on premises" operating systems but that Microsoft was totally committed to cloud computing, its development and its support.

He said that some of the tools found in the Microsoft technology suite, such as SQL Reporting Services, are not yet available in the cloud on Azure/SQL Azure (due end of year though) - he said that he hoped that people understood that re-engineering an existing application for the cloud sometimes took time to ensure the scaleable and reliability demanded when providing the functionality through the cloud. The vision put forward by Bob for development of cloud applications seemed very compelling, with Microsoft aiming to make things such enabling resilience for a globally available cloud application as simple as ticking a check-box in Microsoft Visual Studio. He put forward that the major barrier to cloud adoption was the human aspect of trust of moving applications "off premises". He said that he saw a fundamental shift across all industries to cloud development and deployment, but added there may be some areas such as government and finance where this process takes a lot longer.

The event then switched to presentations by EasyJet, RiskMetrics and SeeTheDifference. The head of IT at EasyJet gave his pitch first. His department get an annual budget of 0.75% (small?) of turnover of £2.5bn (larger, so translating to £18.75m) and has around 60 people. He presented how EasyJet has taken an incremental approach to the adoption of cloud computing, utilising both "on-premises" and cloud ("off-premises") technology together (exposing end points of applications into the cloud at first). He advised this approach since it:

  • was a smaller step than full-blown adoption
  • was lower risk
  • demonstrated big value in a short time-frame
  • leveraged the rich functionality available in Azure
  • accelerated acceptance of cloud technology

Dr Rob Fraser of RiskMetrics was next up. He explained whilst Moore's Law says that computing power doubles every 18 months, the calculations needed for risk management have doubled every six months. This has driven the need for parallel computing to meet this calculation need, and that RiskMetrics' RiskBurst service uses around 2,500 64-bit Opteron cores in their data centre but combines this with use of Azure to meet the peaks in calculation needed during each day (the similarities with power consumption management were pretty apparent). He said that average CPU consumption was around 18% of peak, hence a combination of both on and off premises compute power was a good solution for them. He mentioned that the management of this hybrid combination of technologies, and in particular being able to show real-time billing for it was a key area of investment for RiskMetrics.

The final presentation was by SeeTheDifference. The main point of this presentation was that this charitable organisation had zero permanent staff involved in IT, but regardless was able to deliver a very professional, reliable and scaleable website using external consultants to build on Azure.

Final section of the morning was a roundtable discussion with questions from the audience. The EasyJet guy said that the human mindset was key to the adoption of cloud computing. In terms of what keeps him awake at night was the thought that what would happen/how would attitudes change if any of the cloud infrastructure failed - so far it has experienced 100% up time. Rob of RiskMetrics was concerned about the stability of the platform, trying to ensuring that any changes introduced do not damage reliability. He added that he disagreed with Bob Muglia and thought that financial institutions would adopt public clouds quickly – he cited their experience of their revenues now being 90% based from service provision not on-premises applications. David said that he took some of the comments from Bob to indicate that Microsoft would also offer more of a pure VM (IaaS) soon in addition to the PaaS approach of Azure. David said that trust was the major issue in cloud adoption and he advised an incremental approach so "get your feet wet" then build from there.

On the whole the presentations were good and my knowledge of cloud technology has improved a bit - certainly it is fantastically appealing to develop globally available applications with no scaling, no resilience or data replication issues - it sounds too good to be true which generally means it is, so I guess there is much more work to do in gaining trust and acceptance for this technology. So my (pragmatic?) cynicism remains - but cloudy days are certainly coming and for a change maybe this is something to very much look forward to.

 

Posted by Brian Sentance | 17 May 2010 | 9:37 am


More CEP Events

Sybase have acquired Aleri according to Finextra. It was less than a year ago when the complex event processing (“CEP”) vendors Aleri and Coral8 announced their merger (see press release); there was also a big buzz when Sybase announced a CEP capability based on Coral8 and Streambase decided to offer an Amnesty Program for Aleri-Coral8 Customers (see earlier post 'Merging in public is difficult...). And only a few months later, Microsoft announced that their CEP Orinoco (now integrated with SQL Server 2008 as StreamInsight) was heading to market (see post 'Microsoft CEP surfaces as 'Orinoco').

Another sign that CEP is moving more mainstream and that real-time everything is becoming more important? Or a good market for acquisitions?

Posted by Sara Verri | 4 February 2010 | 6:00 pm


Heavyweight Data Management...

...I am very concerned that I have previously missed an important requirement for data management solutions - a heavweight one judging by this great discussion on one of the Microsoft forums.

Posted by Brian Sentance | 17 July 2009 | 8:17 am


Microsoft CEP Surfaces as "Orinoco"

Seems like Microsoft have now gone public on the Microsoft TechEd site that they have a Complex Event Processing (CEP) engine that will be coming to market shortly (see MagmaSystems blog post ). One of my colleagues Mark Woodgate attended a briefing event at Microsoft for this technology back in February this year - here's an extract from some internal notes that Mark made back then:

"Microsoft CEP is very similar to StreamBase conceptually (and not unsurprisingly), in the sense that there are adapters and streams and how you merge and split them via some kind of query language is the same. However, StreamBase uses the StreamSQL which as we have seen is SQL-like in syntax but Microsoft CEP uses LINQ and .NET and although conceptually it is doing the same thing, it does not look the same. StreamBase’s argument was you can be an SQL programmer to use it and don’t need lower-level like .NET; however, it’s not SQL really as it has all these ‘extensions’ you have to learn so using .NET might look more tricky but in fact it makes sense. They don’t have a sexy GUI yet for designing CEP applications like StreamBase but it will be done in Visual Studio 2008.

 

Currently, you build various assemblies (I/O adapters, queries and functions) and then bolt them all together, called ‘binding’ by command line tool. You then deploy the application onto one or more machines using another tool so it’s a manual process right now. They are aware this needs to be made easier and more visual. They are allowing other libraries to be bolted in via the various SDKs so it’s pretty open and flexible. It works well with HPC and clusters/grids (or so they say) and of course can be used with SQL Server. The CEP engine also has a web interface based on SOAP so at least non-Windows based systems can talk to it"

 

The release of this technology will be an interesting addition to the CEP market and to the Microsoft technology stack in general. Assuming performance is at credible levels (i.e. not necessarily leading but not appalling either) it will certainly bring both technical and commercial pressure to bare on existing CEP vendors (see earlier post on Aleri/Coral8) and has the potential to broaden the usage of CEP. Obviously Linux-Lovers (sorry, I didn't mean to be personal...) will not agree with this, but Microsoft is putting together an interesting stack of technology when you see this CEP engine, Microsoft HPC and Microsoft Velocity coming together under .NET.

 

Posted by Brian Sentance | 14 May 2009 | 5:13 pm


High Performance Spreadsheets

Another article about the operational risk generated by the usage of spreadsheets within the financial markets (see earlier posts), appeared in the April issue of Waters Magazine.
 
The articles highlights how spreadsheets are largely used within financial institutions and suggests that the current regulation requirements for more transparency and ad-hoc risk management might push the proliferation of spreadsheets even further. The articles also refers to the progress and improvements made by Microsoft in recent versions of Excel to increase the security of spreadsheets.
 
Xenomorph has worked closely with Microsoft on hosting its time series database within SQL Server 2008. The case study we have written together describes how SQL Server 2008 offers integration within Office Excel 2007 so that whilst the spreadsheet is still the end-user viewing tool, operational risk is reduced by engaging Excel 2007 as an analytics and reporting tool and not as a mean of storing data.
 
Our TimeScape solution offers more than 700 easy to use add-in functions to Office Excel 2007 and we are currently working on the use of Excel Services, part of Microsoft Office Share Point Server 2007, to further enhance the centralized approach to spreadsheet.
 
If you are interested in how Xenomorph solves the problem of spreadsheet management, then take a look at our (newly updated) website. Here we explain how to solve the problem and how Xenomorph Spreadsheet Inside technology can bring unstructured spreadsheet data and complex calculation within a centralized data management system, increasing transparency and reducing operational risk.

Posted by Brian Sentance | 8 April 2009 | 2:35 pm


CEP in 2009

Interesting predictions for complex event processing (CEP) in 2009 (click here for link) - sounds like some form of reality is appearing in this area of the market, accelerated by the current financial crisis. Entry of bigger players and usage of LINQ in CEP will be interesting too.

Posted by Brian Sentance | 25 January 2009 | 4:02 pm


Transparency for troubled times

I came across this pair of quotes on a google search, bringing data management into the context of the current financial crisis:

"Where is the wisdom? Lost in the knowledge. Where is the knowledge? Lost in the information." - T.S. Eliot

"Where is the information? Lost in the data. Where is the data? Lost in the ******* database." - JoeCelko

Here's to hoping that wisdom is not in short supply at the moment...

Posted by Brian Sentance | 6 October 2008 | 7:00 am


Solid State Drives - the promise of a free lunch?

I read an interesting article a few weeks ago on the SQL Server Magazine web-site where the issue of Solid State Drives (SSD) and their potential to impact the future need to tune databases was being discussed.

The article raised the question that as SSD becomes more mainstream, and its capacity increases significantly, then could it eventually eliminate the need for database designers/administrators to have to optimise table structures to deliver acceptable levels of performance?

The argument used was along the lines that with SSD there's less traditional disk i/o going on (making reads a thousand times quicker than hard disks), so query performance levels may just be acceptable by virtue of the SSD memory delivering data quickly to the consumer process. This makes good sense, but also reminds me of previous technology advances in this area such as RAM disks and even paging files, which all promised such things but eventually needed cleverer system infrastructure around them to fulfil an overall business need.

That said, I have absolutely no doubt that SSD will make a significant impact on data storage access times (it has to). However, my guess is that it will just push the problem elsewhere. So, as much as we developers & technicians would like to think that it may deliver us a 'free lunch', I would suggest it's more likely to be a ‘free starter’ and that (sadly) we will continue to have much more work to do to produce the main course and dessert that will keep our customers happy and coming back for more... 

The original SQL Server Magazine article can be found at http://www.sqlmag.com/Articles/ArticleID/100181/100181.html

Posted by Chris Budgen | 24 September 2008 | 2:16 pm


Vhayu and Streambase - positioning clarified?

Partner announcement on Finextra with Vhayu and Streambase coming together:

http://www.finextra.com/fullpr.asp?id=21477

Defining what vendors mean by a "Data Management System" is difficult enough for clients, but in the area of the somewhat fuzzy technology definitions around automated trading it is interesting to see Streambase clarify their offering around CEP (and not database too, which was one of their first messages around bringing real-time and historic data together), and that Vhayu seems to be emphasising its tick database capabilities (and de-emphasising its original perception in the market as a CEP vendor).

Posted by Brian Sentance | 21 May 2008 | 11:29 am


Sun and MySQL - implications for Oracle/SQL Server

Interesting article on Sun's $1billion acquisition of MySQL and how it may affect Oracle and SQL Server:

http://www.sqlmag.com/Articles/ArticleID/98951/98951.html?Ad=1

Posted by Brian Sentance | 30 April 2008 | 12:02 pm


Streaming Blue Genes...

The supercomputer continues to make a come-back - just up on Finextra with TD Bank testing IBM's Blue Gene supercomputers to amalgamate and analyse real-time structured and unstructured data:

http://www.finextra.com/fullstory.asp?id=18293

Posted by Brian Sentance | 2 April 2008 | 4:11 pm


Time Series inside SQL Server

Case study of some of the work we have been doing with Microsoft on hosting our time series storage inside SQL Server has just gone up on their site at:

http://www.microsoft.com/casestudies/casestudy.aspx?casestudyid=4000001637

Posted by Brian Sentance | 19 March 2008 | 1:24 pm


IQ grows revenue for Sybase

Interesting article saying that Sybase IQ revenues were up 70% in 2007, and formed a very significant part of overall revenues:

http://news.yahoo.com/s/cmp/20080301/tc_cmp/206901052

Also mentions Mike Stonebraker with his column-based database start-up, Vertica, and how one of the senior IBM technologists puts forward that the full benefits at the back-end of higher performance are often not seen by the end user, and so the complexity involved in proprietary solutions outweighs the benefit. Element of truth in both, a standards-based approach is prefered by most institutions, but I think financial markets are a special case where back-end performance is transparent to the user.

Posted by Brian Sentance | 12 March 2008 | 6:37 pm


Are databases going green?

Just doing a bit of catching up with what is going on with Sybase IQ (1,000 terabyte benchmark sounds impressive) and came across the Wikipedia entry for this tech (http://en.wikipedia.org/wiki/Sybase_IQ)which mentions at the end that IQ's compression ability "...achieves a 90 percent reduction in CO2 emissions".

So now we have a column-oriented high performance database that is doing its bit to save the planet? I think I need to lie down for a bit and think how I can fit the Toyota Prius into our next marketing campaign...

Posted by Brian Sentance | 12 March 2008 | 12:54 pm


Contact us if you have comments. All rights reserved. Trademarks, copyright and legal. Whole site ©1995-2010 Xenomorph Software Ltd. Registered in England and Wales, Reg no: 03235432, Reg at: Waverly House, 7-12 Noel St, London, W1F 8GQ. VAT no: 672584016 - sitemap