Monday, 31 August 2009

NetApp - Cloud of Fog?

Right, so firstly despite what it may look like this isn't a NetApp bashing blog site or month (in fact I'm grumpy with all vendors - and NetApp are a strategic supplier to me), but I'm afraid I'm a bit lost re NetApp's marketing and 'cloud' and really need to ask a public question :-
"Can anybody give me any detail and substance to NetApp's cloud strategy, technology and deployments?"
I ask this genuinely, as despite asking the company directly for over 2yrs, and despite regularly looking myself through their blogs and whitepapers, I'm finding it really hard to locate any information re NetApp and IaaS, PaaS or SaaS usage or technologies (direct of indirect).

Oh of course I've heard the stories re large company X using ### many filers for search or webmail, and other such things - but frankly these strike me as very much a standard use of a technology coupled with significant discounts linked to volume & name etc. Not specifically addressing the changed requirements in a cloud (thinking IaaS) environment (ie individual assets may have less availability req, object protocols often needed, much greater connectivity reqs etc).

What I'm looking for info on is :-
1) A technology & commercial model that is more aligned at a web 2.0/3.0 business, with vast data scale, objects & files, geographical distribution, policy based mngt, compression & dedupe of data, adequate performance, adequate availability of a physical asset but v high availability of the information, and at a 7yr TCO price point substantially less than today's post discount per TB price

2) Any info re an object store capability (although Val Bercovici has already said he's not prepared to pre-announce anything right now)

3) Any info re a direct to market SaaS offering?

4) Any details, info or case studies re genuine cloud companies or web 2.0/3.0 companies that are using or planning to use NetApp in decent sized (ie multi-PB) cloud deployments
I'm wanting to like their 'cloud' technology but frankly speaking at the moment trying to find out about it is like trying to knit fog (sorry couldn't resist)

If any NetApp'er or anybody wants to point me towards some substance and fact then I'd be more than grateful :)

Thursday, 27 August 2009

Steve's IT Rants - Supplier Eraser

Steve Duplessie has a well honed nack of hitting the nail on the head in a style and language that both informs, entertains & 'sticks' (if you've seen him present in person you'll know what I mean).

Again I find myself reading his blog at

I started to write a comment reply to his post, but felt it was more than a reply so would draft something here.

Some key points from Steve's blog :-
  • 34% of customers looking to reduce supplier qty
  • Isn't the vendor list being reduced organically through acquisition?
  • Is it good for the market as a whole? I'm not sure it is.
  • It seems it will stifle innovation in exchange for more consistent vendor relationships (consistent does not imply "good," necessarily).
Frankly I'm staggered that it's only 34% and not a much higher number - a lot of people (incorrectly IMHO) associate supplier reduction with cost savings.

Yes in the short-term there can be a reduction in the cost of purchase orders by reducing supplier qtys - with the biggest short term gain to be to move to single supplier for a given area. Clearly 'value', 'quality' and 'sustainability' are often overlooked in such situations, similarly 'flexibility' is often missed and a 'cheap deal' suddenly becomes much more expensive when the full picture of lifetime costs & changes are considered. Again over time we have traditionally seen these 'savings' eroded as both supplier & customer get 'lazy & comfortable' together, with less emphasis on value.

As far as I can see such deals often also ignore the cost of migrating to the new terms, technologies or processes - often the 'adoption' costs can be considerable. Again people often think of deal durations as being a long time, but consider a 3yr deal - 6+ months taken to 'adopt' the deal after signing, and the renewal position will need to be considered at least 12 months before expiry to allow for the annual budget process to permit flexibility in the renewal outcome. So at best I'd say you get 12-18 months of stability (ie potential for 'cost savings') during a 36mth agreement.

So what do I think are some of the driving factors re supplier reduction :-
  • Reduction in resources & overheads in supply chain functions - thus forcing them to only wish to deal with fewer companies of a larger deal value.
  • The current 'in vogue' supply chain quality processes have a similar 'better fit' with companies of larger scale and with a larger / more regular financial exchange between customer & supplier
  • As a result of above, supply chain are often being rewarded or KPI'd with regards to supplier qty reduction - and personal rewards or KPIs always drive behaviours
  • Customer IT/IS depts are under hard pressure to do more, better, quicker with less - which equals less time & resource to accommodate diversity (product or suppliers)
  • Naturally there is also a major drive from the larger suppliers, with them offering big discounts on deals in order to generate any form of revenue and also elbow out competitors. These 'deals' are increasingly not just technology acquisition price, but are moving to multi-year deals covering tech, services, support & maint etc. It would appear such deals are often 'sold internally' as vendor displacement and are defensive in nature, with the vendor expecting later to cross sell into other areas (sometimes even just to act as a reseller to be able to book revenue), to recoup funds through services or simply to be able to remove a competitor.
Yes the supplier list is being reduced through acquisition, but I've yet to see this reduce the costs to the customer, nor does it always appear to be beneficial when the technology is killed off or vanishes for years.

Similarly I'm yet to be convinced that a reduction in competition is good for the customer of the industry - the customer looses due to less commercial pressures and less reason for suppliers to work with open standards, the industry looses through lack of innovation and enforced marriage acquisition issues.

My conclusion is that I think there is a lot of short term thinking (and savings benefit claiming) going on both in vendors (revenue now at cost of profit now & revenue in future), and customers (in terms of standards, commercials, leverage & choice) - with a considerable risk to all concerned for the mid-term.
Reblog this post [with Zemanta]

Wednesday, 26 August 2009

NotApp or NetApp?

So after years of us asking and waiting it looks like NetApp have finally made a couple of pre-announcements :-
  1. OnTap v8
  2. Object storage
Now I've been a big fan of what NetApp did for storage re:-
  • Storage configuration ease & simplification
  • Single OS/firmware over all products
  • Consistent and compatible capabilities on each product (eg think replication)
  • Their work with Oracle on NFS
  • Use and publication of open storage & system benchmarks
However I also regularly raise concerns re NetApp over :-
  • They are just not a truly global player and struggle with dealing with global companies
  • Roadmap & futures disclosure - as much as I have issues with EMC (and I have many) they do technical contact, futures and strategy briefing much much better
  • OnTap GX - has been in the wings for years, and appears to have been a major drain on their dev resources
  • OnTap constraints not matching the increasing scale of the requirements and/or platforms - eg re aggregate max 16TB etc
  • Poor estate mngt tools - prior to Onaro acquisition these were woeful, and there still a long way to go for NetApp native tech
  • Too frequent product changes, revisions & models - making interop a pain, and appearing to drive too many codebase versions
  • Poor interface and processes for RFEs (though I've yet to find a storage company that has any worth mentioning)
  • Poor acquisition history re choices & integration execution
  • Lastly, and most importantly, sadly over the last couple of years IMHO they have listened to their own hype too much, and as a consequence have lost touch with the real market prices and are unable to prove the value of their benefits. (Acting very similarly to EMC in the 90s and early 00s)
That all said I still like & recommend their technology! So, knowing above, what are my thoughts re their recent announcements :-

1) Ontap v8
  • DataMotion - looks very interesting, but the devil in the the capability, requirements & constraints details, can anybody provide these yet?
  • Pam-II cards are interesting, and a good way to get overall performance improvements without requiring lots of specific configurations, but value will depend on their € cost, and how to mitigate against the use of the onboard slots (thus reducing either disk loops or network interfaces)
  • NDDC - have read this 4 times and still think it's purely a Prof Srvs play wrapped in words, can anybody correct me with details?
  • I can't find a public document that compares v7.3 with v8.0 7-mode, so very tricky to talk about differences, anybody see a public doc?
  • The last time I saw public docs on Ontap v8.x the major features, benefits and improvements came in v8.1 rather than v8.0 - so I'm also rather keen to see what's being disclosed publicly re comparisons between v7.3, 8.0 & 8.1
  • The PDF published on the Netapp website ( re 8.0 7-mode makes lots of claims re 'lower TCO', 'increase productivity' etc but I can find nothing about a) what they are comparing to, b) what level of improvement and c) what proves the justification for these statements
2) Object storage
  • Fundamentally this is good news
  • mid/end 2010 will be too late, if it's not Q4 '09 then the momentum will be elsewhere
  • In the object space, the model cares & relies much less on the 'tin' thus the 'OnTap Object' techology will need to exist in a software 'virtual appliance' that can run on commodity hardware (look at the great things done by Caringo is this area)
  • Price point - similarly object storage is expected to have a materially lower price point than SAN or NAS (think DAS price point), very unclear how NetApp will be able to achieve this given their current pricing models
  • There is a lot more to object storage than simply being able to use a REST or SOAP API to R/W objects - look at how long it's taking EMC to get Atmos into shape (and there are some mightly minds on that project)
  • API - if it's not both XAM and EC2 compatible, then frankly don't bother. So when will the API details be published?
So for me these announcements are nice but nothing more than that, until the details are made public and the code goes GD (rather than GA). Of course what I really want to see is a TCO model comparing the before and after pictures :)

[Disclaimer] I am a NetApp customer and have over a PB of their disk platforms and have access to NDA information (that I will not disclose)
Reblog this post [with Zemanta]

Saturday, 22 August 2009

Objects & Metadata

As usual Dave Graham brings up some interesting and worthwhile topics in his blog post here

Now being an ex database programmer ('ex' of anything being of course the very worse and most dangerous type), and of course a storage curmudgeon, I have a passion for the topic of metadata and data. And being somebody having to deal with PBs of object data I naturally have some concerns and views here...

Now normally I agree with Dave on a lot of things - but I have to say I much prefer my scallops to be seared and served on black pudding nice and simply, letting the quality of the flavours shine.

That said I have to agree re his view of being able to segment metadata & object storage models into two areas - but do think there is a place (almost essential IMHO) for both models in the future storage.

We've seen this area tackled by a number of existing technologies re CAS and object stores (Caringo CFS gateway onto Castor object layer is good example) - but are only just starting to see the key new elements test these, namely vast scale (think EBs), geo-dispersal/distribution/replication, low cost.

I do also think it's worth exploring some of the possible types / layers of metadata, for me this breaks into :-
  • System / Infrastructure metadata - the metadata mandated by the storage service subsystem for every application using the service and every object held within the storage service. System metadata is under the exclusive control of storage service subsystem, although can be referenced by applications & users. Examples such as object ID, creation data, security, hash/checksum, Storage service SLA attributes (resilience, performance etc) etc.
  • Application metadata - This is the metadata associated with each object that is controlled and required by the application service(s) utilising the object. There may be multiple sets of application metadata for a single object, each only accessible by the approved application.
  • Object metadata - context & descriptive attributes, object history, related objects, optional user extensible metadata
I would expect all 3 examples of these metadata to be linked with every object, with at least the 'system metadata' always held locally with the object. The 'application metadata' & 'object metadata' may reside in the storage system, the storage service, the application or any combination. (In this context I refer to the storage system as an object store, and the storage server as being object store + metadata store)

Some of the metadata relates to the application and infrastructure architecture (eg geo-location information re object distribution & replication) whilst some of the metadata are attribute fields used within the application itself.

Given the above, it should be clear that I certainly agree with an entry Dave made in his blog comments re :-
"interesting note on ownership to which I'd say that there has to be dual ownership, one from the system level (with immutable meta such as creation date, etc.) as well as mutable data (e.g. user generated meta). The meta db then needs to maintain and track 2 different levels. Policy can affect either, fwiw."
So some thoughts about where to locate metadata as it relates to the object :-
  • As referenced above, I believe 'System metadata' must always reside with the object as it is used by the storage service for mngt, manipulation and control of the object itself, and ensure it's resilience & availability.
  • As has been an issue with file-systems for some time, there is always an issue with fragmentation of the underlying persistency layer with vast size differences between objects and metadata when they are tightly coupled
  • As a result of needing to traverse the persistency layer to establish the metadata, there are performance issues associated with metadata embedded within the object layer - move the metadata to a record based system and performance & accessibility can increase dramatically
  • For certain classes of use (eg web 2 etc) it's often the metadata that is accessed, utilised & manipulated several orders of magitude more often than the objects themselves, thus the above improvements in performance and accessibility of metadata (thin SQL query etc) make major differences
  • Clearly if the metadata and objects are held separately the metadata can be delivered to applications without needing to send the objects, similarly the metadata can be distributed separately / in-advance of the object. Thus having major advantages for application scaling and geo-distribution.
  • With the split of persistecy location / methods this also allows for security layers to be handled differently for the metadata and the object.
This also brings into line a question area I've been working with for over 2 years with object stores - in that what features & functions should live in the application layer and what features and functions should live within the infrastructure (storage service) layer. What areas of metadata are actually data information in their own right, or embedded in the application logic, there appears to be no clear rules or guidelines.

If you like, this could be seen as an argument between IaaS & Paas - and for sure the only sensible answer for a company right now is IaaS, PaaS exposes far too much of the logic, taxonomy, behaviours, trends and metadata layers to the PaaS provider than is healthy.

There is also an additional interest point re metadata - as we move from the System metadata into the Application & Object metadata, should we consider privacy and encryption of the metadata itself? (assuming that the objects will always be protected appropriately) I could see how this will be a requirement in some multi-tenancy environments an for some metadata elements...

Lastly some more questions :-
  1. How do you cover the topics of backup/recovery of the various metadata elements?
  2. How do you cope with bulk import / export of the various levels of metadata and their logical / context relationships?
  3. What standards will emerge for metadata schema definitions and attributes?
  4. What standards will emerge for policy script language & descriptors that manipulate within the storage systems? (think how to describe an SLA in a programmatic language)
  5. Can security authorisation & permission tokens exist and be enforced in a separate context and control domain to the identities?
Naturally in this we're not covering any of the 'internal' metadata used by the storage system to locate objects, to handle the multiple instances of the same object within a 'object storage service' (resilience, replication etc), to enable sharding / RSE encoding of objects etc that the storage system has to cope with.

Now I'm off for some lunch, I'm hungry and fancy some seafood for some reason ;)



Reblog this post [with Zemanta]

Sunday, 16 August 2009

Storage Security

In this post I'm going to discuss a small area of storage security, specifically the privacy side of the security coin, more specifically array data erase. Right, so in my view security is a very dangerous area for people to wade into, especially storage people.

However when we do dare to wade into this area my feeling is that the storage people often either :-
  1. Totally ignore the topic and hope (the religious Ostrich strategy)
  2. Simply don't understand the topic, and have no idea how to assimilate the vast myriad of actual or hype 'requirements' that impact storage from a security aspect, or frankly don't know who to trust in this area (other than @beaker obviously)
  3. People often select and place security technologies in the wrong areas in the mis-belief that this will help them
Security is always an expensive topic in terms of investment, process and discipline - and generally I'd argue that often a bad security design, or technology, is actually worse (& more expensive) than no security.

However one interesting security technology that I do think has a use in the storage array area is the TCG SSC Opal standards work, which should offer another option in the 'data erase' sector.

With it already often taking "well over a week" to securely erase a current 100TB array, just how long do you think it will take to secure erase a 2PB disk array using current methods, and at what cost? For those companies disposing of, or refreshing, 10s->100s of arrays every year, this is a major & expensive pain.

My understanding of one element of TCG SSC Opal is that each individual disk interface uses standards-based encryption techniques and methodologies (AES-128 or AES-256) to encrypt all data stored on that disk. Further this supports multiple storage ranges with each having its own authentication and encryption key. The range start, range length, read/write locks as well as the user read/write access control for each range are configurable by the administrator. Thus to 'erase' the data only the keys need to be revoked and destroyed within the drive.

Problems addressed
  • Failed / failing disks - Allowing data on failing disks to be 'erased' rapidly as part of the disk swap process.
  • Technology refresh & Array disposal - clearly before an array and it's disks can be exited from a company the data on disks must be rendered inaccessible, incurring considerable cost and time. Sometimes this results in physical destruction of the array and disks, preventing any possible credit / resale value.
  • Array relocation - increasingly it's a requirement to secure erase an array prior to moving it to an alternate location within the same company. Again incurring additional cost and time delays for the relocation.
  • Lack of standards - sure there's the U.S. Department of Defence 5220-22.M specification document, but this isn't an international standard, and is open to interpretation.
  • Standards - this will provide an industry standard based method, against which vendors & technologies can be measured and operational practices audited against. Should also help reduce the FUD from technology sales in this area.
  • Scalability - unlike other 'in band' encryption technologies this solution scales linearly and independently with each disk in system, no bottlenecks or SPOFs introduced.
  • Time to erase - now this is a major pain as array capacities grow, particularly in migrations where the data on the array must be securely erased prior to array decommissioning. Hence extending the duration that power/cooling is needed etc. Anything that improves this timeline on-site is a significant benefit.
  • Reliability - strange one I know, but a fair % of enterprises do not allow their failing disks to be returned to the manufacturer under support or warranty, preferring instead to physically destroy such disks and pay for new ones. Thus denying the manufacture the RCA analysis process and not contributing to the continual improvement process. If the data on the disk is useless (according to an agreed standard, and at no cost/time impact to customer) then these disks may now go back into the RMA processes to the benefit of all.
  • Security - by providing an easy to utilise technology the technology should see increased utilisation and hence an overall improvement in the security of data in this area.
  • Cost reduction - clearly anything that saves time reduces some elements of cost. But this should also make a fair dent in additional technology sales and professional services costs. Similarly, should also reduce the need to physically destroy (disks & arrays) during refresh projects, and thus expand a resale market / opportunity.
The questions I want to know answers to though are :-
  • Why haven't I been hearing about this from array manufacturers in their 'long range' roadmaps?
  • When will we see the first array manufacturer supporting this? (the disk manufacturers are already doing so and shipping products)
  • What will be the cost uplift for this technology in disk & array?
  • When will the first customers mandate this in their storage RFx requirements?
Clearly this is only going to help for 'data at rest', on newer disk drive models & arrays, and not for portable media etc - but it's a step in the right direction.

Yes there is a need for more 'intelligence' in the disk drive firmware to ensure that latency & throughput levels are maintained. Yes there is work on the array needed for mngt control interfaces and KMS relationships etc. But I want to know more and get answers to my questions above :)

Some links for further reading on TCG SSC Opal :-
Reblog this post [with Zemanta]

Now I'm no hippy but...

Whilst wandering through TED videos the other day on my 'Mother TED' application on my Android HTC Magic I found this video from 2005 that I'd not seen before, with Bono making his requests for the 3 wishes that TED granted him :-

Yes it's 27mins long- but spare the time, it's worth it and at least we're able to watch it...

The thing that startled, depressed and really bothered me is that how relevant all of the points Bono makes are, and that they are all still (sadly) valid :(

I understand from here that there was some good reasoning made in '05/'06 as to the technical challenges, and why the 3rd wish hasn't yet been granted. However 4 years is a very long time in technlogy, so I'm intrigued as to what is possible today and tomorrow.

So hopefully ignoring all the 'feel good' bull associated with corporate social responsibility, I've decided to build in a couple of my own questions into the meetings, presentations and pitches that get made to me by large vendors re technology that they want me to approve or purchase. Namely the following :-

1) I want the first content slide presented to me to be a statement from your company re it's position, status, plan & contribution to the three wishes stated in Bono's video

2) I want the second content slide presented to me to detail your companies contribution to charities, help and aid, specific projects as % of revenue and profit.

I used to work in a company where all our purchase request costs were reported internally in terms of the qty of products we needed to sell in order to generate the funds to pay for the request - I wonder how many IT people are able to do this today? I wonder how many would be brave enough to think of them in terms of lives in less fortunate countries?

As I'm prohibited from declaring the IT 'street' prices I'm aware of and then comparing then to aid impacts - I'll just leave it with the questions "is that storage price really good value?" and "why is data on an array invested in more than a life in Africa?"

Saturday, 8 August 2009

Hello Dave - don't get misty eyed

Just a brief post to say hello and welcome to the cloud area to somebody that I have a great amount of time & respect for, so :-
"hello Dave Graham it's great to hear you're taking a key role in the cloud infrastructure arena, please keep to your great ways so far and don't get misty eyed or foggy over matters :)"

To hear from the man himself see here :-

Infrastructure Conferences

Boring I know, but here's the list of infrastructure & storage events that (all being well) I should be attending in the coming months :-

  • CloudCamp London - next one London Sept 24, 2009 6pm to 10 pm
  • IP Expo - London Earls Court Oct 7-8 2009 (thanks StorageZilla)
  • StorageExpo - London Olympia, 14/15 Oct 2009 (one day only)
  • SNW Europe - Frankfurt, Oct 26/27/28 2009
  • EMC Intl Customer Council (invite only) - Prague Nov 2009
  • CloudExpo London
  • Cisco Networkers - Barcelona Jan 25-28 2010
  • Cebit 2010 - Hannover March 2-6 2010 (one day only)
  • VMWorld Europe 2010 - in Oct 2010 in Cannes (if it occurs)

Like most people I find the peer disucssion the most useful, but add to that the ability to speak directly and candidly with the relevant empowered decision makers, and the events are a lot more than the 'jolly' some people think of them. Trust me you know my views on business travel by now, and I wouldn't be attending if I didn't think it valuable...

If you're attending any of the above then let me know, hell I might even ask you to buy me a beer! :) If you know of any other good events re data-centre infrastructure or cloud topics then let me know and I might attend and buy you a beer! :)

Friday, 7 August 2009

Video Time

No it's not a StorageRap or WheelCam style video, but rather a simple round up of some infrastructure related videos on the web that I've found interesting.

Firstly, Simon Wardley (Canonical – Ubuntu), talking a lot of common sense (something missing from the 'traditional major IT vendors) on cloud computing - brilliant, entertaining & very accurate...

Some interesting videos showing the scale & nature of a number data-centre infrastructure designs and the differences in their approaches :-

Google's Container DCs

Microsoft OS Cloud Windows Azure Data Center

Oracle's Austin Data Center

HP's POD & Next Gen DCs

As usual there definitely isn't a 'one size fits all' answer to the DC of the future, but I can certainly see the use of both 'factory', 'container' and 'traditional' DCs within any large enterprise going forward - will be very interesting to see how the tools, technologies, people & culture adapt to work with each & all of these plus of course the cloud IaaS provided 'virtual DCs' of the future...

Emulex E3S

Ok so the Emulex E3S technology is rather interesting...

Dave Graham started it all off here with these 2 blog posts :-

Chris Evans also riased points on here :-

Emulex have now posted a site for information re this at :-

I raised some questions that I'm still not sure I've found the answers to :-

  1. How does the 'adapter' handle and treat mutable / changing blocks? (eg does it write new block object and retire the old or something more optimised (eg a mini delta block))
  2. What are the scale targets for the adapter (or groupings of adapters) re qty objects, capacity abstracted, latency, throughput & cache etc?
  3. What underlying cloud APIs are used? are these 'pluggable / changable'? and can multiple be used at once?
  4. What specific encryption & KMS system is used?
  5. How does the adapter work with authentication, authorisation & accounting / billing attributes that may need to be handled re cloud storage?
  6. What policy mngt framework is used to control the behaviours of the adapters? and how does this relate / compete / cooperate with other policy frameworks (eg Atmos's or Symm FAST etc)?
  7. Does the adapter maintain the checksum history of the cloud objects written in order to validate that their retrieval matches? if so where is this data stored and how is it protected / made resilient?
  8. What's the target price point?
  9. What's the availability date?
  10. Who will be the first array manufacturer to include this (or similar) as a BE card? and how will that affect technology capability licensing within that array?
Points added Aug 9th :-
  1. Could this be made to work with other block formats / devices (eg tape emulation)?
  2. Is this partnering with any other object storage formats (eg XAM, Caringo Castor, EMC Centera etc)?

The Power of Bod

So this week my good friend StorageBod shone the lighthouse beam ( on this said quiet little internet hovel of mine, and suddenly I get a bunch of comments on the blog from people I know and respect - not only do I feel a little less grumpy today, and a bit more pressure to live up to the expectations, but I've also got further respect for the media position Martin holds :)

Oh course he did this whilst I was working 18hr days in Istanbul, Turkey reviewing various storage proposals from the usual culprits, and also spending far too much time with lawyers that make me look like a smiling munchkin. All of which meaning I haven't had time to draft & publish any proper content yet...

So with that in mind I guess it's time for me to update this blog and put a little bit of storage related content into it...