Discussions ranging from enterprise technology, technology architecture, cloud infrastructure, storage, and data-centre infrastructure to TVRs, motorsport and sports in general. Expect general grumpiness, frequent rants, and plenty of complaints & challenges re vendor FUD and hype. Frequently heard shouting "show me the requirements, TCO & ROI"...
Sunday, 10 October 2010
Quick Thought/Rant - Storage Commodities
Now @chuckhollis make some good comments here about storage being a commodity - something of which I have some views on :)
Interestingly through the discussion Chuck still talks about the 'problem statement' from a storage technician perspective (with an introduction of diversity & complexity in the technology mix, which glosses over two major TCO cost elements - complexity & diversity of solution spaces) - until the closing section, which is where I agree with him.
You see, as far as I'm (and my CIO is) concerned storage is a commodity - however it's the physical storage that's currently a total commodity, the logical layer (software) still has a bit of a way to go to become a commodity. I think a lot of the sensible people have made the leap past caring about many of the internal storage service widgets & sprockets - and frankly, no longer care - 80% of storage products in data-centres are capable of supporting 80% of the requirements.
But I 100% agree that people around storage have to change - the processes, the eco-system, the value, the religion, the entrenched conservatism, the lack of transparency, the products, the "can't tell you in advance without a full PoC", the tools, the 'buy our magic beans' culture, the org structure, the "we're special, honest", the cost & value, the people, the sales model, the support model... All of this has to change in order to accept the fact that storage is now a commodity hygiene factor and no longer the king of the castle in the infrastructure space...
I just wish that more of the people involved would realise this!
(Oh and we'll leave the 'shiny new baubles' of mngt tool nirvana for an expensive (for anything re mngt tools is always 10x more expensive than you believe and always 0.3x the value you're told) rant another day...)
That's all for me for now - I'm still around building up a major backlog of rants, but current work (both volume and subject matter) prohibits me from posting much of it right now. Normal service should resume towards the end of the year.
Monday, 3 May 2010
Hey EMC? Rhubarb! I say Rhubarb!
Now contrary to popular belief I don't actually like calling out specific things or people, but I'm afraid I feel compelled to call out something. I shall do this using a recent CloudCamp by-law convention established by @swardley on Feb 8th 2010...For the publication of the "Savings from their IT Investments" press release I hear-by shout a claim of "Rhubarb!" firmly in the face of EMC and their PR team.
Now to be clear my call of "Rhubarb!" refers mainly to the specific 3rd paragraph, namely :-
Optimizing performance and cost reduction for Oracle Database 11g deployments with EMC Symmetrix® V-Max™ or EMC CLARiiON® CX-4 networked storage systems and EMC FAST to automatically adjust storage tiering as Oracle workloads change. This results in up to 30 percent lower acquisition cost and up to 45 percent lower operating cost in hardware, power, cooling and management over a three year period.Additionally @sakacc also makes reference to this in the 5th paragraph of his blog post here re :-
Oh – we also showed how using Fully Automated Storage Tiering, Solid State storage, and deduplication using Data Domain we could lower the acquisition cost by 30%, and the operating costs of Oracle 11g by 45% – while delivering equal or better performance.I'd like to believe these papers were created by engineers with good intent and sound principles, and I can certainly vouch for Chad having those qualities, and it is positive to see such materials being published, however...
- The actual cost savings percentage values called out in the press release text
- The baseline that has been used for the comparison savings statements above
- Costs of components, technologies & software used - clearly showing the standard FoC and additional cost elements for each option (eg cost of FAST, cost of SSD etc)
- Any form of ROI and TCO models used to underpin the two statements in the document making reference to 'improving TCO'
- The list of assumptions and/or pre-requisites used in the models re savings (re facilities costs, FTE costs, frequency & duration of administration or change tasks, support & deployment operating model RACI etc)
- Savings for other alternatives (eg using thin-provisioning & wide-striping on the vmax rather than FAST, using a CX instead of a VMax, using both a CX & VMax etc)
- Impacts of other technologies (eg Oracle on NFS, Database compression, Flash Cache etc)
- Impacts of the use of two independent arrays and ASM "NORMAL REDUNDANCY" for the replication of data between the arrays (ie rather than the cost of SRDF), and then how the independent operation of FAST on each array may impact performance predictability under disk failure situations. (with different read & write IO profiles potentially driving the independent policy engines into different decisions)
- Some context definitions as to what capacity the paper regards as 'large' Oracle databases (eg 10TB? 50TB? 150TB? etc) (page 5 of pdf)
- How this is impacted by the capacity of the databases being handled by the storage structure?
- Impacts of the rate of database capacity growth on the proposed model (eg a rapidly expanding DB, when the DBs exceed the capacities of each tier etc)
- The RPO & RTO requirements & assumptions for these databases, and impacts / sensitivity of changing them
- The support operational SLA context for the database services (eg permissible time for response, resolution and return to normal operation, performance & risk after an incident)
- Any specific performance related numbers (eg actual response ms times required for SLA, throughput/sec, IOPs etc), and how they change (given we don't know much about the transaction being measured)
- No details to the NFR (performance, resiliency, capacity etc) impacts during the FAST migrations, or indeed how long the migrations took to complete each time
- A slight puzzle for me is the use of Raid 5 3+1 in the vMax, given all the previous statements from EMC re default preference for Raid 6
- The on page 30 of the doc the SSD raid type is stated as R5 7+1, versus R5 3+1 of other disk types (impacts on performance, risk & perf deg during rebuild?) but at the bottom of figure 25 & middle of figure 26 the screen-shots appear to show the SSD as being R5 3+1?
- I'd be interested in seeing how including the pool allowed to use the SSD would change the performance - given a lot of Oracle DB perf issues come from redo log bottlenecks
- The more likely use-case for me would be migration of storage within a single database's data structure (eg some of the database data-files active some inactive etc hence some on SSD, some on FC and some on SATA for the same DB)
- As a minor side point it doesn't mention the version of ASM being used
- Sadly as usual with EMC there are no actual details of the reference benchmarks being used for the workload simulation - really would be good if they published their suite of benchmark tools
- The paper makes reference to "we used an internal EMC performance analysis tool that shows a 'heat map' of the drive utilisation of the array back end" - why isn't this confidence validation view & tool available to all customers as part of all array type standard software? (after all at worse it would be a sales tool to help justify the purchase of the additional FAST software licences???)
Now I'm not saying the reports are wrong, but what I'm rather saying is that I think they are incomplete, and appear to have little or no direct linkage to the claims being made in their name by the PR teams and certainly don't give a full context picture. This additional context & information is needed for enterprises to get sufficient comfort in the technologies to perform their own benefit opportunity & impact assessments.
So marks out of 10? So far 6/10 with a caution for "unjustified PR marketing abuse" - but very willing to review the score upon someone pointing out what I may have missed, a revised draft, justification of claims & clarification...
Sunday, 25 April 2010
Feature stacks and the abuse of language
So the new financial year heralds the season of vendor conferences, and - as night follows day - over the horizon, like the four riders of the apocalypse, approaches the associated marketeering storm that always comes with such conferences.
Sadly one trend I'm seeing more of from the (increasingly desperate?) IT infrastructure industry is aspirational future feature stacking. Where endless features are announced haphazardly into the mix in an attempt to justify new revenue streams; naturally the delivery of these features is in a different year/decade to when they are announced, let alone when any actual benefit might be realised.
Of course the first challenge to this is trying to convince customers re the vital importance of features they haven't heard of before, often for problems they never knew they had. So some use fictional stories in order to try and paint a picture of utopia as a result of paying for their magic liquor, some just plaster the industry with noise, others use new abuses of marketing terms, some use all.
The benefit's case is an interesting point in it's own right - remember these are the vendors that often still haven't a clue about the TCO or ROI for their products several years after they were announced. Naturally there is little or no mention of the financial costs involved, ingress & egress disruption, organisation & technology process changes, operating model changes, and increasingly, the business process changes needed to use this fictional future widget function.
- 'Cloud' NIST has worked to a certain extent but IT companies have abused the hell out of it.
- 'Virtualisation' has some common understanding in the server world, but as usual the storage world is chaos.
- Now along wanders 'Federation' as the latest word to be put through the hype & definition mangler.
- The specific customer requirements & problems this addresses & justify how
- The use cases this feature / function applies to, and those that it doesn't
- Why & how this feature is different to that own vendor's previous method for solving this problem
- Provide clarity over the non-functional impacts of the feature before, during & after it's use - ie impact on resilience, impact on performance, concurrency of usage etc (including provide up-front details of constraints)
- Provide the before & after context of the benefit position, clearly explain the price of the benefit change and any assumptions or prerequisites needed to use the feature
Provide some form of baseline & target change objective for entire process steps impacted- Confirm the technology costs and cost metric model for this feature
- Naturally you'll also expect me to require TCO & ROI of the feature, and any changes to the models as a result of this feature
If this sounds overtly negative that isn't the intent. The issue for me is that any 'nirvana function'© is normally only of use if it makes a net positive change to the cost of BAU service or change. In order to prove that we need to understand how it impacts the steps, effort & duration for each item in the transition from 'desire to delivery' (eg when somebody thinks they may need some capacity to when they are able to actually use this). From my experience this sequence involves a mix of commercial, technical, political, emotional & financial steps - similarly very few companies seem to be able to show the steps in this sequence and how their function changes them.
Now I'm very much one for focusing on capabilities and architectures rather than point widget features, but the current trend of announcing aspirations as architectures and then products is a very dangerous and steep curve downhill. Like an iced wedding cake made from cards built on a sandy beach - this obsession with feature stacking promises everything but benefit delivery regularly lasts for only a few minutes before collapsing in an ugly mess.
Are suppliers hoping that by increasingly frequently hyping the shiny shiny baubles of the progressively distant future they will distract us from the factual reality of today? Remember today was the future of yesterday, and how many of the past's 'nirvana functions'© promised by these same If only these vendors spent time & resources making the existing features usable, simplifying the stack, resolving the interop issue, given clear context and being able to actually justify their claims, rather than building their own independent leaning towers of Pisa from which they can throw mud at each other...
Thursday, 15 April 2010
NotApp takes a byte of objects?
Now eager followers (it's legit to use plural as there are at least 2 of you!) will recall that I commented about NetApp and cloud storage last year here (NotApp or NetApp) and here (NetApp cloud or fog) - so of course I'm rather interested to follow-up and hear how @valb00 carries through with his statement on my blog comments from Aug 2009 of :-
"- Finally we come to the highly anticipated Object Storage question. Without pre-announcing anything, I will divulge that our solution will prove the value of Spinnaker’s scale-out excellence, particularly beyond NAS or even SAN/iSCSI configurations. Priorities of REST, XAM, SOAP and others are really interesting to us at the early (pre-standards) market phase"I must admit to being a little disappointed by the announcement - much like @StorageBod, I had been allowed to gather the impression that they were much further along with their own internal object work. One assumption would be that what was being alluded to was a whole bag of empty :( Of course the another possibility is that the internal work is going fine and this is an stand-alone additional product line?
Either way, the timing of NotApp & ByCost gives me a wry smile given length of time between 'object strategy' PR statements to actually starting doing something.... (let alone the GA/GD date of the final solution)
Fundamentally I still have the same questions plus naturally some addition new ones :)
Clearly there are some of the obvious questions :-
- How quickly they make this a native capability of Ontap and not just a standalone product or a bolt-on gateway? (frankly I'm not taking bets on anything earlier than the GD release of v8.3??)
- What pricing model & cost they sell the tech at - the object model will not stand NetApp's traditional COGs, let alone combined COGs of NotApp plus Bycost
- How NetApp intends to handle Bycast as a company? As let's face it, NotApp's acquisition history isn't exactly great, and their software dev trains are rather muddled and overly complex right now
- How will NetApp manage to hold on to the people & desire fuelling the drive and innovation at Bycast? Especially when they faced with the monolithic wall of spaghetti code that OnTap must be by now...
- How much did NetApp pay for Bycast? and thus how much additional value do they need to return to their shareholders over what period of time?
- Would I have purchased from Bycast before? no. Would I now via NetApp? don't know - far too early to understand
- What's the product costs and the combined/revised TCO model?
- How will NetApp position this pure software only model, that allows for felxibility with hardware (eg server reuse, DAS pricing models, capex risk mitigation with repurposing etc), against their normal hardware + software model?
- When will they include compatibility for the AWS S3 object APIs standard? As this is most definitely the de-factor standard that people are interested in right now..
- What will Netapp do for globally local deployment skills & support?
- "Is this too little too late?" @RandyBias asks here - interesting question! All depends on things like the API model, product cost, time to deliver real integration, where it fits in sales proposition, roadmap integration etc...
- How will NetApp build upon Bycast and what is their 18mth roadmap for the Bycast technology?
- What difference will being part of NetApp make to Bycast? and how will this improve their products and services?
- How will NotApp adapt the waffle maker to be able to efficiently cope with the metadata needed in an object platform?
- How does it relate to OnTap 8 distributed file-system mngt? is this helping in-fill minds, technology, issues in that space?
- Does NetApp have the suitable culture to be able to connect and deliver in this space? interesting... in the enterprise market for internal object stores - maybe... for the web 2.0 uber scale developer lead object stores then no... They certainly are not the driving culture innovator they once were, just look how hard EMC have fround this area with Atmush and the squillions of $s & good minds that they've poured into CIB & Atmush so far... (and that's a not too bad a product - some material API standards issues but mainly internal culture, sales & cost issues...). Now given that NetApp are nowadays more like EMC in the 90s than any other company I've ever met (ie complacent, out of touch, expensive, slow to react, storage only player, rhubarb for ears etc - but interestingly still better NAS than the rest) - how on earth wil NotApp's sales-force get their heads around selling something at much lower costs, higher value and address the margin cannibalisation directly?
- Will NetApp want to get into the IaaS/SaaS market directly by offering an object store service directly to compete with AWS & EMC etc? and if so how will they handle the 'competing with their own customer' bit?
- How will the competition react?
- Who will look to snap-up the other software only storage cloud players out there?
- Will NetApp now finally calling 'any shared bit of tin' a cloud and use the term with a bit more respect?
- Prior to this I was aware of them, but they weren't somebody I was actively engaged in discussions with
- I think it's good that they are working with the SNIA CDMI standards (I'm guessing this is where the acquisition discussions may have started from)
- David Slik's blog here seems to have plenty of good content in it
- Bycast certainly seem to have a bunch of happy customers so far
- The fact that it already supports multiple types & staus of target media is very positive, as is the support of running under VMWare (and hence being hardware agnostic)
- The data on the website is rather light on specific numbers (volume, qty scale, performance etc) and details on policy mngt & metadata
- One thing that annoys me, is that to find any information out (documentation, technical, support etc) it would appear I have to register (and wait for an email of the document and for the inevitable sales droid to try and contact me) - big hint, you want me to look at your company & product? Make it easy! (especially when I tried it the system crashed with Siebel OnDemand errors all over the place)
- Clearly the devil is in the details, I'll wait to find out more over time
Wednesday, 14 April 2010
Large slices of pie do choke you!
Now anybody who's spent time working with me on my companies' global storage BOMs will understand that this is a major issue for me, and not something that is getting any easier. The issue is a complex one :-
- The €/Per GB ratio becomes more attractive the larger the capacity within an array (as the chassis, interfaces, controllers & software overheads get amortised over a larger capacity) - however of course the actual capex & opex costs continue to be very sizeable and tricky to explain (ie "why are we buying 32TB of disk for this 2TB database??")
- As the GB/drive ratio increases, the IOPS per individual drive stays relatively consistent - thus the IOPS/GB ratio is on a slow decline, and thus performance management is an ever more complex & visible topic
- IT mngt have been (incorrectly) conditioned by various consultants & manufacturers that 'capacity utilisation' is the key KPI (as opposed the the correct measure of "TCO per GB utilised")
- DC efficiency & floor-space density are driving greater spindles per disk shelf = more GB per shelf
- Arrays are designed to be changed physically in certain unit sizes, often 2 or 4 shelves at a time
- As spindle sizes wend their merry way up in capacity the minimum quantity of spindles doesn't get any less, thus the capacity steps gets bigger
- Software licences are often either managed / controlled by the physical capacity installed in the array, or in some random unit of capacity licences key combination - these do not change re spindle sizes
- Naturally this additional capacity isn't 'equally usable' within the array - thus a classic approach has been to either 'short stroke' the spindles or to use the surplus for low IO activity. However in order to achieve this you either have to have good archiving and ILM, or need to invest in other( relatively sub-optimal to application ILM) technology licences such as FAST v2.
- Of course these sizes & capacities differ by vendor so trying to normalise BOM sizes between vendors becomes an art rather than science
- Inevitably it means that the entry level capacity of arrays is going up, and that the sensible upgrade steps are similarly going up in capacity.
- We are going to have to spend more time re-educating management that "TCO per GB utilised" is the correct measure
- Vendors are going to have to get much better at the technical size of software & functionality licensing that much more closely matches the unit of granularity required by the customer
- All elements of array deployment, configuration, management, performance and usage must be moved from physical (ie spindle size related) to logical constructs (ie independent of disk size)
- Of course SNIA could also do something actually useful for the customer (for a change), and set a standard for measuring and discussing storage capacities - not as hard as it might appear as most enterprises will already have some form of waterfall chart or layer model to navigate between 'marketing GB' through at least 5 layers to 'application data GB'
- Naturally the strong drive to shared infrastructure and enterprise procurement models (as opposed to 'per project based accounting') combined with internal service opex recharging within the enterprise estate will also help to make the costs appear linear to the business internal customer (but not the company as a whole)
- The real part though will be a vendor that combines a technical s/ware & h/ware architecture with a commercial licence & cost model that actually scales from small to large - and no I don't mean leasing or other financial jiggery pokery
Sunday, 6 December 2009
UK iSMF - The hidden jewel
by senior mngt for a issues they didn't create. These men promptly
escaped from a maximum security site to the UK underground.
Today, still wanted by the vendors, they survive as admins of fortune.
can find them - maybe you can contact : The (storage)A-Team.
The best way to contact this shadowy group is via the "UK Independent Storage Management Forum" - with an online presence at http://www.linkedin.com/groupInvitation?groupID=166408
If you're in the UK, a storage customer, and want to meet like minded storage admin peeps then sign the NDA, request to join and get ready to contribute on a 2-3 times per year basis...
If you're a vendor or reseller then you certainly can't do sales / marketing pitches - but, if you're lucky, maybe you can engage with the team to review your product, give feedback and sort your RFE & roadmaps...
[Disclaimer - the forum used to have facilities & logistics funding (repairs to the team van, all the cigars & gold chains costs you know!) and assistance from EMC (I'm currently unsure) - but certainly no sales pitch or content control]
[Double disclaimer - my terrible time & diary mngt has meant I've been a very underground figure for too long in this forum]
Monday, 23 November 2009
Storage - LUN Sizing & Standards
- Do you have standard sizes for LUNs in your storage estate? If so :-
- What was the rational behind having the standard?
- What is the actual sizing & layout standard?
- What was the rational behind the actual size & layout chose?
- Does it vary by storage type / location / product?
- Does it vary by application / dataset use?
- If you don't have standard LUN sizes :-
- Have you noticed any optional / support impacts or complexities?
- What benefits have you seen?
- How does your device layout & LUN sizing impact, or is impacted by :-
- Data replication strategies? (eg application, host agent, VM/FS, pathing, SAN, array etc)
- Procurement & purchasing models? (eg large pre-provisioned boxes, small boxes with chunk growth, ad-hoc project based etc)
- Do you envisage this situation changing in light of technologies such as :-
- Thin provisioning
- Wide striping
- Automated lun/sub-lun tiering (eg FAST, TSM etc)
- Space reclomation (eg ZPR etc)
- VM/FS improvements
- Some form of improved & useful SRM tools (yes I realise this is a big wish)
Would welcome thoughts and comments?
Friday, 30 October 2009
RFEs - My informal tracking list
Ok so further to my previous blog entry http://grumpystorage.blogspot.com/2009/10/rfe-for-rfes.html , and in the spirit of sharing, here's my draft list of 'non-NDA' RFEs :-
· Ability to terminate multiple VSANs onto a single array FE port directly without needing to use IVR and/or dedicated array FE ports per VSAN
· XML interface for customer retrieval EOL/EOSL information, eg XML to complement :-
NetApp http://now.netapp.com/NOW/products/eoa/
EMC
"PowerLink -> Home > Support > Interoperability and Product Lifecycle Information > Release and End of Life Dates"
"PowerLink -> Home > Support > Interoperability and Product Lifecycle Information > Documentum and Former Legato Product Information"
· XML interface & client for 3rd party interop matrix status and updates, eg to complement :-
NetApp http://now.netapp.com/matrix/ http://now.netapp.com/NOW/products/interoperability/
EMC
"PowerLink -> Home > Support > Interoperability and Product Lifecycle Information > Interoperability Matrices"
HDS http://www.hds.com/corporate/resources.html
· API / XML driven interface for licenses :-
a) To be able to remotely programmatically determine each & every licensed featured installed / not installed within the device
b) The be able to remotely programmatically determine the license metric and it’s usage for each & every licensed featured installed / not installed within the device
· API / XML driven interface for environmental impact
a) To be able to remotely programmatically determine the real time energy consumption by the asset (power etc)
b) The be able to remotely programmatically determine the high, low & average energy consumption by the asset (power etc) over a given period
IBM
· Would be very interesting to use the SVC to perform Raid5/6 over multiple arrays - to remove the array enclosure as an SPOF in a data-centre
Microsoft
· SQL Server formally & fully supporting NAS for it's database storage
· Formally & fully supporting 3rd party backup/recovery tools for it's database products
Zimbra
· Support of object storage (Castor, S3, Atmos etc)
· zBackup to operate multi-threaded and support for backup / restore of 10TB database messagestores
Caringo
· Full support for running Castor as a VMWare guest image
· Castor AWS S3 compatible API option
· Castor support for Zimbra, Sharepoint & Exchange+Enterprise Vault
· TCG SSC Oasis support
EMC
· Centera usability & readability of config reports
· SRDF & Mirrorview replication inbuilt cross compatibility
· Ability for SRDF sync replication with delayed updated at remote site
· Atmos support for Zimbra, Sharepoint & Exchange+Enterprise Vault
· Atmos support for an AWS S3 compatible API option
· Networker full support of Atmos as target
· TCG SSC Oasis support
· Greater qty of Ethernet ports on NAS devices
· EMC Powerlink to present at least the same deployment statistics and uptime information as NetApp NOW site
a) Enhancing the current information presented at :-
"PowerLink -> Home > Support > Interoperability and Product Lifecycle Information > Storage Target Revisions and Adoption Rates"
Release Date
Number of customer systems currently running this release
Number of customer sites currently running this release
Average run-time days per system
Qty of issues, by severity
c) To make the following additional information also available via an XML interface
Product major & minor (eg DMX & 4), Code Name, Code Rev, Release Date, # or % product running this release, # or % sites running this release, Average up-time, Target / Recommended Release
· To have an equivalent of EMC PowerLink or NetApp Now for self-support and information access
· API reporting of hot-spare qtys correctly on USP range
· API alert reporting for service processor utilisation on AMS range
· Ability for sync replication with delayed updated at remote site
· TCG SSC Oasis support
· Support for Thin-Provisioing in USP-1100
NetApp
· TCG SSC Oasis support
· Release the software only variant of OnTap as a commercial product (capacity limited if needs be)
· Greater qty of Ethernet ports on NAS devices
· Much larger aggregate & flexvol capacity sizes without having to upgrade to v8
· Support for OnTap v7.3 & v8 on older generation equipment and also for newer products (eg 2020 & 2050)
· Support for native OnTap 'in box' tiering of a file-system over multiple disk types (eg to support FAN)
Cisco
· 'terminal release' concept for SANos (check current name) - where support partners must converge upon a certainly release variant within X months (to aid interop)
Twidroid
· To have a "eMail tweet" (inc URL to tweet) that integrates with your google mail account
· To have a "Reply All" option against each tweet
· TwidroidPro posts to be 'sent from' TwidroidPro rather than just 'Twidroid'
· To correctly support Androids 'select & hold' ability to correct default spell-check suggestions
TweetDeck
· eMail tweet to include URL to tweet
· Option to save state of currently 'in memory' tweets upon shutdown, and reload upon restart (I'd pay for this feature alone)
· UK spellchecker
· Ability to set frequency of refresh of searches (in similar fashion to DMs, Mentions etc)
Snarfer
· An update to code?
· Better support for database store to prevent / fix corruption