Oct 23

It’s not often the EMC markitecture spiel breaks from the well worn rut. But Frank Slootman talking straight about the newly formed business unit at EMC is refreshing after years of EMC changing the deduplication party-line, spinning vague technical slants on product lines, siloed sales teams pushing all kit in all places, and generally confusing customers about their capabilities with deduplication.

In 2007, the stories of ‘our deduplication is in development’ where trumped in 2008 with the Quantum partnership announcement, and once again with the Data Domain acquisition in 2009. Finally, someone has a leadership position and is potentially leading a breakdown of siloed product lines (I like to hope).  Frank may not have a lot to loose, but regardless I like his style and hope it works out for the sake of a lot of good technologies and people who build them.

Some key aspects I happen to agree with from the recent interview:

  1. ‘Let Disk Library be a VTL’ – Hallelujah- Franken-VTL will be no more.  Bolting on an inline deduplication device behind a high-performance VTL with 4 times the throughput just didn’t make sense. This also will clarify to existing CDL/EDL customers there are 2 simple choices: High-speed non-deduplicated VTL – use the EDL/FalconSTOR. Deduplicated disk, use Data Domain.
  2. ‘Data Domain plays in the core, Avamar at the edge’- Absolutely. Both are fine products but have a different role and capability in the enterprise. Avamar can be beautiful for remote sites and relatively small backup payloads and supports a wide range of topologies for replication. Data Domain integrates with existing software infrastructure and scales in a completely different way (more suitable to the enterprise core).
  3. A hard-line regarding Commvault – Agree. In fact, for a company that has grown into the enterprise market with a solid product known to actually work as advertised, I’m skeptical of the aggressive play to take down the target-based deduplication market with software-only deduplication architecture. Sure, it works, but just because my car is drivable doesn’t put me in pole position for Indy. Where are the performance specifications and benchmarks, where are the design guidelines, and most importantly how do you break the cardinal rule of mixing heavy I/O and compute workload on media servers and magically whip the technologies that have struggled to scale deduplication compute and metadata scaling into the enterprise after 6-10 years of R&D and field experience? Also, why bilk customers who want to write to a non-Commvault disk device, when Symantec already tried that and royally enraged their customer base?
  4. An integrated line of business for backup solutions – Makes sense. When you see Avamar being positioned in the field for enterprise core backup, 3D3000’s pushed in one account, EDL’s in another, Networker’s solid development roadmap vs. perpetual field support challenges, and Data Protection Advisor being pimped like a utility, but not strategically positioned, you really have to wonder with all the guns blazing is there anyone really steering the boat here?

Deduplication Straight Talk from inside the Sausage Factory…IBM- time to take some notes on this one.

Tagged with:
Oct 07

NetApp has been fairly quiet after the drama of the EMC/Data Domain acquisition died down. Quite a contrast to Quantum, who hammered the market with an incredible amount of positive PR about product enhancements and strategic positions to market.

NTAP Deduplication Rap Video

So instead of playing it straight to the the technical crowd, NetApp is shifting focus to the hip-hop consumers of enterprise storage with a music video, a move I both admire and hope to emulate in my own job.

Arguably, NetApp has quite a bit to boast about when it comes to primary storage/NAS deduplication. they’ve built it, deployed it successfully, and are the lead-player in the market.  But as players go, they certainly don’t have as much play in the secondary/backup deduplication market. Their entry to market with deduplication was late, then bid to takeover Data Domain and lost the ‘cold-hard-cash’ war with EMC.  For NetApp’s sake, I do hope the best for them gaining marketshare with the little giant (Data Domain) and the gorilla (EMC) are joining forces and from some perspectives making big plans to corner the market.

Tagged with:
Jul 21

5 years ago, your average data center hosting 10,000 servers looked like a linoleum covered football field,  was wicked cold, yet had plenty of aisles where you could break a sweat due to warm air-current hot-spots. In the early century, air-convection planning and floor-space reconfigurations were considered ‘green’ strategies to reduce environmental costs.Google

Flash forward to, uh now, for those who haven’t seen the video of google data centers, this is an interesting glimpse into the world of massive-scale distributed computing on the cheap and the green.

First, some observations:

What you see:

  • Lots of modular components: starting with containers, basic power and plumbing conduits, heat-exchangers, and lots of open-case x86 kit crammed into high-density confined spaces.
  • An obsessive design focus on minimizing the Power Usage Effectiveness ratio
  • Centralized cooling and power infrastructure and many non-descript containers in a warehouse rack configuration
  • Vertical distribution of gear and workload (way beyond the scale of your average data center rack)
  • As noted in the exceptionally dry video dialogue, 45000 servers across 45 containers (1000 servers per container), in a single warehouse type facility
  • (now, close your eyes and imagine Google running on over 450,000 servers world wide and tell me if you’re going to sign up for a yahoo account tomorrow)

What don’t you see?

  • No raised floor, specialized rooms or floors built for data center infrastructure
  • The classic air conditioned hockey-rink full of heat generating equipment
  • Large volumes of ambient space being cooled
  • Centralized UPS / Power Fail-Safe equipment
  • No cooling or power machines competing for compute space
  • Vendor labels (notice, no labels whatsoever)
  • Specialized hardware – in fact the hardware is engineered to be dead simple and uniform (which is relatively easy to do if your business runs massively distributed custom application code)

If you’re watching the Google drip-feed of information about their data centers, the innovation is absolutely incredible. The latest news about the chiller-less data center in Belgium, you see geography, climatology, IT infrastructure, and geographic load-balancing strategy all in play, with a goal to load-balance work away from Belgium on hot days, and not rely on a single chiller. That’s seriously more innovative than creating warm/cold air convection cells in your raised-floor meat-locker. Maybe my earth sciences degree will come in handy for IT work after all…

Next, if you take a look at the major IT infrastructure vendors, we now have a horse race to emulate big modularized x86 kit in a box, ready for rapid deployment and consumption. Sun arguably pioneered the concept several years ago with the modular data center, and now we have pods, pods, and more pods.

In the larger picture, these innovations seem to validate Friedman’s semi-futuristic speculation in Hot, Flat, & Crowded, of the convergence between power grid and IT infrastructure. Granted, it will take years, if not decades to purge legacy systems and applications, but for your average enterprise planning on ‘massively virtualized’, this isn’t a far-stretch of the imagination to see where centralized infrastructure could possibly take hold for large parts of IT infrastructure. Absolutely, there are scores of dependencies and assumptions which must be true for this to work.

The Google drip feed also shed’s light on why cloud computing is more than just a link to the next best data center. As a buddy of mine put it, ‘maybe the people who want to build their own cloud are missing the point’.

Tagged with:
Jul 08

The biggest drama in the 2007-2009 storage industry is over. NetApp loses Data Domain.

This is very bad news for Quantum and FalconSTOR, leaves NetApp in a less than desirable position to market, and puts EMC in a strong technology position.  With a capital infusion of 2.3 billion, EMC is going to lead aggressively and move more kit than we can collectively imagine with the Data Domain acquisition.

For Quantum, their main channel was EMC through a not too shabby product line up (if you exclude the Franken-EDL with Deduplication), which extended through the Dell relationship to the mid-market, with a collective wide and deep market reach. That’s going away.

For FalconSTOR, the EMC relationship presumably ends with EDL. EMC still pushes EDL due to the fact it’s a solid product and does a great job at being a non-deduplicated VTL. But nontheless deduplication for backup data storage on disk is usually a deal-maker when you do the capacity sizing and are getting 8:1 deduplication or better.

For NetApp, it’s back to square one with kind of late-to-market VTL+deduplication offering that has yet to even scratch the powerful legacy of the NAS product lineup.

For EMC, things just got a lot more interesting. EMC has acquired arguably the most ‘proven’ deduplication feature set in the field (if you could instances and years in production). While some real engineering will be required for Data Domain deduplication to legitimately play in the primary storage space, the collective deduplication, virtualization, and security capabilities of EMC (Data Domain, VMWARE, RSA), position EMC with a mad toolkit for the cloud storage game. And in the meantime, a simple and proven product offering to market in the small-mid-enterprise market for VTL/NAS deduplication is ready to roll.

Plus, EMC has done a great job of acquiring companies and not screwing them up over the last several years. That is of course if you look past the Data General acquisition and the 2nd-class citizenry of the CLARiiON line, which persists to this day as a result of an ego-driven acquisition. I’d speculate EMC plays somewhat softer and smarter these days, and is going to make this work and take advantage of the investment.

Tagged with:
Jul 01

NetApp initial attempt to acquire Data Domain shocked most people in the market. Insiders tell me even at highest levels in Data Domain and NetApp engineering, no one had any inkling the boards were inking contracts. That makes a lot of sense with years of blood-war between both companies and EMC. So when EMC finally did find out, the results were awkward and hostile. Never a good combination, but it’s a great way to keep the PR firms busy.

In addition to the dramatic overtures by Joe Tucci in the aggressive courting of Data Domain, with all-cash offers, on top of the cool 100M antes to the original offer, and even bigger cash offers, you start to wonder if this is love or desperation at work? And in the end, Data Domain still holds out for the fiscally less enticing, yet rumored to be contractually binding marriage to NetApp.

What does this signify?

Validation of Data Domain, and the first big move to commoditize the duplication market. Sure, deduplication is a feature applied to primary and secondary disk architectures, but the bigger story is how deduplication is part of the gateway to the next big wave of IT infrastructure. Right now deduplication is fragmented and promoted by vendors all with their own secret sauce. Everyone is special. Everyone is different. In the future, you’ll be buying this feature set from 1 of a handful of market leaders. Same old story for a very disruptive technology feature.

Tucci doth protest too much?

The EMC story for deduplication is fragmented. You have a stalled relationship with FalconSTOR, after mammoth generation-1 VTL deployments from 2003 on, yet no adoption of FalconSTOR deduplication software. You next have Avamar, which is a separate platform specifically engineered for backup. Then you have Centera (which is not performing deduplication, but single-instance store of file objects, with industry rumors the platform is approaching end of development/life. Overall there is a boat-load of non-deduplicated kit spread around the world and no smooth transition for legacy customers.

Then there was about 12 months of EMC telling customers deduplication is ‘in development. Next you have the Quantum partnership, which is in pole position for EMC deduplication / VTL offerings. Finally you have the hostile bid to take over Data Domain. This doesn’t exactly send the good vibes to Quantum, even if you are the CEO .

EMC can spin just about anything out of ether, but my take is they desperately need a new technology base for deduplication and this is much more than picking up market share in the mid-market. EMC knows market share and can market any thing, any where. We all know that. And if you look at other major EMC acquisitions (such as RSA, VMware, etc.), none of these moves were marriages of convenience. EMC is playing strategically and working to fill a technology gap. Even though they have quite a few varations on the deduplication theme, there’s still a gap.

Will the Data Domain & NetApp love last?

You never really know, but my personal theory is this is a mix of practical business sense and high-level hatred of EMC. Quite a few of Data Domain’s original crew came from NetApp. The culture of strong product engineering and R&D are pervasive.  Sure, Data Domain reminds me of EMC too , but who can blame a publicly traded company who IPO’s, as we careened into worldwide economic meltdown 2008, for being aggressive. The NetApp deduplication / VTL offering was late to market and in spite of the company reputation for making outstanding and easy to use products, late in this game means you loose. Plus, an injection of aggressive won’t hurt NetApp’s position in the backup infrastructure market.

The most promising aspect is that big company R&D budgeting, strong synergy in engineering disciplines, and the cornering of the NAS primary and secondary storage market will drive out price and flatten the market for deduplicated disk. NetApp already has deduplication in production for NAS filers, now they’re gaining thousands of clients and a proven architecture for backup infrastructure. Plus, this may help Data Domain scale out of 35 usable TB per instance faster than planned.

Tagged with:
preload preload preload