
Leo Leung79













































Let's tee it up with our first question to the crowd http://www.via-cc.at... - are you ready for "always on"

Leo Leung
@gregorygbishop - thoughts?

Leo Leung
We're joined by @gregorygbishop , who ran the huge Time Warner infrastructure

Dave Vellante
the issue of course is always recovery - this is what makes "enterprise ready" very difficult - all the processes and procedures that are in place create terribly cemented infrastructure - but it typically works

John Furrier
. @lleung here at #bigdataSV #strataconf the questions on scale with hardware is coming up on how to scale #bigdata

Greg Bishop
At TWC, I supported a mail system with 20M mailboxes, and over 10B object in storage

Leo Leung
@dvellante yes and no - recovery is relevant, but so is running continuously even with failures

Stuart Miniman
the old way was to harden every piece of infra, the new was is distributed systems. Big hurdle around applications making this change, infrastructure is getting there faster.

Joseph B George (JBG)
Definitely a common theme that I hear from customers these days

John Furrier
hadoop is still useful for petabyte processing. and yes many companies do have that problem. spark seems cool only if you can figure out how to keep it running at scale - what should folks do for this

Greg Bishop
Recover isn't the issue - the issue is figuring how not to need recovery

Leo Leung
@gregorygbishop - please elucidate

Joseph B George (JBG)
and as tech evolves (a la Hadoop 2.0 and features like erasure coding), scaling becomes ever more interesting

Greg Bishop
When making a system "always on" and "at scale," one must assume that the system always has something in a failed state.

Leo Leung
@dvellante - the old way of downtime or system slowdown while you recover is no longer valid

Andrew Reichman
with massive data sets it's just not viable to think that you can have primary running with copies to somewhere else that you would recover to when things break- it just takes too long to move the data and build out a new envr.- you need HA

Greg Bishop
So in the old sense, the system is always 'in recovery'

Andrew Reichman
But building HA requires deep integration with the apps that use the data, technology to keep multiple sites synchronized and double huge infr

Joseph B George (JBG)
I'm seeing more people put more thought into things like fault domains - embracing that downtime will happen and planning with it in mind

Andrew Reichman
@gregorygbishop exactly- instead of recovery being a declared event when things hit the fan, it's more of a constant scenario that you're mitigating in smaller, non-disruptive ways

Leo Leung
@reichmanIT - @gregorygbishop - do you agree in the notion of deep integration or is the infrastructure smarter?

Lacee
@jbgeorge fault domain seems to be a common issue I hear from customers #realdatastories

Joseph B George (JBG)
Fail fast, right :)

Leo Leung
There's definitely a law of large numbers effect - 1,000's of disks, 1,000's of nodes, things will fail

Joseph B George (JBG)
I will also say that as the infrastructure is evolving - esp as we are looking at networking beyond 10GbE - the infrastructure design gets more interesting

Greg Bishop
I'm not sure that deep integration with the infrastructure is required to support the resiliency requires for 'always on'

Andrew Reichman
@gregorygbishop depends on who's talking- if it's infr team they will say deep integration. if it's app team, they will say that they can control dumb infr with their smart software

Joseph B George (JBG)
back in 100Mb times, it was a source that had to be "designed around" - that is changing

Leo Leung
@gregorygbishop - certainly, our prescription is a different kind of infrastructure - "distributed" is one piece @stu

John Furrier
polarization with apps at scale (bus applications) and infra at scale (infra software) - lots of innovation at the infra

Leo Leung
@gregorygbishop - given "continuous recovery" what do you do differently from before?

John Furrier
. @lleung this bringups the notion of hw as a service - consumption has to be easy to stand up and provision for app scale world - I'm interested in what solutions are out there

Ariana Gradow
The value and the ability to build something that can scale is now a necessity

Joseph B George (JBG)
I know @zehicle has been talking about this for many years

Greg Bishop
At TWC, the mail application interfaces with the storage infrastructure using a standard web interface, but has no concept of how the infrastructure keeps data available

Joseph B George (JBG)
I actually WOULD say it is HWaaS - the tools behind can give it that level of delivery

Greg Bishop
So, the goal was to make the infrastructure smart, not HA in the traditional sense, as the application sees no 'failover'

Joseph B George (JBG)
totally agree on containers @furrier

Leo Leung
@gregorygbishop - cool - my point these days is traditional notions of failover, recovery, availability... need an update

Joseph B George (JBG)
In that vein, we are seeing more and more HP customers start looking at infra closer - purpose built vs general purpose - getting great results

Andrew Reichman
@gregorygbishop decoupled architecture allows each piece to scale indefinitely and not break the others so long as everybody is reliable and speaking a language the others understand

Joseph B George (JBG)
The recently announced HP Big Data Ref Arch is a good example

Leo Leung
@reichmanIT - that's what i mean by service oriented vs. "as a service" - probably need a longer piece on that

Dave Vellante
@gregorygbishop how utopian - that would be a computer industry first!

Andrew Reichman
agree- as a service just means that someone else is doing it. service oriented means that separate domains have rules of engagement whereever they might live and whoever might have built them

Dave Vellante
you have to think about "disposable infrastructure" but imo if you ignore recovery you are a foolish practitioner - remember - even google has to recover from tape at times

Ariana Gradow
.@lleung Cloud scalability and performance should be at the heart of every successful internet venture.