RedefineBigData

EMC’s Ask Me Anything
EMC’s Dean of Big Data, Bill Schmarzo, will answer all your Big Data questions! Ask him anything
John Furrier
Q9: Can you really stand up analytics #datalake solution in one week? Explain this please http://www.via-cc.at...

Dean of Big Data
Wow, I really don't think so. What are you standing it up to do? What's the targeted use case? How do you get Business buy-in? Need to answer those questions first
John Furrier
This is what EMC was promoting in the release today?
Dean of Big Data
I mean, you can stand up one in a week, but it takes much longer to make sure that you're focused on the 4 Ms of Big Data..."Make Me More Money"
priya joseph
A week is too long for the money-spent on my source of truth aka data lake
Rodrigo Gazzaneo
@schmarzo narrow down the case to standing up the infrastructure and it may be possible. See the reference below: http://reflectionsbl...
Jean-Luc Chatelain
So the answer is yes, in fact the setup of a lack is a day BUT you need the data source connectors and that can take while. Just a data point but we see about 6 weeks from go actionable outcomes when we take the bit size approach.
Dean of Big Data
Yes, we can stand up a data lake in a week. We've pre-engineered all the different data lake components. Business value then comes next and times a bit more time
Jay Livens
@schmarzo I love the 4Ms! Awesome. That is the real challenge with all of this.
Jean-Luc Chatelain
BTW is the time to result that come as opposed to time to setup the lake. It is about catch fishes not just putting water in the pound!
John Furrier
Getting some analytics up would be easy if access to the data is available then running some baseline analytics is possible - but it would take a big more to really jam on something.
Dave Vellante
standing up a data lake in-and-of-itself isn't of much value unless time-to-insights is part of that equation
Jay Livens
@schmarzo Yes, we are back to analytics. We can stand it up, but how do we generate value?
Ashish Sahni
@schmarzo this was the slide that I removed from the deck...
Dean of Big Data
@InformationCTO Agreed. We can stand one up quickly, but then you need to focus on what do you want to accomplish with the data lake. That's where we focus in EMC Global Services
Dean of Big Data
Many of our clients start with a Vision Workshop to help them identify where and how to start their big data journey from a business transformation perspective; gives us the focus and priority to ensure success
Dean of Big Data
@JLivens And it's the most fun! The 4ms where capture the imagination of our clients and helps them focus on driving success. The technology follows after that
Jay Livens
@schmarzo Makes sense, but doesn't IT need to hire people with new skillset to take advantage? (Speaking from a geek with an undergrad degree in econometrics.)
Jay Livens
The other thing is that Big Data models cannot be static and so will likely change over time hence the need for modelling expertise....
Dean of Big Data
@JLivens ...or you create teams that couple the data scientist with the Business SME. That's been working great for us!
Rodrigo Gazzaneo
@JLivens I love the Undecided / Motivated / Ready curve to illustrate the maturity level in development
Dean of Big Data
@dvellante And creating value out of the data lake is really the most fun; it's why my job is so enjoyable and fascinating!
Jay Livens
@schmarzo Love that idea - coupling business and data scientist knowledge is the right strategy.
Jean-Luc Chatelain
Ditto for us at Accenture, start with workshop, understand business goals, current state and execution step to journey
Dean of Big Data
@vGazza Agree! The Undecided/Motivated/Ready maturity is a great way for organizations to determine what they should do next
Ted Bardasz
It is all about time to value, and the 4 M's, but what we've seen in use is that the time to value is hampered by the IT activities around initially creating the environment and then support mods against hypothesis analysis.
Ted Bardasz
That's the value of the FBDL is to leverage Federation Assets to the path to the 4Ms is paved.
John Furrier
Q3: Ok here comes the hard question. why data lake as a term???? vs data ocean or data stream.. http://www.via-cc.at...

Dean of Big Data
As you know, I'm not into "term wars." It's more about the functionality
Dean of Big Data
But I was thinking on my job this morning that lakes, reservoirs, oceans all suffer from the same problem, there can be multiple of them
John Furrier
then you agree data ocean is better :-)
Jean-Luc Chatelain
Well you ask...Marketing and hype. What ever you call it, it is a place where raw, un-tortured data can be captured at an good TCO price point and where data scientist AND business analyst can JOINTLY do information discovery
Dean of Big Data
So how about "Data Earth" as the single data lake to avoid the data lake silo and proliferation problem? Whatcha think?
John Furrier
data universe - but watch out for the "black hole of data"
Dean of Big Data
I like "Data Universe"!! And I like the idea of Black Holes. John, you might be on to something!
Jean-Luc Chatelain
A < insert favorite analogy > is a place where you can find what new questions to ask of your data as opposed to verifying answers to know questions (BI)
John Furrier
gr8 to have @InformationCTO here sharing his knowledge
Dean of Big Data
@InformationCTO A place that supports data discovery, data enrichment, data munging and both predictive and prescriptive analytics
Rhonda Edwards
I prefer Data-Lake as it connotes controllable and known albeit vast, while refreshing itself via streams and purging via outlets.
Dean of Big Data
Totally agree! @InformationCTO adding great value to the conversation!
Jean-Luc Chatelain
The real interest point is that data lakes/oceans allow significant reduction in spent in EDW (storage) why enabling much better analytics driving actionable outcomes.
John Furrier
@rhondanet I spoke with Inhi on @theCUBE and oceans are more dynamic real time unpredictable like "streaming and in memory apps" - retail
Jean-Luc Chatelain
My 2 cents is too many analogies confuse the business customers...They just "got" what lakes mean in concept so why talk of ocean,swamps or other ponds unless it is about click baiting and getting time on the Cube :-)
Dean of Big Data
@InformationCTO Also means the separation of the EDW and the Analytics Environment; un-handcuff the data science team from being dependent on the EDW for their data
Jean-Luc Chatelain
@schmarzo yes, at the end of the day BI/EDW is to make the bean counters happy (necessary) but NOT to effect real change and improve outcomes
Dean of Big Data
Not sure on Ocean as there are multiple Oceans. Like Universe better (but then again, that might be my Business Objects heritage showing through)
Dean of Big Data
@InformationCTO "at the end of the day BI/EDW is to make the bean counters happy (necessary) but NOT to effect real change and improve outcomes." I love that!
Jean-Luc Chatelain
you heard me say it last time I think. The black hole is where privacy/governance/security has fallen (in the #bigata universe)!
Dean of Big Data
@InformationCTO Data Lake 2.0 MUST address privacy/governance/security or we'll never get past the data lake as any more than a technology stack
Crowd Captain
My ship only works on the Ocean
John Furrier
Q4: Please explain this trend outlined in this slide? What does this mean for businesses as they transform to fully digital #DigitalUniverse http://www.via-cc.at...

Dean of Big Data
Think Business Metamorphosis where organizations are moving to a "Business-as-a-Service" model and the importance of data + analytics to support that
Dean of Big Data
I took my Big Data MBA class through an exercise where we picked a company and contemplated the data and analytic requirements to convert them to Business-as-a-service #BigDataMBA
Jean-Luc Chatelain
All this mean that no business can avoid to take an IPE (information powered enterprise) journey. The journey is not easy because lots shiny objects on the way but a journey that must be taken anyway.
Jean-Luc Chatelain
And like any journey, you need good Sherpas to help you along the way.
Dean of Big Data
@InformationCTO Amen. It also means that tomorrow's business leaders can't delegate data and analytics to IT; they must be engaged!
John Furrier
Jean Luc: like the farming equipment in the pic; loved your wife driving one this weekend for spring work on the farm!!
Jean-Luc Chatelain
@schmarzo The #1 person which can make the journey successful is the CEO...anything else is bound to fail. The lead reason is that the journey will force business process changes and unless the boss says so people will not change biz as usual
Dean of Big Data
@InformationCTO I've seen CIO's and IT Directors grabbing the #BigData mantle and reaching across the aisle to bring the business into the fold. Most CEO's too busy just trying to run the business.
Jean-Luc Chatelain
@schmarzo yes there are some but across the G2000 they are few and far between. The good CEOs though are the one that can see over the horizon and accept that to avoid obsolescence they must be data lead. It is a behavioral change.