John Furrier64
Q2: Why did Datawarehousing fail?
Dean of Big Data
great topic. Just getting ready to publish a blog on this very topic and the learnings for a Data Lake world
John Furrier
Data Warehouses that don't embrace the future will die
John Furrier
@schmarzo a warehouse can't float on a lake or ocean
Dean of Big Data
data warehouse concept will live, but the underlying infrastructure could change dramatically!
Robert Novak
I think data warehousing didn't fail, it just didn't scale. Lots of applications for DW and the ones that integrate with the modern data ecosystem can survive and transform.
John Furrier
adding new data sources is hard; lots of incomplete data makes for shit results
Robert Novak
And lost/missing/siloed data makes for incomplete results and lost efficiencies of scale. People are still learning how to find and use data.
John Furrier
data warehouses lacked the compute power both on prem and in the cloud to analyze petabytes of customer data
Robert Novak
I was talking to a @ciscodc customer who'd acquired a high-data-volume company and knew they wanted to use that data but had no idea how. Probably not uncommon among M&A situations.
John Furrier
@gallifreyan the elephant in the room you won't hear about at Orielly strata show is the nuances of Integration; Integration is the new "bar" that has been raised
Robert Novak
I wonder if the smaller conferences (from @USENIX and such) will pick up more of the hands-on integration technology discussions.
Dean of Big Data
I think data warehousing failed in at least one area - data proliferation with data silos.
Muddu Sudhakar
Datawarerhouse did not scale to large environments; hard to extract data to run algorithms on data.
Ash Parikh
great topic - it's not about data warehousing vs. big data - you need both - one answers what did I sell to my customers in the past - the other answers what I will be able to sell in the future - it's about making the best of both worlds IMHO
Ash Parikh
stats say that 70% of big data and data lake projects will remain at experimentation or fail - the issue is that data management for big data is an after thought or simply overlooked - data management is foundational whether data is big or small
Ash Parikh
here is an article I wrote recently that covers the big data journey - would love your thoughts - http://www.computerw...
Ash Parikh
it's about using the right tool for the right job - we see customers investing separately in data warehouses and in big data infrastructures at the same time - again, to answer different questions - successful ones think data management first
Muddu Sudhakar
Most of the Hadoop ecosystem products are features for Platform product or part of technology for Applications leveraging big data
Ash Parikh
@smuddu let me know if i interpreted your response correctly - stand-alone data preparation is not enough - it is a key capability in a comprehensive platform that provides end-to-end data management for big data and data lakes - thoughts?
Muddu Sudhakar
@parikhash @furrier data prep is feature in end-to-end platform product. Most hadoop vendors functionality seems like set of features and it will be customer responsibility to integrate these to solve their problems.
Ash Parikh
@smuddu thanks - i think you will find this article fun to read - http://www.computerw...
John Furrier
@parikhash you should post that on SiliconANGLE.com
Rishi Yadav
what I see at clients is that they used datawarehosing because they did not have alrernative. Who wants to deal with mess of creating cubes when you can do everything in memory
Dave Vellante
EDW & BI = too rigid. they became insights for a few and those insights weren't operationalized at scale
Dave Vellante
certainly EDW failed to live up to it's vision and promise of a 360 degree view of customers in near real time
Muddu Sudhakar
@dvellante John Furrier Hadoop & BI = too rigid + too complex. Need packaged Solutions/Apps which can hide complexity of Hadoop/Spark and take away need consulting/PS services
Rishi Yadav
I am biased but in reality the most work is happening in integrating data sources. Hadoop/Spark etc just work. The challenge is in connectors. Lets take case of deployments on AWS. Consulting is not needed to get started but to reduce latency
Dmitry Golubev
EDW are failing due to complexity mainly. Too many interfaces between systems, and business logic is hidden in Apps. The result is difficult impact assessments and painful changes. Big Data is even worse in this sense.
Bharath Aleti
..we just have to look at the past .. databases were ubiquitous, bcz users had a whole slew of apps that could leverage the underlying data infra. Big Data requires a similar vertical ecosystem, to avoid the pain cited by @parikhash
Annika Jimenez
Data Warehousing is too BI-centric, too rigid for new data ingest, too ETL-dependent, too difficult to enable access, too expensive. The Big Data arena has shifted to agility in discovery, rapid access enablement, data/insights-enable apps.