unifydata

Talking Data Hub for Analytics
In the renaissance of analytics and AI, the storage industry is holding enterprises back with legacy architectures and silos, like data lakes. Join us as we discuss what it takes to move the storage industry forward and how data hub lays the foundation for a modern architecture
Peter Burris
From the “Open Letter to the Storage Industry”: “Data lake is dying. It was built on the obsolete premise that all unstructured data is meant to be stored.” How are you evolving your management of analytics data? https://www.crowdcha...

Roy Kim
Anyone here using data lake in their organization?
David Floyer
Traditional Data Lakes and data warehouses have had poor returns on investment - they make only people smarter. AI is driving up the value, as Inference engines can make processes smarter. Data Hubs are a step on the way!
John Furrier
I have been on record since that term came out (took tons of heat for it) that data lakes aren't the silver bullet in fact I dislike the term. Data is fluid and the ocean metaphor is better bc
John Furrier
@dfloyer I agree with David Floyer on this. I would add further that imho data needs many hubs and data buses

(edited)

Roy Kim
@dfloyer , how did you measure returns of investment? i heard that data lake gives poor ROI..
Roy Kim
@furrier, seems like most organizations are using data lakes to store all their unstructured data.
Brian Schwarz
current data centers have too many different architectures for storing data, hte public cloud simplified it by narrowing the choices down dramatically, same has to happen for on-prem, and why new architectures like Data Hub have to come to fruition
John Furrier
@purerkim They are and they mostly turn into data swamps bc they have to optimize on keeping the data moving when the better architecture is to let the apps drive the data. movement. The result is extra time is wasted managing data quality

(edited)

vaughn stewart
Data lakes ultimately are proprietary with 'openness' enabled via ETL and/or gateways. They don't work with the speed at which data is being created and analyzed across a multitude of platforms
David Floyer
AI and Cloud change the return on Data - by integrating data hubs and moving code to the data, much richer real-time analysis can occur. This enables smarter application decisions in real-time.
Brian Schwarz
it also strikes me that most on-prem apps today are built to run on files, but cloud apps are built to run on objects -- that has to be unified at some point
ArmiB
@dfloyer what role does flash play here?
David Floyer
@purerkim The simplest measurement of ROI is to ask CIOs what the comparative return on data lakes and data warehouses. They will usually roll there eyes!
Brian Schwarz
@vStewed the past data lakes are also too static, and you can't easily inject new analytics apps into your pipeline, one more reason to seperate compute (into containers or VMs) and enable fast scalable storage tier -- that is a data hub
Roy Kim
@dfloyer, there was a rush to store all the data in a data lake. can't get them out fast enough now! slow data = poor ROI!
David Floyer
@ARmiBanaria Flash is a critical role in increasing the amount of data processed. Well designed flash systems have much lower latency, and much higher aggregate throughput. This leads to higher value real-time applications.
vaughn stewart
@TheSchwarzBwthU spot on - one's AI & Analytics architectures should be build on the same principals where on-prem, hybrid or in the cloud.
vaughn stewart
@dfloyer Two fundamental problems with classic analytics and AI data storage strategies are 1) they were designed on slow disk and networking technologies and 2) single use case centricity, expandable via gateways or ETL.
Neil Raden
@dfloyer Data Lakes are filling the role of the landing zone for further processing of data. To DW for that SVOT need and to support ML. Unfortunate name, Data Lake.
Roy Kim
@NeilRaden , Agreed. "Lake" is a poor choice for name..
Roy Kim
@NeilRaden, it's too close to being a "swamp"!
Dave Vellante
the whole concept of data lake is flawed imo...
Peter Burris
How would you like to see storage industry partnerships (e.g., application relationships) evolve to advance storage technology? https://www.crowdcha...

Roy Kim
, would love to see more integration with deep learning stack (#TensorFlow, #PyTorch), with storage somehow
Roy Kim
@plburris, #AI, #deeplearning ecosystem is talking about this but no one has proposed anything
Brian Schwarz
More investment in converged infrastructure systems that use the same building blocks....that is the way to get both simplicity of management and flexibility to accomodate all different types of workloads
Brian Schwarz
@purerkim saw that FaceBook made a public push on PyTorch this week - will be good to get more standardization on AI SW toolkits
vaughn stewart
it's not just a storage vendor responsibility - AI & Analytics SW vendors as well as the pubcloud need to embrace open data sharing - standardizing on protocols, security models and APIs.
Roy Kim
@TheSchwarzBwthU , that's right. #pytorch 1.0 just released yesterday. for all the rage in AI, it shows how young the industry is. 1.0 just released!
David Floyer
Storage is a vital part of systems. DevOps is moving the responsibility of operations from people to the applications. Storage vendors needs to provide the tools to developers for managing data within the DevOps frameworks!
vaughn stewart
as an example, S3 is not a universal API. Implementations vary from implementation to implementation - which hinders unification efforts.
Brian Schwarz
@vStewed great point Vauhgn, it would be nice if app vendors spent more time think about the foundation they run on
Roy Kim
@vStewed , right on! I don't think app vendors will want to this this by themselves
Amy Rushall
More heart and less attack! Let's work together to #unifydata
Roy Kim
@TheSchwarzBwthU , i wonder if we need the public cloud to push the entire SW ecosystem
vaughn stewart
credit to analytics vendors like @hortonworks, @Splunk, @Vertica and others who embrace disaggregated architectures and object stores. More need to follow suit.
Brian Schwarz
@vStewed your right about the S3 nuances, but object protocols are becoming more standardized than they used to be, which is a step in the right direction. Would be good for @AWS to open source some parts of it to further increase compatibility
Neil Raden
@vStewed In the past, the data disciplines (storage, ingest, quality, transform, integrate, etc) were a separate industry from analytics and intelligence. No more. Org's want continuous frictionless flow from data creation to insight.
Roy Kim
@vStewed Definitely a good start. Need more app vendors to embrace modern, disaggregated architectures. Looks, @AWS and other pubcloud vendors are doing it
vaughn stewart
I'm bullish on S3. Maybe @SNIA should partner with @AWS in order to increase interoperability across vendors.
Peter Burris
Does your organization plan to invest in AI practices in the next year? If so, what is the use case? https://www.crowdcha...

Dave Vellante
yes. searching, identifying and capturing high value content and segments buried in videos
Brian Schwarz
Is there any industy that isn't going to impacted by AI?
Amy Rushall
for those looking for strategy assistance I highly recommend this AI Business Strategy course I completed in the Spring from MIT Sloan CSAIL.https://getsmarter.m...
Roy Kim
@plburris, I think #US organizations are falling behind in #AI race. See what other countries are doing with AI. Self-driving drones: https://www.engadget...
Brian Schwarz
I talked to a utility company that wanted to use drone to fly over 10,000 miles of electric transmission wires to check on tree cover
David Floyer
My personal work is developing AR technology to help victims of abuse recover!
Roy Kim
@purerkim for goodness sake, is farmers are using AI, what industry can't use AI?
Mark May
Its impossible to not invest in AI and stay ahead of the curve. We just need to get past all the AI washing...
vaughn stewart
@cincystorage I'm more in favor of getting past legacy thinking. New roles like Chief Data Officer and Data Architects need to continue to be added to organizations. They are the change agents tasked with planning for the future.