eweekchat

Trends in Data Orchestration
JOIN US: This is a chat-based conversation about what we're seeing in the organization of all that data we're collecting. Data orchestration--using Kubernetes or other platforms--is a key topic right about now. We'll have expert guest hosts!
   6 months ago
#eweekchatTrends in New-Gen IT SecurityJOIN US: This is a chat-based conversation about what you think we can expect to see--or won't see--in data security this year. We'll have expert commentators!
   3 months ago
#eweekchatTrends in Collaboration ToolsJOIN US: This is a chat-based conversation about what tools we're using in working from home and in teams and whether we're more (or less) efficient. We'll have expert guest hosts!
Chris Preimesberger
Q1: How is your company using data orchestration? What business values are you deriving?
Sam Lakkundi
A1. At @BMCSoftware, we use #dataorchestration to bring our #data closer to compute across clusters, regions, clouds, and countries #eweekchat
Chris Preimesberger
Hi Sam, and welcome. What advantages are there in bringing compute closer to the data store itself?
Scality
A1. I can speak for a couple of our customers and open source users. We see people orchestrating data between legacy and modern applications - example, media company with a legacy FTP-based app orchestrating data up to cloud services.
Sean Knapp
A1. At @ascend_io, #dataorchestration is at the heart of our product, and what we also use internally to collect data across disparate systems & clouds, perform complex transformations & modeling (for #analytics and #datascience), and even automate #datagovernance.
Alex Ma
Bringing the data from the datastore itself, closer to the compute or local to the compute has a number of advantages. Reduces network latency, increases throughput for the application itself and minimizes overhead with communicating to the datastore.
Dipti Borkar
A1. At @alluxio , we focus on bringing #dataorchestration to our users, for hybrid cloud and multi cloud environments. Data Orchestration is accelerating compute with remote data and unify data access.
Sam Lakkundi
hi Chris, data orchestration in general simplifies data ecosystems due to new frameworks, cloud adoption/migration, as well as the rise of data-driven and demanded applications #eweekchat
Scality
A1. Business value of #dataorchestration can range from delivering new services to business continuity / DR.
Dipti Borkar
A1. Users like DBS Bank and Walmart, have data on-prem. Accessing this data in other remote environments can be a big challenge and these users and others, use @alluxio 's "Zero-copy" bursting approach to bring data to the cloud to accelerate #bigdata compute
Eric Kavanagh on #DMRadio
@dborkar I love @Alluxio's approach of creating a virtual cache of data that connects on-prem environments to each other, and to the cloud. As such, it enables high-powered analytics on data that's scattered across multiple locations. #DataOrchestration #eWeekChat
Dipti Borkar
A2. Great question @editingwhiz! At @alluxio #dataorchestration = Seamless data caching + transparent virtual data access + data migration + intelligent cataloging + transformations to change the shape of data and make it compute-friendly.
Dipti Borkar
We often hear that #datawrangling can be #complex and #inefficient. Users are looking for automation, and seamless access to data where ever they need it. @alluxio does this, not by creating more copies of data, but making data available via caching & a global unified namespace

(edited)

Dipti Borkar
DBS Bank uses this "zero-copy" bursting approach to expand their compute capacity no matter where data is. The truth is that a lot of enterprise data still remains on-prem, but data centers are running out of capacity and cannot keep up with demand.
Chris Preimesberger
Hi Dipti and welcome! Where, generally, are your customers maintaining their caches--in data centers still, or are they moving these to the cloud--hybrid, public or private?
Dipti Borkar
Thank you for having me! Alluxio is typically co-located with compute and often remote from data, which may be on-prem or in a different data center
coshiro
A1: AtScale’s platform is designed to orchestrate data where it lies in response to data usage. We have a concept called Autonomous Data Engineering which uses some ML to understand how to present data and ultimately how to optimize the data for better data analytics.

(edited)

coshiro
Our customers derive the benefit of enhanced performance against their data while leveraging all the data regardless of where it lies. This leads to better historical trending, outlier identification and multi-platform support.
Eric Kavanagh on #DMRadio
Cool! So do you create a marshaling area of data, like a cache? Or if not, how exactly does the data get provisioned from where it lives, to where the user needs it for analysis or reporting? Can can you also enable OLTP applications with this method? #eWeekChat
Chris Preimesberger
Thanks and welcome Chris! Can you explain a bit more about the use of ML in this context?
coshiro
@eric_kavanagh - mainly an OLAP platform. Our Acceleration Structures are not cache. They are not transient. Instead the are heuristically defined and maintained by AtScale.
coshiro
AtScale is tracking queries and query patterns to ‘learn’ and ‘anticipate’ future behavior.
Chris Preimesberger
Our guest experts today are:
Eric Kavanaugh, CEO of The Bloor Group; Wally MacDermid, VP of Cloud for Scality; Dipti Borkar, VP of Products at Alluxio; Christopher Merz, Principal Technologist at NetApp; ...
Eric Kavanagh on #DMRadio
Excited to join this #eWeekChat today! All about a super-hot topic: #DataOrchestration! Lots of very cool vendors in this space now @streamsets @openprisetech @geminidata @informatica @ascend_io @Boomi and many others!
Scality
Hello @editingwhiz! Thank you for the opportunity to participate and contribute/learn with such a great panel.
John Furrier
Great lineup of experts lets the knowledge sharing begine :-) cc @dvellante @stu
Sean Knapp
@editingwhiz A4: to further #DataOrchestration, we need to continue to make our systems increasingly intelligent. We have more #DataEngineers, creating more #DataPipelines, producing more data products than ever before. #DataOrchestration must understand intent of the developer.
John Furrier
I like where you're going with this ..I call is "data coding" or DataOps .."data as code"
Sean Knapp
To accomplish this, #DataOrchestration systems need to move to Declarative models, understand both code and data. Easy litmus test: can your Data Orchestration tool guarantee that the data in your lake/warehouse/etc is exactly in sync with the code in your repo? ;-)

(edited)

John Furrier
how do you see this working with pure public cloud and thru a hybrid environment
Sean Knapp
@furrier exactly! Similar to #DevOps, #DataOps is about enabling more people, to do more (with data), faster, and safely. To do that, you need smarter systems that modularize, understand, automate, and adapt on behalf of the developer & user.
Sean Knapp
@furrier another great question (#DataOrchestration & #DataOps in hybrid env). I think this is where intelligent systems shine. You can, and often should, have hybrid environments. An intelligent orchestration system should know what data is needed where, and deliver it for you.
Sean Knapp
Most of our customers end up with many data sets spread across clouds or hybrid environments, and use the @ascend_io platform to not only move that data around, but perform extremely complex transformations along the way, all with unified lineage tracking & audit-ability.
Chris Preimesberger
We're getting down to the final minutes of our chat today. What a fountain of great info we unearthed!
Sam Lakkundi
Agreed, this is a great, lively discussion
Chris Preimesberger
Before we get to final takeaways, I want to let everybody know that this topic is to be continued on live radio tomorrow! https://dmradio.biz/archives/2958 Eric and Sean, with us today, will join me and Haoyuan Li of Alluxio on the air!
Dipti Borkar
A4. In fact, the interest in #DataOrchestration has exploded over the past 6 months. At @alluxio 's first #DataOrchestration Summit, we received an unbelievable 400 attendees. Interest is very high, adoption will follow and then will come main stream adoption.
Sean Knapp
@editingwhiz A5: Digital Transformation turned every company into a Software company, and every Software company must become a Data company. The past 10+ years have focused on how we store, process, or move data; more of it, and faster. The next 10 is about doing more with it!
Chris Preimesberger
Thanks, Sean. See you on the radio tomorrow!
Sean Knapp
Looking forward to it Chris!
Eric Kavanagh on #DMRadio
Ah, yes, shame on me for not cross-promoting! Join us tomorrow at 3 ET on #DMRadio as we tackle the innovation cycle of #DataOrchestration! http://bit.ly/2voEPRM We'll interview several guests and broadcast coast-to-coast reaching 1 million listeners. #eWeekChat
Dipti Borkar
A5. @editingwhiz It is very important for data platform leaders to understand what their core challenge is. There are many tools out there, and each is a little different. Once you understand the goal ex. bursting compute to cloud or better performance, pick the right tool.
Dipti Borkar
Look beyond the hype and solve the core problem.
Dipti Borkar
I'll also say one more thing. STOP MAKING MORE COPIES OF DATA :-) Copying data creates more problems than it solves. Sorry about the CAPS.
Christopher Merz
Amen to that! Meta-data copies only ;)