BigDataNYC

Hadoop for the Enterprise
Is Hadoop Enterprise Ready?
   11 years ago
#BigDataNYCAnalytics for All:Putting the Power of Analytics in the Hands of Business Users
   10 years ago
#BigDataNYCBigDataNYC HadoopWorld Exclusive event at Hilton with @theCUBE covering Hadoop World with the people behind big data tech
Michael Hiskey
@MapR has done the most to think in terms of "Enterprise Readiness", but it seems that the biggest contingent of the #Hadoop Community is primarily locked-in on "opneness" as the principle important concept... opposite of @MapR approach.
Jeff Kelly
in fairness, MapR does not open source its code but makes its platform API compat - this gets back to q about better model for Hadoop
Jeff Kelly and actually MapR is supporting open source Apache Drill
Jeff Frick And I presume more enterprises are about getting their problems solved with a viable solution, and they're willing to pay for support, training, etc., traditional services
Jim Walker
au contrair... the open community has a HUGE focus on enterprise readiness. they just do all their work in the open. open is a development vehicle as well.
John Furrier can the community get there fast enough and are people going "lone wolf" with own stuff too soon? CIOs want to know
Jim Walker the fastest path to innovation is the open commuity. hive 12 delivered 420 jira tickets (4 mos). hbase 96 over 2000 (~12 mos). HUGE
John Furrier
MapR is doing some interesting things for enterprise ready hadoop using the open source and @hortonworks is delivering pure open source
John Furrier
I don't see MapR as lock in.. customers want functionality and agile - lock in is contextual to the solution and ability to change imho
Jack Norris
MapR includes all of the open source components plus architectural innovations.
Jack Norris
MapR believes point -in-time, consistent snapshots are required for data protection and mirroring (WAN replication not data copy) is required for DR.
Jeff Kelly WAN replication def important as enterprises expand deplpoyments - @WANdisco doing interesting things here as well as @MapR
Dave Vellante who does point-in-time consistent snapshots -anyone other than MapR?
Jack Norris , @dellante, . Yes, EMC, NetApp anyone serious about data protection.
Dr.Cos @WANdisco isn't doing snapshot'ing, our technology is based upon operation coordinations
Jeff Kelly
#Hadoop is ready for some enterprises, but not all - maybe the better question: is YOUR enterprise ready for Hadoop?
Michael Hiskey
That's a great way to put it, Jeff.
Jeff Frick
And if not, What's holding it up? What's the gate? What has to happen to open the door?
Jeff Kelly skills shortage, robust back-up and recovery capabilities, easier-to-use admin tooling, based on what I hear from @wikibon community
Jack Norris
@MapR focus is on production success we have a large number of proof points of traditional enterprises that are using Hadoop successfully. Customers with 1000s of nodes and dozens of use cases.
Jack Norris
Great list Jeff. Most IT organizations don't want to become Hadoop experts, they want Hadoop to look and behave like their current enterprise tools and apps.
Michael Hiskey
Enterprises looking for a slow transition to #Hadoop as the de facto "Data Lake" (to use @Hortonworks term); this will enable them to keep the current stable of #BI Tools up and running with minimal disruption to business users
Dave Vellante
@furrier says Data Lake is 'too small' he uses 'Data Ocean" :-)
Michael Hiskey
How about a "data reservoir" ?
Jeff Frick
Funny how we've moved from the land (warehouse) to sea #StrataConf #Hadoop #DataOcean
Jack Norris
The question for Hadoop distributions for the data lake use case is what is the time frame for the data lake? For an organization to have a long-term data lake you need Hadoop to provide full HA, data protection, and DR features
Dave Vellante
Agree or disagree? Hadoop is absolutely ready for enterprise dev and small projects but not for large scale, big ETL-type day-after-day production
Michael Hiskey
Spoken to clients that have moved 1000s of ETL jobs into #Hadoop, doing day to day production there, and are quite happy with it.
Dave Vellante so you feel hadoop has adequate automation, operational tools, reference architectures and maturity to be called "enterprise ready?"
Jeff Kelly
disagree .... but you need to have the skills in-house to keep Hadoop up and humming. Not many enterprises do yet.
Michael Hiskey
Not #Hadoop by itself - but the ecosystem conquers all. Combining an effective SQL on Hadoop tool (like @Kognitio) as a platform between HDFS and business apps fuels an easy integration
Michael Hiskey
Difference - enterprise ready for test? Yes. Will everyone move their database of record to #Hadoop tomorrow? No.
Dr.Cos
No way it is fully ready yet. Still requires a lot of in-house expertise for tweaking of the components, etc.
Jack Norris
Definitely agree and @MapR has the customers to prove this both in terms of large scale ETL and data warehouse offload and large scale operations (petabytes of data) and Hadoop is the system of record
Dave Vellante
OK...all the big enterprise whales...when they entered the Hadoop market said "our strategy is we're going to make Hadoop enterprise ready" - what about Hadoop needs enterprise readying?
Michael Hiskey
That was the supposition, but did they get off track and start to make it about a business instead? The market has become so much more of a traditional commercial battle between vendors, with the new twist of who will get bought
Dave Vellante I remember amr at cloudera telling me "we know something about making hadoop ent ready"
Dave Vellante the451 put out some research saying administration tooling and performance top the gap list, followed by reliability, SQL support and backup & recovery
Dave Vellante also that , but development tools and authentication and access control are not far behind...of course it's all behind a firewall so I can't see the full results
Jeff Kelly
the big things I hear from the Wikibon community are continuous availability, better security controls and easier-to-use management/monitoring capabilities.
Jeff Frick
watching LinkedIn, still seems like a pretty severe skills shortage
Dave Vellante there's no question this is the case (skills gap) - hadoop is a complicated situation for many / most shops
Jeff Kelly
I recall @merv saying at #hadoopsummit that security was the biggest obstacle to enterprise adoption http://www.youtube.com/watch?v=FeWaeKKa4n4
Michael Hiskey
Business, within the enterprise, doesn't care if data sits on #Hadoop or a toaster oven, they just want to do something new and interesting - generally along the lines of advanced/predictive #Analytics - to learn something NEW
Jeff Frick
Expectations for next week? Moving the ball in yards, or chunks?
Jeff Kelly
#Hadoop 2.0 and YARN are moving the ball down the field, to borrow a football analogy - lots of work left to do but YARN provide foundation to making Hadoop a multi-purpose #BigData platform
Jim Walker
YARN is fairly mature, hence the community GA moniker. i'm interested in the next wave of innovation for processing models that can build IN hadoop now.
Jeff Kelly what do you think are some of the more innovative apps we're going to see being built in the near term thanks to YARN?
Jim Walker when existing ISVs pick up on it and start using YARN as an OS. lots of what we do today.
Jeff Kelly by more work to be done, referring to adding streaming analytics and moving Hive off MapReduce - both coming, but YARN today is still a major development.
Stuart Miniman
#Hadoop isn't typically put on an enterprise SAN - so who owns the infrastructure for big data applications?
Jeff Kelly
good question - storage guys may not want to touch this.
Ercan Yilmaz
organizational issues are real, cause friction and slows adoption
Ercan Yilmaz
exploratory nature of big data analytics is also something new and needs to be addressed by traditional business case/governance processes
John Furrier
I like this question from Stu if distributed and virtual does it matter or will compliance reqs change the game here
Michael Hiskey
#Hadoop should be that data center of the universe; it needs tools around it to get back the ACID compliance the #EDW has had forever - not all of this will be open & free, but the skills that manage it need to be "commodity" available in nature
Michael Hiskey
A big benefit to the enterprise for #Hadoop is that it helps them avoid vendor lock-in and getting squeezed come software license renewal time...
Jeff Kelly
yes, as long as you need highly trained, expensive Hadoop admins there will be a limit to the enterprises that want to adopt Hadoop
Dave Vellante
what % of hadoop workloads need ACID compliance?
Ercan Yilmaz
neither hdfs nor hbase is a replacement for EDW, hence ACID compliance is probably not a requirement
Jeff Kelly not today, but isn't that part of the long-term vision - for Hadoop to serve as data lake with all data services inc. EDW on top?
John Furrier
Questions on #hadoop in the enterprise: On a scale of 1-10 (10 being fully baked) how much is hadoop in line with compliance features to be fully compatible? Areas that are good, bad, and ugly?
Jack Norris
We have customers today that rank it as a 10. Ad platforms running their business on Hadoop (billions of events per data). Banks performing fraud detection, Retailers using it for online recommendations, the list is long.