datasciencedevops

DevOps in Data Science
Discuss how developers are bringing DevOps practices into the data-science pipeline.
jameskobielus
http://www.via-cc.at...

jameskobielus
Anybody have any thoughts on this?
Peter Burris
devops is the translation of "lean" into the IT world.
jameskobielus
@plburris Really? I consider "DevOps" as the industrialization of the app development/release pipeline.
Peter Burris
Lots of ways to "industrialize" appdev/release pipeline.
Kirk Borne
DevOps is defined by experts in the field, which isn't me, but I see it as Agile, Fail-fast, Tight-coupling between dev and ops teams, focused on rapid POV's (Proofs of Value) to meet end-user requirements
George Gilbert
how about the intersection of application development, ITOps, and programmable infrastructure
John Furrier
practice of continuous innovation and improvement where each iteration builds on the next to create world-leading technology
Peter Burris
but only a few focus on "reduce waste according to customer value."
John Furrier
#devops is modern software delivery methodology
jameskobielus
@KirkDBorne Tight coupling of Dev & Ops, yes. With a strong emphasis on seamless automated handoff of production-ready code.
John Furrier
#devops is providing the speed, quality and innovation in software delivery that is demanded by the business
John Furrier
#DevOps is the union of people, process and tools to enable continuous delivery of value to end users
jameskobielus
@ggilbert41 Programmable infrastructure? Perhaps. Policy-driven release governance over the lifecycle.
jameskobielus
@furrier I'm not sure that continual refinement of the app is central to DevOps. It's more a matter of continuous automation in the dev, test, deployment, and assessment pipeline.
Kirk Borne
Shout out to my @BoozAllen #DevOps colleagues who published the Enterprise DevOps Playbook: https://www.boozalle...
jameskobielus
Next question coming. Refresh your browsers.
jameskobielus
@KirkDBorne That's a good one. Thanks for sharing it!
jameskobielus
@KirkDBorne That's a good one. It's about bringing industrial-grade automation, predictabilty, and speed to the Dev-to-Ops handoff.
David Floyer
Its about bringing the world of Data Science into the world of integration into systems of record and the requirements of high availability, regression testing, provenance, reproducibility, & predictable performance: a collision of worlds!
John Furrier
@dfloyer Yes David Floyer is here
jameskobielus
http://www.via-cc.at...

John Furrier
This is a loaded question but I'd say it depends where the conversation is started in organization or C-Level
John Furrier
it doesn't matter where the initiative starts but where it ends. Adoption should yield results
Kirk Borne
Okay, I am going to cheat. Here are the results of a recent survey on DevOps adoption: https://betanews.com...
John Furrier
I still think the chasm is being crossed as we speak and #devops pioneers still view devops as devops; mainstream call is #cloudOps
jameskobielus
@KirkDBorne Great article. Thanks for sharing, Kirk.
John Furrier
Company putting out a manifesto doesn't make it #devops real agility and proof points wins the day
John Furrier
35% of some projects proves my thought on chasm crossing #cloudops is here which is #devops made easy more automation required
Peter Burris
Like most complex, social changes: It's selective.
jameskobielus
I have not seen any research pointing to DevOps adoption rates in enterprise data science. but this cited research gives numbers on DBA adoption of DevOps, which is interesting.
jameskobielus
Question #4 up above.
John Furrier
#devops challenge is scaling it beyond the small number of teams and projects
Peter Burris
Where Agile is the dev process, and ops process less driven by hardware (i.e., cloud), more likely to find devops.
John Furrier
main comment from #devops pros is: "devops is never finished"
John Furrier
I think incentives across siloed executive leadership are the largest inhibitors to the DevOps transformation; once executive incentives motivate collaboration over siloed transformations, then you win
Kirk Borne
There are various definitions of DataOps, but I prefer this one: #DevOps for #DataScience = #DataOps (IMHO). It's about #DataProduct design, development, deployment lifecycle.
John Furrier
This also might be good to talk value stream automation
David Floyer
Only when development perceive that the Ops is useful in getting code out faster.
jameskobielus
@furrier Another key challenge is scaling to handle the growing range of artifacts--code snippets, statistical models, metadata, schemas, etc.--in a complex app-dev pipeline.
Kirk Borne
The rate of change (actually, rate of acceleration) in digital transformation is really high and is jerking biz around (including tipping points and future shock): https://www.amazon.c... by @csurdak
Peter Burris
@KirkDBorne We equate digibiz = differential use of data. As more firms institutionalize work around data assets, more digibiz -- which amplifies the role of data assets.
Peter Burris
@KirkDBorne Hence, the acceleration.
jameskobielus
http://www.via-cc.at...

Peter Burris
I know it sounds recursive, but devops is going to need a LOT of data science-like stuff to reach full potential.
John Furrier
#devops enablers are integrated "open toolchains"
John Furrier
#devops enable #2: use of analytics and cognitive/deep learning
John Furrier
#devops enablement #3: microservices and container adoption
jameskobielus
I'll jump in here: source-control repository, data lake, and integrated collaboration environment that spans the entire pipeline.
John Furrier
Just did a #crowdchatstorm in a thread take that @pmarca
jameskobielus
@furrier The integrated toolchain needs to be embedded within the integrated DevOps collaboration environment that I alluded to.
jameskobielus
Question 6 coming.
Kirk Borne
#Microservices come to mind. Also, APIs... here are just a few: https://twitter.com/...
Kirk Borne
#Containers have really had a huge positive impact on productivity for #MachineLearning #DataScience teams in my organization
Kirk Borne
#DataLakes done right will definitely be a big plus in breaking down data silos, enabling rapid innovation, and zero-day discovery from new data sources: https://twitter.com/...
Peter Burris
@KirkDBorne How? I can see why it might, but can you offer specifics?
jameskobielus
@plburris Right. Automated ML-driven code-gen tools are coming along fast and furious. Microsoft, for example, is making great strides to use ML for rapid app development and iteration.
Kirk Borne
@plburris Schema-on-read is fast. Schema-on-write requires months of data modeling, design, development, testing,... i.e., DevOps in the database-build phase, yes, but which I am not seeing too much of.
Kirk Borne
See my article "Mining the #BigData Wheel" that mentions fast schema-on-read day-zero analytics here: https://mapr.com/blo... at @MapR #DataScience
jameskobielus
@KirkDBorne You are quite right. Precious little DevOps drives the database modeling process.