RNFutureTense

Data Science 101
Discuss basics of data science & predictive modeling tools during #CSCTechCom
CSC ResearchNetwork
What do you all think about the CrowdChat? Was this helpful/useful? Want to see more of these types of interactions with CSC's ResearchNetwork?
Max Hemingway
the crowdchat is very useful. Great conversation and links shared. Yes to more opportunities to interact.
Kyle Zellman
I loved it. Great way to direct others to materials that let's them look deeper into concepts of interest to them.
craig newsome
Great way to read what everyone else is thinking
John Furrier
excellent group of high quality people and having LinkedIn users combined with Twitter made it feel unique. Thanks for hosting this
CSC ResearchNetwork
Why would you want to detect potentially disruptive technologies? How can you use disruptions from the past to predict future disruptions? #csctechcom
Jerry Overton
You can get in front of change and lead it.
Theyaa Matti
You can prepare your workforce for the future disruptions so you can get ahead of competitors.
John Furrier
previous disruptions can show the kinds of orders of magnitude impact and that it's hard to know where it will go until you get something out there
Max Hemingway
some trends repeat via improvements on the original or evolution of things. Results of previous disruptions can help predict what could happen next.
Alan Janke
the problem with using past disruptions to predict future ones, is the fact that the disruptions are accelerating, both in number and in speed of adoption (or disruption)...
Sean Kery
The advent of Big data is a current disrupter, Social media the same. These are changing the future in ways not thought of 25 years ago. These will fade to a new stable "normal" with time, whats next?
James G Hayes
#CSCTechCom Question - why use ipython vs. SOLR or RapidMiner for semantic analysis? What is the differentiator?
Jerry Overton
To be honest, much of the decision is based on what the technologist is familiar with.
Theyaa Matti
One of the benefit of using a programming language than a package like RapidMiner is knowing the underlying processes rather than depending on preconfigured ones.
Theyaa Matti
The other benefit would be, that you can run python code from command line, or from anywhere, rather than installing a packaged product like RapidMiner.
James G Hayes
Good point. Just as an fyi - SOLR has a documented API in multiple programming languages as well (e.g., SOLRJ for Java)
Kyle Zellman
you can also use various tools on the same project. For example, I might use python to gather and prepare the data and rapidminer to run the model.
James G Hayes
The challenge with ipython seems to be just to get it configured to work. Too many pieces.
Theyaa Matti
Thank you James for introducing SOLR. Through my experience I always choose the right tool for the analysis, not the right analysis for the tool.
Theyaa Matti
Some packages are heavy on processing and memory. So, the best solution would be to select the most appropriate tool that keeps you the most productive.
Michael Doane
All of the above. RapidMiner has an extension that takes full advantage of R. You can code in R and the incorporate the script within a RapidMiner process. All R libraries are available and fully supported in RM Studio. Best of both worlds
Kyle Zellman
awesome point Michael. Extensions allow you to combine capability with comfort in your choice if tools.
Theyaa Matti
Michael, are those extensions available for the community edition?
Michael Doane
yes. Available in the RapidMiner Marketplace.
Theyaa Matti
Thank you Michael, here is the link to the R extension http://rapidminer.co...
Giuseppe Taibi
Hello everybody. My name is Giuseppe Taibi and I am the Chief Product Officer of RapidMiner.
Sue Cronizer
Thanks so much for joining the crowdchat Giuseppe!
Giuseppe Taibi
Happy to be here!
Doug Austin
Giuseppe check your PMs here please
Jerry Overton
Hey Giuseppe, can you tell us where we can get info on the RapidMiner roadmap. What's on the horizon for RapidMiner?
Fitz Stewart
Welcome Giuseppe
Giuseppe Taibi
@JerryAOverton just sent you a Private Message
Max Hemingway
Great Data Science conversations over lunch #CSCtechcom
John Furrier
what were the topics
Max Hemingway
Lunch discussions on -Use of Data Science, Social Media and also great discussion on User Interfaces #CSCTechCom
CSC ResearchNetwork
Post some of the thoughts, discussions here so that everyone can see 'em!
Jerry Overton
At my lunch table, we talked about striking the right balance between discussing concepts and working through demonstrations.
John Furrier
@JerryAOverton the topic seems to be around experimentation of developing and implementing "data centric" programming
Jerry Overton
Short video on how to use the RapidMiner cross validation operator: https://www.youtube....
Max Hemingway
Are there any other good places to look for other relevant RapidMiner training sets
Giuseppe Taibi
Check out the data sets at https://sites.google.... They refer to the book Data Mining For the Massess, available in hardcopy and PDF from http://rapidminer.co...
Data Mining for the Masses
Sign in|Recent Site Activity|Report Abuse|Print Page|Powered By Google Sites
Michael Doane
Thanks Jerry. Searching YouTube produces some amazing video tutorials for RapidMiner that are produced and posted by the RapidMiner community
Chris Baker
Open data sets
http://snap.stanford...
Note especially the Networks with ground-truth for testing community detection algorithms:
http://snap.stanford...
John Furrier
Open data is the big trend I see sweeping the market next 24 months data virtualization will be enabler of this
Matt
Can we get the R code sample (netflix) and the data sets? are they on C3?
Theyaa Matti
Here is the link to the R sample code https://c3.csc.com/d...
Theyaa Matti
For more informations, setup, Please visit the FutureTense Group https://c3.csc.com/g... under C3 and look under documents
CSC Global Pass - Login
Use of Electronic Communications Media By signing on, clicking OK or otherwise attempting to log on, access, or connect to a CSC network or system resource anywhere in the world, you are notified of the monitoring and inspection of all your electroni...
Matt
@TheyaaMatti Thanks, but wanted to get hold of the Netflix one
Kyle Zellman
you can also grab google trends data all the way back to 2004 for anything at google.com/trends...
Google Trends
Explore Google trending search topics with Google Trends.
Kyle Zellman
the gear at the top will allow you to dl as csv. You can process the data in excel then.
Jerry Overton
What would be awesome is if we could actually simulate and generate disruptions. Any ideas on how to do that?
Max Hemingway
firstly need a good hypothesis of what you are trying to achieve. There are many areas to generate disruptions.
Jerry Overton
Agreed. Guess I should be taking my own "doing data science" advice -- always start with a hypothesis.
Max Hemingway
Use examples from ServiceMesh as a disruptive tech to look at a possible model.
John Furrier
I think disruption is about experimentation with an eye towards agile/flexibile iteration. Data is the key to measuring the efforts
James G Hayes
Rapidminer demo went to fast ....not useful for those of us trying to replicate the model into the tool.
Steven Melanson
Come to the back of the room and I'll give you a hand (now or after the talk)
John Furrier
was there a screen shot of the demo I missed it
Jerry Overton
Going through the demo again. Let us know if there is something you want more detail about.
Michael Doane
Anyone can feel free to reach out to me as well and I can help with any questions after the event.