Premium Training Sessions

- Taught by World-Class Data Scientists -

Learn the latest data science concepts, tools and techniques from the best. Forge a connection with these rockstars from industry and academic, who are passionate about molding the next generation of data scientists.

Premium Training Includes

Form a working relationship with some of the world’s top data scientists for follow up questions and advice.

Additionally, your ticket includes access to 50+ talks and workshops.

High quality recordings of each session, exclusively available to premium training attendees.

Equivalent training at other conferences costs much more.

Professionally prepared learning materials, custom tailored to each course.

Opportunities to connect with other ambitious like-minded data scientists.


Premium Training Disclaimer: There will be a total of 7 premium training sessions.  Sessions will take place on Friday,  9/23,  and Saturday 9/24.  Sessions will run from 9am – 1pm and 2pm – 6pm.  Certain sessions will run simultaneously, thus pre-registration for sessions will be required and will take place closer to the event. 

Bayesian Statistics in Python with Renowned Author & Professor Allen Downey

Bayesian statistical methods are powerful, versatile tools that provide guidance for decision-making under uncertainty. Data scientists who know how to apply Bayesian methods can craft practical solutions to important problems in science, engineering, and business. But for many people it’s hard to get started, and this workshop will help remove the roadblocks imposed by un-intuitive mathematical representations. As one of the world’s foremost experts, attending a workshop by Alan Downey is a highly rewarding experience.

Why Allen Downey?

  • Allen Downey epitomizes the level of instructor we seek to conduct our premium workshop. He is a highly accomplished professor and a widely published author of several books. Not only is he one of the foremost experts in Bayesian statistics but his easy manner makes him a workshop favorite.
  • Allen Downey is a professor at Olin College of Engineering. He uses material from his book, Think Bayes (O’Reilly Media 2013). Prof. Downey has more than 15 years of teaching experience, and has offered successful workshops at PyCon and other professional conferences.
  • Prof. Downey has a Ph.D. in Computer Science from U.C. Berkeley, and a M.S. and a B.S. in Engineering from MIT. Before joining Olin College, he taught at Colby College and Wellesley College. In 2009-10 he was a Visiting Scientist at Google, Inc. He is the author of several books, including Think Python, Think Stats, and Think Complexity, published by O’Reilly Media.

Why Bayesian?

Bayesian statistical methods are powerful, versatile tools that provide guidance for decision-making under uncertainty. Data scientists who know how to apply Bayesian methods can craft practical solutions to important problems in science, engineering, and business. But for many people it’s hard to get started—in most books and classes the important ideas are hidden by unintuitive mathematical representations.

Abstract

  • Bayesian statistical methods are powerful, versatile tools that provide guidance for decision-making under uncertainty. Data scientists who know how to apply Bayesian methods can craft practical solutions to important problems in science, engineering, and business. But for many people it’s hard to get started—in most books and classes the important ideas are hidden by unintuitive mathematical representations.
  • Our workshops take a computational approach, so we avoid the math and get straight to the fundamental ideas. We focus on realistic problems with real-world complexity. And participants work hands-on, so they learn practical tools as well as the theory.
  • The workshops use Python, so participants should be familiar with basic Python, including classes and methods. No statistics background is required.

Prerequisites

  • Participants should be familiar with Python, but no statistical background is required. Each participant should have a laptop with an Python environment that includes the SciPy stack (details below).
  • If possible, I encourage participants to read Chapter 1 of Think Bayes before the workshops. Electronic versions of the book are available from thinkbayes.com
  • Code for the workshops is in a repository on GitHub. If you have a Git client installed, you should be able to download it by running:
  • git clone https://github.com/AllenDowney/BayesMadeSimple.git
  • It should create a directory named BayesMadeSimple. Otherwise you can download the repository in this zip file:
  • https://github.com/AllenDowney/BayesMadeSimple/archive/master.zip
  • To do the exercises, you need Python 2.x or 3.x with NumPy, SciPy, and matplotlib. To test your environment:
  • cd BayesMadeSimple
  • python install_test.pyA window should appear with a graph of a normal distribution. If so, you have everything you need for the workshops.
  • Otherwise, I highly recommend installing Anaconda. By default it contains everything you need for the workshops; it is easy to install on Windows, Mac, and Linux; and because it does a user-level install, it will not interfere with other Python installations.
  • If you have any problems, please let me know before the workshops. We will not have time on the day to debug environments.
  • However, if you are not able to get your environment set up ahead of time, please come to the workshops anyway. At the beginning I will pair up participants to work together. As long as each pair has a working environment, we will be all set.

Intro to Machine Learning with Harvard Lecturer Rahul Dave

Machine learning is often billed as the part of artificial intelligence that actually works. As it quietly reshapes the world around us, it has become a must have skill for anyone interested in predictive analytics. Raul Dave is a master at breaking down this complex subject into quick and intuitive takeaways.

Why Rahul Dave?

  • When Rahul Dave gives one of his machine learning workshops at ODSC, there never fails to be a line out the door. In addition to being a lecturer at Harvard University, Rahul is a highly accomplished data scientist. He helped create the largest astrophysics time series database to name but one of his many accomplishments.
  • Rahul Dave is a lecturer at Harvard University and partner at LxPrior, a small Data Science consultancy. LxPrior offers its clients data analysis services as well as data science training. Rahul trained as an astrophysicist, doing research on dark energy, and worked at the University of Pennsylvania, NASA’s Astrophysics Data System, as well as at Harvard University. As a computational scientist, he has developed time series databases, semantic search engines, and techniques for classifying astronomical objects.
  • He was one of the people behind Harvard’s Data Science course CS109, and Harvard Library’s Data Science Training For Librarians course. This year he is teaching courses in computer science and stochastic methods to scientists and engineers.

Why Machine Learning?

  • Machine learning is often billed as the part of artificial intelligence that actually works. As it quietly reshapes the world around us, it has become a must have skill for anyone interested in predictive analytics.
  • There is huge demand for machine learning specialists and the fundamental techniques you will learn will prove invaluable in your job and career. As more organizations become aware of the predictive breakthroughs machine learning can achieve, this skill set is sure to remain white hot.

Abstract

  • After this workshop, you will be able to learn other machine learning models independently.
  • The workshop will start with the basic concepts of machine learning – including Modeling, Model Selection, overfitting, and validation, using the python package scikit-learn.
  • The basic concepts also include complexity control and validation, all in the context of a regression model. This model will be used to introduce the scikit-learn estimator interface. The curriculum will include the core ideas behind machine learning through a classification mode and introducing the concepts of maximizing likelihood or minimizing cost.

Prerequisites

TBD
Programming with Data: Python and Pandas with Accomplished Data Scientist Daniel Gerlanc

The great debate around “can you be be a data scientist without being a coder” still goes on. However the consensus amongst employers is no and one of the most popular responses to that question is Python. As a financial quant and highly sought after data science consultant, Daniel Gerlanc has the adept experience to quickly make you a practical and productive data scientist with Python.

Why Daniel Gerlanc?

  • Daniel Gerlanc is a highly respected former hedge fund quant and much sought after data scientist. He has a well earned reputation of helping companies improve their modeling techniques and unblocking critical issues.  His workshop is a shorten version he has delivered internally to top hedge funds and fortune 100 companies.
  • Daniel Gerlanc has worked as a data scientist for over 10 years. He spent 5 years as a quantitative analyst with two Boston hedge funds before starting Enplus Advisors Inc, a predictive analytics consultancy, in 2011. At Enplus, he works with clients in different industries to improve existing analytic processes and develop new ones. He has coauthored several open source R packages, published in peer-reviewed journals, and is active in local predictive analytics groups. He is a graduate of Williams College.

Why Python & Pandas?

  • The great debate around “can you be be a data scientist without being a coder” still goes on. However the consensus amongst employers is no and one of the most popular responses to that question is Python.
  • Python is considered one of the best and most adaptable languages for doing data analysis. Relative to other languages it has a gentler learning curve and with its plethora of libraries including pandas and scikit-learn (machine learning) you can quickly be doing meaningful data science in hours.

Abstract

  • Whether in R, MATLAB, Stata, or python, modern data analysis, for many researchers, requires some kind of programming. The preponderance of tools and specialized languages for data analysis suggests that general purpose programming languages like C and Java do not readily address the needs of data scientists; something more is needed.
  • In this workshop, you will learn how to accelerate your data analyses using the Python language and Pandas, a library specifically designed for interactive data analysis. Pandas is a large library so we will focus on its core functionality, specifically, loading, filtering, grouping, and transforming data. Having completed this workshop, you will understand the fundamentals of Pandas, be aware of common pitfalls, and be ready to perform your own analyses.

Prerequisites

TBD

Intermediate Machine Learning with Harvard Lecturer Rahul Dave

There is huge demand for machine learning specialists and the fundamental techniques you will learn will prove invaluable in your job and career. As more organizations become aware of the predictive breakthroughs machine learning can achieve, this skill set is sure to remain white hot. If you already have some ML or predictive experience this workshop will help you advance your skills to a new level.

Why Rahul Dave?

  • When Rahul Dave gives one of his machine learning workshops at ODSC, there never fails to be a line out the door. In addition to being a lecturer at Harvard University, Rahul is a highly accomplished data scientist. He helped create the largest astrophysics time series database to name but one of his many accomplishments.
  • Rahul Dave is a lecturer at Harvard University and partner at LxPrior, a small Data Science consultancy. LxPrior offers its clients data analysis services as well as data science training. Rahul trained as an astrophysicist, doing research on dark energy, and worked at the University of Pennsylvania, NASA’s Astrophysics Data System, as well as at Harvard University. As a computational scientist, he has developed time series databases, semantic search engines, and techniques for classifying astronomical objects.
  • He was one of the people behind Harvard’s Data Science course CS109, and Harvard Library’s Data Science Training For Librarians course. This year he is teaching courses in computer science and stochastic methods to scientists and engineers.

Why Machine Learning?

  • Machine learning is often billed as the part of artificial intelligence that actually works. As it quietly reshapes the world around us, it has become a must have skill for anyone interested in predictive analytics.
  • There is huge demand for machine learning specialists and the fundamental techniques you will learn will prove invaluable in your job and career. As more organizations become aware of the predictive breakthroughs machine learning can achieve, this skill set is sure to remain white hot.

Abstract

  • At this point you will will be able to go out and work with a team of real-world machine learning problems, and learn even more complex models and techniques such as pipelining and decision theory independently.
  • Building on Basic Machine Learning, in the intermediate course we will delve deeper into concepts of feature selection and the curse of dimensionality; ROC curves and model selection. In the workshop we will go over various kinds of classification models, regression, and the combining of models together into ensembles.

Prerequisites

TBD

Hadoop with Big Data Expert Marty Lurie

Hadoop is arguably the most important open source platform in big data. Knowing your way around this ubiquitous platform is an essential skill for anyone who wants to be taken seriously as a big data practitioner. Marty Lurie has a highly successful track record as a big data specialist and is well positioned to provide you with not only the fundamentals but also key insight in using Hadoop for big data science.

Why Marty Luire?

  • Marty Luire is second to none when it comes to getting you up to speed on arguably big data’s most important platform – Hadoop. Marty is a highly experienced and accomplished big data specialist and he is well placed to provide not only the fundamentals but also key insight with this hands-on-hadoop workshop. is second to none when it comes to getting you up to speed on arguably big data’s most important platform – Hadoop.
  • Marty is a highly experienced and accomplished big data specialist and he is well placed to provide not only the fundamentals but also key insight with this hands-on-hadoop workshop.

Why Hadoop?

  • Hadoop is arguably the most important open source platform in big data. Knowing your way around this ubiquitous platform is an essential skill for anyone who wants to be taken seriously as a big data practitioner.
  • Learning this platform is sure to open many more career opportunities in addition to helping you run predictive analytics on large volumes of data. Hadoop continues to evolve in line with the massive demand for predictive analytics at scale, thus ensuring this skillset will be in demand for years to come.

Abstract

  • This session will introduce you to several use cases for Hadoop. Through these use cases you will understand the capabilities of the major Hadoop projects. Think of it like ""speed-dating"" with the different components. Newcomers will understand the range of capabilities within Hadoop. Those with experience will gain exposure to projects they haven't worked with before. Prerequisite skills: ability to enter a command at a bash-shell prompt, or watch the person next to you enter the command.
  • Which use cases? Political analysis - was Abe Lincoln a ""team-player"", NYSE stock ticker analysis including using an analytics application, Labor census analysis and data warehouse offloading, web site interaction profiling at www.flra.gov, next-best-offer with machine learning, mining social media, and real time geolocation.
  • Which Hadoop projects? HDFS, Streaming MapReduce, Hive, Impala, Spark, Search, Sqoop, Flume, Kafka, Oozie, HBase, and mllib. You will see sample code from each project. We will also run some industry standard Hadoop benchmarks.
  • The workshop is based on my tutorial, but with a number of enhancements:
  • http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/

Prerequisites

TBD
Student
Attendee
Group

*** Must show valid student ID during registration ***

Super Early Bird
Early Bird
Regular
Door Price
Sold Out!
Sold Out!
Sold Out!
Access to Talks
Access to Workshops
Access to Exhibition Floor
Access to Career Fair
Access to Recorded Talks
Premium Training Recordings
Access to Networking Events
Premium Training (More info: http://bit.ly/BDFPT1)
One Day Pass
$118
Access to Talks
Access to Workshops
Access to Exhibition Floor
Access to Career Fair
Access to Recorded Talks
Premium Training Recordings
Access to Networking Events
Premium Training (More info: http://bit.ly/BDFPT1)
Three Day Pass
$509
Access to Talks
Access to Workshops
Access to Exhibition Floor
Access to Career Fair
Access to Recorded Talks
Premium Training Recordings
Access to Networking Events
Premium Training (More info: http://bit.ly/BDFPT1)
Premium Training
$764
Access to Talks
Access to Workshops
Access to Exhibition Floor
Access to Career Fair
Access to Recorded Talks
Premium Training Recordings
Access to Networking Events
Premium Training (More info: http://bit.ly/BDFPT1)

Super Early Bird
Early Bird
Regular
Door Price
Sold Out!
Sold Out!
Sold Out!
Access to Talks
Access to Workshops
Access to Exhibition Floor
Access to Career Fair
Access to Recorded Talks
Premium Training Recordings
Access to Networking Events
Premium Training (More info: http://bit.ly/BDFPT1)
One Day Pass
$139
Access to Talks
Access to Workshops
Access to Exhibition Floor
Access to Career Fair
Access to Recorded Talks
Premium Training Recordings
Access to Networking Events
Premium Training (More info: http://bit.ly/BDFPT1)
Three Day Pass
$599
Access to Talks
Access to Workshops
Access to Exhibition Floor
Access to Career Fair
Access to Recorded Talks
Premium Training Recordings
Access to Networking Events
Premium Training (More info: http://bit.ly/BDFPT1)
Premium Training
$899
Access to Talks
Access to Workshops
Access to Exhibition Floor
Access to Career Fair
Access to Recorded Talks
Premium Training Recordings
Access to Networking Events
Premium Training (More info: http://bit.ly/BDFPT1)

*** Group discounts are available for parties of 5 or more. ***

Super Early Bird
Early Bird
Regular
Door Price
Sold Out!
Sold Out!
Sold Out!
Sold Out!