Screen Shot 2013-11-04 at 5.03.57 PM

You can sign up for our Sunday Intensive beginner level R classes at
NYC Data Science Academy meetup page or email vivian.zhang@supstat.com for more info.

Brief: The course (which will meet five Sundays) will start from the basics,
introducing the building blocks used for programming in R and building
intuition for writing clean and robust code. We will move on to cover
data analysis, applications of statistical techniques, and graphing.

Date: Nov 10th, Nov 17th, Nov 24th, Dec 1st, Dec 8th (Five Sundays)

Time: 12:00pm to 4pm

Instructors:
Scott Kostyshak (Data Scientist @ Supstat Inc, 5th year Econ PhD at Princeton Univ.)
Vivian Zhang (CTO @Supstat Inc, Master degrees in Computer Science and Statistics)

Screen Shot 2013-11-04 at 5.16.48 PMScreen Shot 2013-11-04 at 5.10.04 PM

Cost:
Individual: $110/class
For group(5 or more persons) and enterprise pricing, please email vivian.zhang@supstat.com

Course Outline:

(Content may be adjusted based on the real teaching condition)

Basics 6 hours
Abstract: explain the basic operation of knowledge through this unit of study , students can learn the characteristics of R , resource acquisition mode , and mastery of basic programming
Case and Exercise: Using the R language completion of certain Euler Project (euler project)

* How to learn R
* How to get help
* R language resources and books
* RStudio
* Expansion Pack
* Workspace
* Custom Startup Items
* Batch Mode
* Data Objects
* Custom Functions
* Control statements
* Vectorized operations

Data for two hours

Abstract: explain the various ways the R language read data , the participants through the basic WEB knowledge of web crawling , connect to the database via sql statement calling data from a variety of local read excel file data .
Case studies and exercises: crawl watercress data on the site , write a custom function .

* Web data capture
* API data source
* Connect to the database
* Local Documentation
* Other data sources
* Data Export

Data collation 3 hours

Abstract: how to manipulate the data use R for the all kinds of data conversion, especially for string operation processing .
Case studies and exercises : Find the QQ(the most used instant messager tool) group , then discuss research options with text features.

* Data sorting
* Merge Data
* Summary data
* Remodeling Data
* Take a subset of data
* String manipulation
* Date Actions

Data Visualization 3 hours

Abstract: cover two advanced drawing package , lattice and ggplot2, understand the various methods of visualization to explore.
Case and Exercise: Using graphics to right before the movie , text and other data to describe

* Histogram
* Point
* Column
* Line
* Pie
* Box Plot
* Scatter
* Matrix related
* Map

Elementary statistical methods 5 hours
Abstract: The primary explanation to use R for statistical analysis , regression analysis, students can master the basic statistical significance and role model.
Case and Exercise: Using regression to predict commodity prices ; simulated casino game winner.

* Descriptive Statistics
* Statistical Distributions
* Frequency and contingency tables
* Correlation
* T test
* Non-parametric statistics
* Linear Regression
* Regression Diagnostics
* Robust Regression
* Nonlinear regression
* Principal Component Analysis
* Logistic Regression
* Statistical Simulation

Preliminary data mining ( Selected Topics )

Abstract: explain the R language for data mining expansion pack and functions use , students can master the supervised learning and unsupervised learning two mining methods .
Case and Exercise: Use R to participate in Kaggle Data Mining Competition
* General Mining Process
* Rattle bag
* Hierarchical clustering
* K -means clustering
* Decision Trees
* BP neural network

What does SupStat offer?(click on the image to see more details.)
Our services include consulting on statistical methods, software training on statistical computing and data analysis (mainly R), statistical graphics and data visualization, as well as statistical reports. We have Beijing, Shanghai and New York office. Our team includes top 0.1% ranked Kagglers.(www.kaggle.com hosts excellent data mining competitions and gathers more than 100K data scientists.) For business inquiry, please email:
vivian.zhang@supstat.com

Screen Shot 2013-11-04 at 5.04.10 PM

Screen Shot 2013-11-04 at 5.08.08 PM

Screen Shot 2013-11-04 at 5.08.22 PM

Screen Shot 2013-11-04 at 5.08.34 PM

  • Edward Olanow

    This class sounds very appealing. Are computers provided or do I need to bring a laptop?

    • vivian.stanford@gmail.com

      You need to bring your laptop and follow us along.