Data Science Courses
Data Science using SAS and Python
"Data Science is not a Tool/Software, it is a Subject. Most of us are putting it off and only concentrate on Tools/Software which leads to failure and loss of interest towards Data Science."
Data Science is a study of data. We believe Data is everything in today's world that can be treated as 'Information' about different aspects of Business. In this course, we will cover all the components of Data Science like Data Management, Programming, Visualization, and Analytics. All the components of Data Science mentioned here are equally important to successfully become a Data Scientist.
This course focuses mainly on SAS and Python tools. It starts with programming using SAS and Python followed by analytical and visualization techniques. It includes multiple hands-on exercises and project work in the domains of marketing, banking, finance, risk management, retail, clinical, manufacturing/inventory management etc.
The ultimate purpose of this course is to cover all the basics and advance predictive modeling techniques which are a must in today's competitive edge. This course will cover all the programming, visualization and analytical models to be implemented in Python and SAS.
Our Faculty has 9+ years of experience in the field of Data and Analytics, Data Science, Training, and Analytical application development.
Our Faculty is a certified Predictive Modeler and Programmer from SAS Institute Inc.
Our Faculty worked on SAS (All analytical modules), Python (Data Science and Data Management libraries), and Big Data (Hadoop and Hive) during his 9+ years of tenure.
He has successfully implemented analytics and developed analytical applications in the field of AML, Risk Management, Healthcare Fraud, Retail, and Market Research.
He is Former employee of TCS and SAS Institute Inc.
Introduction to Data Science:
- Welcome/General Discussion about the expectation from course
- Definition of Data
- Difference between data management and data analytics
- Data Science components
Programming using SAS:
- Base SAS Overview
- Data Step and Proc Step processing
- Concept of PDV, Input Buffer
- Concept of SAS library and SAS Catalog
- Variable Types in SAS
- Reading Data stored external to SAS
- Importing Data by using Proc Import
- Data Step SAS statements
- SAS Functions
- Appending and Merging using SAS
- SAS Procedures like proc means, Proc Univariate, proc append, proc freq, and proc export.
- SAS SQL
- SAS Macros
Programming using Python:
- Python Overview
- Python Data Types
- Python operations using Numbers, String, Logical, Arithmetic and so on
- Python Strings
- Python Lists
- Python Tuple
- Python Dictionary
- FOR and WHILE loops
- IF/THEN/ELSE in Python
- Data Manipulation using Numpy and Pandas
Statistics using SAS and Python
- Levels of Measurement and Variable types
- Descriptive Statistics and Picturing Distributions
- Confidence Interval for the Mean
Hypothesis Testing and ANOVA using SAS and Python
- One Sample t-test of comparing means
- Two Sample t-test of comparing means
- One Way ANOVA
- Assumptions of ANOVA Modeling
- n-way ANOVA
- ANOVA Post Hoc Studies
Exploratory Data Analysis using SAS and Python
- Data Exploration by using Scatter Plots
- Pearson and Spearmen Correlations
Linear Regression using SAS and Python
- Fit Simple Linear Regression Model
- Assumptions of Linear Regression Model
- Analyze the output of the Linear Regression
- Producing Predicted Values
- Difference between Simple Linear Regression and Multiple Linear Regression Models
- Fit Multiple Linear Regression Model
- Stepwise Regression/Model Selection Techniques
Regression Diagnostics using SAS and Python
- Residual Analysis
- Influential Observation
- Difference between Influential Observation and Outliers
- Collinearity Diagnostics
Categorical Data Analysis using SAS and Python
- Examining Distributions
- Test of Associations by using the chi-square test
- Fisher's Exact p-values for Pearson Chi-square test
Logistic Regression using SAS and Python
- Odds and Odds Ratio
- Simple Logistic Regression
- Multiple Logistic Regression with categorical predictors
- Analyze the output of Logistic Regression
Measure Model Performance using SAS and Python
- Apply the principles of honest assessment to model performance measurement
- Rare event adjustments
- Assess classifier performance using the confusion matrix
- Model selection and validation using training and validation data
- Create and interpret graphs (ROC, lift, and gains charts) for model comparison and selection
- Establish effective decision cut-off values for scoring
Decision Tree Modeling and XGBOOST using SAS and Python
Decision Tree Modeling
- Introduction to Decision Tree Modeling
- Model essential for decision tree models
- Decision Tree model development by using CHAID, Entropy/Information gain and Gini
- Decision Tree Model tuning
Gradient Boosting (XGBOOST)
- Introduction to Boosting
- Example of Boosting
- Regression Decision Tree
- Gradient Boosted Trees Regression
Project 1: Portfolio Data Analysis
Domain: Risk Management
Problem Statement: As an analyst, you need to advise your client to decide which mutual fund risk category should invest in.
Topics:Descriptive Analytics, Distributions, and Visualization
Project 2: Effectiveness of Production Process
Domain: Manufacturing/Inventory Management
Problem Statement: As a manager/supervisor of a company, you need to measure the effectiveness of the production of cereal boxes. The aim is to analyze whether or not the cereal boxes' weight is as per company specifications.
Topics:Hypothesis Testing (One-Sample tests)
Project 3: Product Assortment Strategy in Retail Store
As a regional sales manager of a company, you need to analyze mean sales comparison between two types of displays of products in the retail store.
The aim is to decide whether or not the Promotional display of the product is more effective than the Normal display of the product.
This helps management to decide the display location of the product in a store that will maximize sales
Topics:Hypothesis Testing (Two-Sample tests)
Project 4: Drug Analysis
Problem Statement: Before you launch the new drug in the market, you need to analyze the effect of new drug and its different doses on blood pressure of the human body
Topics:Analysis of Variance (ANOVA Models)
Project 5: Analyzing consumption of oxygen in the human body during running
In exercise physiology, an objective measure of aerobic fitness is how effectively the body can absorb and use oxygen during their 1.5 miles run.
Factors affecting oxygen_consumption are run_time, age, gender, run_pulse, rest_pulse, and so on.
The aim is to identify the key factors affecting the oxygen_consumption during a run.
Topics:Analysis of Variance (EDA and Linear Regression Models)
Project 6: Titanic Event Analysis
Domain: Event Analysis
On the 14th of April, the Titanic hit a iceberg and sank. There were 1517 fatalities from different age groups, class (1, 2, and 3) and gender.
The objective is to measure how all these factors are associated with the survival status of passengers.
Topics:Odds, Odds Ratio, Chi-Square tests, Ordinal associations, and Logistic Regression Model
Project 7: Marketing Campaign for a Bank
A target marketing campaign for a bank was undertaken to identify a segment of customers who are likely to respond to an insurance product.
Here, the target variable is whether or not the customers bought insurance product and it depends on factors like product usage in three months, demographics, transaction patterns as like deposit amount, checking account, a branch of the bank, residential information (like urban, rural) and so on.
Topics:Classification, Categorical Data Analysis, Logistic regression, Decision Tree and Gradient Boosting (XGBOOST)
Why should I learn Data Science from Techinfoplace Ltd.?
- Techinfoplace offers exclusive Data Science online/classroom courses for professionals/freshers who want to expand their knowledge base and start a career in this field.
- A personal mentor to track your progress
- Techinfoplace online/classroom sessions conducted by experienced professionals in this field.
- Real-time exercises, assignments, and projects
- Study materials, reference books for every topic.
- 24/7 learning support
- The large community of like-minded learners
What are the different modes of training that Techinfoplace provides?
Who can attend this course?
- Anyone who wants to play with Data can enter into this field.
What kind of projects are included as part of the training?
- This course is known for projects involved in it. This is purely Job Oriented training.
- You will work on highly exciting projects in the domains of high technology, Retail, Banking, Marketing, Clinical, Manufacturing, and so on.
- After completing the projects successfully, your skills will be equal to 6 months of rigorous industry experience in each topic.
- You will also encounter a different interview question for each topic which will help in cracking Data Science interviews.
Is there any discount for this course.
- Yes. The fees mentioned for this course is for one candidate. If you come with at least 3 enrollment, a discount of 20% on base price would be given to that group.
Data Science is not a Tool/Software, it is a Subject. Most of us are putting it off and only concentrate on Tools/Software which leads to failure and loss of interest towards Data Science. In this course, you will be trained to have both, Subjects as well as Tools/Software to stand unique in the current competitive edge.