Data science

Dərslər həftədə 3 dəfə,hər dərs 2 saat olmaqla tədris olunur. Kursu bitirən şəxslər sertifikatla təmin olunur.

Kurs Haqqında

Module 1: Python for Data Science

Duration: 3 weeks

Overview

Data science is a fast-growing new knowledge domain used by organizations to make data driven decisions. Data Scientists wear various hats to work with data and to derive value from it. The Python programming language is an indispensable tool for the data science practitioner and a must-know tool for every aspiring data scientist. Python offers you a fast, reliable, cross-platform, and mature environment for data analysis, machine learning, and algorithmic problem solving.

What You’ll Learn

At the end of this module, you’ll learn:

       How to work with Python interactively in web notebooks

       The essentials of Python scripting

       Key concepts necessary to enter the world of Data Science via Python

Why Learn Python?

Python is definitely one of the most popular languages in Data Science, which can be used for data analysis, manipulation, and visualization. Python has access to many Data Science libraries, making it the perfect language for developing applications and implementing algorithms.

Python has been one of the premier, flexible, and powerful open-source language that is easy to learn, easy to use, and has powerful libraries for data manipulation and analysis. For over a decade, Python has been used in scientific computing and highly quantitative domains such as finance, oil and gas, physics, and signal processing. It's continued to be a favorite option for data scientists who use it for building and using Machine learning applications and other scientific computations. Python cuts development time in half with its simple to read syntax and easy compilation feature. Debugging programs is a breeze in Python with its built-in debugger. It has evolved as the most preferred Language for Data Analytics and the increasing search trends on Python also indicate that it is the Next Big Thing and a must for Professionals in the Data Analytics domain.

Which companies use Python?

Many of the biggest and most popular companies use Python. Some of them are:

 

       Google, NASA, Amazon

       Social networking sites like Instagram, Reddit, Quora, etc

       Media streaming companies like Netflix and Spotify

       Rideshare companies like Uber and Lyft

“Python has been an important part of Google since the beginning and remains so as the system grows and evolves. Today dozens of Google engineers use Python, and we are looking for more people with skills in this language.”  - Peter Norvig, Director of Research at Google Inc.

Course Outline

Chapter 1: Introduction to Python

Goal: In this chapter, you will learn about the basic concepts of Python

Topics:

       An Overview of Python

o   Need for Programming

o   Advantages of Programming

o   About Python

o   Organizations using Python

o   Python Applications in Various Domains

o   Python Installation

o   Starting Python

o   Using the interpreter

o   Running a Python script

o   Python scripts on Unix/Windows

o   Using the editor

       Getting Started

o   Using variables

o   Built-in functions

o   Operands and Expressions

o   Strings

o   Numbers

o   Converting among types

o   Writing to the screen

o   Command line parameters

       Flow Control

o   About flow control

o   White space

o   Conditional expressions

o   Relational and Boolean operators

o   While loops

o   Alternate loop exits

Hands on Labs

       Creating “Hello World” code

       Variables

       Demonstrating Conditional Statements

       Demonstrating Loops

 

Chapter 2: Sequences, Arrays, Dictionaries, and Sets

Goal: In this chapter, you will learn how to Perform operations on Arrays, Dictionaries, Sets and learn different types of sequence structures, their usage, and execute sequence operations

Topics:

       About sequences

       Lists and list methods

       Tuples

       Indexing and slicing

       Iterating through a sequence

       Sequence functions, keywords, and operators

       List comprehensions

       Generator Expressions

       Nested sequences

       Working with Dictionaries

       Working with Sets

 

Hands on Labs

       Tuple - properties, related operations, compared with a list

       List - properties, related operations

       Dictionary - properties, related operations

       Set - properties, related operations

 

Chapter 3:

Goal: In this chapter, you will learn about different types of Functions

Topics:

       User-Defined Functions

       Defining functions

       Concept of Return Statement

       Concept of __name__=” __main__”

       Function Parameters

       Different Types of Arguments

       Global Variables

       Global Keyword

       Variable Scope and Returning Values

       Lambda Functions

       Various Built-In Functions

       Nested functions

 

Chapter 5: Errors and Exception Handling

Goal: In this chapter, you will learn about address/exceptions in code, types of errors and how to handle these errors

Topics:

       Syntax errors

       Exceptions

       Using try/catch/else/finally

       Handling multiple exceptions

       Ignoring exceptions

 

Chapter 6: Modules and packages

Goal: In this chapter, you will learn how to create generic python scripts and extract/filter content using regex.

Topics:

o   Standard Libraries

o   Packages and Import Statements

o   Reload Function

o   Important Modules in Python

o   Packages and name resolution

o   Naming conventions

o   Using imports

 

Chapter 7: Classes

Goal: In this chapter, you will learn about various Object-Oriented concepts such as Abstraction, Inheritance, Polymorphism, Overloading, Constructor, and so on

Topics:

       Defining classes

       Introduction to Object-Oriented Concepts

       Built-In Class Attributes

       Public, Protected and Private Attributes, and Methods

       Class Variable and Instance Variable

       Constructor and Destructor

       Decorator in Python

       Core Object-Oriented Principles

       Inheritance and Its Types

       Method Resolution Order

       Overloading

       Overriding

       Getter and Setter Methods

       Inheritance-In-Class Case Study

Module 2: Analytics with Python

Duration: 2 weeks

Overview

Learn advanced Python skills for data analysis and visualizations.

This course explores using Python for data scientists to perform exploratory data analysis and complex visualizations. In this course you’ll learn about essential mathematical and statistics libraries such as NumPy and Pandas. It also covers visualization tools like matplotlib and Seaborn.

Course Outline

Chapter 1: – Introduction to NumPy

Goal: In this chapter, you will learn about the basics of Data Analysis using two essential libraries: NumPy and Pandas. You will also understand the concept of file handling using the NumPy library.
Topics:

       Basics of Data Analysis

       NumPy - Arrays

       Operations on Arrays

       Indexing Slicing and Iterating

       NumPy Array Attributes

       Matrix Product

       NumPy Functions

       Functions

       Array Manipulation

       File Handling Using NumPy

Chapter 2: – Data Manipulation using pandas

Goal: In this chapter, you will gain in-depth knowledge about analyzing datasets and data manipulation using Pandas.
Topics:

       Introduction to pandas

       Data structures in pandas

       Series

       Data Frames

       Importing and Exporting Files in Python

       Basic Functionalities of a Data Object

       Merging of Data Objects

       Concatenation of Data Objects

       Types of Joins on Data Objects

       Data Cleaning using pandas

       Exploring Datasets

       Analysing a dataset

Chapter 3: – Data Visualization using Matplotlib

Goal: In this chapter, you will learn Data Visualization using Matplotlib.
Topics:

       Why Data Visualization?

       Matplotlib Library

       Line Plots

       Multiline Plots

       Bar Plot

       Histogram

       Pie Chart

       Scatter Plot

       Boxplot

       Saving Charts

       Customizing Visualizations

       Saving Plots

       Grids

       Subplots

       Rendering

 

 

Module Project:

Project 1:

Preparing an analytical report based on available data to help producers of educational programs effectively build a strategy for updating and improving courses.

 

Project 2:

Preparing an analytical report for the HR department. Based on the analytics, it is necessary to draw up recommendations for the HR department on recruitment strategy and interaction with employees.

Module 3: Statistics for Data Science

Duration: 3 weeks

Overview

The self-paced Statistics module has been designed in such a manner that it is easy for a Data Scientist to get a solid foundation on the concepts. The complete mechanism of Data Science is explained in detail in terms of Statistics and Probability. Data and its types are discussed along with different kind of sampling procedures.

Other essential concepts of Statistics (statistical inference, testing, clustering) are emphasized here as well since that’s a very important part of being a Data Scientist.

Module Objectives

After the completion of this course, you should be able to:

       Analyze different types of data

       Master different sampling techniques

       Illustrate Descriptive statistics

       Apply probabilistic approach to solve real life complex problems

       Explain and derive Bayesian inference

       Understand Clustering techniques

       Understand Regression modelling

       Master Hypothesis

       Illustrate Testing the data

Why learn Statistics?

Statistics and its methods are the backend of Data Science to "understand, analyze and predict actual phenomena". Machine learning employs different techniques and theories drawn from statistical & probabilistic fields. This Statistics Essentials for Analytics Course enables you to gain knowledge of the essential statistics required for analytics and Data Science, understand the mechanism of popular Machine Learning Algorithms like K-Means Clustering, Regression. The course also takes you through the glimpse of hypothesis testing and its methods enabling you perform test on alternative hypothesis.

Chapter 1: Understanding the Data

Learning Objectives:

At the end of this module, you should be able to:

       Understand various data types

       Learn Various variable types

       List the uses of variable types

       Explain Population and Sample

       Discuss sampling techniques

       Understand Data representation

Topics:

       Introduction to Data Types

       Numerical parameters to represent data

       Mean

       Mode

       Median

       Sensitivity

       Information Gain

       Entropy

       Statistical parameters to represent data

Hands-on Labs

       Estimating mean, median and mode using Python

       Calculating Information Gain and Entropy

 

Chapter 2: Probability

Learning Objectives:

At the end of this module, you should be able to:

       Understand rules of probability

       Learn about dependent and independent events

       Implement conditional, marginal, and joint probability using Bayes Theorem

       Discuss probability distribution

       Explain Central Limit Theorem

Topics:

       Uses of probability

       Need of probability

       Bayesian Inference

       Density Concepts

       Normal Distribution Curve

Hands-on Labs

       Calculating probability using python

       Conditional, Joint and Marginal Probability using Python

       Plotting a Normal distribution curve

Chapter 3: Statistical Inference

Learning Objectives: In this module, you will learn about different statistical techniques and terminologies used in data analysis.

At the end of this module, you should be able to:

       Understand concept of point estimation using confidence margin

       Draw meaningful inferences using margin of error

       Explore hypothesis testing and its different levels

Topics:

       What is Statistical Inference?

       Terminologies of Statistics

       Point Estimation

       Confidence Margin

       Hypothesis Testing

       Levels of Hypothesis Testing

Hands-on Labs

       Calculating and generalizing point estimates using python

 

Chapter 4: Testing the Data

Learning Objectives:

At the end of this module, you should be able to:

       Understand Parametric and Non-parametric Testing

       Learn various types of parametric testing

       Discuss experimental designing

       Explain a/b testing

Topics:

       Parametric Test

       Parametric Test Types

       Non- Parametric Test

       Experimental Designing

       A/B testing

Hands-on Labs

       Perform p test and t tests in Python

       A/B testing in Python

 

Chapter 5: Data Clustering

Learning Objectives:

At the end of this module, you should be able to:

       Understand concept of association and dependence

       Explain causation and correlation

       Learn the concept of covariance

       Discuss Simpson’s paradox

       Illustrate Clustering Techniques
 

Topics:

       Association and Dependence

       Causation and Correlation

       Covariance

       Simpson’s Paradox

       Clustering Techniques

Hands-on Labs

       Correlation and Covariance in Python

       Hierarchical clustering in Python

       K means clustering in Python

 

Chapter 6: Regression Modelling

Learning Objectives:

At the end of this module, you should be able to:

       Understand the concept of Linear Regression

       Explain Logistic Regression

       Implement WOE

       Differentiate between heteroscedasticity and homoscedasticity

       Learn concept of residual analysis

Topics:

       Logistic and Regression Techniques

       Problem of Collinearity

       WOE and IV

       Residual Analysis

       Heteroscedasticity

       Homoscedasticity

Hands-on Labs

       Perform Linear and Logistic Regression in Python

       Analyze the residuals using Python

 


 

Module 4: Data Science

Duration: 4 weeks

Chapter 1: Introduction to Data Science

Learning Objectives:

Get an introduction to Data Science module and see how Data Science helps to analyze large and unstructured data with different tools.

Topics:

       What is Data Science?

       What does Data Science involve?

       Era of Data Science

       Business Intelligence vs Data Science

       Life cycle of Data Science

       Tools of Data Science

       Introduction to Big Data and Hadoop

       Introduction to R

       Introduction to Spark

       Introduction to Machine Learning

Hands-on Labs

       No lab

 

Chapter 2: Introduction to Machine Learning

Learning Objectives:

Get an introduction to Machine Learning as part of this chapter. You will discuss the various categories of Machine Learning and implement Supervised Learning Algorithms

Topics:

       Python Revision (numpy, Pandas, scikit learn, matplotlib)

       What is Machine Learning?

       Machine Learning Use-Cases

       Machine Learning Process Flow

       Machine Learning Categories

       Linear Regression

       Logistic Regression

       Gradient descent

Hands-on Labs

       Implementing Linear Regression model

       Implementing Logistic Regression model

 

Chapter 4: Supervised Learning

Learning Objectives:

In this chapter, you will learn Supervised Learning Techniques and their implementation, for example, Decision Trees, Random Forest Classifier etc.

Topics:

       What are Classification and its use cases?

       What is Decision Tree?

       Algorithm for Decision Tree Induction

       Creating a Perfect Decision Tree

       Confusion Matrix

       What is Random Forest?

       What is Naïve Bayes?

       How Naïve Bayes works?

       Implementing Naïve Bayes Classifier

       What is Support Vector Machine?

       Illustrate how Support Vector Machine works?

       Hyperparameter Optimization

       Grid Search vs Random Search

       Implementation of Support Vector Machine for Classification

Hands-on Labs

       Implementing Decision Tree model

       Implementing Linear Random Forest

       Implementing Navies Bayes model

       Implementing Support Vector Machine

       Implementation of Naïve Bayes, SVM

 

Chapter 5: Dimensionality Reduction

Learning Objectives:

In this Data Science with Python Training module, you will learn about the impact of dimensions within data. You will be taught to perform factor analysis using PCA and compress dimensions. Also, you will be developing an LDA model.

Topics:

       Introduction to Dimensionality

       Why Dimensionality Reduction

       PCA

       Factor Analysis

       Scaling dimensional model

       LDA

Hands-on Labs

       PCA

       Scaling

 

Chapter 5: Unsupervised Learning

Learning Objectives:

Learn about Unsupervised Learning and the various types of clustering that can be used to analyze the data.

Topics:

       What is Clustering & its Use Cases?

       What is K-means Clustering?

       How does K-means algorithm work?

       How to do optimal clustering

       What is C-means Clustering?

       What is Hierarchical Clustering?

       How Hierarchical Clustering works?

Hands-on Labs

       Implementing K-means Clustering

       Implementing C-means Clustering

       Implementing Hierarchical Clustering

 

Chapter 6: Model Selection and Boosting

Learning Objectives:

In this module, you will learn about selecting one model over another. Also, you will learn about Boosting and its importance in Machine Learning. You will learn on how to convert weaker algorithms into stronger ones.

Topics:

       What is Model Selection?

       The need for Model Selection

       Cross-Validation

       What is Boosting?

       How Boosting Algorithms work?

       Types of Boosting Algorithms

       Adaptive Boosting

Hands-on Labs

       Cross Validation

       AdaBoost

 

Module Projects

Module 5: Natural Language processing

Duration 3 weeks

About the Course

This Python NLP course is for anyone who works with data and text– with good analytical background and little exposure to Python Programming Language. It is designed to help you understand the critical concepts and techniques used in Natural Language Processing using Python Programming Language. You will be able to build your own machine learning model for text classification. Towards the end of the course, we will be discussing various practical use cases f NLP in the python programming language to enhance your learning experience.

Why learn Natural Language Processing or NLP?

Natural Language Processing (or Text Analytics/Text Mining) applies analytic tools to learn from collections of text data, like social media, books, newspapers, emails, etc. The goal can be considered to be similar to humans learning by reading such material. However, using automated algorithms we can learn from massive amounts of text, very much more than a human can. It is bringing a new revolution by giving rise to chatbots and virtual assistants to help one system address queries of millions of users.

NLP is a branch of artificial intelligence that has many important implications on the ways that computers and humans interact. Human language, developed over thousands and thousands of years, has become a nuanced form of communication that carries a wealth of information that often transcends the words alone. NLP will become an important technology in bridging the gap between human communication and digital data.

Course Outline

Chapter 4: Introduction to Text Mining and NLP

Goal:

In this module, you will learn about text mining and the ways of extracting and reading data from some common file types including NLTK corpora

Topics:

       Overview of Text Mining

       Need of Text Mining

       Natural Language Processing (NLP) in Text Mining

       Applications of Text Mining

       OS Module

       Reading, Writing to text and word files

       Setting the NLTK Environment

       Accessing the NLTK Corpora

Hands-on Labs

       No lab

 

Chapter 4: Extracting, Cleaning and Pre-processing Text

Learning Objectives:

This module will help you understand some ways of text extraction and cleaning using NLTK

Topics:

       Tokenization

       Frequency Distribution

       Different Types of Tokenizers

       Bigrams, Trigrams & Ngrams

       Stemming

       Lemmatization

       Stopwords

       POS Tagging

       Named Entity Recognition

Hands-on Labs

       No lab

 

Chapter 4: Analyzing Sentence Structure

Goal:

In this Module, you will learn how to analyze a sentence structure using a group of words to create phrases and sentences using NLP and the rules of English grammar

Topics:

       Syntax Trees

       Chunking

       Chinking

       Context Free Grammars (CFG)

       Automating Text Paraphrasing

Hands-on Labs

       No Lab

 

Chapter 4: Text Classification

Goal:

In this chapter, you will explore text classification, vectorization techniques and processing using scikit-learn.

Topics:

       Machine Learning: Brush Up

       Bag of Words

       Count Vectorizer

       Term Frequency (TF)

       Inverse Document Frequency (IDF)

       Converting text to features and labels

       Multinomial Naive Bayes Classifier

       Leveraging Confusion Matrix

Hands-on Labs

       No Lab

 

Module Project

In this module, you will learn Sentiment Classification on Movie Rating Dataset

At the end of this module, you should be able to:

       Implement all the text processing techniques starting with tokenization

       Express your end-to-end work on Text Mining

       Implement Machine Learning along with Text Processing

 


 

Data Science Capstone Project

Auto Insurance Case Study

Learning Objectives:

The capstone project will provide you with a business case. You will need to solve this by applying all the skills you’ve learned in the courses of the master’s program. This Capstone project will require you to apply the following skills

Data Exploration

       Checking Data Size

       Note the important features

Data Wrangling

       Handling Imbalanced Data

       MetaData Creation

       Statistics on the Data

       Identify Missing Variable

       Rectify Missing Variable

       One Hot Encoding

       Scaling: Standard Scaler & Min Max Scaler

Data Exploration

       Data Visualization

Machine Learning

       PCA

       Logistic Regression

       Generating F1 Score Metric

       Linear SVC Classifier

       XG Boost Classifier

       AdaBoost Classifier

Hardan başlamalısan bilmirsən?

Hardan başlamalı olduğunu bilmirsən?

Bizimlə birbaşa əlaqə: (+994 51) 433 64 51

  • Adress
  • Cəfər Cabbarlı küç. 609, Bakı / Globus Center

  • © 2014-2023 Orient Academy

  • Social network