Introduction to AI, ML, DL, NLP, CV and its types
Setting up the development environment (Python, Jupyter Notebook, libraries: NumPy, Pandas, Scikit-learn, Tensorflow, PyTorch, OpenCV, NLTK, etc.)
Overview of the workflow and common techniques
Definition of data science and its role in various industries.
Explanation of the data science lifecycle and its key stages.
Overview of the different types of data: structured, unstructured, and semi-structured.
Discussion of the importance of data collection, data quality, and data preprocessing.
I Introduction to Pandas, a Python library for data manipulation and analysis.
Overview of NumPy, a fundamental package for scientific computing with Python.
Explanation of key data structures in Pandas: Series and DataFrame
Hands-on exploration of data using Pandas to summarize, filter, and transform data.
Data cleaning techniques, handling missing values, and dealing with outliers.
Statistical analysis of data using NumPy functions.
Introduction to data visualization and its importance in data analysis.
Overview of Matplotlib, a popular plotting library in Python.
Exploring different types of plots: line plots, scatter plots, bar plots, histogram, etc.
Customizing plots with labels, titles, colors, and styles.
Introduction to Seaborn, a Python data visualization library based on Matplotlib.
Advanced plotting techniques with Seaborn: heatmaps, pair plots, and categorical plots.
Introduction to Plotly, an interactive plotting library for creating web-based visualizations.
Creating interactive and dynamic visualizations with Plotly
Hands-on: Instagram Reach Analysis
Introduction to Data Engineering: Data cleaning, transformation, and integration Data cleaning and Handling missing values: Imputation, deletion, and outlier treatment
Feature Engineering techniques: Creating new features, handling date and time variables, and encoding categorical variables
Data Scaling and Normalization: Standardization, min-max scaling, etc.
Dealing with categorical variables: One-hot encoding, label encoding, etc.
Introduction to web scraping: Tools, libraries, and ethical considerations
Scraping data from websites using libraries like BeautifulSoup and requests: HTML parsing, locating elements, and extracting data
Handling different types of data on websites: Tables, forms, etc.
Storing scraped data in appropriate formats: CSV, JSON, or databases
Hands-on: Working on Scraping Data from Static / Dynamic Website
Introduction to Regression: Definition, types, and use cases
Linear Regression: Theory, cost function, gradient descent, and assumptions
Polynomial Regression: Adding polynomial terms, degree selection, and overfitting
Lasso and Ridge Regression: Regularization techniques for controlling model complexity
Evaluation metrics for regression models: Mean Squared Error (MSE), R-squared, and Mean Absolute Error (MAE)
Hands-On - Real Time Project
Introduction to Classification: Definition, types, and use cases
Logistic Regression: Theory, logistic function, binary and multiclass classification
Decision Trees: Construction, splitting criteria, pruning, and visualization
Random Forests: Ensemble learning, bagging, and feature importance
Evaluation metrics for classification models: Accuracy, Precision, Recall, F1-score, and ROC curves
Implementation of classification models using scikit-learn library
Hands-On - Heart Disease Detection & Food Order Prediction
Support Vector Machines (SVM): Study SVM theory, different kernel functions (linear, polynomial, radial basis function), and the margin concept. Implement SVM classification and regression, and evaluate the models.
K-Nearest Neighbors (KNN): Understand the KNN algorithm, distance metrics, and the concept of K in KNN. Implement KNN classification and regression, and evaluate the models.
Naive Bayes: Learn about the Naive Bayes algorithm, conditional probability, and Bayes' theorem. Implement Naive Bayes classification, and evaluate the model's performance.
Hands-On - Contact Tracing & Sarcasm Detection
AdaBoost: Boosting technique, weak learners, and iterative weight adjustment
Gradient Boosting (XGBoost): Gradient boosting algorithm, Regularization, and hyperparameter tuning
Evaluation and fine-tuning of ensemble models: Cross-validation, grid search, and model selection
Handling imbalanced datasets: Techniques for dealing with class imbalance, such as oversampling and undersampling
Hands-On - Medical Insurance Price Prediction
Introduction to Clustering: Definition, types, and use cases
K-means Clustering: Algorithm steps, initialization methods, and elbow method for determining the number of clusters
DBSCAN (Density-Based Spatial Clustering of Applications with Noise): Core points, density reachability, and epsilon-neighborhoods
Evaluation of clustering algorithms: Silhouette score, cohesion, and separation metrics
Introduction to Dimensionality Reduction: Curse of dimensionality, feature extraction, and feature selection
Principal Component Analysis (PCA): Eigenvectors, eigenvalues, variance explained, and dimensionality reduction
Implementation of PCA using scikit-learn library
Hyperparameter tuning using GridSearchCV and RandomizedSearchCV
Model selection and comparison
Text Preprocessing: Learn about tokenization, stemming, lemmatization, stop word removal, and other techniques for text preprocessing.
Text Representation: Explore techniques such as Bag-of-Words (BoW), TF-IDF, and word embeddings (e.g., Word2Vec, GloVe) for representing text data.
Sentiment Analysis: Study sentiment analysis techniques, build a sentiment analysis model using supervised learning, and evaluate its performance.
Hands-On - Real Time Sentiment Analysis
Introduction to Recommendation Systems: Understand the concept of recommendation systems, different types (collaborative filtering, content-based, hybrid), and evaluation metrics.
Collaborative Filtering: Explore collaborative filtering techniques, including user-based and item-based approaches, and implement a collaborative filtering model.
Content-Based Filtering: Study content-based filtering methods, such as TF-IDF and cosine similarity, and build a content-based recommendation system.
Deployment and Future Directions: Discuss the deployment of recommendation systems and explore advanced topics in NLP and recommendation systems.
Hands-On - News Recommendation System
- Introduction to Reinforcement Learning: Agent, environment, state, action, and reward
Markov Decision Processes (MDP): Markov property, transition probabilities, and value functions
Q-Learning algorithm: Exploration vs. exploitation, Q-table, and learning rate
Hands-On - Reinforcement Learning projects and exercises
Introduction
Natural Language Processing
Transformers, what can they do?
How do Transformers work?
Encoder Models
Decoder Models
Sequence-to-Sequence Models
Bias and Limitations
Mastering NLP
Summary
Introduction to audio data
Audio classification with a pipeline
Automatic speech recognition with a pipeline
Audio generation with a pipeline
Hands-on exercise
Hands-On - Speech-to-speech translation
Imaging
Imaging in Real-life
What Is Computer Vision
Pre-processing for Computer Vision Tasks
Applications of Computer Vision
Feature Description
Real-world Applications of Feature Extraction in Computer Vision
Feature Matching
Hands-On - Real-Time Detection
Introduction to Flask / Streamlit / Gradio web framework
Creating a Flask / Streamlit / Gradio application for ML model deployment
Integrating data preprocessing and ML model
Designing a user-friendly web interface
Deployment using AWS / PythonAnywhere / Streamlit Cloud / Spaces