Search results for Engineering

Frontmatter
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp i-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

5 - Simple Linear Regression
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp 187-240
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter Objectives
• To understand the need for simple linear regression.
• To comprehend the concept of hypothesis and parameters of simple linear regression.
• To understand mathematical modeling of cost function and its minimization.
• To understand the importance and different steps of the gradient descent algorithm.
• To comprehend the mathematical modeling of the gradient descent algorithm.
• To understand the role of learning rate α.
5.1 Introduction to Simple Linear Regression
As discussed in earlier chapters, regression predicts a continuous value or real-valued output. This chapter will discuss how regression works (from a mathematical aspect) to predict the continuous value for the given dataset. Our first learning algorithm is simple linear regression. In this section, we will discuss the fundamental concepts and mathematical modeling of simple linear regression.
We usually have a dependent variable having a continuous value whose value we wish to predict based on one or more independent variables. If we have only one independent or input variable, this situation is known as simple linear regression (also called univariate regression). If we have multiple independent or input variables, it is known as multiple linear regression or multivariate regression.
Linear regression could be used for studying patterns in different real-life scenarios. Consider a research lab where a researcher wants to understand how the stipend is effected by the years of experience, or, in simple words, we wish to predict the stipend based on the years of experience of the researcher. Machine learning (ML) is about learning from past experiences or data. Thus, to predict the researcher's stipend, we have to collect some data about past researchers, specifically their stipend and experience.
In the supervised learning models, we need a dataset called a training set. We will use the dataset as given in Table 5.1 for training the model, and our job will be to build the ML model that learns from this data and hence predicts the stipend of a researcher based on his experience. Here, the stipend will be considered the dependent or output variable because it depends on the researcher's years of experience. Thus, years of experience will be considered an independent or input variable. So, we will use simple linear regression to build the ML model. For proceeding with this problem, we will use a dataset of researchers’ stipends with their corresponding years of experience, as shown in Table 5.1.

17 - Implementation of the ANN
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp 865-888
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter Objectives
• To understand the process of implementation of the artificial neural network (ANN).
• To understand the role of keras and its different modules in building the ANN.
• To understand the syntax for adding input layer, hidden layers, and output layer to ANN.
• To perform a compilation of the ANN model.
• To fit the ANN model on the training dataset.
• To make predictions with a trained ANN model.
• To evaluate the performance of the ANN classifier by using confusion matrix, precision, and recall.
17.1 Building Artificial Neural Network for Cancer Detection
Machine learning (ML) can play a crucial role in cancer detection. In this chapter, we will build a neural network for cancer detection by using a breast cancer dataset.
You can download this dataset by using the following link.
https://www.kaggle.com/datasets/uciml/breast-cancer-wisconsin-data
The acquired data contains the records of cancer patients in the United States. These records were created by Dr William H. Wolberg and others at the University of Wisconsin, USA. The whole data has 32 columns along with 569 rows. The prominent attributes are the radius, texture, perimeter, smoothness, concavity, symmetry, area, compactness, concave points, and the fractal dimension of the tumor. A snapshot of the dataset is depicted in Figure 17.1.
The dataset has a diagnosis column used as an output variable, while the remaining variables will be used as input data. The class attribute diagnosis has two classes, i.e., malignant identified as M and benign identified as B. Thus, it will be a binary classifier.
The code and dataset used in this chapter are also available at the following link.
https://github.com/bhatiaparteek/ml_with_python/tree/main/Chapter_17_ANN
To build ANN over this dataset, the whole procedure can be divided into three sub-parts below.
i. Loading the dataset and performing pre-processing of data
ii. Building the artificial neural network (ANN)
iii. Making predictions and performing the validations
Let us perform all these operations by following a step-by-step approach.
17.2 Loading the Dataset and Pre-processing
In this step, we will perform tasks of loading the dataset and pre-processing.
17.2.1 Step 1: Importing the Libraries
To perform this task, we need to import two libraries, i.e., Pandas and NumPy, as shown in code snippet 1. NumPy facilitates mathematical operations, while Pandas specializes in loading and extracting datasets.

15 - Implementation of Association Mining
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp 807-820
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter Objectives
• To implement the Apriori algorithm for transaction dataset.
• To prepare the dataset in the form of a transactions list for its processing.
• To learn parameter tuning of the Apriori algorithm.
• To understand and analyze the results produced by the model.
15.1 Building Association Mining Model
In this chapter, we will implement the Apriori algorithm in Python to solve a business problem. Association rule mining is one of the most popular machine learning applications, which is often used by supermarket chains and retail outlets to find the relation between the sales of item X and item Y. It is often called a “market basket” analysis. Discovering associations among attributes can lead to fact-based marketing strategies for store floor plans, special discounts, coupon offerings, product clustering, and catalog design to identify items that need to be put in combo packs.
Let us solve the problem of one such retail store by implementing the Apriori algorithm in Python.
Problem statement: Consider a supermarket store selling 16 products. To better understand the association between the sales of different items, the store manager decides to perform a market-basket analysis through the Apriori algorithm. The goal is to find association mining rules to improve the store's sales. The list of 16 items in-store is shown in Table 15.1, and the manager is analyzing 25 transactions of the sale of these items as given in Table 15.2. The number of items and transactions in a real application will be larger. In Table 15.2, each row in the table represents one transaction, i.e., the items bought by one customer.
In this dataset, the manager is interested in finding association rules that should have a minimum of 25% support and 70% confidence.
The implementation of association mining can be broken down into multiple steps. These steps are described below:
Step 1: Importing libraries and loading the dataset—We must import the required libraries for model building into the environment, and then we must load the required dataset.
Step 2: Making transactions—Apriori takes an input as a set of transactions in the desired structure; thus, we need to prepare a set of transactions containing the combination of items.
Step 3: Building the model—An Apriori model will be trained on the lists of transactions, and rules will be generated based on some key parameters (support and confidence).

4 - Implementing Data Pre-processing in Python
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp 155-186
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter Objectives
• To understand the need for importing libraries like NumPy, Pandas, Matplotlib, Scikit–Learn.
• To learn the steps to import dataset.
• To understand the process for handling missing values.
• To discuss the steps for handling categorical data.
• To understand the need and process of splitting the dataset into training and testing datasets.
• To discuss the steps to perform feature scaling by using normalization and standardization.
Machine learning (ML) algorithms work on cleaned data. Usually, the data we collect for building ML models suffers from noise, missing values, inconsistent data types, and different data scales. This makes pre-processing of data a very important phase in preparing the data for building ML models. Pre-processing is when we apply transformations over the data before feeding it to the ML algorithm. In short, data pre-processing symbolizes a set of procedures applied to the data to make it fit for ML algorithms. It generally involves the following steps:
Step 1—Importing libraries: It involves importing the necessary libraries that are required to carry out the subsequent data manipulation and cleaning tasks.
Step 2—Loading the dataset: The dataset that needs to be pre-processed must be loaded.
Step 3—Handling the missing values: Dataset often contains missing or null values; these values need to be handled appropriately.
Step 4—Handling the categorical data: In the data pre-processing phase, it is crucial to address categorical attributes that often contain multiple categories. Handling categorical data becomes an important step to ensure proper treatment and transformation of these attributes.
Step 5—Splitting the dataset into training and testing datasets: Training and testing is the most important part of ML; thus, we need to split the dataset into training and testing subsets before building the ML models.
Step 6—Feature scaling: In datasets, the range of data often varies, or data is often of different scales. Thus, feature scaling needs to be done to ensure uniformity in results.
It is important to note that it is not necessary to apply all of these steps to pre-process the data. However, based on the nature of the dataset, some of these steps may be skipped for building the model. In the coming sections, we will discuss the importance or need of these steps and discuss how to perform these steps in Python.

20 - Recurrent Neural Network
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp 969-1006
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter Objectives
• To discuss the relationship between neural networks and the human brain.
• To learn the way to extend artificial neural network (ANN) to recurrent neural network (RNN).
• To know the limitations of feed-forward networks.
• To understand the working principle of RNNs.
• To understand the mathematical modeling of RNN.
• To know the limitations of RNNs.
• To get familiar with issues of vanishing and exploding gradients.
• To comprehend the concept of long short-term memory (LSTM).
• To understand the differences between RNN cells and LSTM cells.
• To understand the role of the input gate, forget gate, cell state, and output gate in the working of LSTM.
• To learn about the applications of RNN and LSTM.
20.1 Neural Networks and Human Brain
The inspiration to build artificial neural networks (ANNs) has come from the deep desire to simulate the working of the human brain. As our understanding of the human brain is getting enriched and improved daily due to the ongoing research on this topic, researchers are also improving the ANNs accordingly. One such recent addition is the recurrent neural networks (RNNs) and the long short-term memory (LSTM). In this chapter, we will discuss these developments in a simplified manner so that our readers can understand this amazing technology and use it for their development projects.
Sections 20.1 and 20.2 draw their inspiration from the blog titled “The Ultimate Guide to Recurrent Neural Networks (RNN)” published by SuperDataScience Team. Most of the figures used in these sections are adapted from this blog.
For detailed information on the blog, please refer to the “Additional Resources” section of this chapter.
Research on the human brain gives us the idea that the human brain has three main parts: cerebrum, cerebellum, and brainstem, as shown in Figure 20.1. Let us briefly discuss the main functions of these parts and their components to understand the association between neural networks and the human brain.
Cerebrum
From Figure 20.1, we can understand that the cerebrum has four lobes. These are as follows.

Preface
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp lvii-lviii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Welcome to the World of Machine Learning!
In this ever-evolving era of technology, the field of machine learning (ML) stands at the forefront of innovation, promising unprecedented insights and solutions to complex problems. This textbook provides a comprehensive understanding of the theoretical foundations of ML algorithms, coupled with practical implementation using Python, designed to be a companion for both beginners venturing into the field and seasoned practitioners seeking to deepen their understanding.
Why Machine Learning?
ML is not just a technical endeavor; it is a transformational force shaping the future of how we interact with data. From enhancing search engines and revolutionizing social media to powering self-driving cars and advancing artificial intelligence (AI), the applications of ML are boundless. ML has paved the way for developing sophisticated chatbots, generative AI models, and AI copilots. This book aims to demystify the intricate concepts of ML, making them accessible to learners of all backgrounds.
What This Book Offers
- Foundations: We begin with the fundamentals, laying a solid groundwork to help you grasp the core concepts of ML.
- “Learning by Doing” Approach: This book adopts a “learning by doing” approach, offering step-by-step coding instructions for various ML techniques. The aim is to empower you with the knowledge and skills to implement these principles effectively.
- Real-World Applications: Understanding theory is essential, but applying it to real-world scenarios is where the true power of ML unfolds. This book bridges theory and application, ensuring a holistic learning experience.
For Whom This Book Is Intended
Whether you are a student exploring the realms of ML for the first time, a data professional aiming to expand your skill set, or a business leader seeking insights into how ML can elevate your organization, this book is crafted with you in mind.
How to Use This Book
Feel free to navigate the chapters based on your familiarity with the subject. If you’re a beginner, start from the beginning, and if you’re seeking advanced knowledge, dive into specific sections of interest. Explore GitHub resources that provide access to datasets, sample code, and examples used in this book for hands-on learning. For enhanced visual understanding, this book is complemented by an online video course available at learncompscience.com, allowing you to reinforce your knowledge through engaging video sessions.

Acknowledgments
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp lix-lxii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

19 - Implementation of Convolutional Neural Network
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp 937-968
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter Objectives
• To build various image classifiers by using the convolutional neural network (CNN).
• To perform image augmentation for extending the dataset.
• To import keras library and packages.
• To initialize the CNN model.
• To add convolution layer, pooling, flattening, and full connection operations to build the CNN.
• To perform a compilation of the CNN model.
• To get the predictions from the trained model.
• To improve the accuracy of the model by adding more convolutional and max pooling layers.
19.1 Building Image Classifier with CNN
The convolutional neural network (CNN) is the perfect choice for building an image classifier system. In this chapter, we will implement various CNN classifiers in Python by considering various case studies.
19.2 Dog–Cat Classifier
Consider an image classification problem that involves identifying photos as either containing a cat or dog. Our task is to develop a CNN model that can identify the object in the image as a dog or cat. The CNN model will be trained on a dataset of images containing a dog or cat, and then this trained model will be used for classification. Once the model is built, it can be used as a template to build other models for predicting the class of any image on the trained dataset. It means we can use the same model to classify the tumors by training them on medical images.
Preparing the Dataset
To build a dog–cat classifier, we will use 11000 manually annotated photos of dogs and cats, where 5500 photos are of dogs, and the remaining ones are of cats. This is the partial dataset that has been created from the dogs-vs-cats dataset available at Kaggle. The link for this dataset is given below.
https://www.kaggle.com/competitions/dogs-vs-cats/data
This full dataset contains 37500 images (24000 in the train and 13500 in the test folders), which must be manually segregated into dog and cat folders. Since the dataset is huge and segregation is time-consuming, we took 10000 images (5000 images of dogs and 5000 images of cats) in the train folder and 1000 images in the test folder (500 images of dogs and 500 images of cats).

12 - Clustering
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp 611-698
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter Objectives
• To define clustering, explain its applications and features.
• To explain various proximity measures for data clustering.
• To discuss various clustering techniques.
• To explain the working principle of the k-means clustering algorithm.
• To discuss hierarchical clustering and its types.
• To discuss agglomerative and divisive clustering techniques.
• To describe the concept of the DBSCAN algorithm.
12.1 Introduction to Clustering
In machine learning (ML), labeling the data is one of the crucial tasks. But sometimes, we do not have the labeled data. Even though the data is not labeled, we can still analyze it using clustering techniques. As you know, algorithms in ML are broadly classified as supervised and unsupervised techniques. In the case of supervised learning, the input data points/examples are labeled, while in unsupervised learning, the input data points/examples are not labeled. Cluster analysis, also called clustering, comes under unsupervised learning. Here, the input data points are not labeled. Clustering is the most popular technique in unsupervised learning.
Clustering is defined as grouping the input data points into various clusters/groups based on their similarity.
A cluster contains objects that are more similar to each other. In other words, during cluster analysis, the data is grouped into classes or clusters, so that records within a cluster (intra-cluster) have high similarity with one another but have high dissimilarities in comparison to objects in other clusters (inter-cluster).
The clustering algorithm aims to minimize the intra-cluster distance and maximize the inter-cluster distance, as shown in Figure 12.1.
An example of clustering is shown in Figure 12.2. Here, records in the input have different shapes. Here, we only have the input data without any label of shape. After applying the clustering algorithm, they were classified into three types of clusters. Here, the clustering algorithm considers the dimensions of the object and its color as the input features. Records whose features are highly similar are gathered to form a single cluster. In this case, we get three clusters representing three types of records, i.e., it clusters rhombus, circle, and triangle separately, as shown in Figure 12.2.

18 - Deep Learning and Convolutional Neural Network
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp 889-936
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter Objectives
• To know the limitations of traditional neural networks for image recognition.
• To understand the working principles of convolutional neural network (CNN).
• To understand the architecture of CNN.
• To know the importance of convolution layer, max pooling, flattening, and full connection layer of CNN model.
• To understand the process of training a CNN model.
• To decide the optimal number of epochs to train a neural network.
18.1 Image Recognition
Using a convolutional neural network (CNN), technological development in image recognition has revolutionized far beyond our imagination. Let us consider the comic scene shown in Figure 18.1, as it provides interesting insights into the development of image recognition and depicts a decade-back possible scenario. Here, a manager asks his computer programmer to “Develop an app which can check whether the user is in a national park or not, when he clicks some photo!” Being an easy and feasible task, the computer programmer responds that the task is merely of few hours. But, the manager's curiosity goes up, and he asks the programmer further to check whether the image is of a bird or not? Surprisingly, the programmer responds, “I need a research team and five years for this task.”
This surprised the manager as he expected it to be an easy task. But the programmer who has a knack in the field knows that it is one of the complex problems to be addressed in computer science.
In the last decade, we have provided solutions to many complex problems in the field of computers. But, for the last 50 years, we have been struggling to solve the problems in image recognition. However, thanks to the efforts of researchers and computer scientists across the globe, we can solve these problems now. Even a three-year-old child can identify a bird's photo, but identifying a way by which computers can do the same task was not a cake-walk; hence, it took almost 50 years!
We have finally found a promising approach for object recognition using deep CNN in recent years. In this chapter, we will discuss the working principle and concepts of CNN, a deep neural network approach to solving the problem of image recognition.

7 - Multiple Linear Regression and Polynomial Linear Regression
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp 267-312
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter Objectives
• To comprehend the concept of multiple linear regression.
• To understand the process of handling nominal or categorical attributes and the concept of label encoding, one-hot encoding, and dummy variables.
• To build the multiple regression model.
• To understand the need, concept, and the process to calculate the P-value.
• To comprehend various variable selection methods.
• To comprehend the concept of polynomial linear regression.
• To understand the importance of the degree of independent variables.
7.1 Introduction to Multiple Linear Regression
In simple linear regression, we have one dependent variable and only one independent variable. As we discussed in the previous chapter, the stipend of a research scholar is dependent on his years of research experience.
But most of the time, the dependent variable is influenced by more than one independent variable. When the dependent variable depends on more than one independent variable, it is known as multiple linear regression.
Figure 7.1 indicates the difference between simple and multiple regressions mathematically.
In Figure 7.1(b), b0 is the constant while x1, x2, and xn are independent variables on which the dependent variable y depends. You can point out that mathematically multiple linear regression is derived similarly to simple linear regression. The major difference is that in multiple linear regression, we have multiple independent variables, as it is from x1 to xn instead of only one independent variable, x1, in the case of simple linear regression. It is also important to note that we have b1 to bn as the coefficients of these independent variables, respectively.
The price prediction of a house can be viewed as a multiple linear regression problem, where factors such as plot size, number of bedrooms, and location play a significant role in determining the price. Unlike simple linear regression, which relies solely on plot size, multiple linear regression considers various features to accurately estimate the house price.
Let us understand this concept further with a real-life example. Consider a case where a venture capitalist has hired a data scientist to analyze different companies’ data to predict their profit. Identifying the company having maximum profit will help the venture capitalist to select the company he could invest in the near future to earn maximum profit.

21 - Implementation of Recurrent Neural Network
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp 1007-1040
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter Objectives
• To understand the conceptual framework for the implementation of recurrent neural network (RNN) using the long short-term memory model.
• To perform data pre-processing on the time series data.
• To install and import TensorFlow, Keras, and other desired packages.
• To build the architecture of RNN.
• To learn the procedure for compiling RNN.
• To perform a fit operation on the compiled RNN model.
• To prepare the test dataset in desired data structure.
• To perform visualization and analysis of the results.
21.1 Introduction
In recent years, a recurrent neural network (RNN) has become one of the most prominent neural networks for predicting values based on time series data. The time series is a collection of data points ordered over even intervals in time. Hence, time series analysis is useful when we want to study some parameter changes over time. In RNNs, the output from the previous steps is fed into the input of the current state. Thus, RNN could predict the next letter of any word, the next word of the sentence, and the diesel prices, stock prices, or weather. In all these tasks, there is a need to remember the previous values, and consequently, the output at t+1 will be dependent on the output at the time t interval. In this chapter, we will implement an RNN to predict the trend of diesel prices.
We will implement an RNN through long short-term memory (LSTM) units. We will use stacked layers of LSTM, as RNNs suffer from the issue of vanishing gradient and exploding gradient. Whereas LSTM encompasses gates through which they can regulate the flow of information. Because of this added advantage, LSTMs are commonly used for implementing RNNs.
21.2 Implementation of RNN in Python
Problem Statement: In this experiment, our primary objective is to predict the trend of diesel prices in Delhi, i.e., we wish to predict that the diesel prices in Delhi will witness an upward or downward trend for the near future. It is difficult to predict the exact future price of diesel on a particular day, as our dataset for this problem is quite small. Hence, using an RNN model (implemented through stacked layers of LSTM), we will predict an approximate value of future diesel prices in Delhi, which helps to predict the trend accurately (upward or downward).

11 - Implementation of Classification Algorithms
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp 533-610
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter Objectives
• To understand about different steps to perform for the implementation of classifier algorithms in Python.
• To import the desired libraries and dataset.
• To split the dataset into training and testing datasets.
• To perform feature scaling on the data.
• To build the different classifications models and predict the test set results.
• To make the confusion matrix for the result analysis.
• To visualize the training and testing set result.
• To implement important classification algorithms like decision tree, random forest, naive Bayes, k-NN, logistic regression, and support vector machine.
11.1 Introduction to Classification Algorithms and Steps for Its Implementation
In this chapter, we will implement several classification algorithms of machine learning (ML) in Python. For the same, we will consider the Purchase Alexa Dataset, which contains multiple users’ information and their decision to buy Alexa in terms of “YES” or “NO”. The dataset features several independent columns and a labeled class attribute. Our main objective is to predict whether a user will buy Alexa or not (class attribute). The class predictions are based on the input attributes composed of several independent columns.
To better understand, we will use the same dataset, i.e., Purchase Alexa Dataset, to implement all the classification algorithms such as decision tree, random forest, naive Bayes, k-NN, logistic regression, and support vector machine (SVM). This will allow us to compare the results and the working of different classification algorithms. We will start by creating a pre-processing data template for our dataset, which shall remain common for all the classification models. Later, we will implement a classification model for each algorithm mentioned above. Then the performance of the model under study would be analyzed using the confusion matrix and the performance metrics. Finally, the training and testing results would be visualized graphically to better understand the working of classifiers.
A stepwise approach to implementing a classification model in Python is as follows:
Step 1: Importing libraries—Importing of libraries is required to perform various functions and steps related to data pre-processing and building the classifiers.
Step 2: Loading dataset—Download and load the desired dataset to Spyder.
Step 3: Splitting the dataset into training and testing datasets—Splitting the loaded dataset into training and testing subsets.

Contents
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp vii-xxii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

6 - Implementing Simple Linear Regression
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp 241-266
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter Objectives
• To import the necessary libraries and loading the dataset.
• To split the dataset into training and testing datasets.
• To build the simple linear model and make predictions.
• To visualize the training set and testing set results.
• To calculate mean absolute error, mean squared error, and root mean squared error.
In this chapter, we are going to implement the simple linear regression in Python. To implement this concept, we will analyze how the stipend of a researcher is related to their years of research experience. Our aim is to predict the stipend of the researcher based on his/her research experience.
Problem Statement and Dataset
To perform this task, we will consider a dataset consisting of two attributes: ResearchExperience and Stipend. There are 30 observations in this dataset to draw the correlation between the research experience and their corresponding stipend. A research institute aims to find this correlation between research experience and the stipend. This will assist the management in providing an appropriate stipend to new research scholars based on their years of research experience, rather than deciding randomly. The obvious thing is that the stipend is directly proportional to the research experience. The higher the experience, the more will be the stipend. We will use a simple linear regression model to solve this problem.
Let us quickly refresh our concepts of a simple linear regression model. We know that simple linear regression can best fit the straight line to generate a relationship between the research experience and the stipend. Though the dataset is quite simple, it has a great business value to it, as the model created will help the institute predict the stipend of the researcher based on their experience. Therefore, using this model, the stipend of the new researchers can be easily predicted, and this would also acknowledge the transparency in the management. Here, ResearchExperience (independent variable) will be our X and act as horizontal axis. In contrast, the Stipend (variable to be predicted is the dependent variable) will be Y and act as vertical axis, as shown in Figure 6.1.

Foreword
- By Irad Ben-Gal
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp liii-liv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In the ever-evolving landscape of technology, machine learning (ML) has emerged as a transformative force, reshaping industries and empowering innovations across the globe. Its profound impact on diverse fields, from healthcare and finance to autonomous vehicles and natural language processing, makes it an indispensable skill for the modern-day learner and practitioner.
Machine Learning with Python: Principles and Practical Techniques is a comprehensive and timely written book that navigates through the intricacies of this fascinating field. Written by Dr Parteek Bhatia, an expert in the domain, this book aims to provide a solid foundation for understanding the principles of ML and equipping readers with the practical techniques to tackle real-world challenges.
As we embark on this enlightening journey, the author starts by laying a solid groundwork, demystifying the fundamental concepts of ML. He skilfully explains the various learning paradigms, such as supervised, unsupervised, and reinforcement learning, and elucidates the key algorithms underpinning each. From linear regression and decision trees to support vector machines and deep neural networks, the book offers a comprehensive exposition that caters to readers with varying levels of expertise.
A distinguishing feature of this book is its strong emphasis on Python, a versatile, opensource, and widely used programming language in the realm of ML. By utilizing Python, readers are empowered with a practical and accessible toolset, facilitating the implementation and experimentation of various algorithms discussed in the text. The authors provide hands-on examples and code snippets, fostering a learning-by-doing approach that ensures readers can confidently harness the power of ML in their projects.
Whether you are a seasoned data scientist seeking to enhance your skillset or a curious beginner eager to dive into the world of ML, this book offers something for everyone. Its clear and concise presentation, enriched with illustrative examples and practical exercises, makes it an invaluable resource for self-study or classroom instruction.
Machine Learning with Python: Principles and Practical Techniques is a must-have addition to the library of any aspiring or seasoned ML practitioner. Its comprehensive coverage of key concepts, coupled with the practical implementation guidance using Python, makes it a timeless reference in the fast-paced landscape of ML.

8 - Implementation of Multiple Linear Regression and Polynomial Linear Regression
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp 313-358
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter Objectives
• To conduct a case study on predicting a company's profits using multiple linear regression.
• To learn about the process of importing libraries and the dataset.
• To perform encoding of the dataset.
• To divide the dataset into training and testing datasets.
• To fit multiple linear regression to the training set.
• To predict the class attribute for the test dataset.
• To perform model evaluation.
• To implement a case study of rank salary to build a polynomial linear regression.
• To compare the performance of polynomial linear regression and simple linear regression.
• To implement a case study of a chemistry lab experiment to build a polynomial linear regression.
8.1 Introduction to Multiple Linear Regression
In the previous chapter, we covered the concepts of multiple linear regression and polynomial linear regression. There, we defined regression as the prediction of continuous values; and in the case of multiple linear regression, there are multiple independent variables and a single dependent variable. If Y is a dependent variable whose value depends on X1, X2, X3,…, Xn independent variables, then mathematically, multiple linear regression is represented by Equation (8.1).
where b0 is the constant and b1, b2, …, bn are the coefficients. Each of the coefficients represents the change in the dependent variable due to the per unit change in the independent variable value. Let us move ahead and implement a multiple linear regression model in Python.
8.1.1 Understanding Dataset and Problem Statement
Let us consider the dataset of 50 advertisement agencies in India. A few instances of this dataset are shown in Table 8.1. The goal is to predict the agency's profit based on different parameters – Print Media Expenses, Social Media Expenses, Outdoor AD Expenses, and City or Location. All the figures mentioned in this dataset is in INR (Indian rupees).
The dataset used in this chapter is derived from the SuperDataScience dataset by Kirill Eremenko and his team. A big thanks to SuperDataScience for granting us permission to use and reference the dataset in this book. You can find it at https://www.superdatascience.com/pages/machine-learning.

List of Tables
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp xlvii-lii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Foreword
- By Ravi Shankar
Parteek Bhatia, Thapar University, India
Book:

Machine Learning with Python

Published online:

22 February 2025

Print publication:

31 May 2025, pp lv-lvi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Machine learning (ML) is one of the most upcoming knowledge areas transforming the world. It is a dynamic and transformative field that has the potential to reshape the way we interact with technology and the world around us. In the book, Machine Learning with Python: Principles and Practical Techniques, author Parteek Bhatia offers an insightful and hands-on approach to demystifying this complex subject.
In a world where ML plays an increasingly central role in our lives, understanding its principles and practical applications is essential. In this book, numerous algorithms are explained in detail, assuming no previous knowledge of readers. All these algorithms are widely used in the practicing world and research areas. This book is an invaluable resource for beginners and those looking to deepen their knowledge in the field.
Parteek Bhatia takes the reader on an engaging journey, starting from the basics and gradually building up to more advanced concepts. What sets this book apart is its focus on practicality. The book covers various ML techniques, from data pre-processing and regression to classification, clustering, and association mining. Each concept is illuminated with detailed Python implementations, allowing you to see firsthand how these algorithms work and how they can be applied to real-world problems. It also delves into more advanced topics like artificial neural networks, deep learning, convolutional neural networks, recurrent neural networks, and genetic algorithms. This comprehensive approach equips you with the tools and knowledge to tackle complex ML challenges.
Parteek Bhatia's passion for the subject and dedication to making it accessible shines through every chapter. This book is not just a collection of information; it is a learning adventure. It takes the reader from a beginner to a confident practitioner, ready to take on the exciting and everevolving world of ML.
Whether you are a student, a professional, or simply someone eager to explore the possibilities of ML, this book will be your trusted guide. Machine Learning with Python is your gateway to unlocking the potential of this fascinating field.
I commend Parteek Bhatia for his commitment to creating this educational masterpiece. As you embark on your journey through the captivating realm of ML, I encourage you to embrace the concepts, put them into practice, and let your curiosity and creativity flourish.

Engineering

Refine search

Refine search

Actions for selected content:

Save Search

209705 results in Engineering

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary