Answered step by step
Verified Expert Solution
Question
1 Approved Answer
SMS Spam Classification: Detecting Unwanted Messages Life Cycle of the Project Steps to be Performed Introduction Problem Statement Data Checks to Perform Data Cleaning EDA
SMS Spam Classification: Detecting Unwanted Messages
Life Cycle of the Project
Steps to be Performed
Introduction
Problem Statement
Data Checks to Perform
Data Cleaning
EDA
Text Preprocessing
Model Training
Evaluation
Conclusion
Author Message
Introduction
This Kaggle notebook presents a stepbystep guide to building an efficient SMS spam classification model using the SMS Spam Collection dataset. By the end of this notebook, you'll have a powerful tool to help you filter out unwanted messages and ensure that your text messaging experience is smoother and safer.
Problem Statement
The primary goal of this notebook is to develop a predictive model that accurately classifies incoming SMS messages as either ham or spam. We will use the SMS Spam Collection dataset, which consists of SMS messages tagged with their respective labels.
Data Checks to Perform
Import Necessary Libraries
# Importing necessary libraries
import numpy as np # For numerical operations
import pandas as pd # For data manipulation and analysis
import matplotlib.pyplot as plt # For data visualization
matplotlib inline
# Importing WordCloud for text visualization
from wordcloud import WordCloud
# Importing NLTK for natural language processing
import nltk
from nltkcorpus import stopwords # For stopwords
# Downloading NLTK data
nltkdownloadstopwords # Downloading stopwords data
nltkdownloadpunkt # Downloading tokenizer data
optcondalibpythonsitepackagesscipyinitpy:: UserWarning: A NumPy version and is required for this version of SciPy detected version
warnings.warnfA NumPy version npminversion and npmaxversion
nltkdata Downloading package stopwords to usrsharenltkdata...
nltkdata Package stopwords is already uptodate!
nltkdata Downloading package punkt to usrsharenltkdata...
nltkdata Package punkt is already uptodate!
True
Back to the Top
Load the Data
df pdreadcsvkaggleinputsmsspamcollectiondatasetspamcsv encoding'latin
styleddf dfhead
styleddf styleddfstyle.settablestyles
selector: th "props": color 'black'backgroundcolor", #FFCC
styleddf
v v Unnamed: Unnamed: Unnamed:
ham Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat... nan nan nan
ham Ok lar... Joking wif u oni... nan nan nan
spam Free entry in a wkly comp to win FA Cup final tkts st May Text FA to to receive entry questionstd txt rateT&Cs apply overs nan nan nan
ham U dun say so early hor... U c already then say... nan nan nan
ham Nah I don't think he goes to usf, he lives around here though nan nan nan
Back to the Top
Data Cleaning
Data Info
dfinfo
RangeIndex: entries, to
Data columns total columns:
# Column NonNull Count Dtype
v nonnull object
v nonnull object
Unnamed: nonnull object
Unnamed: nonnull object
Unnamed: nonnull object
dtypes: object
memory usage: KB
Drop the Columns
dfdropcolumns Unnamed: 'Unnamed: 'Unnamed: inplace True
styleddf dfheadstyle
# Modify the color and background color of the table headers th
styleddfsettablestyles
selector: th "props": color 'Black'backgroundcolor", #FFCCfontweight', 'bold'
v v
ham Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat...
ham Ok lar... Joking wif u oni...
spam Free entry in a wkly comp to win FA Cup final tkts st May Text FA to to receive entry questionstd txt rateT&Cs apply overs
ham U dun say so early hor... U c already then say...
ham Nah I don't think he goes to usf, he lives around here though
Rename the Column
# Rename the columns name
dfrenamecolumns v: 'target', v: 'text' inplace True
Convert the target variable
from sklearn.preprocessing import LabelEncoder
encoder LabelEncoder
dftarget encoder.fittransformdftarget
styleddf dfheadstyle
# Modify the color and background color of the table headers th
styleddfsettablestyles
selector: th "props": color 'Black'backgroundcolor", #FFCCfontweight', 'bold'
target text
Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat...
Ok lar... Joking wif u oni...
Free entry in a wkly comp to win FA Cup final t
Can you explain this code
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started