Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Missing feature values need to be addressed prior to the model development phase of the CRISP - DM methodology to avoid training on incomplete data.

Missing feature values need to be addressed prior to the model development phase of the CRISP-DM methodology to avoid training on incomplete data. This task assesses your ability to navigate the complexities of constructing a data pipeline to transform partially erroneous raw data to knowledge and evaluate the effectiveness of the proposed solution. The task will require you to identify a suitable publicly available dataset/s, for which there is previous research that addresses the missing feature value problem. Propose an approach to use the Nave Bayes classifier to address the missing feature value problem in the context of categorical feature values. Implement this approach on a publicly available dataset/s and report its performance in the context of a classification problem. Compare the effectiveness of this approach against a baseline imputation approach that uses the mode value, on two different machine learning models. Justify and discuss your findings using appropriate metrics and relate your findings to previous research on data imputation using the same dataset/s. Present your findings in a written report and a video presentation. State and motivate any assumptions or scope adopted during the task. Based on this question....Please write a two page report using these this literature. Use in text citation and also pull references. Chronic kidney disease (CKD) is a major burden on the healthcare system because of its increasing prevalence, high risk of progression to end-stage renal disease, and poor morbidity and mortality prognosis. It is rapidly becoming a global health crisis. Unhealthy dietary habits and insufficient water consumption are significant contributors to this disease. Without kidneys, a person can only live for 18 days on average, requiring kidney transplantation and dialysis. It is critical to have reliable techniques at predicting CKD in its early stages. Machine learning (ML) techniques are excellent in predicting CKD. The current study offers a methodology for predicting CKD status using clinical data, which incorporates data preprocessing, a technique for managing missing values, data aggregation, and feature extraction. A number of physiological variables, as well as ML techniques such as logistic regression (LR), decision tree (DT) classification, and K-nearest neighbor (KNN), were used in this work to train three distinct models for reliable prediction. The LR classification method was found to be the most accurate in this role, with an accuracy of about 97 percent in this study. The dataset that was used in the creation of the technique was the CKD dataset, which was made available to the public. Compared to prior research, the accuracy rate of the models employed in this study is considerably greater, implying that they are more trustworthy than the models used in previous studies as well. A large number of model comparisons have shown their resilience, and the scheme may be inferred from the study's results.
Go to:
1. Introduction
Chronic kidney disease (CKD) is a major public health concern around the world, with negative outcomes such as renal failure, cardiovascular disease, and early death [1].According to a 2010 study by the Global Burden of Disease Study (GBDS), chronic kidney disease (CKD) was listed as the 18th leading cause of mortality worldwide, up from 27th in 1990[2]. Chronic kidney disease affects over 500 million people worldwide [3,4], with a disproportionately high burden in developing countries, particularly South Asia and sub-Saharan Africa [5]. According to a 2015 study, there were 110 million people with CKD in high-income nations (men 48.3 million, women 61.7 million), but 387.5 million in low- and middle-income countries [6].
Bangladesh is a densely populated developing country in Southeast Asia where chronic kidney disease is on the rise year after year. The overall population of CKD is estimated to be 14 percent in a global study of six areas, including Bangladesh [7]. Another study discovered a 26% prevalence of chronic kidney disease among urban Dhaka residents over 30 years old [8], while another researcher discovered a 13% prevalence of chronic kidney disease among urban Dhaka residents over 15 years old [9]. In 2013, a community-based prevalence study in Bangladesh revealed that one-third of rural residents were at risk of developing CKD, which was generally misdiagnosed at the time [10]. The observed variations in CKD prevalence between Bangladeshi groups, on the other hand, could be explained by a number of factors, including the cross-sectional research design with a small sample size, the study period, and the geographic distribution of urban and rural areas. According to one study, the prevalence of CKD varies by age group, gender, socioeconomic status, and geographic region [7]. Chronic kidney disease (CKD) patients are more prone to developing end-stage renal disease (ESRD), which demands expensive treatment methods like di

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Essential SQLAlchemy Mapping Python To Databases

Authors: Myers, Jason Myers

2nd Edition

1491916567, 9781491916568

More Books

Students also viewed these Databases questions