Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 10, 2024

hello, I need help , when i try to call my pipline i get an error: from pyspark.ml . feature import VectorAssembler, OneHotEncoder, StringIndexer from

hello, I need help $,$ when i try to call my pipline i get an error:

from pyspark.ml $.$ feature import VectorAssembler, OneHotEncoder, StringIndexer

from pyspark.ml $.$ evaluation import MulticlassClassificationEvaluator

from pyspark.ml $.$ classification import LogisticRegression, DecisionTreeClassifier

from pyspark.ml import Pipeline

# create lists of feature names

num $_$ features $= ["$ age $",$ "hypertension","heart $_$ disease", "avg $_$ glucose $_$ level", "bmi" $]$

cat $_$ features $= [$ col for col in stroke $_$ df $.$ columns if col not in num $_$ features $+ ["$ stroke $"]]$

# create lists of integer $-$ encoded and one $-$ hot encoded features

ix $_$ features $= [$ col $+ "_$ ix $"$ for col in cat $_$ features $]$

vec $_$ features $= [$ col $+ "_$ vec" for col in cat $_$ features $]$

# create StringIndexer, OneHotEncoder, and VectorAssembler objects

indexer $=$ StringIndexer $($ inputCols $=$ cat $_$ features, outputCols $=$ ix $_$ features, handleInvalid $=$ "skip" $)$

encoder $=$ OneHotEncoder $($ inputCols $=$ ix $_$ features, outputCols $=$ vec $_$ features $)$

assembler $=$ VectorAssembler $($ inputCols $=$ num $_$ features $+$ vec $_$ features, outputCol $=$ "features" $)$

# create pipeline

pipeline $=$ Pipeline $($ stages $= [$ indexer $,$ encoder, assembler $])$

# fit pipeline to data

train $=$ pipeline.fit $($ stroke $_$ df $) .$ transform $($ stroke $_$ df $)$

# persist train DataFrame

train.persist $()$

# display first $10$ rows of features and stroke columns of train

train.select $("$ features $",$ "stroke" $) .$ show $(10,$ truncate $=$ False $)$

train $=$ train.withColumn $("$ label $",$ train $["$ stroke $"])$ Applying the Model to New Data : data $= {$

"gender": $["$ Female $",$ "Female", "Male", "Male" $],$

"age": $[42.0, 64.0, 37.0, 72.0],$

"hypertension": $[1, 1, 0, 0],$

"heart $_$ disease": $[0, 1, 0, 1],$

"ever $_$ married": $["$ No $",$ "Yes", "Yes", $"$ No $"],$

"work $_$ type": $["$ Private $",$ "Self $-$ employed", "Private", "Govt $_$ job" $],$

"Residence $_$ type": $["$ Urban $",$ "Rural", "Rural", "Urban" $],$

"avg $_$ glucose $_$ level": $[182.1, 175.5, 79.2, 125.7],$

"bmi": $[26.5, 32.5, 15.4, 19.4],$

"smoking $_$ status": $["$ smokes $",$ "formerly smoked", "unknown", "never smoked" $]$

$}$

new $_$ data $=$ pd $.$ DataFrame $($ data $)$

new $_$ data

processed $_$ new $_$ data $=$ pipeline.fit $($ new $_$ data $) .$ transform $($ new $_$ data $)$ : AttributeError: 'DataFrame' object has no attribute $'_$ jdf $'$

~ $/ .$ ipykernel $/ 16918 /$ command $- 472670917524769 - 290429957$ in $? ()$

$- - - - > 1$ processed $_$ new $_$ data $=$ pipeline.fit $($ new $_$ data $) .$ transform $($ new $_$ data $)$

$/$ databricks $/$ python $/$ lib $/$ python $3.10 /$ site $-$ packages $/$ pandas $/$ core $/$ generic $.$ py in $? ($ self $,$ name $)$

$5898$ and name not in self. $_$ accessors

$5899$ and self. $_$ info $_$ axis. $_$ can $_$ hold $_$ identifiers $_$ and $_$ holds $_$ name $($ name $)$

$5900)$ :

$5901$ return self $[$ name $]$

$- > 5902$ return object. $$ getattribute $($ self $,$ name $)$

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions

Question

Explain theories of moral development including those of Kohlberg, Gilligan, Nucci, and Haidt, and discuss how teachers can deal with one moral challenge for studentscheating.

Answered: 1 week ago

Question

★★★★★

The ABC Partnership is to be liquidated and you have been hired to prepare a Schedule of Cash Payments for the partnership. Partners Andie, Becka, and Candice share income and losses in the ratio of...

Answered: 1 week ago

Question

★★★★★

Daley Company prepared the following aging of receivables analysis at December 31. accounts receivable $665,000 total $415,000 - zero 1 to 30- $109,000 31 to 60- $55,000 61 to 90- $37,000 over to 90-...

Answered: 1 week ago

Question

★★★★★

A model of sheared fluid flow flow rates are increased, most flows undergo a transition from 'laminar' flow ooth flow parallel to the boundaries ) to 'turbulent '(chaotic ) flow . figure 16 (a ) of...

Answered: 1 week ago

Question

★★★★★

state the current situation of Learnenow inc and describe the team's goals and the desired outcome should be?

Answered: 1 week ago

Question

★★★★★

A collector found a bone near a former Mayan territory. After careful research it has been determined that the bone lost about 8% of its Carbon-14. Approximately how old is the bone? Could it have...

Answered: 1 week ago

Question

★★★★★

Sweet Memories Inc (for Q11-13) Sweet Memories Inc is a company that produces Customized Family Portraits. Preparing a Family Portrait involves 3-steps: Collect the Images and Stories, Edit, and...

Answered: 1 week ago

Question

★★★★★

5. An outside manufacturer has offered to produce 86,000 Daks and ship them directly to Andretti's customers. If Andretti Company accepts this offer, the facilities that it uses to produce Daks would...

Answered: 1 week ago

Question

★★★★★

Howard is in the 33% tax bracket, so his after-tax profit is 67% of his before-tax profit. He invests $10,000 in a certain stock, whose annual return is normally distributed with mean 5% and standard...

Answered: 1 week ago

Previous Question Next Question