Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Target: Split the train dataset into two datasets for validation. Main Points: 1. You need to split the 4-week train dataset in to three datasets

Target: Split the train dataset into two datasets for validation. Main Points: 1. You need to split the 4-week train dataset in to three datasets for validation: 3 weeks for valid_train, one week for valid_test and some of aids truncated from valid_test as valid_test_label. 2. Like the relationship between the origin datasets, valid_train and valid_test have totally different sessions, while valid_test and valid_test_label have totally the same sessions. 3. All the aids in test and test_label datasets should be included in train dataset, so you need to insure all the aids in valid_test and valid_test_label dataset are in valid_train dataset.(Delete the aids in valid_test and valid_test_label datasets which are not in valid_train dataset) 4. Tips: You can split valid_train and valid_test by timestamp and truncate the event list for each session in valid_test randomly to achieve your target. 5. The timestamp to split valid_train and valid_test is 1661119200 """

# Add your code here timestamp_to_split = 1661119200 valid_train = train.filter(train['timestamp'] < timestamp_to_split) valid_test = train.filter(train['timestamp'] >= timestamp_to_split) valid_test = valid_test.width_column( 'aids', valid_test['aids'].apply(lambda aids: aids.sample(fractions=0.8) if len(aids) > 1 else aids) )

valid_test_label = valid_test.width_column( 'aids', valid_test['aids'].apply(lambda aids: aids[:1] if len(aids) > 1 else aids) ) valid_train_aids = valid_train['aids'].flatten().unique() valid_test = valid_test.filter(valid_test['aids'].apply(lambda aids: all(aid in valid_train_aids for aid in aids)) )

return valid_train, valid_test, valid_test_label

# word2vec model def W2V(sentences, mode): if mode == 'test': model_path = './model/test/' elif mode == 'valid': model_path = './model/valid/' else: raise Exception('Wrong mode')

files = os.listdir(model_path) if 'word2vec.model' in files: w2v = Word2Vec.load(model_path + 'word2vec.model') print('Word2Vec Loaded ') else: print('Word2Vec Start Training ') """

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database 101

Authors: Guy Kawasaki

1st Edition

0938151525, 978-0938151524

More Books

Students also viewed these Databases questions