Question
Target: Split the train dataset into two datasets for validation. Main Points: 1. You need to split the 4-week train dataset in to three datasets
Target: Split the train dataset into two datasets for validation. Main Points: 1. You need to split the 4-week train dataset in to three datasets for validation: 3 weeks for valid_train, one week for valid_test and some of aids truncated from valid_test as valid_test_label. 2. Like the relationship between the origin datasets, valid_train and valid_test have totally different sessions, while valid_test and valid_test_label have totally the same sessions. 3. All the aids in test and test_label datasets should be included in train dataset, so you need to insure all the aids in valid_test and valid_test_label dataset are in valid_train dataset.(Delete the aids in valid_test and valid_test_label datasets which are not in valid_train dataset) 4. Tips: You can split valid_train and valid_test by timestamp and truncate the event list for each session in valid_test randomly to achieve your target. 5. The timestamp to split valid_train and valid_test is 1661119200 """
# Add your code here timestamp_to_split = 1661119200 valid_train = train.filter(train['timestamp'] < timestamp_to_split) valid_test = train.filter(train['timestamp'] >= timestamp_to_split) valid_test = valid_test.width_column( 'aids', valid_test['aids'].apply(lambda aids: aids.sample(fractions=0.8) if len(aids) > 1 else aids) )
valid_test_label = valid_test.width_column( 'aids', valid_test['aids'].apply(lambda aids: aids[:1] if len(aids) > 1 else aids) ) valid_train_aids = valid_train['aids'].flatten().unique() valid_test = valid_test.filter(valid_test['aids'].apply(lambda aids: all(aid in valid_train_aids for aid in aids)) )
return valid_train, valid_test, valid_test_label
# word2vec model def W2V(sentences, mode): if mode == 'test': model_path = './model/test/' elif mode == 'valid': model_path = './model/valid/' else: raise Exception('Wrong mode')
files = os.listdir(model_path) if 'word2vec.model' in files: w2v = Word2Vec.load(model_path + 'word2vec.model') print('Word2Vec Loaded ') else: print('Word2Vec Start Training ') """
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started