Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

I have this code to run pyspark notebook. How do I display the original log entries? And what is regexp_extract doing? import re from pyspark.sql.types

I have this code to run pyspark notebook. How do I display the original log entries?

And what is regexp_extract doing?

import re

from pyspark.sql.types import *

from pyspark.sql.functions import *

inputPath = "/databricks-datasets/sample_logs/"

df = sqlContext.read.text(inputPath)

converted = df.select(unix_timestamp(regexp_extract(df["value"], ".+\[(.+) -", 1), "dd/MMM/yyyy:HH:mm:ss") \

.cast(TimestampType()),

split(df["value"], "")[8])

display(converted.take(10))

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Financial management theory and practice

Authors: Eugene F. Brigham and Michael C. Ehrhardt

12th Edition

978-0030243998, 30243998, 324422695, 978-0324422696

Students also viewed these Programming questions