Answered step by step
Verified Expert Solution
Question
1 Approved Answer
I have this code to run pyspark notebook. How do I display the original log entries? And what is regexp_extract doing? import re from pyspark.sql.types
I have this code to run pyspark notebook. How do I display the original log entries?
And what is regexp_extract doing?
import re
from pyspark.sql.types import *
from pyspark.sql.functions import *
inputPath = "/databricks-datasets/sample_logs/"
df = sqlContext.read.text(inputPath)
converted = df.select(unix_timestamp(regexp_extract(df["value"], ".+\[(.+) -", 1), "dd/MMM/yyyy:HH:mm:ss") \
.cast(TimestampType()),
split(df["value"], "")[8])
display(converted.take(10))
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started