Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

In this assignment, your task is to design an Earthquake Analyzer using Bogazici University Kandilli Observatory and Earthquake Research Institude data given in a .txt

In this assignment, your task is to design an Earthquake Analyzer using Bogazici University Kandilli Observatory and Earthquake Research Institude data given in a .txt file.

Step by step, we will develop the following functionalities :

  • Reading and processing earthquake data from a .txt file,
  • Finding the distribution of depths and dates of the earthquakes as a 1D histogram,
  • Counting regional earthquakes in the format of a 2D grid-histogram, and returning the most active and passive regions,
  • Sorting the earthquakes according to their magnitude and retrieving the top-k largest earthquakes in magnitude and in depth.

Part I: Reading the Earthquake Data (20 pts)

The data for the PS is provided in the "input.txt" file. It includes the following information in each line:

  • Date: Date of the earthquake
  • Time: Time of the earthquake
  • Latit (Latitude): The angular distance of a place north or south of the earth's equator, or of the equator of a celestial object, usually expressed in degrees and minutes. Mathematically, you can think this is as the y-coordinate in cartesian coordinate system.
  • Long (Longitude): The angular distance of a place east or west of the Greenwich meridian, or west of the standard meridian of a celestial object, usually expressed in degrees and minutes. Mathematically, you can think this is as the x-coordinate in cartesian coordinate system.
  • Depth (km): The depth of the earthquake origin.
  • **MD, ML, and Mw: ** Different types of magnitude for earthquake data.
  • **Region: ** The geographical name of the origin.
  • **Method: ** Observatory method, either quick (immediate) or corrected (refined) results.

Our goal is to read, extract, and store this data in the memory for further processing. We will be using the Richter magnitude, a.k.a local magnitude (ML) which is the most common measure to characterize the relative size of an earthquake recorded by seismographs.

Its scale is logarithmic, so each unit represents a ten-fold increase in the amplitude of the seismic waves. For example, an earthquake with a 4.0 magnitude makes a 10x impact compared to 3.0. Furthermore, we will ignore the Method column, since most of the methods are quick.

We will be indexing the earthquake data according to the lines in the input file. Note that the first two lines in the input file are for the heading, and the rest represents earthquake entries, one entry per line. We need to be able to retrieve the information about an earthquake based on its index. For example, for index 63 (64th entry, line 66 if we start from line 1), we expect to see the following line:

 

MAG: 4.2 DEPTH: 5.0km REGION: SAGPAZAR-BAYAT DATE/TIME: 2021-12-17 12:28:32 LAT: 40.4855 LON: 34.20677

Note that we do not tell you what data structure to use here. It is up to you how to store this data as long as you can retrieve it based on this indexing and print it as you see in the example.

Part II: 1D Histogram (The Most Active Date) (20 pts)

In this part, you are going to compute a 1D histogram out of the date data. From the entries, observe that the date is represented as dd.mm.yyyy (day, month, and year) and the the data covers the earthquakes in between 02.12.2021 and 18.12.2021 (both included). Including the starting and the ending dates, our data covers 17 days in total.

Your first task is to count how many earthquakes per day and find the most active day in this interval. The build the histogram, you will allocate a bin for each day, i.e. the bin size will be one day. In the next step, you will count the number of earthquakes for each day. You can start from a list of zeros where the first index corresponds to the number of earthquakes that happened in 02.12.2021, the second index corresponds to 03.12.2021 and so on. After counting all of the entries in the input txt, the resulting histogram should be as follows:

 

[3, 56, 45, 53, 53, 33, 53, 44, 37, 36, 25, 37, 35, 32, 51, 41, 42]

The most active day is in the second index with 56 earthquakes, which correponds to the date 2021-12-03.

Please note that, using Python's datetime library is allowed. It allows you to subtract days between objects as you did in the midterm yourself. It is handy when you have data including different months. For example, you can calculate the difference like this:

 

import datetime # create a datetime object for the date 17.05.2020 x1 = datetime.datetime(2020, 5, 17) # create a datetime object for the date 19.06.2020 x2 = datetime.datetime(2020, 6, 19) # calculate the difference between x1 and x2 and print it in days print((x2-x1).day)

Hint: This may be useful for your histogram computation while calculating the index for a specific date.

Final remark: Your code should work with ANY date/time range, not just the dates in the given example.

Here is the input.py data sample:

Date Time Latit(N) Long(E) Depth(km) MD ML Mw Region Method ---------- -------- -------- ------- ---------- ------------ ----------- ------- 2021.12.18 22:27:57 36.2620 28.9352 10.5 -.- 3.1 3.2 AKDENIZ Quick 2021.12.18 21:53:20 35.0258 25.7753 5.0 -.- 2.7 -.- GIRIT ADASI ACIKLARI (AKDENIZ) Quick 2021.12.18 21:26:37 36.9778 27.7713 3.8 -.- 1.6 -.- GOKOVA KORFEZI (AKDENIZ) Quick 2021.12.18 20:35:30 37.8640 35.1760 5.4 -.- 1.5 -.- PINARBASI-CAMARDI (NIGDE) Quick 2021.12.18 19:23:36 37.8505 26.7575 7.5 -.- 2.0 -.- EGE DENIZI Quick 2021.12.18 18:27:00 37.7900 32.0702 4.3 -.- 1.4 -.- YATAGAN-MERAM (KONYA) Quick 2021.12.18 18:19:10 41.1435 43.9102 9.5 -.- 2.1 -.- ERMENISTAN Quick 2021.12.18 17:19:05 39.8660 41.8363 15.0 -.- 1.6 -.- KAYABASI-KOPRUKOY (ERZURUM) Quick 2021.12.18 16:35:51 37.0740 28.4037 0.0 -.- 1.7 -.- YESILOVA-ULA (MUGLA) Quick 2021.12.18 16:21:42 38.9238 33.6222 4.0 -.- 1.6 -.- FADILLI-SEREFLIKOCHISAR (ANKARA) Quick 2021.12.18 15:57:50 36.2732 33.7483 0.0 -.- 1.6 -.- AKDERE-SILIFKE (MERSIN) Quick 2021.12.18 14:45:02 37.0588 28.2883 0.0 -.- 1.3 -.- KUYUCAK-(MUGLA) Quick 2021.12.18 03:39:29 37.1152 27.6445 4.9 -.- 1.5 -.- MUMCULAR-BODRUM (MUGLA) Quick 2021.12.18 02:37:38 36.0638 27.5588 8.0 -.- 2.9 -.- AKDENIZ REVISE01 (2021.12.18 02:37:38)

I managed to create a regex for all the possible inputs they might give:

ptn = re.compile(

r"^([\d.]+)\s+([\d:]+)\s+([\d.]+)\s+([\d.]+)\s+([\d.]+)\s+([\d.]+|-\.-)\s+([\d.]+|-\.-)\s+([\d.]+|-\.-)\s+([\w\s()-]+?)\s+([\w]+[\d]+\s+\([^)]*\)|\s+[a-zA-Z]+)$")

Thank you very much beforehand

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions

Question

What is an interface? What keyword is used to define one?

Answered: 1 week ago