Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1 Introduction Images in this specification are from Carl Kingsford s lectures on KD - Trees You have now seen several ADTs in Python that

1 Introduction
Images in this specification are from Carl Kingsfords lectures on KD-Trees
You have now seen several ADTs in Python that manage collections of ob- jects, like stacks, trees and queues. A common question that we need to ask of collections is as follows: what value in this collection is most similar to item X? This is called a nearest neighbour search. In this assignment, you will implement an abstract data structure called a kd-tree (short for k-dimensional tree) that efficiently supports this kind of a search.
2 KD-Trees: A primer
Figure 1. A KD-Tree where K =2. At left, an illustration of the way the tree partitions 2-dimensional space. At right, the nodes in the KD-tree.
A kd-tree is a form of a binary search tree that stores points in k-dimensional space. You can use a kd-tree to store a collection of points on a plane or in
1
higher-dimensional space. In this assignment, youll be building a kd-tree to store data about municipal trees and to quickly find trees based on their lo- cation. As in your labs, locations will be defined as longitude and latitude coordinates on a map.
At the root of a kd-tree, the x-value in the coordinate pair is used to sepa- rate nodes at the next level. All of the nodes to the right of the root have a larger or equal x-value than the node at the root and all nodes to the left of the root have a smaller x-value than the node at the root.
At the next level in the kd-tree it is the y-value that is used to separate nodes. In Figure 1, the nodes to the left of the one labelled (70,70) all have a y-value less than 70. The nodes to the right (if there were any) would all have a y-value greater than or equal to 70.
Because of the way data is organized within a kd-tree, we can efficiently query whether a given point exists within it as follows. Given a coordinate pair (X,Y), we start at the root of the tree. If the root node contains (X,Y), we know the value exists in the tree. However, if the x-value is strictly less than the x-value of the root node, we will search for (X,Y) in the left subtree. Otherwise, we will descend into the right subtree. When we descend a level, we will again ask if the node we are exploring contains (X,Y). If not, we will compare the y-value in our coordinate pair to the y-value at this node. If the y-value in the coordinate pair is strictly less than the y-value of the node, we will search for (X,Y) in the left subtree and otherwise, we will descend into the right subtree. We will continue this process, alternating between x and y comparisons at each level, until we reach the leaves of the kd-tree. We will either find the value we seek there or fail, in which case we know the value does not exist in our collection.
Your assignment will be to a KDTree class in Python that allows clients to build kd-trees and perform some lookup operations with them. We will specifi- cally encode operations to assess if a value exists in the tree as well as operations to locate points that are similar, or near a given query. This search for points that are similar to a query is called a nearest neighbour search. We will describe each operation in more detail below.
3 Step 1: Tree Construction
Constructing a KD-tree involves recursively partitioning the space in which our data points lie into smaller regions. These algorithm below assumes our data contains points in a 2-dimensional space and that the root of our KD-tree is anchored at a depth of 0. When you build your own KD-tree, your data will consist of longitude and latitudes that are associated with municipal trees. The first coordinate (x) will be the latitude, and the second (y) the longitude. The construction process (which you will encode in the build tree(data, depth)
2
method) follows:
4
If there is only one point in the input data, we will create a single leaf (or KDNode) to represent this point and return it. This will be the root of our tree, and were done.
Otherwise and if depth is even, we will split the data into two subsets based on the value of the x-coordinate. Call P1 the set of points with x-coordinates strictly less than the median value, and let P2 be all the remaining points.
If instead the depth is odd, we will split the list of input points into two subsets based on the value of the y-coordinate. Call P1 the set of points with y-coordinates strictly less than the median value, and let P2 be all the remaining points.
We will now recursively call build tree with input parameters P1 and depth+1(i.e. build tree(P 1, depth +1), and we will assign the KDNode returned from this recursive call to the variable vleft.
We will also recursively build tree with input parameters P2 and depth+1(i.e. call build tree(P 2, depth +1), and will assign the KDNode returned from this recursive call to a variable vright.
Finally, we will manufacture a KDNode to store the data point associated with the median and make vright the right child of this node and vleft the left

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Linked Data A Geographic Perspective

Authors: Glen Hart, Catherine Dolbear

1st Edition

1000218910, 9781000218916

More Books

Students also viewed these Databases questions

Question

How did the authors address the fallacy of homogeneity?

Answered: 1 week ago