Question

1 Approved Answer

Posted on Sep 28, 2024

Kindly explain the attached image. Thank you. 176 CHAPTER 8 Workload Management in the Data Warehouse weight to the already voluminous data warehouse from a

Kindly explain the attached image. Thank you.

176 CHAPTER 8 Workload Management in the Data Warehouse weight to the already voluminous data warehouse from a workload perspective, causing overwhelm- ing workloads and underperforming systems. Distributing the workload does not improve scalabil- ity and reduce workload, as anyone would anticipate since each distribution comes with a limited scalability. New workloads and Big Data Big Data brings about a new definition to the world of workloads. Apart from traditional challenges that exist in the world of data, the volume, velocity, variety, complexity, and ambiguous nature of Big Data creates a new class of challenges and issues. The key set of challenges and issues that we need to understand regarding data in the Big Data world include: Data does not have a finite architecture and can have multiple formats. Data is self-contained and needs several external business rules to be created to interpret and process the data. Data has a minimal or zero concept of referential integrity. Data is not relational. Data needs more analytical processing. Data depends on metadata for creating context. Data has no specificity with volume or complexity. . Data is semi-structured or unstructured. Data needs multiple cycles of processing, but each cycle needs to be processed in one pass due to the size of the data. Data needs business rules for processing like we handle structured data today, but these rules need to be created in a rules engine architecture rather than the database or the ETL tool. Data needs more governance than data in the database. Data has no defined quality. Big Data workloads Workload management as it pertains to Big Data is completely different from traditional data and its management. The major areas where workload definitions are important to understand for design and processing efficiency include: . Data is file based for acquisition and storage-whether you choose Hadoop, NoSQL, or any other technique, most of the Big Data is file based. The underlying reason for choosing file-based management is the ease of management of files, replication, and ability to store any format of data for processing. Data processing will happen in three steps: 1. Discovery-in this step the data is analyzed and categorized. 2. Analysis-in this step the data is associated with master data and metadata. 3. Analytics-in this step the data is converted to metrics and structured