Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Please answer with the vlookup formula needed for the Customer Status in Part 3 and formulas needed for Parts 4-6 (ignore parts 1-2). Part 1:
Please answer with the vlookup formula needed for the "Customer Status" in Part 3 and formulas needed for Parts 4-6 (ignore parts 1-2).
Part 1: ETL with the OrderId field Let's decide that the rule is to leave OrderID in Source 1 alone and remove the "A" from Source 2. Try this: 1) Click on the "Full Set" tab. 2) Click on cell B2. 3) Press "=" to start a formula, switch to the Source 1 tab, and click on A2 there. 4) Press Enter and you'll see the OrderId from Source 1. 5) Copy that formula down to cell B30 on the Full Set tab (you'll see it's labeled "Data From Source 1"). 6) Now click on cell B31. 7) Press "=" to start a formula, and type "=RIGHT('Source 2'!A2, LEN('Source 2'!A2)-1) 8) Press Enter and you'll see the OrderId from Source 2 without the leading A 9) Copy that formula down to cell B59 on the Full Set tab (the part labeled Data From Source 2). Dissecting the formula: RIGHT(value, n) is an Excel function that takes the right n characters of value. So RIGHT("HELLO", 2) will return "LO". LEN(value) returns the number of characters contained in value. So LEN(123) and LEN("DOG) both return 3. SO LEN('Source 2'!A2)-1 looks at the length of the cell A2 and returns everything except the first character. Here's an example: Let's say the cell contains A12345. The LENgth is 6, so length -1 is 5. Now if you take the right 5 characters of A12345 you get only 12345. So you've transformed your data into a new format! Part 2: ETL with the Customer State/Province field Now let's look at the Customer State/Province field. Our rule will be that state and provinces (for Canada) names will be displayed using their abbreviation (i.e., PA instead of Pennsylvania, ON instead of Ontario). To do this, we can use the "State/Province Lookup" table that has been created in the "Lookups worksheet. Take a quick look at that table and then follow the instructions below: 1) Click on the "Full Set" tab. 2) Click on cell E2. 3) Press "=" to start a formula, switch to the Source 1 tab, and click on D2 there. 4) Press Enter and you'll see the Orderlo from Source 1. 5) Copy that formula down to cell E30 on the Full Set tab (you'll see it's labeled "Data From Source 1"). 6) Now click on cell E31. 7) Press "=" to start a formula, and type VLOOKUP('Source 2'!E2, Lookups!$A$3:$B$62,2,FALSE) 8) Press Enter and you'll see the state abbreviation from Source 2 ("KS") instead of the full name ("Kansas") 9) Copy that formula down to cell B59 on the Full Set tab (the part labeled Data From Source 2). Dissecting the formula: VLOOKUP(lookup_value, table array, column index, range lookup) is an Excel function that will match a value with another value in a separate table. So lookup_value" is value that you're looking for. So in this case Excel will look for the value contained in cell E2 in the Source 2 worksheet. In this case, that value is "Kansas". And table array" is the table where you're going to do your search. The table is from A3 to B62 on the "Lookups worksheet. Notice that the first column of that table is in alphabetical order. That is what it uses to find a match; if the first column isn't in alphabetical order (or ascending numerical order) the function won't work. Also, you need to use the dollar signs to keep the cell references from changing when you copy the formula to the other cells on the Full Set worksheet. In other words, you're lookup value keeps changing, but your lookup table is always the same. The parameter "column_index indicates column number with the value that is returned. Notice that column 2 has all of the state abbreviations. Finally, "range_lookup" is TRUE if we are looking for approximate matches and FALSE if we are looking for exact matches. Unless you have a good reason to do so, always use FALSE. Part 3: Finish the process Perform the ETL process on the rest of the fields in the Full Set worksheet: Customer Full Name* Customer City Customer Status* Order Date Product ID Product Unit Price Quantity Discount Full Price Extended Price Total Discount* * These are fields with inconsistent data between the fields. In most cases you'll just be copying the data from each worksheet without transformation (like you did in the first five steps in Parts 1 and 2). For example, Order Date is represented in the same way in Source 1 and Source 2. However, in other cases, such as Customer Full Name, Customer Status, and Total Discount, you'll need to transform the data. You may transform either Source 1 or Source 2, depending on the transformation rule you create. However, in each case you need to create and document a rule for each field, and apply that rule to your data. Here is a summary of the remaining inconsistencies: Source 1 Field Customer Full Name as one field Source 2 Field Customer First Name and Customer Last Name as separate fields Customer Status as 1, 2, and 3. 3 is the best. Customer Status as "Silver," "Gold," and "Platinum." Platinum is the best. Total Discount included Total Discount not computed Formula for combining Customer First Name and Customer Last Name ... CONCATENATE(value1, value2...): Combines the values in two or more cells Example: CONCATENATE(A1,", HELLO") will append the string", HELLO" to the end of whatever is in cell A1 Use a nested if statement to change customer status from numeric number to word description. To prevent error from happening, if the customer status is not 1, 2, and 3, "Invalid Status will be displayed to that cell. Part 4: Add the "Credit Line" information from "Source 3" to "Full Set" We are changing the original credit line policy so that a minimum credit line of $2,000 is being established, even if the customer had an original credit line of $0, it is now changed to $2,000. Create a new column called "New Credit Line" in Source 3 worksheet to reflect these changes. Use the VLOOKUP() function to put this updated information into the "Full Set" worksheet. You will notice that even if you do it correctly, there will be some errors (N/A" values). Question: Which customer(s) lacks usable data when you apply the VLOOKUP() function? Explain what is causing this problem? [5 points) After answering the question above, make the necessary change to the Source 3 worksheet to correct the issue so that usable Credit Line data appears for all the customers. Part 5: Add the "Missed Payments" information from "Source 3 to "Full Set We have observed an anomaly in the "Missed Payment" field. In the Source 3 worksheet, currently if a customer has no missed payments, their value for that field is NONE. We want to add a new column called "Missed Payments 2" to convert the string value of "NONE" to a numeric value of 0. Use the IE() function to do this. Once you do the transformation, use the VLOOKUP() function to bring the data in the Missed Payments 2" column into the "Full Set worksheet. Question: Write the data transformation rule for the missed payment field (list the syntax of the IFO function and explain the criteria you used to transform the data). [5 points] Part 6 Add the "Country" information from "Source 3 to "Full Set" We have observed a lack of consistency in the "Country"field. Notice that "United States" string is represented in several different ways. Choose one of those representations and transform the remaining "Country data so that the value for the United States is consistent across all customers. Create "Country 2" column and use it to hold the transformed data. Use an IF() statement to transform the data. Then use the VLOOKUP() function to bring the data into the "Full Set" worksheet. Question: Write the data transformation rule for the country field (list the syntax of the IF() function and explain the criteria you used to transform the data). [5 points] PA 12 PA PA Customer Order Name Customer City 10248 Suzan Plock Pittsburgh 10248 Suzan Plock Pittsburgh 10248 Suzan Plock Pittsburgh 10249 Allan Strate Miami 10249 Allan Strate Miami 10250 Elnora Willison Philadelphia 10250 Elnora Willison Philadelphia 10250 Elnora Willison Philadelphia 10251 Daniela Becknell Tacoma 10251 Daniela Becknell Tacoma 10251 Daniela Becknell Tacoma 10252 Cathrine Delamater Philadelphia 10252 Cathrine Delamater Philadelphia 10252 Cathrine Delamater Philadelphia 10253 Leota Vonderheide Philadelphia 10253 Leota Vonderheide Philadelphia 10253 Leota Vonderheide Philadelphia 10254 Tyrone Hine Pittsburgh 10254 Tyrone Hine Pittsburgh 10254 Tyrone Hine Pittsburgh 10255 Christin Tillinghast Tacoma 10255 Christin Tillinghast Tacoma 10255 Christin Tillinghast Tacoma 10255 Christin Tillinghast Tacoma 10256 Kisha Grauer Salt Lake City 10256 Kisha Grauer Salt Lake City 10257 Darryl Manuelito Charleston 10257 Darryl Manuelito Charleston 10257 Darryl Manuelito Charleston Customer Customer State/Province Status Silver Silver Silver FL Platinum FL Platinum PA Platinum Platinum PA Platinum WA Platinum WA Platinum WA Platinum Platinum PA Platinum Platinum PA Platinum PA Platinum PA Platinum PA Gold Gold PA Gold WA Gold WA Gold WA Gold WA Gold UT Silver Silver SC Silver Silver SC Silver PA Order Product Unit Discoun Date ID Product Price Quantity t Jan-11 11 Queso Cabrales $14.00 $0.00 Jan-11 42 Singaporean Hokkien Fried Mee $9.80 10 $0.00 Jan-11 72 Mozzarella di Giovanni $34.80 $0.00 Jan-11 51 Manjimup Dried Apples $42.40 40 $0.00 Jan-11 14 Tofu $18.60 $0.00 Jan-11 51 Manjimup Dried Apples $42.40 35 $0.15 Jan-11 41 Jack's New England Clam Chowc $7.70 10 $0.00 Jan-11 65 Louisiana Fiery Hot Pepper Sauc $16.80 15 $0.15 Jan-11 65 Louisiana Fiery Hot Pepper Sauc $16.80 $0.00 Jan-11 22 Gustaf's Knckebrd $16.80 $0.05 Jan-11 57 Ravioli Angelo $15.60 $0.05 Jan-11 33 Geitost $2.00 25 $0.05 Jan-11 20 Sir Rodney's Marmalade $64.80 $0.05 Jan-11 60 Camembert Pierrot $27.20 $0.00 Jan-11 49 Maxilaku $16.00 $0.00 Jan-11 39 Chartreuse verte $14.40 $0.00 Jan-11 31 Gorgonzola Telino $10.00 $0.00 Jan-11 55 Pt chinois $19.20 $0.15 Jan-11 74 Longlife Tofu $8.00 $0.00 Jan-11 24 Guaran Fantstica $3.60 15 $0.15 Jan-11 36 Inlagd Sill $15.20 $0.00 Jan-11 59 Raclette Courdavault $44.00 $0.00 Jan-11 16 Pavlova $13.90 $0.00 Jan-11 2 Chang $15.20 $0.00 Jan-11 53 Perth Pasties $26.20 $0.15 Jan-11 77 Original Frankfurter grne Soe $10.40 $0.00 Jan-11 27 Schoggi Schokolade $35.10 $0.00 Jan-11 39 Chartreuse verte $14.40 $0.00 Jan-11 77 Original Frankfurter grne Soe $10.40 15 $0.00 Extended Total Full Price Price Discoun $168.00 $168.00 $0.00 $98.00 $98.00 $0.001 $174.00 $174.00 $0.00 $1,696.00 $1,696.00 $0.00 $167.40 $167.40 $0.00 $1,484.00 $1,261.40 $222.60 $77.00 $77.00 $0.00 $252.00 $214.20 $37.80 $336.00 $336.00 $0.00 $100.80 $95.76 $5.04 $234.00 $234.00 $222.30 $11.70 $50.00 $47.50 $2.50 $2,592.00 $2,462.40 $129.60 $1,088.00 $1,088.00 $0.00 $640.00 $640.00 $0.00 $604.80 $604.80 $0.00 $200.00 $200.00 $0.00 $403.20 $342.72 $60.48 $168.00 $168.00 $0.00 $54.00 $45.90 $8.10 $380.00 $380.00 $0.00 $1,320.00 $1,320.00 $0.00 $486.50 $486.50 $486.50 $70 $0.00 $304.00 $304.00 $0.00 $393.00 $393.00 $0.00 $124.80 $124.80 $0.00 $877.50 $877.50 $0.00 $86.40 $86.40 $0.00 $156.00 $156.00 $0.00 PA 30 15 UT SC OrderID Customer First Na Customer Last Na Customer City Customer State/Province Customer Status Order Date Product ID Product Unit Price Quantity Discount Full Prict Extende Total Discount A10258 Milagros Fehrenbach Kansas City Kansas Jan-11 2 Chang $15.20 $0.20 $760.00 $608.00 A10258 Milagros Fehrenbach Kansas City Kansas Jan-11 32 Mascarpone Fabioli $25.60 $0.20 $153.60 $122.88 A10258 Milagros Fehrenbach Kansas City Kansas Jan-11 5 Chef Anton's Gumbo Mix $17.00 $0.20 $1,105.00 $884.00 A10259 Julio Willard Greensboro North Carolina Jan-11 21 Sir Rodney's Scones $8.00 $0.00 $80.00 $80.00 (Notice that this A10259 Julio Willard Greensboro North Carolina Jan-11 37 Gravad lax $20.80 $0.00 $20.80 $20.80 data is missing! A10260 Roslyn Plott Colombus Ohio Jan-11 70 Outback Lager $12.00 $0.25 $252.00 $189.00 A10260 Roslyn Plott Colombus Ohio Jan-11 57 Ravioli Angelo $15.60 $0.00 $780.00 $780.00 But also notice A10260 Roslyn Plott Colombus Ohio Jan-11 62 Tarte au sucre $39.40 $0.25 $591.00 $443.25 it is Full Price - A10260 Roslyn Plott Colombus Ohio Jan-11 41 Jack's New England Clam Chowc $7.70 $0.25 $123.20 $92.40 Extended Price A10261 Eve Haak Detroit Michigan Jan-11 35 Steeleye Stout $14.40 $0.00 $288.00 $288.00 A10261 Eve Haak Detroit Michigan Jan-11 21 Sir Rodney's Scones $8.00 $0.00 $160.00 $160.00 A10262 Max Kindle Colombus Ohio Jan-11 5 Chef Anton's Gumbo Mix $17.00 $0.20 $204.00 $163.20 A10262 Max Kindle Colombus Ohio Jan-11 7 Uncle Bob's Organic Dried Pears $24.00 $0.00 $360.00 $360.00 A10262 Max Kindle Colombus Ohio Jan-11 56 Gnocchi di nonna Alice $30.40 $0.00 $60.80 $60.80 A10263 Roxie Prewitt Provo Utah Jan-11 16 Pavlova $13.90 $0.25 $834.00 $625.50 A10263 Roxie Prewitt Provo Utah Jan-11 30 Nord-Ost Matjeshering $20.70 $0.25 $1,242.00 $931.50 A10263 Roxie Prewitt Provo Utah Jan-11 74 Longlife Tofu $8.00 $0.25 $288.00 $216.00 A10263 Roxie Prewitt Provo Utah Jan-11 24 Guaran Fantstica $3.60 $0.00 $100.80 $100.80 A10264 Erik Vire Philadelphia Pennsylvania Jan-11 2 Chang $15.20 $0.00 $532.00 $532.00 A10264 Erik Vire Philadelphia Pennsylvania Jan-11 41 Jack's New England Clam Chowc $7.70 $0.15 $192.50 $163.62 A10265 Lonnie Ludy Kansas City Missouri Jan-11 17 Alice Mutton $31.20 $0.00 $936.00 $936.00 A10265 Lonnie Ludy Kansas City Missouri Jan-11 70 Outback Lager $12.00 $0.00 $240.00 $240.00 A10266 Jamie Woodbridge Seattle Washington Jan-11 12 Queso Manchego La Pastora $30.40 $0.05 $364.80 $346.56 A10267 Eleanor Aburto Toronto Ontario Jan-11 76 Lakkalikri $14.40 $0.15 $216.00 $183.60 A10267 Eleanor Aburto Toronto Ontario Jan-11 40 Boston Crab Meat $14.70 $0.00 $735.00 $735.00 A10267 Eleanor Aburto Toronto Ontario Jan-11 59 Raclette Courdavault $44.00 $0.15 $3,080.00 $2,618.00 A10268 Odessa Smith Providence Rhode Island Jan-11 29 Thringer Rostbratwurst $99.00 $0.00 $990.00 $990.00 A10268 Odessa Smith Providence Rhode Island Jan-11 72 Mozzarella di Giovanni $27.80 $0.00 $111.20 $111.20 A10269 Jamie Sabat Vancover British Colombia Jan-11 72 Mozzarella di Giovanni $27.80 $0.05 $556.00 $528.20 A10269 Jamie Sabat Vancover British Colombia Jan-11 33 Geitost 60 $0.05 $120.00 $114.00 wwNNE WNNNNNNNNNNN wwwwwwNNN $2.00 168 0 ototoOOOO 174 1696 167.4 1184 168 98 174 1696 167.4 1261.4 ........ 222.6 15 37.8 PA WA 20 WA 0.05 15 5.04 11.7 WA PA 2.5 129.6 Data from Source 1 252 336 100.8 234 50 2592 1088 640 604.8 200 403.2 168 Silver Silver Silver Platinum Platinum Platinum Platinum Platinum Platinum Platinum Platinum Platinum Platinum Platinum Platinum Platinum Platinum Gold Gold Gold Gold Gold Gold Gold Silver Silver Silver Silver Silver 0 0 0 60.48 0 8.1 10248 Suzan Plock 10248 Suzan Plock 10248 Suzan Plock 10249 Allan Strate 10249 Allan Strate 10250 Elnora Willison 10250 Elnora Willison 10250 Elnora Willison 10251 Daniela Becknell 10251 Daniela Becknell 10251 Daniela Becknell 10252 Cathrine Delamater 10252 Cathrine Delamater 10252 Cathrine Delamater 10253 Lepta Vonderhelde 10253 Leota Vonderheide 10253 Leota Vonderheide 10254 Tyrone Hine 10254 Tyrone Hine 10254 Tyrone Hine 10255 Christin Tillinghast 10255 Christin Tillinghast 10255 Christin Tillinghast 10255 Christin Tillinghast 10256 Kisha Grauer 10256 Kisha Grauer 10257 Darryl Manuelito 10257 Darryl Manuelito 10257 Darryl Manuelito 10258 Milagros Fehrenbach 10258 Milagros Fehrenbach 10258 Milagros Fehrenbach 10259 Julio Willard 10259 Julio Willard 10260 Roslyn Plott 10260 Roslyn Plott 10260 Roslyn Plott 10260 Roslyn Plott 10261 Eve Haak 10261 Eve Haak 10262 Max Kindle 10262 Max Kindle 10262 Max Kindle 10263 Roxe Prewitt 10263 Roxie Prewitt 10263 Roxie Prewitt 10263 Roxie Prewitt 10264 Erik Vire 10264 Erik Vire 10265 Lonnie Ludy 10265 Lonnie Ludy 10266 Jamie Woodbridge WA WA WA WA Pittsburgh Pittsburgh Pittsburgh Miami Miami Philadelphia Philadelphia Philadelphia Tacoma Tacoma Tacoma Philadelphia Philadelphia Philadelphia Philadelphia Philadelphia Philadelphia Pittsburgh Pittsburgh Pittsburgh Tacoma Tacoma Tacoma Tacoma Salt Lake City Salt Lake City Charleston Charleston Charleston Kansas City Kansas City Kansas City Greensboro Greensboro Colombus Colombus Colombus Colombus Detroit Detroit Colombus Colombus Colombus Provo Provo Provo Provo Philadelphia Philadelphia Kansas City Kansas City Seattle Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 11 Queso C 14 42 Singapor 9.8 72 Mozzarel 34.8 51 Manjimu 42.4 14 Tofu 51 Manjimu 12.4 41 Jack's Ne 7.7 65 Louisiana 16.8 65 Louisiana 16.8 22 Gustaf's 16.8 57 Ravioli A 15.6 33 Geitost 2 20 Sir Rodn 64.8 6 0 Camemt 27.2 49 Maxilaku 16 39 Chartreu 14.4 31 Gorgonzi 10 55 Pat chir 19.2 74 Longlife 8 24 Guaran 3.6 36 Inlagd Si15.2 59 Radette 44 16 Pavlova 13.9 2 Chang 15.2 53 Perth Pa 26.2 77 Original 10.4 27 Scheggi 35.1 39 Chartreu 14.4 7 7 Original 10.4 2 Chang 15.2 32 Mascarp: 25.6 5 Chef Ant 17 21 Sir Rodn8 37 Gravad la 20.8 70 Outback 12 57 Ravioli A 15.6 62 Tarte au 39.4 41 Jack's Ne 7.7 35 Steeleye 14.4 21 Sir Rodn8 5 Chef Ant 17 7 Unde Bo24 56 Gnocchi 30.4 16 Pavlova 13.9 30 Nord-Ost 20.7 74 Longlife 8 24 Guaran 3.6 2 Chang 15.2 41 Jack's Ne 7.7 17 Alice Mul 31.2 70 Outback 12 12 Queso M 30.4 380 1320 486.5 304 393 124.8 877.5 86.4 55888222225 214.2 336 95.76 222.3 47.5 2462.4 10BB 640 604.8 200 342.72 16B 45.9 380 1320 486.5 304 393 124.8 877.5 86.4 156 60B 122.88 884 BO 20.8 189 780 443.25 92.4 2BB 160 163.2 156 EL....................... 152 30.72 221 0 760 153.6 1105 80 20.8 252 780 0 0 0.25 N OH 15 591 0.25 147.75 30.8 MI Data from Source 123.2 288 160 204 OH 0.2 0 40.8 D 360 360 0 0.25 0.25 60.8 834 208.5 310.5 0.25 0 UT 0 60.B 625.5 931.5 216 100.B 532 163.62 936 240 346.56 1242 298 100.8 532 192.5 936 240 364.8 0 0.15 28.88 MO MO 0 18.24 WA 0.05 AL MB MD MN State and Province Lookup State/Province Abbreviation Alabama Alaska AK Alberta Arizona Arkansas British Columbia California Colorado Connecticut Delaware Florida Georgia Hawaii Idaho Illinois Indiana Iowa Kansas Kentucky Louisiana Maine Manitoba Maryland Massachusetts MA Michigan MI Minnesota Mississippi Missouri MO Montana Nebraska Nevada New Brunswick NB New Hampshire New Jersey New Mexico NM New York Newfoundland and Labrador North Carolina North Dakota Nova Scotia Ohio Oklahoma Ontario Oregon Pennsylvania Prince Edward Island Quebec Rhode Island Saskatchewan South Carolina South Dakota Tennessee Texas Utah MT NE NH NY NL NC 2 Z 8 8@EX 5 Customer Name Original Credit Line Missed Payment Country Allan Strate $10,000.00 1 USA Cathrine Delamater $20,000.00 3 USA Christin Tillinghast $8,000.00 5 United States Daniela Becknell $10,000.00 NONE United States Darryl Manuelito $10,000.00 NONE US Eleanor Aburto $10,000.00 NONE Canada Elnora Willison $20,000.00 3 US Erik Vire $0.00 NONE US Eve Haak $5,000.00 NONE United States Jamie Sabat $5,000.00 NONE Canada Jamie Woodbridge $5,000.00 5 USA Julio Willard $10,000.00 NONE USA Kisha Grauer $7,500.00 NONE USA Leota Vonderheide $7,500.00 3 USA Lonnie Ludy $10,000.00 2 USA Max Kindel $10,000.00 NONE USA Milagros Fehrenbac $0.00 NONE USA Odessa Smith $20,000.00 NONE USA Roslyn Plott $20,000.00 NONE US Roxie Prewitt $20,000.00 3 US Suzan Plock $7,500.00 NONE US Tyrone Hine $20,000.00 2 US Part 1: ETL with the OrderId field Let's decide that the rule is to leave OrderID in Source 1 alone and remove the "A" from Source 2. Try this: 1) Click on the "Full Set" tab. 2) Click on cell B2. 3) Press "=" to start a formula, switch to the Source 1 tab, and click on A2 there. 4) Press Enter and you'll see the OrderId from Source 1. 5) Copy that formula down to cell B30 on the Full Set tab (you'll see it's labeled "Data From Source 1"). 6) Now click on cell B31. 7) Press "=" to start a formula, and type "=RIGHT('Source 2'!A2, LEN('Source 2'!A2)-1) 8) Press Enter and you'll see the OrderId from Source 2 without the leading A 9) Copy that formula down to cell B59 on the Full Set tab (the part labeled Data From Source 2). Dissecting the formula: RIGHT(value, n) is an Excel function that takes the right n characters of value. So RIGHT("HELLO", 2) will return "LO". LEN(value) returns the number of characters contained in value. So LEN(123) and LEN("DOG) both return 3. SO LEN('Source 2'!A2)-1 looks at the length of the cell A2 and returns everything except the first character. Here's an example: Let's say the cell contains A12345. The LENgth is 6, so length -1 is 5. Now if you take the right 5 characters of A12345 you get only 12345. So you've transformed your data into a new format! Part 2: ETL with the Customer State/Province field Now let's look at the Customer State/Province field. Our rule will be that state and provinces (for Canada) names will be displayed using their abbreviation (i.e., PA instead of Pennsylvania, ON instead of Ontario). To do this, we can use the "State/Province Lookup" table that has been created in the "Lookups worksheet. Take a quick look at that table and then follow the instructions below: 1) Click on the "Full Set" tab. 2) Click on cell E2. 3) Press "=" to start a formula, switch to the Source 1 tab, and click on D2 there. 4) Press Enter and you'll see the Orderlo from Source 1. 5) Copy that formula down to cell E30 on the Full Set tab (you'll see it's labeled "Data From Source 1"). 6) Now click on cell E31. 7) Press "=" to start a formula, and type VLOOKUP('Source 2'!E2, Lookups!$A$3:$B$62,2,FALSE) 8) Press Enter and you'll see the state abbreviation from Source 2 ("KS") instead of the full name ("Kansas") 9) Copy that formula down to cell B59 on the Full Set tab (the part labeled Data From Source 2). Dissecting the formula: VLOOKUP(lookup_value, table array, column index, range lookup) is an Excel function that will match a value with another value in a separate table. So lookup_value" is value that you're looking for. So in this case Excel will look for the value contained in cell E2 in the Source 2 worksheet. In this case, that value is "Kansas". And table array" is the table where you're going to do your search. The table is from A3 to B62 on the "Lookups worksheet. Notice that the first column of that table is in alphabetical order. That is what it uses to find a match; if the first column isn't in alphabetical order (or ascending numerical order) the function won't work. Also, you need to use the dollar signs to keep the cell references from changing when you copy the formula to the other cells on the Full Set worksheet. In other words, you're lookup value keeps changing, but your lookup table is always the same. The parameter "column_index indicates column number with the value that is returned. Notice that column 2 has all of the state abbreviations. Finally, "range_lookup" is TRUE if we are looking for approximate matches and FALSE if we are looking for exact matches. Unless you have a good reason to do so, always use FALSE. Part 3: Finish the process Perform the ETL process on the rest of the fields in the Full Set worksheet: Customer Full Name* Customer City Customer Status* Order Date Product ID Product Unit Price Quantity Discount Full Price Extended Price Total Discount* * These are fields with inconsistent data between the fields. In most cases you'll just be copying the data from each worksheet without transformation (like you did in the first five steps in Parts 1 and 2). For example, Order Date is represented in the same way in Source 1 and Source 2. However, in other cases, such as Customer Full Name, Customer Status, and Total Discount, you'll need to transform the data. You may transform either Source 1 or Source 2, depending on the transformation rule you create. However, in each case you need to create and document a rule for each field, and apply that rule to your data. Here is a summary of the remaining inconsistencies: Source 1 Field Customer Full Name as one field Source 2 Field Customer First Name and Customer Last Name as separate fields Customer Status as 1, 2, and 3. 3 is the best. Customer Status as "Silver," "Gold," and "Platinum." Platinum is the best. Total Discount included Total Discount not computed Formula for combining Customer First Name and Customer Last Name ... CONCATENATE(value1, value2...): Combines the values in two or more cells Example: CONCATENATE(A1,", HELLO") will append the string", HELLO" to the end of whatever is in cell A1 Use a nested if statement to change customer status from numeric number to word description. To prevent error from happening, if the customer status is not 1, 2, and 3, "Invalid Status will be displayed to that cell. Part 4: Add the "Credit Line" information from "Source 3" to "Full Set" We are changing the original credit line policy so that a minimum credit line of $2,000 is being established, even if the customer had an original credit line of $0, it is now changed to $2,000. Create a new column called "New Credit Line" in Source 3 worksheet to reflect these changes. Use the VLOOKUP() function to put this updated information into the "Full Set" worksheet. You will notice that even if you do it correctly, there will be some errors (N/A" values). Question: Which customer(s) lacks usable data when you apply the VLOOKUP() function? Explain what is causing this problem? [5 points) After answering the question above, make the necessary change to the Source 3 worksheet to correct the issue so that usable Credit Line data appears for all the customers. Part 5: Add the "Missed Payments" information from "Source 3 to "Full Set We have observed an anomaly in the "Missed Payment" field. In the Source 3 worksheet, currently if a customer has no missed payments, their value for that field is NONE. We want to add a new column called "Missed Payments 2" to convert the string value of "NONE" to a numeric value of 0. Use the IE() function to do this. Once you do the transformation, use the VLOOKUP() function to bring the data in the Missed Payments 2" column into the "Full Set worksheet. Question: Write the data transformation rule for the missed payment field (list the syntax of the IFO function and explain the criteria you used to transform the data). [5 points] Part 6 Add the "Country" information from "Source 3 to "Full Set" We have observed a lack of consistency in the "Country"field. Notice that "United States" string is represented in several different ways. Choose one of those representations and transform the remaining "Country data so that the value for the United States is consistent across all customers. Create "Country 2" column and use it to hold the transformed data. Use an IF() statement to transform the data. Then use the VLOOKUP() function to bring the data into the "Full Set" worksheet. Question: Write the data transformation rule for the country field (list the syntax of the IF() function and explain the criteria you used to transform the data). [5 points] PA 12 PA PA Customer Order Name Customer City 10248 Suzan Plock Pittsburgh 10248 Suzan Plock Pittsburgh 10248 Suzan Plock Pittsburgh 10249 Allan Strate Miami 10249 Allan Strate Miami 10250 Elnora Willison Philadelphia 10250 Elnora Willison Philadelphia 10250 Elnora Willison Philadelphia 10251 Daniela Becknell Tacoma 10251 Daniela Becknell Tacoma 10251 Daniela Becknell Tacoma 10252 Cathrine Delamater Philadelphia 10252 Cathrine Delamater Philadelphia 10252 Cathrine Delamater Philadelphia 10253 Leota Vonderheide Philadelphia 10253 Leota Vonderheide Philadelphia 10253 Leota Vonderheide Philadelphia 10254 Tyrone Hine Pittsburgh 10254 Tyrone Hine Pittsburgh 10254 Tyrone Hine Pittsburgh 10255 Christin Tillinghast Tacoma 10255 Christin Tillinghast Tacoma 10255 Christin Tillinghast Tacoma 10255 Christin Tillinghast Tacoma 10256 Kisha Grauer Salt Lake City 10256 Kisha Grauer Salt Lake City 10257 Darryl Manuelito Charleston 10257 Darryl Manuelito Charleston 10257 Darryl Manuelito Charleston Customer Customer State/Province Status Silver Silver Silver FL Platinum FL Platinum PA Platinum Platinum PA Platinum WA Platinum WA Platinum WA Platinum Platinum PA Platinum Platinum PA Platinum PA Platinum PA Platinum PA Gold Gold PA Gold WA Gold WA Gold WA Gold WA Gold UT Silver Silver SC Silver Silver SC Silver PA Order Product Unit Discoun Date ID Product Price Quantity t Jan-11 11 Queso Cabrales $14.00 $0.00 Jan-11 42 Singaporean Hokkien Fried Mee $9.80 10 $0.00 Jan-11 72 Mozzarella di Giovanni $34.80 $0.00 Jan-11 51 Manjimup Dried Apples $42.40 40 $0.00 Jan-11 14 Tofu $18.60 $0.00 Jan-11 51 Manjimup Dried Apples $42.40 35 $0.15 Jan-11 41 Jack's New England Clam Chowc $7.70 10 $0.00 Jan-11 65 Louisiana Fiery Hot Pepper Sauc $16.80 15 $0.15 Jan-11 65 Louisiana Fiery Hot Pepper Sauc $16.80 $0.00 Jan-11 22 Gustaf's Knckebrd $16.80 $0.05 Jan-11 57 Ravioli Angelo $15.60 $0.05 Jan-11 33 Geitost $2.00 25 $0.05 Jan-11 20 Sir Rodney's Marmalade $64.80 $0.05 Jan-11 60 Camembert Pierrot $27.20 $0.00 Jan-11 49 Maxilaku $16.00 $0.00 Jan-11 39 Chartreuse verte $14.40 $0.00 Jan-11 31 Gorgonzola Telino $10.00 $0.00 Jan-11 55 Pt chinois $19.20 $0.15 Jan-11 74 Longlife Tofu $8.00 $0.00 Jan-11 24 Guaran Fantstica $3.60 15 $0.15 Jan-11 36 Inlagd Sill $15.20 $0.00 Jan-11 59 Raclette Courdavault $44.00 $0.00 Jan-11 16 Pavlova $13.90 $0.00 Jan-11 2 Chang $15.20 $0.00 Jan-11 53 Perth Pasties $26.20 $0.15 Jan-11 77 Original Frankfurter grne Soe $10.40 $0.00 Jan-11 27 Schoggi Schokolade $35.10 $0.00 Jan-11 39 Chartreuse verte $14.40 $0.00 Jan-11 77 Original Frankfurter grne Soe $10.40 15 $0.00 Extended Total Full Price Price Discoun $168.00 $168.00 $0.00 $98.00 $98.00 $0.001 $174.00 $174.00 $0.00 $1,696.00 $1,696.00 $0.00 $167.40 $167.40 $0.00 $1,484.00 $1,261.40 $222.60 $77.00 $77.00 $0.00 $252.00 $214.20 $37.80 $336.00 $336.00 $0.00 $100.80 $95.76 $5.04 $234.00 $234.00 $222.30 $11.70 $50.00 $47.50 $2.50 $2,592.00 $2,462.40 $129.60 $1,088.00 $1,088.00 $0.00 $640.00 $640.00 $0.00 $604.80 $604.80 $0.00 $200.00 $200.00 $0.00 $403.20 $342.72 $60.48 $168.00 $168.00 $0.00 $54.00 $45.90 $8.10 $380.00 $380.00 $0.00 $1,320.00 $1,320.00 $0.00 $486.50 $486.50 $486.50 $70 $0.00 $304.00 $304.00 $0.00 $393.00 $393.00 $0.00 $124.80 $124.80 $0.00 $877.50 $877.50 $0.00 $86.40 $86.40 $0.00 $156.00 $156.00 $0.00 PA 30 15 UT SC OrderID Customer First Na Customer Last Na Customer City Customer State/Province Customer Status Order Date Product ID Product Unit Price Quantity Discount Full Prict Extende Total Discount A10258 Milagros Fehrenbach Kansas City Kansas Jan-11 2 Chang $15.20 $0.20 $760.00 $608.00 A10258 Milagros Fehrenbach Kansas City Kansas Jan-11 32 Mascarpone Fabioli $25.60 $0.20 $153.60 $122.88 A10258 Milagros Fehrenbach Kansas City Kansas Jan-11 5 Chef Anton's Gumbo Mix $17.00 $0.20 $1,105.00 $884.00 A10259 Julio Willard Greensboro North Carolina Jan-11 21 Sir Rodney's Scones $8.00 $0.00 $80.00 $80.00 (Notice that this A10259 Julio Willard Greensboro North Carolina Jan-11 37 Gravad lax $20.80 $0.00 $20.80 $20.80 data is missing! A10260 Roslyn Plott Colombus Ohio Jan-11 70 Outback Lager $12.00 $0.25 $252.00 $189.00 A10260 Roslyn Plott Colombus Ohio Jan-11 57 Ravioli Angelo $15.60 $0.00 $780.00 $780.00 But also notice A10260 Roslyn Plott Colombus Ohio Jan-11 62 Tarte au sucre $39.40 $0.25 $591.00 $443.25 it is Full Price - A10260 Roslyn Plott Colombus Ohio Jan-11 41 Jack's New England Clam Chowc $7.70 $0.25 $123.20 $92.40 Extended Price A10261 Eve Haak Detroit Michigan Jan-11 35 Steeleye Stout $14.40 $0.00 $288.00 $288.00 A10261 Eve Haak Detroit Michigan Jan-11 21 Sir Rodney's Scones $8.00 $0.00 $160.00 $160.00 A10262 Max Kindle Colombus Ohio Jan-11 5 Chef Anton's Gumbo Mix $17.00 $0.20 $204.00 $163.20 A10262 Max Kindle Colombus Ohio Jan-11 7 Uncle Bob's Organic Dried Pears $24.00 $0.00 $360.00 $360.00 A10262 Max Kindle Colombus Ohio Jan-11 56 Gnocchi di nonna Alice $30.40 $0.00 $60.80 $60.80 A10263 Roxie Prewitt Provo Utah Jan-11 16 Pavlova $13.90 $0.25 $834.00 $625.50 A10263 Roxie Prewitt Provo Utah Jan-11 30 Nord-Ost Matjeshering $20.70 $0.25 $1,242.00 $931.50 A10263 Roxie Prewitt Provo Utah Jan-11 74 Longlife Tofu $8.00 $0.25 $288.00 $216.00 A10263 Roxie Prewitt Provo Utah Jan-11 24 Guaran Fantstica $3.60 $0.00 $100.80 $100.80 A10264 Erik Vire Philadelphia Pennsylvania Jan-11 2 Chang $15.20 $0.00 $532.00 $532.00 A10264 Erik Vire Philadelphia Pennsylvania Jan-11 41 Jack's New England Clam Chowc $7.70 $0.15 $192.50 $163.62 A10265 Lonnie Ludy Kansas City Missouri Jan-11 17 Alice Mutton $31.20 $0.00 $936.00 $936.00 A10265 Lonnie Ludy Kansas City Missouri Jan-11 70 Outback Lager $12.00 $0.00 $240.00 $240.00 A10266 Jamie Woodbridge Seattle Washington Jan-11 12 Queso Manchego La Pastora $30.40 $0.05 $364.80 $346.56 A10267 Eleanor Aburto Toronto Ontario Jan-11 76 Lakkalikri $14.40 $0.15 $216.00 $183.60 A10267 Eleanor Aburto Toronto Ontario Jan-11 40 Boston Crab Meat $14.70 $0.00 $735.00 $735.00 A10267 Eleanor Aburto Toronto Ontario Jan-11 59 Raclette Courdavault $44.00 $0.15 $3,080.00 $2,618.00 A10268 Odessa Smith Providence Rhode Island Jan-11 29 Thringer Rostbratwurst $99.00 $0.00 $990.00 $990.00 A10268 Odessa Smith Providence Rhode Island Jan-11 72 Mozzarella di Giovanni $27.80 $0.00 $111.20 $111.20 A10269 Jamie Sabat Vancover British Colombia Jan-11 72 Mozzarella di Giovanni $27.80 $0.05 $556.00 $528.20 A10269 Jamie Sabat Vancover British Colombia Jan-11 33 Geitost 60 $0.05 $120.00 $114.00 wwNNE WNNNNNNNNNNN wwwwwwNNN $2.00 168 0 ototoOOOO 174 1696 167.4 1184 168 98 174 1696 167.4 1261.4 ........ 222.6 15 37.8 PA WA 20 WA 0.05 15 5.04 11.7 WA PA 2.5 129.6 Data from Source 1 252 336 100.8 234 50 2592 1088 640 604.8 200 403.2 168 Silver Silver Silver Platinum Platinum Platinum Platinum Platinum Platinum Platinum Platinum Platinum Platinum Platinum Platinum Platinum Platinum Gold Gold Gold Gold Gold Gold Gold Silver Silver Silver Silver Silver 0 0 0 60.48 0 8.1 10248 Suzan Plock 10248 Suzan Plock 10248 Suzan Plock 10249 Allan Strate 10249 Allan Strate 10250 Elnora Willison 10250 Elnora Willison 10250 Elnora Willison 10251 Daniela Becknell 10251 Daniela Becknell 10251 Daniela Becknell 10252 Cathrine Delamater 10252 Cathrine Delamater 10252 Cathrine Delamater 10253 Lepta Vonderhelde 10253 Leota Vonderheide 10253 Leota Vonderheide 10254 Tyrone Hine 10254 Tyrone Hine 10254 Tyrone Hine 10255 Christin Tillinghast 10255 Christin Tillinghast 10255 Christin Tillinghast 10255 Christin Tillinghast 10256 Kisha Grauer 10256 Kisha Grauer 10257 Darryl Manuelito 10257 Darryl Manuelito 10257 Darryl Manuelito 10258 Milagros Fehrenbach 10258 Milagros Fehrenbach 10258 Milagros Fehrenbach 10259 Julio Willard 10259 Julio Willard 10260 Roslyn Plott 10260 Roslyn Plott 10260 Roslyn Plott 10260 Roslyn Plott 10261 Eve Haak 10261 Eve Haak 10262 Max Kindle 10262 Max Kindle 10262 Max Kindle 10263 Roxe Prewitt 10263 Roxie Prewitt 10263 Roxie Prewitt 10263 Roxie Prewitt 10264 Erik Vire 10264 Erik Vire 10265 Lonnie Ludy 10265 Lonnie Ludy 10266 Jamie Woodbridge WA WA WA WA Pittsburgh Pittsburgh Pittsburgh Miami Miami Philadelphia Philadelphia Philadelphia Tacoma Tacoma Tacoma Philadelphia Philadelphia Philadelphia Philadelphia Philadelphia Philadelphia Pittsburgh Pittsburgh Pittsburgh Tacoma Tacoma Tacoma Tacoma Salt Lake City Salt Lake City Charleston Charleston Charleston Kansas City Kansas City Kansas City Greensboro Greensboro Colombus Colombus Colombus Colombus Detroit Detroit Colombus Colombus Colombus Provo Provo Provo Provo Philadelphia Philadelphia Kansas City Kansas City Seattle Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 Jan-11 11 Queso C 14 42 Singapor 9.8 72 Mozzarel 34.8 51 Manjimu 42.4 14 Tofu 51 Manjimu 12.4 41 Jack's Ne 7.7 65 Louisiana 16.8 65 Louisiana 16.8 22 Gustaf's 16.8 57 Ravioli A 15.6 33 Geitost 2 20 Sir Rodn 64.8 6 0 Camemt 27.2 49 Maxilaku 16 39 Chartreu 14.4 31 Gorgonzi 10 55 Pat chir 19.2 74 Longlife 8 24 Guaran 3.6 36 Inlagd Si15.2 59 Radette 44 16 Pavlova 13.9 2 Chang 15.2 53 Perth Pa 26.2 77 Original 10.4 27 Scheggi 35.1 39 Chartreu 14.4 7 7 Original 10.4 2 Chang 15.2 32 Mascarp: 25.6 5 Chef Ant 17 21 Sir Rodn8 37 Gravad la 20.8 70 Outback 12 57 Ravioli A 15.6 62 Tarte au 39.4 41 Jack's Ne 7.7 35 Steeleye 14.4 21 Sir Rodn8 5 Chef Ant 17 7 Unde Bo24 56 Gnocchi 30.4 16 Pavlova 13.9 30 Nord-Ost 20.7 74 Longlife 8 24 Guaran 3.6 2 Chang 15.2 41 Jack's Ne 7.7 17 Alice Mul 31.2 70 Outback 12 12 Queso M 30.4 380 1320 486.5 304 393 124.8 877.5 86.4 55888222225 214.2 336 95.76 222.3 47.5 2462.4 10BB 640 604.8 200 342.72 16B 45.9 380 1320 486.5 304 393 124.8 877.5 86.4 156 60B 122.88 884 BO 20.8 189 780 443.25 92.4 2BB 160 163.2 156 EL....................... 152 30.72 221 0 760 153.6 1105 80 20.8 252 780 0 0 0.25 N OH 15 591 0.25 147.75 30.8 MI Data from Source 123.2 288 160 204 OH 0.2 0 40.8 D 360 360 0 0.25 0.25 60.8 834 208.5 310.5 0.25 0 UT 0 60.B 625.5 931.5 216 100.B 532 163.62 936 240 346.56 1242 298 100.8 532 192.5 936 240 364.8 0 0.15 28.88 MO MO 0 18.24 WA 0.05 AL MB MD MN State and Province Lookup State/Province Abbreviation Alabama Alaska AK Alberta Arizona Arkansas British Columbia California Colorado Connecticut Delaware Florida Georgia Hawaii Idaho Illinois Indiana Iowa Kansas Kentucky Louisiana Maine Manitoba Maryland Massachusetts MA Michigan MI Minnesota Mississippi Missouri MO Montana Nebraska Nevada New Brunswick NB New Hampshire New Jersey New Mexico NM New York Newfoundland and Labrador North Carolina North Dakota Nova Scotia Ohio Oklahoma Ontario Oregon Pennsylvania Prince Edward Island Quebec Rhode Island Saskatchewan South Carolina South Dakota Tennessee Texas Utah MT NE NH NY NL NC 2 Z 8 8@EX 5 Customer Name Original Credit Line Missed Payment Country Allan Strate $10,000.00 1 USA Cathrine Delamater $20,000.00 3 USA Christin Tillinghast $8,000.00 5 United States Daniela Becknell $10,000.00 NONE United States Darryl Manuelito $10,000.00 NONE US Eleanor Aburto $10,000.00 NONE Canada Elnora Willison $20,000.00 3 US Erik Vire $0.00 NONE US Eve Haak $5,000.00 NONE United States Jamie Sabat $5,000.00 NONE Canada Jamie Woodbridge $5,000.00 5 USA Julio Willard $10,000.00 NONE USA Kisha Grauer $7,500.00 NONE USA Leota Vonderheide $7,500.00 3 USA Lonnie Ludy $10,000.00 2 USA Max Kindel $10,000.00 NONE USA Milagros Fehrenbac $0.00 NONE USA Odessa Smith $20,000.00 NONE USA Roslyn Plott $20,000.00 NONE US Roxie Prewitt $20,000.00 3 US Suzan Plock $7,500.00 NONE US Tyrone Hine $20,000.00 2 USStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started