Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

import pandas as pd from pandas import ExcelWriter from pandas import ExcelFile import os.path as op from openpyxl import workbook import re def extract_export_columns(df, list_of_columns,

import pandas as pd from pandas import ExcelWriter from pandas import ExcelFile import os.path as op from openpyxl import workbook import re

def extract_export_columns(df, list_of_columns, file_path): column_df = df[list_of_columns] column_df.to_csv(file_path, index=False, sep="|") #Orrginal file input_base_path = 'C:/Users/somedoc input' main_df_data_file = pd.read_csv(op.join(input_base_path, 'som_excel_doc.csv '))

#Filter for tailnumbers tail_numbers = main_df_data_file['UK 1 - ZM135'] <= 30 main_df_data_file[tail_numbers] #iterate over list #number_filter = main_df_data_file.Updated.isin(["15"]) #main_df_data_file[number_filter] #print(number_filter) #for row in main_df_data_file.values: #for value in row: # print(value) #print(row) # to check the condition

# Product of code output_base_path = r'C:\Users\some_doc output' extract_export_columns(main_df_data_file, ['Updated 28 Feb 18 Tail #'], op.join(output_base_path, 'USN_example3.txt'))

Above is my current code. I have a giant excel with thousands of entries and worse, the column/rows are not neat. There are several pieces of data in each column cell per row. What I've noticed is that a number called 'tail #' is missing in some of them. What I want to do is search for that number, if it has it then copy that cell, if it does not then go to the next column in the row. Then repeat that for all cells. There is a giant header, but when I transformed it into CSV, I removed that with formatting. This is also why I am looking for a number because there are several headers. for example, years that say like 2010 but then several empty columns till the next one maybe 10 cols later. Also please not that under this header of years are several columns of data per row that are separated by two columns with no info. Also, the info in a column looks like this, '13|something something|some more words'. If it has a number as you see, I want to copy it. The numbers seem to range from 0 to no greater than 30. Lastly, I'm trying to write this using pandas but I may need a more manual way to do things because using isin, and iloc was not working.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

MFDBS 91 3rd Symposium On Mathematical Fundamentals Of Database And Knowledge Base Systems Rostock Germany May 6 9 1991

Authors: Bernhard Thalheim ,Janos Demetrovics ,Hans-Detlef Gerhardt

1991st Edition

3540540091, 978-3540540090

More Books

Students also viewed these Databases questions

Question

=+5. Does the source speak calmly and reassuringly on this topic?

Answered: 1 week ago