Question
import pandas as pd from pandas import ExcelWriter from pandas import ExcelFile import os.path as op from openpyxl import workbook import re def extract_export_columns(df, list_of_columns,
import pandas as pd from pandas import ExcelWriter from pandas import ExcelFile import os.path as op from openpyxl import workbook import re
def extract_export_columns(df, list_of_columns, file_path): column_df = df[list_of_columns] column_df.to_csv(file_path, index=False, sep="|") #Orrginal file input_base_path = 'C:/Users/somedoc input' main_df_data_file = pd.read_csv(op.join(input_base_path, 'som_excel_doc.csv '))
#Filter for tailnumbers tail_numbers = main_df_data_file['UK 1 - ZM135'] <= 30 main_df_data_file[tail_numbers] #iterate over list #number_filter = main_df_data_file.Updated.isin(["15"]) #main_df_data_file[number_filter] #print(number_filter) #for row in main_df_data_file.values: #for value in row: # print(value) #print(row) # to check the condition
# Product of code output_base_path = r'C:\Users\some_doc output' extract_export_columns(main_df_data_file, ['Updated 28 Feb 18 Tail #'], op.join(output_base_path, 'USN_example3.txt'))
Above is my current code. I have a giant excel with thousands of entries and worse, the column/rows are not neat. There are several pieces of data in each column cell per row. What I've noticed is that a number called 'tail #' is missing in some of them. What I want to do is search for that number, if it has it then copy that cell, if it does not then go to the next column in the row. Then repeat that for all cells. There is a giant header, but when I transformed it into CSV, I removed that with formatting. This is also why I am looking for a number because there are several headers. for example, years that say like 2010 but then several empty columns till the next one maybe 10 cols later. Also please not that under this header of years are several columns of data per row that are separated by two columns with no info. Also, the info in a column looks like this, '13|something something|some more words'. If it has a number as you see, I want to copy it. The numbers seem to range from 0 to no greater than 30. Lastly, I'm trying to write this using pandas but I may need a more manual way to do things because using isin, and iloc was not working.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started