data mining subject 1 summary the artical 2 what is data size 3 recoreds applied 4 what techqinecs is used 5 explain resualts EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH SANGITA GUPTA', SUMA V, 'Jain Univeraly, Rusgalere 'Dayanada Sagar lastitute, Elangalore, India Abstract Ose of the essential roquisites of any software industy is the developenent of custemet satisfied products However, accoliplishing the aforssaid busines dejective dependi upon the depth of quality of prodact that is enginsered in the organization Thus, geseration of high quality depends upon procesh, which is in tum depends upon the pecgle Existing mennario in If industries derwands a zequirement for deploying the right personel for achieving desipable quality in the prodect through the existing proeess The goal of this paper is to idenify the criteria which will be used in induatrial practice to select memben of a software preject tean, and to look for relationships between thee criteris and project success Using senaistructurtd interviews and eqalitative metbods for dita arahois and synthesis, a set of team buikling criteria was identified from preject managers in industry The fiadings show that the conkistent wes of the set of critcria carrelated significantly winh preject suevess, and the criteria related so hunan factors peresent strong corselatices with sotware quality and therdby progect sueces This Lnowledge enables decision makiag for progect managen ia allocation of righn pencentel to realiac desired level 1 INTRODUCTION purpose of knowlodge infrastructure for project management is to provide information from past The main objective of any sottware company is to experience of the organization to improve the provide quality sottware to its eustomers The best of execution of new projects To achieve this objective, software system is bound to fail withoat the right the knowledge infrastructure bas to compile and people working on it One of the ways to achieve erganice empirical data which is present in the highest level of eoulity in software system is through systems and is available for use by project managers discovering knowledge for deployment of project 6 Consequently, the key elenents of building personnel to a project by predicting their knowledge infrastructure are collecting and performanee The knowledge is hidden arsong the organizing the knowledge, making it available data set and it is extractable through data mining through models, and reusing it to improve the techniques Present paper is designed to justify the execution of projects, In softuare development the capabilities of data mining techniques in context of main components can be broadly classified as human noftware sucesss by offering a data mining model foe aspect people and processes Theugh proeesses have software companies to select the right personel for been well organized and developed, humsin aspect is their project In thas research, the classification still at preliminary stages for study Withoet deep method is used to evaluate peoject member's consideration into human aspect of software performance By this task we extract knowledge that engineering even the best of processes will not give describes project member's' performance in the the desired quality, To acsomplish the above sid current project It helps earlier ideatification of objective of soltware quality, organizations are now parameters related to human component resulting in looking deeply into bunsan aspects using varieus better software quality and thereby project success techniques some of them non puramictric and soene Sottware Engineering is a discipline that aims at paransetric data mining methods However data producing high quality software through systematic, mining techniques in sottware engineering have well disciplined approach of software developmest It proved to be importans tools for decision making of' involves methods, tools, bes practices and standards manstgement The data collected require proper to achieve its objective 5 However software method of extracting knowledge from farge engiseering is not oely about tools and methods but repositories for better decision making Knowledge also human aspect involved to work on it Even the diseovery in databuses (KDD), often called data best of software system cansot be develoged without mining, aims at the discovery of useful information correct team memberk Therefore Human Aspect in fron large eollections of data The main fiuctions of Sotware enginecring wbich is an important basis for data mining are applying various methods and software quality beeds more understanding and algorithens in oeder to discover and extract patterns of deeper investigation To achieve high quality stored data 2 Data mining and knowledge soft ware, it is essential to extract knowledge from the discovery applications have got a rich focus due to its large dataset related to project members The main significanoe in decision making and it has become an essential component in various erganizztions Data Knowledge Discovery in Databuse, relers lo mining lechniques have been introduced into new extracting or mining knowledge from large arsounts fields of Statistics Databases, Machine Learsing and of data The segacnces of seps identified in Pattern Recognition There are increasing research extracting knowledge from data are shoen in Figuec interests in using data mining in every aspect of 1 technology, Data Mining, concerns with developing methods that discover knowledge from data originating from empirical environments 2 Data Mining uses many techniques such as Decison Trees, Neural Nictworks, Naive Bayes, K Nearest neighbour, and many cthers Using these techmiques many kinds of knowledge can be discovered such as association rules, classifications and clastering The discovered knowledge can be ased for prediction in diverse applications 2 Section II has more references on applications Finure 1 Research Masthodolopy The main objective of this paper is to toe data mining necthodologies to predict project members Various algorithms and techniques like Classification 'performance for the purticular project Data miniag Clustering Regression, Artificial lotelligenoc, Neural provides many tasks that could be used to study the Netwoeks, Asociatice Rules, Decivion Treek project member's performance In this researth, the Genctic Algoridam Nearest Neiphbour metbal etc classifieation task is used to evaluate project are used for data mining process 2 Our Techniqoes msember s performance There are many approuches and methods in data mining need brief mertion so that are used for data classification, the decision tree have better understinding nocthod is used bere Information lake college pereentile, experienes, domain knonledge Classification is the mos conmonly applied data assessment, commumication skills, reasoning skills, mining technique, which employs a set of pre time efficiency ete was collected frum the project classified examples to develop a model thit can management system for prediction of performance for classify the population of recerds at large This that project Organization of the peper is as follows approach frequently employs decision tree of acural Soctice II spocifies the related werk in the domains of netwoek based classification algorithrms The das mining Section III povider research classifier training alporithen uses these pre clarificed methodology followed during this investigation examples to determine the set of parameters roqairod Sectica IV presents research work and technique for proper discrimination The algorithm then development details, Section V indicates the rosults encodes these parametern inso a model is culled a obrained by elassificatioe lechnique for effective classifier The anthors will be using decision tree for project management Section VI summarizes and their research woek concludes the poper 11 BACKGROUND AND RELATEDWORK Teprosents a choice between a number of alternutives, life The increaxing demand of soteware has led to the take acticns From this node, users split each node progress of continual research in the areas of quality tecursively asecrding to decision tree leaming assurance and effective project management 6, Data algorithm The final result is a decision troe in which maining and pattern recognitice techniques have each branch represents a posible scenario of decisica proven as one of the established techniques for and its cutcons Decisinn troe is troe thapod effective project management Data mining has been structures that represent sets of decisions These used foe many aspects of sottware projects like defect decisicns generate rules for the classificative of a management, test analysis, code optimization etc 4 dathset Specific decision tree methods iaclade Authors in 8 have used data mining for Bug Reports Clessification and Regressice Trees (CART) and Chi Classification asing Ted Data Mining Data mining Square Automatic lateracticn Detectice has also been used by authors in 9 for other domains (CHAID) The authoes in 1 have doac a coenparative like ofucational databases The aubhors in 7 have study of the methods The authors in 11 huve developed a data mining framework based on developed many decisioe tree algarithen Lke ID3 and decision tree and association rules to generate useful C5 The authors in 3 have investigated ant rules for personnel selectica asd retention based on incremental method for finding next node of the several attributes of employee for high lechnology decision tree The Decision Trees algorithm is a industry Data mining also popularly known as classification algerithm aned for predictive modeling Empirical Stady as Selectot of toam mambers bor aflere projecta Dute anning Approach of malivariate attribuies For discrete aftributes, the Atibute Selection Measure function (hecuristic) ca algorithen makes predicticns based on the existing C4 5 algarithm 10 The drantuck of C4 5 relationships between input columns in a dataset It heuristic function (Gain ratio) is that, if the split uses the values, known as states, of those columns to information approaches zero, the ratio becomes predict the states of a column that you designate as unstable In proposed technique for split critsria we predictable Spocifically, the algorithm identifles the have considered the maximum occurrences of cach input columns that are cocrelated with the predictable attribute value then calculating the average maximum colvann The authors in 10 have used a Knowledge occurrences of combination of each category attribute based Decision Trees aigorithm which uses feature thas split information never reaches zero and gives selection so guide the selection of the most useful mofe importance to realistic attributes and aceurate attributes In this stady we have develepod an results The algorithen is as follows algorithm to lind the attributes using incremental method according to their mapping with porformance Stepl Let D, the Data partition be a training set of Thereafer a decision troe was constructed based on class labelied tuples Suppose the class label attribute derived knowledge has m distinst valaes defining n distinct classes, Ci ( Data selection and transformation oIn this step only for i 1,2Nem ) Let Ci,D be the set of tuples of class those fielda were selected which were required for tuples in D and Ci,D respectively Suppose attribute data mining A few derived sariables were selected A on partion D having disinct values al,a2 m,av,as While some of the information for the variables was extracted from the databasc All the predictor and III RESEARCH METHODOLOGY response variables which were derived from the database are given in Table 1 for reference The huge This research focuses upon the selection of project data collected was thereby sampled and asalyzed personnel using classification technique for cfiective Therefore, this work directed towards formsulatica of project managenent and thereby resulting in good bypochesis for selection criteria of project personed software quality In order bo achieve the which has further impact on software quality of aforementioned objective, a deep investigation is sottware projects Modes of data collection include carried out upon similar son cribical projects from iateractions with project developing team and human software industries to get the parameters for selecticen resource management Empirical dea analysis of project persanel includes application of decision tree techaiques to predict the efficiency of each project menber for Data Preparaticens The training data set used in this further deployment The obvervational resalts indicane stady was oteained from soffware companies of that toough most soliware companies lay a lot of Bangalore Initially sire of the data is 40 In this step emphasis on general percentile aggregate but oher data stored in different tables was joibed in a single fictors like domain specifie knowledge and roasoning table after joining proctss, erroos were removod skills play significant role for best performance in the observed frem the training data organization The mast important factor which was analyzed was prograrmming skills Hertby the Siep2 Calculate Atribute Selection Measuremant selection criteria have to be reframed giving more Function (ASMF) for that attribute Sicps for weighage to aspects like programnang skills, depth calculating this finction is as followx in domain knowledge and reasoniag skills rather than aggregate percentile 2 1 No occurrences of each attribule value Data mining and panern recognition is gaining 2 20 ccurrences of each category attribute popularity because of its potentials to enhance our understanding and identifyisg extractiag and Step3 Compute average maximum occurrence for evaluating variables related to any process By each atrribute which denotes thenASMF ai'CID means of this method of classification medbod on mulivariate attributes, it was found that the facters 3 1 Maximum occurrence of coenbination of each like project members 'programming skills, reasoning category and repeat Seep 2 knosiedge assessinent and cher attributes were is Maximum highly correlatod with the peoject member's performance rather than GPA Step4 Then on the basis of sorted values of ASME IV RESEARCH WORK we will divide the given traising set into subiets and move to another level of tree PROPOSED TECHNIQUE In our research work, a Siep5 Then we will repeat the same steps on each data mining technique is weed which is bused on new subset itcratively and derive a decision tree V EMPIRICAL DATA ANALYSIS ISING DECISION TEEE ALGORITHM Data was collected from a sotware compuny in Bangalore, It was preprocossed and proparad fir analysis, It war subjected to data mining techaique and the related to the bypothesis Data preparation is shown in Table 1 proyect personecl's attributes selected fire analyak and Table 2 data with valoes of the atributes mentioned in Table 1 TABLEE L Sothware project personncl related The demain values for some of the variabies were defined for the present investigation as follows All atrribetes marks are normelized out of 10GPA Inc rovar sockinsu dy use angorinum can oe put in a Previous instisution marks DKA Domain Knowlodge Assesment maric A conflusion matrix is a bble that shows the Performance in domain knouledge assossnemt of the rewlts of the classification experiment (aicid) is company PS progranming Aills results oteainad calculated by dividing the aunser of occurrences of by taking internal assessenent on the progranining ai in Cid Bost actribate is soch that good mups to I, coesepte CS Communicetion skills rewalts obtained by perfiomance Therefore the ideal matrix should be as seminar pecentation of employec Semiar performance is evaluated ime foeir clases Poer Presentation and coenmunicato skill is low, Averape Either prescratice is fine ar Conmunicative all is averape Good Both presentation and Coenmunication skill is good, RS Reasoning skills, Reasoning stills performunce GP General Proficiency performance Overall performance from previcat project TE Time efficiency of employee P Performines The ASMF 3 ideully when all good will perform sood all average will perform average and all poot will perform poor The confusion matrix for GPA and The implementation of the above stady was done in PS is shown below Table III denotes the ASMF TreePlan sottware also called DTREG and the results SCORE of all attributes of classification by the author and the tool were matchiag Table IV shows the results obtained from the sodtware tool TABLE TV IMPLEMENTATION RFSULT Therefore the ASMF (GPA) 3 8 6 18 5 15 1 07 Similarly for PS Computing for all Attributes we get the following TABLE III ASMF SCORE FOR AII ATTRIRIITFS Finiahod the analyso at 20 5un 2013 ta 106 17 Analvis ren tim 00000 17 The data set of 40 employee used in this stady which was obtained from soltware company in Blangalere was basis for our classification technique The result Basod on this computation Table III we can derive and rules obeained can classify project menbers iato decision tree with PS which has the highest ASMF as three classes of performance good (should be root node and oeher attributes further down in their deployed), average (can be deployod with training) order One classificaticn rule can be generated for and poor(should not be deployed) each path froen each terninal node to root node Pruning tochnique was executed by rensoving nodes Further soope of this research woek is asing other with less than desired number of objects and after tree classification techniques and domg a comparative pruning process we have the following rules Study We can experiment on different set of astributes and find the most promising selection RULE 1 if (PS O000 ) and (GPA GOOD of criteria AVERAGE ) and (BS GOOD or AVERAGE and (DKA OOOD of 'AVFRRAGE ) and (CS GOODP of VI CONCLUSION AVERAGE ) then P GOOD RULE I If (PS AVERAGE ) and (GPA AVERAGE of In this paper, the classification task is used ow project GOOD ) and (RS GOOD of AVERAGE') mand member's database to predict the project menber's P 6000 there are many appecoches that are used for data RIZ 3 If aPS GOOD ) and (GPA AVFRCEGE of classification, the decision tree method is used here POOK' and (RS AVLRAGE ) and (DKA AVELAGEL The resulting decision tree provides a representation land (CS GOOOF or AVERAGE) then P AVERAGE of the concept those appoals to human bocause it renders the classification process self evident These Rta A if pararserers were collected from the employer's AVERACE land (DKA AVEKAGE ) and current project It was noted that thoagh the GPA RULE 5 Ir(PS OOR ) then P POOR skills, donain knowledge and reasaing skills played

Question

data mining subject 1  summary the artical 2 what is data size 3  recoreds applied 4 what techqinecs is used 5  explain resualts EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS   DATA MINING APPROACH SANGITA GUPTA', SUMA  V, 'Jain Univeraly, Rusgalere 'Dayanada Sagar lastitute, Elangalore, India Abstract  Ose of the essential roquisites of any software industy is the developenent of custemet satisfied products  However, accoliplishing the aforssaid busines dejective dependi upon the depth of quality of prodact that is enginsered in the organization  Thus, geseration of high quality depends upon procesh, which is in tum depends upon the pecgle  Existing mennario in If industries derwands a zequirement for deploying the right personel for achieving desipable quality in the prodect through the existing proeess  The goal of this paper is to idenify the criteria which will be used in induatrial practice to select memben of a software preject tean, and to look for relationships between thee criteris and project success  Using senaistructurtd interviews and eqalitative metbods for dita arahois and synthesis, a set of team buikling criteria was identified from preject managers in industry  The fiadings show that the conkistent wes of the set of critcria carrelated significantly winh preject suevess, and the criteria related so hunan factors peresent strong corselatices with sotware quality and therdby progect sueces  This Lnowledge enables decision makiag for progect managen ia allocation of righn pencentel to realiac desired level 1  INTRODUCTION purpose of knowlodge infrastructure for project management is to provide information from past The main objective of any sottware company is to experience of the organization to improve the provide quality sottware to its eustomers  The best of execution of new projects  To achieve this objective, software system is bound to fail withoat the right the knowledge infrastructure bas to compile and people working on it  One of the ways to achieve erganice empirical data which is present in the highest level of eoulity in software system is through systems and is available for use by project managers discovering knowledge for deployment of project  6   Consequently, the key elenents of building personnel to a project by predicting their knowledge infrastructure are collecting and performanee  The knowledge is hidden arsong the organizing the knowledge, making it available data set and it is extractable through data mining through models, and reusing it to improve the techniques  Present paper is designed to justify the execution of projects, In softuare development the capabilities of data mining techniques in context of main components can be broadly classified as human noftware sucesss by offering a data mining model foe aspect people and processes  Theugh proeesses have software companies to select the right personel for been well organized and developed, humsin aspect is their project  In thas research, the classification still at preliminary stages for study  Withoet deep method is used to evaluate peoject member's consideration into human aspect of software performance  By this task we extract knowledge that engineering even the best of processes will not give describes project member's' performance in the the desired quality, To acsomplish the above sid current project  It helps earlier ideatification of objective of soltware quality, organizations are now parameters related to human component resulting in looking deeply into bunsan aspects using varieus better software quality and thereby project success  techniques some of them non puramictric and soene Sottware Engineering is a discipline that aims at paransetric data mining methods  However data producing high quality software through systematic, mining techniques in sottware engineering have well disciplined approach of software developmest  It proved to be importans tools for decision making of' involves methods, tools, bes practices and standards manstgement  The data collected require proper to achieve its objective   5   However software method of extracting knowledge from farge engiseering is not oely about tools and methods but repositories for better decision making  Knowledge also human aspect involved to work on it  Even the diseovery in databuses (KDD), often called data best of software system cansot be develoged without mining, aims at the discovery of useful information correct team memberk  Therefore Human Aspect in fron large eollections of data  The main fiuctions of Sotware enginecring wbich is an important basis for data mining are applying various methods and software quality beeds more understanding and algorithens in oeder to discover and extract patterns of deeper investigation  To achieve high quality stored data  2   Data mining and knowledge soft ware, it is essential to extract knowledge from the discovery applications have got a rich focus due to its large dataset related to project members  The main significanoe in decision making and it has become an essential component in various erganizztions  Data Knowledge Discovery in Databuse, relers lo mining lechniques have been introduced into new extracting or mining  knowledge from large arsounts fields of Statistics  Databases, Machine Learsing and of data  The segacnces of seps identified in Pattern Recognition  There are increasing research extracting knowledge from data are shoen in Figuec interests in using data mining in every aspect of 1 technology, Data Mining, concerns with developing methods that discover knowledge from data originating from empirical environments  2   Data Mining uses many techniques such as Decison Trees, Neural Nictworks, Naive Bayes, K Nearest neighbour, and many cthers  Using these techmiques many kinds of knowledge can be discovered such as association rules, classifications and clastering  The discovered knowledge can be ased for prediction in diverse applications  2   Section II has more references on applications  Finure 1 Research Masthodolopy The main objective of this paper is to toe data mining necthodologies to predict project members Various algorithms and techniques like Classification  'performance for the purticular project  Data miniag Clustering  Regression, Artificial lotelligenoc, Neural provides many tasks that could be used to study the Netwoeks, Asociatice Rules, Decivion Treek  project member's performance  In this researth, the Genctic Algoridam  Nearest Neiphbour metbal etc  classifieation task is used to evaluate project are used for data mining process  2   Our Techniqoes msember s performance  There are many approuches and methods in data mining need brief mertion so that are used for data classification, the decision tree have better understinding  nocthod is used bere  Information lake college pereentile, experienes, domain knonledge Classification is the mos conmonly applied data assessment, commumication skills, reasoning skills, mining technique, which employs a set of pre  time efficiency ete was collected frum the project classified examples to develop a model thit can management system for prediction of performance for classify the population of recerds at large  This that project  Organization of the peper is as follows  approach frequently employs decision tree of acural Soctice II spocifies the related werk in the domains of netwoek based classification algorithrms  The das mining  Section III povider research classifier training alporithen uses these pre clarificed methodology followed during this investigation  examples to determine the set of parameters roqairod Sectica IV presents research work and technique for proper discrimination  The algorithm then development details, Section V indicates the rosults encodes these parametern inso a model is culled a obrained by elassificatioe lechnique for effective classifier  The anthors will be using decision tree for project management  Section VI summarizes and their research woek  concludes the poper  11  BACKGROUND AND RELATEDWORK Teprosents a choice between a number of alternutives, life  The increaxing demand of soteware has led to the take acticns  From this node, users split each node progress of continual research in the areas of quality tecursively asecrding to decision tree leaming assurance and effective project management 6, Data algorithm  The final result is a decision troe in which maining and pattern recognitice techniques have each branch represents a posible scenario of decisica proven as one of the established techniques for and its cutcons  Decisinn troe is troe thapod effective project management  Data mining has been structures that represent sets of decisions  These used foe many aspects of sottware projects like defect decisicns generate rules for the classificative of a management, test analysis, code optimization etc 4   dathset  Specific decision tree methods iaclade Authors in  8  have used data mining for Bug Reports Clessification and Regressice Trees (CART) and Chi Classification asing Ted Data Mining  Data mining Square Automatic lateracticn Detectice has also been used by authors in  9  for other domains (CHAID)  The authoes in  1  have doac a coenparative like ofucational databases  The aubhors in  7  have study of the methods  The authors in  11  huve developed a data mining framework based on developed many decisioe tree algarithen Lke ID3 and decision tree and association rules to generate useful C5  The authors in  3  have investigated ant rules for personnel selectica asd retention based on incremental method for finding next node of the several attributes of employee for high lechnology decision tree  The Decision Trees algorithm is a industry  Data mining also popularly known as classification algerithm aned for predictive modeling Empirical Stady as Selectot of toam mambers bor aflere projecta   Dute anning Approach of malivariate attribuies  For discrete aftributes, the Atibute Selection Measure function (hecuristic) ca algorithen makes predicticns based on the existing C4 5 algarithm  10   The drantuck of C4 5 relationships between input columns in a dataset  It heuristic function (Gain ratio) is that, if the split uses the values, known as states, of those columns to information approaches zero, the ratio becomes predict the states of a column that you designate as unstable  In proposed technique for split critsria we predictable  Spocifically, the algorithm identifles the have considered the maximum occurrences of cach input columns that are cocrelated with the predictable attribute value then calculating the average maximum colvann  The authors in  10  have used a Knowledge occurrences of combination of each category attribute based Decision Trees aigorithm which uses feature thas split information never reaches zero and gives selection so guide the selection of the most useful mofe importance to realistic attributes and aceurate attributes  In this stady we have develepod an results  The algorithen is as follows  algorithm to lind the attributes using incremental method according to their mapping with porformance  Stepl  Let D, the Data partition be a training set of Thereafer a decision troe was constructed based on class labelied tuples  Suppose the class label attribute derived knowledge  has m distinst valaes defining n distinct classes, Ci ( Data selection and transformation oIn this step only for i 1,2Nem )  Let Ci,D be the set of tuples of class those fielda were selected which were required for tuples in D and Ci,D respectively  Suppose attribute data mining A few derived sariables were selected  A on partion D having disinct values al,a2 m,av,as While some of the information for the variables was extracted from the databasc  All the predictor and III  RESEARCH METHODOLOGY response variables which were derived from the database are given in Table 1 for reference  The huge  This research focuses upon the selection of project data collected was thereby sampled and asalyzed  personnel using classification technique for cfiective Therefore, this work directed towards formsulatica of project managenent and thereby resulting in good bypochesis for selection criteria of project personed software quality  In order bo achieve the which has further impact on software quality of aforementioned objective, a deep investigation is sottware projects  Modes of data collection include carried out upon similar son cribical projects from iateractions with project developing team and human software industries to get the parameters for selecticen resource management  Empirical dea analysis of project persanel  includes application of decision tree techaiques to predict the efficiency of each project menber for Data Preparaticens   The training data set used in this further deployment  The obvervational resalts indicane stady was oteained from soffware companies of that toough most soliware companies lay a lot of Bangalore  Initially sire of the data is 40   In this step emphasis on general percentile aggregate but oher data stored in different tables was joibed in a single fictors like domain specifie knowledge and roasoning table after joining proctss, erroos were removod  skills play significant role for best performance in the observed frem the training data  organization  The mast important factor which was analyzed was prograrmming skills  Hertby the Siep2  Calculate Atribute Selection Measuremant selection criteria have to be reframed giving more Function (ASMF) for that attribute  Sicps for weighage to aspects like programnang skills, depth calculating this finction is as followx  in domain knowledge and reasoniag skills rather than aggregate percentile  2 1 No  occurrences of each attribule value  Data mining and panern recognition is gaining 2 20 ccurrences of each category attribute  popularity because of its potentials to enhance our understanding and identifyisg  extractiag and Step3  Compute average maximum occurrence for evaluating variables related to any process  By each atrribute which denotes thenASMF   ai'CID  means of this method of classification medbod on mulivariate attributes, it was found that the facters 3 1 Maximum occurrence of coenbination of each like project members 'programming skills, reasoning category and repeat Seep 2 knosiedge assessinent and cher attributes were is Maximum highly correlatod with the peoject member's performance rather than GPA  Step4  Then on the basis of sorted values of ASME IV  RESEARCH WORK we will divide the given traising set into subiets and move to another level of tree  PROPOSED TECHNIQUE In our research work, a Siep5  Then we will repeat the same steps on each data mining technique is weed which is bused on new subset itcratively and derive a decision tree V  EMPIRICAL  DATA ANALYSIS ISING DECISION TEEE ALGORITHM Data was collected from a sotware compuny in Bangalore, It was preprocossed and proparad fir analysis, It war subjected to data mining techaique and the related to the bypothesis  Data preparation is shown in Table 1  proyect personecl's attributes selected fire analyak and Table 2 data with valoes of the atributes mentioned in Table 1  TABLEE L  Sothware project personncl related The demain values for some of the variabies were defined for the present investigation as follows  All atrribetes marks are normelized out of 10GPA   Inc rovar sockinsu dy use angorinum can oe put in a Previous instisution marks DKA   Domain Knowlodge Assesment maric  A conflusion matrix is a bble that shows the Performance in domain knouledge assossnemt of the rewlts of the classification experiment  (aicid) is company  PS  progranming Aills results oteainad calculated by dividing the aunser of occurrences of by taking internal assessenent on the progranining ai in Cid  Bost actribate is soch that good mups to I, coesepte  CS Communicetion skills rewalts obtained by perfiomance  Therefore the ideal matrix should be as seminar pecentation of employec  Semiar performance is evaluated ime foeir clases  Poer Presentation and coenmunicato skill is low, Averape   Either prescratice is fine ar Conmunicative all is averape  Good   Both presentation and Coenmunication skill is good, RS  Reasoning skills, Reasoning stills performunce GP   General Proficiency performance Overall performance from previcat project  TE   Time efficiency of employee  P Performines The ASMF 3 ideully when all good will perform sood  all average will perform average and all poot will perform poor  The confusion matrix for GPA and The implementation of the above stady was done in PS is shown below  Table III denotes the ASMF TreePlan sottware also called DTREG and the results SCORE of all attributes  of classification by the author and the tool were matchiag  Table IV shows the results obtained from the sodtware tool  TABLE TV  IMPLEMENTATION RFSULT Therefore the ASMF (GPA)  3 8 6 18 5 15  1 07 Similarly for PS Computing for all Attributes we get the following  TABLE III  ASMF SCORE FOR AII  ATTRIRIITFS Finiahod the analyso at 20 5un 2013 ta 106 17 Analvis ren tim   00000 17 The data set of 40 employee used in this stady which was obtained from soltware company in Blangalere was basis for our classification technique  The result Basod on this computation Table III we can derive and rules obeained can classify project menbers iato decision tree with PS which has the highest ASMF as three classes of performance  good (should be root node and oeher attributes further down in their deployed), average (can be deployod with training) order  One classificaticn rule can be generated for and poor(should not be deployed)  each path froen each terninal node to root node  Pruning tochnique was executed by rensoving nodes  Further soope of this research woek is asing other with less than desired number of objects and after tree classification techniques and domg a comparative pruning process we have the following rules  Study  We can experiment on different set of astributes and find the most promising selection RULE 1 if (PS  O000 ) and (GPA  GOOD  of criteria   AVERAGE ) and (BS   GOOD  or  AVERAGE  and (DKA   OOOD  of 'AVFRRAGE ) and (CS  GOODP  of VI  CONCLUSION AVERAGE ) then P GOOD  RULE I If (PS   AVERAGE ) and (GPA  AVERAGE  of In this paper, the classification task is used ow project  GOOD ) and (RS  GOOD  of  AVERAGE') mand member's database to predict the project menber's P 6000 there are many appecoches that are used for data RIZ  3 If aPS GOOD ) and (GPA  AVFRCEGE  of classification, the decision tree method is used here   POOK' and (RS   AVLRAGE ) and (DKA   AVELAGEL  The resulting decision tree provides a representation land (CS  GOOOF or AVERAGE) then P AVERAGE  of the concept those appoals to human bocause it renders the classification process self evident  These Rta  A if pararserers were collected from the employer's  AVERACE  land (DKA AVEKAGE ) and current project  It was noted that thoagh the GPA RULE  5 Ir(PS    OOR ) then P POOR  skills, donain knowledge and reasaing skills played

Accepted Answer

The Answer is in the image, click to view ...

Question

data mining subject 1- summary the artical 2-what is data size 3- recoreds applied 4-what techqinecs is used 5- explain resualts EMPIRICAL STUDY ON SELECTION

Step by Step Solution

Step: 1

Get Instant Access to Expert-Tailored Solutions

Step: 2

Step: 3

Ace Your Homework with AI

Recommended Textbook for

Records And Database Management

Students also viewed these Databases questions

Question

Question

Question

Question

Question

Question

Question

Question

Question

Question

Question

Question

Question

Question