SlideShare a Scribd company logo
Data mining and where
Housing
To know technical edge with Ashok
Thisis useful forall engineeringstudentshadthissubject.
EspeciallyforAndhraUniversityandaffiliatedcollege IT&
CSE students.Itisgood forjustone day before examand
nightout students.
Ashok Sandhyala
23-Jan-15
Dm and dw
Data Mining and Where Housing
Assignment – 1
1. Define Data Mining?
 Data Mining is extract the useful information from the large database
 Data mining is extract the knowledge from the large database
 Data mining is extract interesting patterns from large database
 Data mining is popularly known as knowledge discovery in Database
(KDD)
Data miningisthe processof collecting,searchingthroughandanalyzingalarge amountof data ina
database
Data Miningis the practice of examininglarge pre-existingdatabasesinordertogenerate new
information.
2. Explain the knowledge discovery in database with diagram?
 Data miningassimplyanessential stepinte processof knowledgediscoveryindata
base
 Knoledge discoveryasaprocessisdepictedinfigure andcosistsof an iterative
sequence of the followingsteps
1) Data cleaning:Toremove noise andinconsistentdata.
2) Data integration:Where multipledatasource maybe combined.
3) Data selection:Where datarelevanttothe analysistaskare retrevedfromthe database.
4) Data transformation:where dataare transformedorconsolidatedintoforms
appropriate forminingbyperformingsummaryoraggregationoperationoperationsfor
instance.
5) Data mining:Anessential processwhere intelligentmethods are appliedinorderto
extractdata patterns
6) Patternevalution:Toidentifythe trulyinterestingpatterns
 Knowledgetothe user
 Dataminingstepmayinteractwiththe wseror knowledgebase.
 Interestingpatternsare presentedtothe use ,and maybe storage as new knowledgeinthe
knowledge base.
 Data miningisonlyone stepinthe entire process
Since,ituncovershiddenpatternsforevolution.
 Dataminingisstepinknowledgediscovery process
 Data miningisbecomingmore popularthanlarge termof KDD
 Data miningisthe processof discoveringinterestingknowledge.
 From large amountsof data storedeitherindb’s,dw’sorotherinformationrepositories.
3. Explain a typical Data mining Architecture?
:
DB , DW or other informationrepository :-
 Thisis one or a set of databases, dws,spreadsheetsororthe kindsof information
repository.
 Data cleaninganddata integrationtechniquesmaybe performedonthe data
DB or DW server:-
 It isresponsibleforfetchingthe relevantdatabasedonthe users data miningrequest.
Knowledgebase:-
 It isthe domainknowledge
 Usedto guide the search, or evaluate the interestingnessof resultingpatterns.
 Such knowledgecaninclude concept hierarchies, concepthierarchies,concepthierarchiesused
to organize attributesorattribute valuesintodifferentlevelsof abstractions.
 Knowledgesuchasuserbeliefs
 Thiscan be usedto assessa patternsinterestingnessconstraintsathresholdandmetadata.
Data miningengine:-
 It isessential tothe dataminingsystem
 It consistsof a set of functional modules fortaskssuchas characterization,association,
classificationclusteranalysis,evolutionanddeviationanalysis
 Patternevaluation module:-
 It employsinterestingnessmeasures
 To focusinterestingpatterns
 It usesinterestingpatterns
 It usesinterestingnessthresholdstofilteroutdiscoveredpatterns
 It integrate withminingmodule
Graphical userinterface:-
It communicatesbetween usersanddataminingsystem
It allowingthe usertointeractwiththe systembyspecifyingadata mining
It providesinformationtohelpfocusthe searchandperformingexploration dataminingbasedon
the intermediatedataminingresult.
From a data ware perspective:-
 A data miningcanbe viewedasan advancedstage of OLAP
 Datamininginvolvesanintegrationof techiniquesfronmultipledisciplinessuch
1. DB technologies
2. Statistics
3. Machine learning
4. Highperformance computing
5. Patternrecognition
6. Neutral networks
7. Data visualization
8. Informationretrieval
9. Image
10. Single processing,spatial dataanalysis
 Emphasisisplacedonefficientandscalable dmtechniquesfoe large db
 So dm isconsideredone of the mostimportantfrontiersindbsysteminformation industry.
4. Explain relational Data base, transactional DB, DW with diagrams?
5. Explain Advance Database systems andadvanced applications ?
6. Explain ClassificationofData Mining Systems?
DM isan interdisciplinaryfield
DM as a confluence of multiple disciplinesincludesDBsystem,statistics,machine learning,visualization,
Informationscience.
DM systemscanbe categorizedaccordingtovariouscriteria
1. Classificationaccordingtothe kindsof DB mined
2. Classificationaccordingtothe kindsof knowledgemined
3. Classificationaccordingtothe kindsof techniquesutilized
4. Classificationaccordingtothe kindsof applicationsadopted
1.Classificationaccordingtothe kindsof DB mined:-
 Dm systemscanbe classified accordingtothe kindsof DB mined
 DB systemsthemselvescanbe classifiedaccordingtodifferentcriterias
Suchas ---models,typesof data,applicationsinvolved.
 Each criterionrequire itsownDMtechnique
 DM if classifyingaccordingodatamodels.We mayhave a relationtransactional,object
oriented,object-relational,Data-warehouse
 If classifyingaccordingtospecial typesof datahandledwe mayhave a spatial,time ��series,text
and multimedia,www
2.clssificationaccordingtothe kindsof knowledge mined:-
 DM systemcategorizedaccordingtothe kindsof knowledge
 CategorizedbasedonDM
 DM functionalities-characterization,Discrimination,Association,classification,clustering,
outlieranalysis,evolutionanalysis
 A comprehensive DMsystemusuallyprovides –multiple DMfunctionalitiesintegratedDM
functionalities
 Move over,DMsystemscan be distinguishedbasedonthe granularityorlevel of abstractionof
knowledge mined
 Include-generalizedknowledge(atahigh-level of abstraction)
Primitive level knowledge(ata raw data level)
Knowledgeatmultiple levels(consideringseveral levelsof abstraction)
 An advancedDMsystemshouldfacilitate the discoveryof knowledge atmultiple levelsof
abstraction.
 DM systemscanalsobe categorizedasmine dataregularitiesv/smine datairregularities.
3.classificationaccordingtothe kindsof techniquesutilized:-
Categorizedaccording tothe underlyingDMtechniquesemployed
Techniquescanbe describedaccordingtothe degree of userinteractive to involved
Eg—autonomoussystem,interactive exploratorysystems,query-drivensystems.
Technologiescanbe describedaccordingtothe methodsof data analysisemployed
Eg—DB oriented , DW orientedtechniques,statistics,visualization,neutral,machine learning,pattern
recognition
A sophisticateddataminingsystemwill oftenadoptmultiple DMtechniquesorworkout an effective,
integratedtechnique
4.classificationaccordingtothe applicationadopted:-
DM systemcategorizedaccordingtothe applicationstheyadopt
Eg—DMsystemscouldbe tailoredspecificallyfoe finance,tale communication,DNA,STOCKMARKETS,
E-mails.
Differentapplicationsoftenrequire the integrationof applicationspecificmethods.
Therefore,ageneral all-purposeDMsystemsmay not fit domain-specificminingtasks.
7. Explain Major issues in Data mining system?
 Major issuesinData miningregarding
1. Miningmethodology
2. User interaction
3. Performance
4. Diverse datatypes
1.MiningmethodologyandUserinteractionissues:-
 These issuesreflect
 The kindsof knowledge mined
 Abilitytomine knowledge atmultiple granularities
 The use of domainknowledge
 Ad hocDM
 Knowledgevisualization
a. Miningdifferentkindsof knowledgeinDM:-
 Since differentuserscanbe interestedindifferentkindsof knowledge
 DM shouldcovera wide spectrumof dataanalysisandknowledge discoverytasks
including
Data characterization,discriminatory,association,classificationtrendclustering,
deviationanalysis,similarityanalysis

More Related Content

Dm and dw

  • 1. Data mining and where Housing To know technical edge with Ashok Thisis useful forall engineeringstudentshadthissubject. EspeciallyforAndhraUniversityandaffiliatedcollege IT& CSE students.Itisgood forjustone day before examand nightout students. Ashok Sandhyala 23-Jan-15
  • 3. Data Mining and Where Housing Assignment – 1 1. Define Data Mining?  Data Mining is extract the useful information from the large database  Data mining is extract the knowledge from the large database  Data mining is extract interesting patterns from large database  Data mining is popularly known as knowledge discovery in Database (KDD) Data miningisthe processof collecting,searchingthroughandanalyzingalarge amountof data ina database Data Miningis the practice of examininglarge pre-existingdatabasesinordertogenerate new information. 2. Explain the knowledge discovery in database with diagram?  Data miningassimplyanessential stepinte processof knowledgediscoveryindata base  Knoledge discoveryasaprocessisdepictedinfigure andcosistsof an iterative sequence of the followingsteps 1) Data cleaning:Toremove noise andinconsistentdata. 2) Data integration:Where multipledatasource maybe combined. 3) Data selection:Where datarelevanttothe analysistaskare retrevedfromthe database. 4) Data transformation:where dataare transformedorconsolidatedintoforms appropriate forminingbyperformingsummaryoraggregationoperationoperationsfor instance. 5) Data mining:Anessential processwhere intelligentmethods are appliedinorderto extractdata patterns 6) Patternevalution:Toidentifythe trulyinterestingpatterns  Knowledgetothe user  Dataminingstepmayinteractwiththe wseror knowledgebase.
  • 4.  Interestingpatternsare presentedtothe use ,and maybe storage as new knowledgeinthe knowledge base.  Data miningisonlyone stepinthe entire process Since,ituncovershiddenpatternsforevolution.  Dataminingisstepinknowledgediscovery process  Data miningisbecomingmore popularthanlarge termof KDD  Data miningisthe processof discoveringinterestingknowledge.  From large amountsof data storedeitherindb’s,dw’sorotherinformationrepositories.
  • 5. 3. Explain a typical Data mining Architecture? : DB , DW or other informationrepository :-  Thisis one or a set of databases, dws,spreadsheetsororthe kindsof information repository.  Data cleaninganddata integrationtechniquesmaybe performedonthe data DB or DW server:-  It isresponsibleforfetchingthe relevantdatabasedonthe users data miningrequest. Knowledgebase:-  It isthe domainknowledge  Usedto guide the search, or evaluate the interestingnessof resultingpatterns.  Such knowledgecaninclude concept hierarchies, concepthierarchies,concepthierarchiesused to organize attributesorattribute valuesintodifferentlevelsof abstractions.  Knowledgesuchasuserbeliefs  Thiscan be usedto assessa patternsinterestingnessconstraintsathresholdandmetadata.
  • 6. Data miningengine:-  It isessential tothe dataminingsystem  It consistsof a set of functional modules fortaskssuchas characterization,association, classificationclusteranalysis,evolutionanddeviationanalysis  Patternevaluation module:-  It employsinterestingnessmeasures  To focusinterestingpatterns  It usesinterestingpatterns  It usesinterestingnessthresholdstofilteroutdiscoveredpatterns  It integrate withminingmodule Graphical userinterface:- It communicatesbetween usersanddataminingsystem It allowingthe usertointeractwiththe systembyspecifyingadata mining It providesinformationtohelpfocusthe searchandperformingexploration dataminingbasedon the intermediatedataminingresult. From a data ware perspective:-  A data miningcanbe viewedasan advancedstage of OLAP  Datamininginvolvesanintegrationof techiniquesfronmultipledisciplinessuch 1. DB technologies 2. Statistics 3. Machine learning 4. Highperformance computing 5. Patternrecognition 6. Neutral networks 7. Data visualization 8. Informationretrieval 9. Image 10. Single processing,spatial dataanalysis  Emphasisisplacedonefficientandscalable dmtechniquesfoe large db  So dm isconsideredone of the mostimportantfrontiersindbsysteminformation industry.
  • 7. 4. Explain relational Data base, transactional DB, DW with diagrams? 5. Explain Advance Database systems andadvanced applications ? 6. Explain ClassificationofData Mining Systems? DM isan interdisciplinaryfield DM as a confluence of multiple disciplinesincludesDBsystem,statistics,machine learning,visualization, Informationscience. DM systemscanbe categorizedaccordingtovariouscriteria 1. Classificationaccordingtothe kindsof DB mined 2. Classificationaccordingtothe kindsof knowledgemined 3. Classificationaccordingtothe kindsof techniquesutilized 4. Classificationaccordingtothe kindsof applicationsadopted 1.Classificationaccordingtothe kindsof DB mined:-  Dm systemscanbe classified accordingtothe kindsof DB mined  DB systemsthemselvescanbe classifiedaccordingtodifferentcriterias
  • 8. Suchas ---models,typesof data,applicationsinvolved.  Each criterionrequire itsownDMtechnique  DM if classifyingaccordingodatamodels.We mayhave a relationtransactional,object oriented,object-relational,Data-warehouse  If classifyingaccordingtospecial typesof datahandledwe mayhave a spatial,time –series,text and multimedia,www 2.clssificationaccordingtothe kindsof knowledge mined:-  DM systemcategorizedaccordingtothe kindsof knowledge  CategorizedbasedonDM  DM functionalities-characterization,Discrimination,Association,classification,clustering, outlieranalysis,evolutionanalysis  A comprehensive DMsystemusuallyprovides –multiple DMfunctionalitiesintegratedDM functionalities  Move over,DMsystemscan be distinguishedbasedonthe granularityorlevel of abstractionof knowledge mined  Include-generalizedknowledge(atahigh-level of abstraction) Primitive level knowledge(ata raw data level) Knowledgeatmultiple levels(consideringseveral levelsof abstraction)  An advancedDMsystemshouldfacilitate the discoveryof knowledge atmultiple levelsof abstraction.  DM systemscanalsobe categorizedasmine dataregularitiesv/smine datairregularities. 3.classificationaccordingtothe kindsof techniquesutilized:- Categorizedaccording tothe underlyingDMtechniquesemployed Techniquescanbe describedaccordingtothe degree of userinteractive to involved Eg—autonomoussystem,interactive exploratorysystems,query-drivensystems. Technologiescanbe describedaccordingtothe methodsof data analysisemployed Eg—DB oriented , DW orientedtechniques,statistics,visualization,neutral,machine learning,pattern recognition A sophisticateddataminingsystemwill oftenadoptmultiple DMtechniquesorworkout an effective, integratedtechnique 4.classificationaccordingtothe applicationadopted:- DM systemcategorizedaccordingtothe applicationstheyadopt
  • 9. Eg—DMsystemscouldbe tailoredspecificallyfoe finance,tale communication,DNA,STOCKMARKETS, E-mails. Differentapplicationsoftenrequire the integrationof applicationspecificmethods. Therefore,ageneral all-purposeDMsystemsmay not fit domain-specificminingtasks. 7. Explain Major issues in Data mining system?  Major issuesinData miningregarding 1. Miningmethodology 2. User interaction 3. Performance 4. Diverse datatypes 1.MiningmethodologyandUserinteractionissues:-  These issuesreflect  The kindsof knowledge mined  Abilitytomine knowledge atmultiple granularities  The use of domainknowledge  Ad hocDM  Knowledgevisualization a. Miningdifferentkindsof knowledgeinDM:-  Since differentuserscanbe interestedindifferentkindsof knowledge  DM shouldcovera wide spectrumof dataanalysisandknowledge discoverytasks including Data characterization,discriminatory,association,classificationtrendclustering, deviationanalysis,similarityanalysis