Dm and dw
- 1. Data mining and where
Housing
To know technical edge with Ashok
Thisis useful forall engineeringstudentshadthissubject.
EspeciallyforAndhraUniversityandaffiliatedcollege IT&
CSE students.Itisgood forjustone day before examand
nightout students.
Ashok Sandhyala
23-Jan-15
- 3. Data Mining and Where Housing
Assignment – 1
1. Define Data Mining?
Data Mining is extract the useful information from the large database
Data mining is extract the knowledge from the large database
Data mining is extract interesting patterns from large database
Data mining is popularly known as knowledge discovery in Database
(KDD)
Data miningisthe processof collecting,searchingthroughandanalyzingalarge amountof data ina
database
Data Miningis the practice of examininglarge pre-existingdatabasesinordertogenerate new
information.
2. Explain the knowledge discovery in database with diagram?
Data miningassimplyanessential stepinte processof knowledgediscoveryindata
base
Knoledge discoveryasaprocessisdepictedinfigure andcosistsof an iterative
sequence of the followingsteps
1) Data cleaning:Toremove noise andinconsistentdata.
2) Data integration:Where multipledatasource maybe combined.
3) Data selection:Where datarelevanttothe analysistaskare retrevedfromthe database.
4) Data transformation:where dataare transformedorconsolidatedintoforms
appropriate forminingbyperformingsummaryoraggregationoperationoperationsfor
instance.
5) Data mining:Anessential processwhere intelligentmethods are appliedinorderto
extractdata patterns
6) Patternevalution:Toidentifythe trulyinterestingpatterns
Knowledgetothe user
Dataminingstepmayinteractwiththe wseror knowledgebase.
- 4. Interestingpatternsare presentedtothe use ,and maybe storage as new knowledgeinthe
knowledge base.
Data miningisonlyone stepinthe entire process
Since,ituncovershiddenpatternsforevolution.
Dataminingisstepinknowledgediscovery process
Data miningisbecomingmore popularthanlarge termof KDD
Data miningisthe processof discoveringinterestingknowledge.
From large amountsof data storedeitherindb’s,dw’sorotherinformationrepositories.
- 5. 3. Explain a typical Data mining Architecture?
:
DB , DW or other informationrepository :-
Thisis one or a set of databases, dws,spreadsheetsororthe kindsof information
repository.
Data cleaninganddata integrationtechniquesmaybe performedonthe data
DB or DW server:-
It isresponsibleforfetchingthe relevantdatabasedonthe users data miningrequest.
Knowledgebase:-
It isthe domainknowledge
Usedto guide the search, or evaluate the interestingnessof resultingpatterns.
Such knowledgecaninclude concept hierarchies, concepthierarchies,concepthierarchiesused
to organize attributesorattribute valuesintodifferentlevelsof abstractions.
Knowledgesuchasuserbeliefs
Thiscan be usedto assessa patternsinterestingnessconstraintsathresholdandmetadata.
- 6. Data miningengine:-
It isessential tothe dataminingsystem
It consistsof a set of functional modules fortaskssuchas characterization,association,
classificationclusteranalysis,evolutionanddeviationanalysis
Patternevaluation module:-
It employsinterestingnessmeasures
To focusinterestingpatterns
It usesinterestingpatterns
It usesinterestingnessthresholdstofilteroutdiscoveredpatterns
It integrate withminingmodule
Graphical userinterface:-
It communicatesbetween usersanddataminingsystem
It allowingthe usertointeractwiththe systembyspecifyingadata mining
It providesinformationtohelpfocusthe searchandperformingexploration dataminingbasedon
the intermediatedataminingresult.
From a data ware perspective:-
A data miningcanbe viewedasan advancedstage of OLAP
Datamininginvolvesanintegrationof techiniquesfronmultipledisciplinessuch
1. DB technologies
2. Statistics
3. Machine learning
4. Highperformance computing
5. Patternrecognition
6. Neutral networks
7. Data visualization
8. Informationretrieval
9. Image
10. Single processing,spatial dataanalysis
Emphasisisplacedonefficientandscalable dmtechniquesfoe large db
So dm isconsideredone of the mostimportantfrontiersindbsysteminformation industry.
- 7. 4. Explain relational Data base, transactional DB, DW with diagrams?
5. Explain Advance Database systems andadvanced applications ?
6. Explain ClassificationofData Mining Systems?
DM isan interdisciplinaryfield
DM as a confluence of multiple disciplinesincludesDBsystem,statistics,machine learning,visualization,
Informationscience.
DM systemscanbe categorizedaccordingtovariouscriteria
1. Classificationaccordingtothe kindsof DB mined
2. Classificationaccordingtothe kindsof knowledgemined
3. Classificationaccordingtothe kindsof techniquesutilized
4. Classificationaccordingtothe kindsof applicationsadopted
1.Classificationaccordingtothe kindsof DB mined:-
Dm systemscanbe classified accordingtothe kindsof DB mined
DB systemsthemselvescanbe classifiedaccordingtodifferentcriterias
- 8. Suchas ---models,typesof data,applicationsinvolved.
Each criterionrequire itsownDMtechnique
DM if classifyingaccordingodatamodels.We mayhave a relationtransactional,object
oriented,object-relational,Data-warehouse
If classifyingaccordingtospecial typesof datahandledwe mayhave a spatial,time –series,text
and multimedia,www
2.clssificationaccordingtothe kindsof knowledge mined:-
DM systemcategorizedaccordingtothe kindsof knowledge
CategorizedbasedonDM
DM functionalities-characterization,Discrimination,Association,classification,clustering,
outlieranalysis,evolutionanalysis
A comprehensive DMsystemusuallyprovides –multiple DMfunctionalitiesintegratedDM
functionalities
Move over,DMsystemscan be distinguishedbasedonthe granularityorlevel of abstractionof
knowledge mined
Include-generalizedknowledge(atahigh-level of abstraction)
Primitive level knowledge(ata raw data level)
Knowledgeatmultiple levels(consideringseveral levelsof abstraction)
An advancedDMsystemshouldfacilitate the discoveryof knowledge atmultiple levelsof
abstraction.
DM systemscanalsobe categorizedasmine dataregularitiesv/smine datairregularities.
3.classificationaccordingtothe kindsof techniquesutilized:-
Categorizedaccording tothe underlyingDMtechniquesemployed
Techniquescanbe describedaccordingtothe degree of userinteractive to involved
Eg—autonomoussystem,interactive exploratorysystems,query-drivensystems.
Technologiescanbe describedaccordingtothe methodsof data analysisemployed
Eg—DB oriented , DW orientedtechniques,statistics,visualization,neutral,machine learning,pattern
recognition
A sophisticateddataminingsystemwill oftenadoptmultiple DMtechniquesorworkout an effective,
integratedtechnique
4.classificationaccordingtothe applicationadopted:-
DM systemcategorizedaccordingtothe applicationstheyadopt
- 9. Eg—DMsystemscouldbe tailoredspecificallyfoe finance,tale communication,DNA,STOCKMARKETS,
E-mails.
Differentapplicationsoftenrequire the integrationof applicationspecificmethods.
Therefore,ageneral all-purposeDMsystemsmay not fit domain-specificminingtasks.
7. Explain Major issues in Data mining system?
Major issuesinData miningregarding
1. Miningmethodology
2. User interaction
3. Performance
4. Diverse datatypes
1.MiningmethodologyandUserinteractionissues:-
These issuesreflect
The kindsof knowledge mined
Abilitytomine knowledge atmultiple granularities
The use of domainknowledge
Ad hocDM
Knowledgevisualization
a. Miningdifferentkindsof knowledgeinDM:-
Since differentuserscanbe interestedindifferentkindsof knowledge
DM shouldcovera wide spectrumof dataanalysisandknowledge discoverytasks
including
Data characterization,discriminatory,association,classificationtrendclustering,
deviationanalysis,similarityanalysis