Bigdata netezza-ppt-apr2013-bhawani nandan prasad
- 1. © 2013 AgreeYa Solutions. All rights reserved.1
© 2013, AgreeYa Solutions. All rights reserved.
www.agreeya.com
Netezza Overview
Netezza Architecture
Netezza Performance Tuning
Netezza Admin
April 10, 2013 – BHAWANI NANDAN PRASAD – BI Practice Head
SMP – IIM Calcutta, MBA – Stratford University USA, B.E. (IT)
- 2. © 2013 AgreeYa Solutions. All rights reserved.2
Agenda
• Netezza Architecture
• Netezza Connectivity
• NZSQL
• Data Types in Netezza
• Metadata Tables
• Types of Joins in Netezza
• Data Loading and Unloading in Netezza
• Data Distribution in Netezza
• Transactions in Netezza
• GROOM/Reclaim Process in Netezza
• Zone Maps in Netezza
• GENERATE STATISTICS in Netezza
- 3. © 2013 AgreeYa Solutions. All rights reserved.3
© 2013, AgreeYa Solutions. All rights reserved.
www.agreeya.com
Netezza Architecture
- 4. © 2013 AgreeYa Solutions. All rights reserved.4
Netezza Architecture
- 5. © 2013 AgreeYa Solutions. All rights reserved.5
Data Stream Processing in Netezza
- 6. © 2013 AgreeYa Solutions. All rights reserved.6
Netezza Connectivity
- 7. © 2013 AgreeYa Solutions. All rights reserved.7
NZSQL
• Utility to interact with Netezza database
• Useful to writing multi-liner queries, executing those for analysis or
reporting purpose
• Setting environment is a pre-requisite before starting on nzsql
• Logging into nzsql opens the pg.log file and start capturing all
activities performed by user on corresponding DB
- 8. © 2013 AgreeYa Solutions. All rights reserved.8
Data Types in Netezza
DATATYPE DESCRIPTION SIZE
BOOL boolean, 'true'/'false' 1
BPCHAR char(length), blank-padded string, fixed storage length VAR
CHAR single character 1
DATE ANSI SQL date 4
FLOAT4 single-precision floating point number, 4-byte storage 4
FLOAT8 double-precision floating point number, 8-byte storage 8
INT1 -128 to 127, 1-byte storage 1
INT2 -32 thousand to 32 thousand, 2-byte storage 2
INT4 -2 billion to 2 billion integer, 4-byte storage 4
INT8 ~18 digit integer, 8-byte storage 8
INTERVAL @ <number> <units>, time interval 12
NCHAR nchar VAR
NUMERIC numeric(precision, decimal), arbitrary precision number 19
NVARCHAR nvarchar VAR
TIME hh:mm:ss, ANSI SQL time 8
TIMESTAMP date and time 8
TIMETZ hh:mm:ss, ANSI SQL time 12
VARCHAR varchar(length), non-blank-padded string, variable storage length VAR
- 9. © 2013 AgreeYa Solutions. All rights reserved.9
Metadata Tables in Netezza
• Like any other database, Netezza also provides metadata tables and
views which provides information about objects
• Some of the frequently required MD tables are:
System Table Name Usage
_V_OBJECTS Used to display information related to different objects like tables, views,
external tables, synonyms and more
_V_TABLES Used to display information related to different tables present in Netezza
_V_VIEW Used to display information related to different views present in Netezza
_V_RELATION_COLUMN Used to display information related to different columns present in Netezza
tables
- 10. © 2013 AgreeYa Solutions. All rights reserved.10
Types of Joins in Netezza
• Netezza internally processes joins in following order:
– Hash Join (in memory)
– Hash Join (in disk)
– Sort Merge Join
– Nested Loop Join
– Cross Join
• Netezza has three main types of joins available:
– Co-located Join
– Re-distribution of data
– Broadcasting of data
- 11. © 2013 AgreeYa Solutions. All rights reserved.11
Data Loading and Unloading in Netezza
• NZLOAD (only loading)
• EXTERNAL TABLES (both loading and unloading)
• CTAS (CREATE TABLE AS) (both loading and unloading)
• Nzsql with –o option (only unloading)
- 12. © 2013 AgreeYa Solutions. All rights reserved.12
Data Distribution in Netezza
• Key factor in shooting performance to great extent
• Backbone of MPP architecture
• Can be leverage using DISTRIBUTE ON clause after CRAETE TABLE
statement
• Of three types:
– DISTRIBUTE ON (column name);
– DISTRIBUTE ON RANDOM;
– No DISTRIBUTE specification
– Very useful while loading data into tables and fetching data from table
- 13. © 2013 AgreeYa Solutions. All rights reserved.13
Selecting a distribution key
• Columns with many distinct values
• Column or columns based on selection set
• As few columns as possible
• Data distributed on same key
• DO NOT use Boolean keys
• Checking distribution of data in table
- 15. © 2013 AgreeYa Solutions. All rights reserved.15
Single Redistribute
- 16. © 2013 AgreeYa Solutions. All rights reserved.16
Double Redistribute
- 18. © 2013 AgreeYa Solutions. All rights reserved.18
Transactions in Netezza
• Three basic columns to carry out transaction in Netezza
– Createxid
– Deletexid
– Rowid
• Values in these columns keep on changing with every transaction
• These are hidden columns with every table in Netezza
• Also used to track deleted records in many cases
- 19. © 2013 AgreeYa Solutions. All rights reserved.19
Transactions in Netezza contd..
- 20. © 2013 AgreeYa Solutions. All rights reserved.20
Aborted Transaction in Netezza
- 21. © 2013 AgreeYa Solutions. All rights reserved.21
Locking, Concurrency and Isolation
• Netezza implements serializable transaction isolation for highest level
of consistency
• Multi-versioning and Serialization dependency checking
• User cannot explicitly lock a table in Netezza
• UPDATE clause works differently in Netezza
- 22. © 2013 AgreeYa Solutions. All rights reserved.22
GROOM/Reclaim in Netezza
• Logically deleted records reside in memory in Netezza in following
cases:
– INSERT
– UPDATE
– Failed INSERT or aborted nzload operation
– Failed UPDATE operation
• Logically deleted records in Netezza causes:
– Occupancy of extra disk space
– Requires extra time for full table scan
- 23. © 2013 AgreeYa Solutions. All rights reserved.23
GROOM/Reclaim contd..
• GROOM/ RECLAIM process recovers this unused disk space in
Netezza
• GROOM command support operations for:
– Single table
– All tables in one database
– All tables in all database
• Benefits of GROOM:
– Permits shared access to target table
– Can be interrupted without leaving target table locked
– Refreshed materialized views created on base table
• Syntax:
- 24. © 2013 AgreeYa Solutions. All rights reserved.24
Zone Maps in Netezza
• Zone Maps are similar to indexes in any other DB
• Created on integer, date and timestamp fields
• Created and refreshed automatically when:
• GENERATE STATSTICS
• NZLOAD
• INSERT or UPDATE
• GROOM Operation
- 25. © 2013 AgreeYa Solutions. All rights reserved.25
GENERATE STATISTICS in Netezza
• Netezza optimizer relies on GENERATE STATISTICS to gather
statistics about tables
• GENERATE STATISTICS collects statistics about each table
columns:
– Minimum and maximum values on character data
– Maximum and average length on varchar
– NULL Counts
– Updates the system catalog
• GENERATE STATISTICS can be collected at three levels:
– Database Level
– Table level
– Column Level
• Can also be collected using Nzadmin tool
- 26. © 2013 AgreeYa Solutions. All rights reserved.26
GENERATE STATISTICS contd..
• Netezza system generates two basic statistics, table row count and
min-max values for character columns while doing:
– INSERT
– UPDATE
– CTAS (GENERATE STATISTICS is automatically created is row count >=
10k)
– Nzload
– GROOM
– TRUNCATE TABLE
• It is important to generate statistics for:
- 27. © 2013 AgreeYa Solutions. All rights reserved.27
SPU Failover Activity
Disk timing : It shows the SPU showing the slow
performance
Step 1) Pause the system
• nzsql>> nzsystem pause
Step 2) Confirm that the system is paused
• nzsql>> nzstate
Step 3) Failover the SPU
• nzsql>> nzspu failover -id <SPU ID>
Step 4) Resume the system
• nzsql>> nzsystem resume
- 28. © 2013 AgreeYa Solutions. All rights reserved.28
Genstats Command
To generate statistics on any database table(s) for
which the statistics
• are not currently 100% "up-to-date".
The optimizer uses statistics to guide its decisions on
how best to execute a query. The more reliable and
up-to-date the statistics are,more accurate
optimizer's decisions are likely to be.
- 29. © 2013 AgreeYa Solutions. All rights reserved.29
Backup & Restore
Types of Back up :
Full Back up
Differential backup
Incremental Differential backup
Cumulative Differential backup
Elaborative Example
•
- 30. © 2013 AgreeYa Solutions. All rights reserved.30
Back up Command
Backup command / scripts is used for backing up tables from NPS.
Backup command / nz_backup script must be run locally (on the NPS host being backed
up).
These command/scripts processes a single table, multiple tables, or an entire database.
The data format that is used can be either
ascii -- which is very portable.
binary-- which is Netezza's compressed/internal format, which is
much faster, and results in significantly smaller backup sets.
gzip -- ascii, which is gzip'ed on the NPS host.
The data is written to (or read from) disk files or named pipes.
If pipes are used, another application is used to produce the data.
These scripts just concern themselves with the DATA itself. When backing up
• a table, the DDL is not included.
- 31. © 2013 AgreeYa Solutions. All rights reserved.31
Back up Command Examples
Full backup:
• /nz/kit/bin/nzbackup -db CIDB_PRD -dir
/back_folder
• nohup nzbackup -db CIDB_PRD -u admin -dir
/back_folder
Differential backup:
/nz/kit/bin/nzbackup -db CIDB_PRD -u admin -dir
/back_folder -differential -v
• nohup nzbackup -db CIDB_PRD -u admin -dir
/back_folder -schema-only
- 32. © 2013 AgreeYa Solutions. All rights reserved.32
Restore Command
Restore command / scripts is used to restore tables to NPS.
Restore command / nz_restore script must be run locally (on the NPS host being restored ).
These command/scripts processes a single table, multiple tables, or an entire database.
The data format that is used can be either
ascii -- which is very portable.
binary-- which is Netezza's compressed/internal format, which is
much faster, and results in significantly smaller backup sets.
gzip -- ascii, which is gzip'ed on the NPS host.
The data is written to (or read from) disk files or named pipes.
If pipes are used, another application is used to produce the data.
These scripts just concern themselves with the DATA itself. When backing up
• a table, the DDL is not included.
- 33. © 2013 AgreeYa Solutions. All rights reserved.33
Restore Command
Syntax : nzrestore [-db database] [-dir directory]
• [-connector name] [-connectorArgs] [-schema only]
» [-users] [-v] [-rev] [-h] [-increment] [-mode]
• [-backupset ID] [-lockdb]
Here,
-dir specifies the backup root directory when using the file system connector
-connector specifies the connector type either File System, Veritas or Tivoli
• NOTE : If -connector is omitted defaults to File System connector
-connectorArgs specifies:
• - DATASTORE_SERVER and DATASTORE_POLICY when using the Veritas
connector
• - The TSM password when using Tivoli connector
• - may optionally be specified as environment variables
-If incremental is omitted, defaults to full backup
-mode specifies REST /NEXT mode .
-lockdb specifies locking of database during restore [ TRUE/FALSE]
- 34. © 2013 AgreeYa Solutions. All rights reserved.34
Tape Back up Command
Similar to Backup Command.
Syntax is also similar
Command only Differs in the destination location which is “Tape” instead of
any file location as that of normal backup .
Example :
nzbackup -v -db EDW_STANDBY -dir
/migration/TF12_EDW_STANDBY_Tape_Backup/tape1
/migration/TF12_EDW_STANDBY_Tape_Backup/tape2
/migration/TF12_EDW_STANDBY_Tape_Backup/tape3
/migration/TF12_EDW_STANDBY_Tape_Backup/tape4 -streams 4
- 35. © 2013 AgreeYa Solutions. All rights reserved.35
Netezza Performance Server
- 36. © 2013 AgreeYa Solutions. All rights reserved.36
Defaults in Netezza
Default Users
nz (Linux OS user )
admin ( NPS database super-user with full access to all
system functions and objects )
root ( Linux root user )
System Defaults
system database
public group
5480 – ODBC port
Note :
By default user created is added to the public group
User can’t be deleted from public group
group, user & database share a common namespace.so group name, user name and
database names must be unique.
- 37. © 2013 AgreeYa Solutions. All rights reserved.37
Managing Users
By default user have access to only system views allowing then to retrieve a list of used
database objects.
Sql for Creating User :
CREATE USER user_name WITH PASWORD ‘string’ [options]
Sql for altering User credentials/privileges
ALTER USER user_name WITH [options]
Sql for deleting User :
DROP USER user_name
Note : Here options can be :
Row limit, Group name, Validity, Session Time out, Query Time out, Default priority, Maximum
priority, Resource group
Nzsql command to list user
• SYSTEM(ADMIN) => du
Nzsql command to list user’s permission
• SYSTEM(ADMIN) => dpu
- 38. © 2013 AgreeYa Solutions. All rights reserved.38
Managing Groups
By default group created is public group.
By default user is added in public group.
Sql for Creating Group :
CREATE GROUP group_name WITH PASWORD ‘string’ [options]
Sql for altering group credentials/privileges
ALTER GROUP group_name [ADD|OWNER|RENAME|WITH]
Sql for deleting User :
DROP USER use_name
Note : Here options can be :
Row limit, Session Time out, Query Time out, Default priority, Maximum priority, Resource limit, user
names
Nzsql command to list user
• SYSTEM(ADMIN) => dg
Nzsql command to list user’s permission
• SYSTEM(ADMIN) => dpg
- 39. © 2013 AgreeYa Solutions. All rights reserved.39
Permissions
Types of Permission :
Object Permissions [ 11 nos. ]:
• List, Select
• Insert, Delete, Update
• Alter, Drop, Truncate
• Lock, Abort, Load, Genstat
Admin Permissions [ 13 nos. ]:
• Database, Temporary Table, External Table , System Table, view
• User, Group
• Create, Backup, Restore, Reclaim
• Hardware, system
Scope of Permission :
Applicable only to Object Permissions :
Two classes
• Local Scope : Applicable when logged into particular database
• Global Scope : Applicable when logged into system database
By default Admin permissions are Global in Nature
- 40. © 2013 AgreeYa Solutions. All rights reserved.40
Object Permissions
Object Permissions Granted in the system database are inherited by all other databases
• i.e. they have global scope
Object Permissions Granted within database are local to the databases
• i.e. they have local scope
Object Permissions are additive in nature
• i.e. Effectively all permission the of an object
• = User Permissions + Group Permissions + Public Permission
Sql for Granting Object Permission :
GRANT object_permission On object TO {PUBLIC | GROUP group_name | user_name } [ WITH GRANT
OPTION ]
Sql for Revoking Object Permission :
REVOKE object_permission On object TO {PUBLIC | GROUP group_name | user_name }
- 41. © 2013 AgreeYa Solutions. All rights reserved.41
Admin Permissions
Admin Permissions are Global in scope
Sql for Granting Admin Permission :
GRANT admin_permission TO {PUBLIC | GROUP group_name | user_name } [
WITH GRANT OPTION ]
Sql for Revoking Admin Permission :
REVOKE admin_permission TO {PUBLIC | GROUP group_name | user_name }
Nzsql command to list user’s permission
SYSTEM(ADMIN) => dpu
Nzsql command to list group’s permission
SYSTEM(ADMIN) => dpg
- 42. © 2013 AgreeYa Solutions. All rights reserved.42
Listing All Permissions to User/Group
- 43. © 2013 AgreeYa Solutions. All rights reserved.43
Viewing The Distribution & Skew
In CLI
on linux prompt
• $ nz_skew utility ( on Linux Prompt )
on nzsql prompt
• nzsql => SELECT datasliceid, COUNT(datasliceid) AS "ROWS"
FROM MB_STU_PRE
• GROUP BY datasliceid
• ORDER BY "ROWS";
In GUI
– In nzAdmin –> Tools –> Table Skews
NOTE :
For changing the distribution key Create Table table_name AS ( select clause ) is used with Distribution
Key
If distribution clause is not specified in the CTAS, parent table distribution key column is used as
distribution by default.
The default threshold to display skew of table 100 MB.
- 44. © 2013 AgreeYa Solutions. All rights reserved.44
Log Files in Netezza
All the log file in Netezza are in the directory:
/nz/kit/log/
Various log created are
Alcapp, alcloader, waitForAlcapp
backupsvr, bnrmgr, restoresvr
bootsvr, dbos
Clientmgr , eventmgr, sessionmgr, sysmgr
fcommrtx
gencErrors, hostStatsGen
Loadmgr , nzloadTmpLogs
Plans , planshist, postgres
sendMail
ssgdba
startupsvr, statsSvr
- 45. © 2013 AgreeYa Solutions. All rights reserved.45
Priority
Priority are
Job Priority
Session Priority
Priority values are defined for a user, a group, or as the system default
Sys determines value of priority to use when the user connects to the host
– and executes SQL commands
Two more are there- SYSTEM CRITICAL (highest) and SYSTEM BACKGOUND (lowest), which are not visible to
user.
The possible priorities are critical, high, normal, low, or none.
The default priority for groups, and the system is none.
If priorities are not set, user sessions run at normal priority.
- 46. © 2013 AgreeYa Solutions. All rights reserved.46
Priority (Contd…)
The syntax to set system priority is:
SET SYSTEM DEFAULT
– [SESSIONTIMEOUT | ROWSETLIMIT | QUERYTIMEOUT ] TO [number |
UNLIMITED ]
– [DEFPRIORITY | MAXPRIORITY ] to [CRITICAL | HIGH | NORMAL | LOW |
NONE]
The syntax to create group and set default priority is :
SHOW SYSTEM DEFAULT MAXPRIORITY;
SHOW SYSTEM DEFAULT DEFPRIORITY;
The syntax to create group and set default priority is :
CREATE GROUP group_name WITH DEFPRIORITY TO HIGH;
The syntax to create group and set default priority is :
CREATE USER user_name WITH DEFPRIORITY TO CRITICAL;
- 47. © 2013 AgreeYa Solutions. All rights reserved.47
Priority (Contd…)
The syntax to change the priority of a session
ALTER SESSION [<session_id>] SET PRIORITY TO
<priority> ;
Example :
– nzsql=> ALTER SESSION 21664 SET PRIORITY TO HIGH;
The syntax to change priority of a session using nzsession
nzsession priority -high -u nz -pw password -id 21664;
- 48. © 2013 AgreeYa Solutions. All rights reserved.48
Migration
Based on the activities migration activities are classified as :
Data Migration
Environment migration :
Code Migration
- 49. © 2013 AgreeYa Solutions. All rights reserved.49
Data Migration
Data Migration : It is done on two different ways :
nz_migrate utility –
Syntax : nz_migrate -shost <name/IP> -thost <name/IP>
» -sdb <dbname> -tdb <dbname>
» [optional args]
This script must be invoked from the 'source' machine.
Optionally, this script can automatically create the target database and objects
via the options
• -CreateTargetTable
• -CreateTargetDatabase
Through means of external table –
Database/table is converted to external table from the source
External table is converted back to database/table to the target
- 50. © 2013 AgreeYa Solutions. All rights reserved.50
Environment Migration
Environment Migration :
Objects created by developer in personal database EDW_UT is moved to
development EDW_SIT database
All the Objects present in development database EDW_SIT is moved to
Production database EDW_PROD
This is done through customized script
“export/home/nz/psm/scripts/dlc/object.in.work.prod.bash “
NOTE :
– There is also a script
“export/home/nz/psm/scripts/dlc/object.in.work.create.bash”
which
– promotes objects from personal database EDW_UT to EDW_SIT
for testing purpose
- 51. © 2013 AgreeYa Solutions. All rights reserved.51
Code Migration
Code migration :
This is a manually method where ddl of the objects like
table,view etc are obtained using nz_ddl_table, nz_ddl_view
and through this ddl’s objects are created.
The privileges (ACL) of the object is obtained through
nz_get_acl utility, is used to reproduce the ACL on the newly
created object .
This is done through customized script .
- 52. © 2013 AgreeYa Solutions. All rights reserved.52
Events and Alerts
By default there are total 40 events for which alert is
being raised. They are :1. CPUcoresOK_em_NzCS
2. CPUcoresReduced_em_NzCS
3. HostNoLongerOnline
4. HostNotOnline
5. MemFlt_rc_NzCS
6. NzDAC_QDR_fault_em_NzCS
7. Regen_em_NzCS
8. RunAwayQuery_TF12
9. RunAway_rc_NzCS
10. RunAway_rc_monitor
11. SystemOnline
12. coreRequest_em_NzCS
13. dFPGA_em_NzCS
14. dFPGA_em_NzCS_r
15. diskFull_8x_em_NzCS
16. diskFull_90_em_NzCS
17. diskFull_95_em_NzCS
18. histCapture_em_NzCS
19. histLoad_em_NzCS
20. hwFlt_FanOrPwr_em_NzCS
- 53. © 2013 AgreeYa Solutions. All rights reserved.53
Monitoring & Gathering Scripts
Monitoring on the system is done through customized/ system scripts which
executes daily in following ways :
Hourly performance data of each server is generated and mailed
Consolidated reports on various daily dba activites ( like backup, restore,
genstat, reclaim etc..) for all the servers is generated and mailed
Complete health report is generated by nz_health and mailed
Report of all the log activities is generated by and mailed
SPU performance is checked by disk_timing script on every 8 hrs.
- 54. © 2013 AgreeYa Solutions. All rights reserved.54
www.agreeya.com
54
Thank You
BHAWANI NANDAN PRASAD
BI & Analytics Practice Head
Bhawani.prasad@agreeya.net
+91 9717570222