What is Information Integrations Solution’s definition of Grid Computing?A Group of Like Servers running the Same Operating system
High Speed Private network between nodes (1gb)
More than 3 physical nodes
–3 or Less can be managed using static configuration files
–More than 3 is un-manageable (too many combinations)
Resource Manger is being used to Manage node allocation
Dynamic Configuration files created at RUNTIME
Why Information Integration Grid?1.Commodity Hardware
Low Cost
High Availability
2.Software Scalability
Larger Data sources
Faster Run times
3.Utility Usage / Consolidation
Away from dedicated Servers by Project
Data Analysis / Data Cleansing / Data Integration on shared pool of resources
Resource Manager ResponsibilityNode Allocation / Holding of Resource until task completes
Nodes that are used, can’t be used again until previously assigned task completes
No additional nodes available –Jobs Queue’d
Released using FIFO method
Must have enough to run max concurrent job streams
Smart usage/allocation by type of job
Define queues based on priority
Down/off servers not assigned
Resource Manager ResponsibilityLicense Restricter(nodes allocated by type of task)
–ProfileStage: 4 nodes
–QualityStage: 6 nodes
–DataStage: 10 nodes
•Based on time of day
From Shared pool of nodes
License for 8 nodes, but have 48 nodes available (other apps)
Assigned to Logical, Not Physical node
Each Resource Manager manages these tasks differently
Real Time during Day, Batch nightly or weekends
Correct Node AllocationEach Job Sequence or Job should maximize utilization on single node before using multiple nodes
–Maximize on single node first using
•$APT_GRID_PARTITIONS (per node or server)
–Then add more nodes using
•$APT_GRID_COMPUTENODES (servers)
Most Jobs or Job Sequences should use single node and partition, (80%+) determined by:
Run time (ie. Less than 5 minutes)
Data volume (< few thousand rows)
Concurrency of jobs from a job single sequence
Information Integration High Availability GridEnough Compute nodes to provide service when node(s) fails
Compute node failure
–Just restart Job(s), based on availability
–Must be designed for restart
Frontendnode failure
–Heartbeat failover of services
–Restart jobs
SAN failure (I/O node for Larger Grid)
–Similar to Frontendfailure
No comments:
Post a Comment