Help:KMigration

Jump to: navigation, search
  • KMigration.png

Kogence compute nodes are failure and preemption tolerant. All Kogence nodes are cost-effective infrastructure procured through real-time bidding on the spot market. In general such infrastructure can be preempted if another market participant is willing to pay a higher price. Kogence infrastructure if preempt tolerant and kMigration technology saves the state of you simulation, inform the scheduler about the pending move, moves your workload to new node and your simulation restarts from exact same state as before.

KMigration technology is key to predictive scaling through kScaling. The two technologies work in sync to create a new HPC experience for the end user. In traditional HPC environment, when you submit a job to a cluster, you need to know apriori what resources your job will need. Lets say you submit a job to a queue of 100GB nodes. As your job starts to crunch data running for several hours, lets say it runs out of memory. It will simply come crashing. You will need to resubmit the job and start all the work from the beginning wasting hours of computing. What is the guarantee that you can predict how much memory your job needs crunching real time data? The kMigration technology does not let your job come crashing. A new node with larger memory is added to the cluster automatically, your job migrated to new node (preserving entire CPU and memory state) and then old node terminated.