Quantcast
Channel: Welcome to the StarNet Knowledge Base
Viewing all articles
Browse latest Browse all 274

Job Scheduling

$
0
0

The purpose of job scheduling is to take the start data that the user has sent from the API call and merge it with a template string that is returned from the job scheduling function.  Instead of starting a session immediately, the FastX server will execute this interpolated string and any output will be returned to the user.  FastX leverages this feature to allow admins to configure FastX to execute job schedulers like SLURM, MOAB, LSF etc to start sessions.

Note that in FastX terminology, Job Scheduling and Load Balancing are two different actions.

  • Job Scheduling is the act of creating a template string in place of the start command
  • Load Balancing is the act of choosing which server to execute the start data.

When integrating LSF (or equivalent), you will execute the template function (job scheduling) on the serverId returned (load balancing).  In this instance, where the session finally gets started is up to LSF

Built In Schedulers

FastX ships with several built in Job Schedulers to simplify standard load balancing algorithms

Admin > Sessions > Launcher > Load Balancing

  • Fewest Sessions — launch new sessions on the launcher with lowest number of sessions
  • Most Available Memory — launch new sessions on a launcher with the most free RAM
  • Round Robin — evenly distribute the launcher sessions
  • Custom — Use a custom Job Scheduler

Custom Schedulers

Admins can create custom scheduling scripts. In FastX 4, the launcher script is a custom executable script that the admin writes. Set the following environment variables in $FX_CONFIG_DIR/fastx.env

  • LOAD_BALANCER=custom — set the load balancer to custom
  • LOAD_BALANCER_SCRIPT=/path/to/your/script — custom script to use for load balancing

Output

The goal is to return a nodeID on stdout.

Script Input

The load balancer script will send the following input to stdin as a JSON string.

{
   "nodes": [<node1>, <node2>, ... <nodeN>],
   "params": <input object>
}

Node Object

The node object is a JSON object with data of a system that has a launcher service running on it. There are many parameters in each node, but the ones that matter are listed below

{
  "id": "node_id",
  "nodeData": {
    "health": <health>,
    "services": [<service1>,<service2>, ... <serviceN> ]
    "sessions": [ <session1>, <session2>, ... <sessionN> ]
  },
  metadata: {}
}
  • health — node health data
  • services — list of services on the node
  • sessions — array of session data objects
  • metadata — metadata object. Object parameters are set by setting the METADATA_<NAME> environment variable in the environment and then restarting the service

Params Object

Pending Sessions

Sessions that have been scheduled but not yet connected back to the launcher are considered pending. Users cannot connect to a pending session. However there are a limited number of maintenance and information gathering actions (get info, terminate, purge) that a user can do on a pending session.


Viewing all articles
Browse latest Browse all 274

Trending Articles