The Secondary Pool

Version: Deadline 6 and later

Overview

In this article I'd like to draw attention to an older feature, but one worth noting, especially if you use a Job Scheduling Order that starts with "Pool", which includes Deadline's default "Pool, Priority, First-In-First-Out" scheduling order. That feature, found in a Job's settings, is the Secondary Pool.

A Brief Review of Pools

Pools provide a way of influencing the order in which Deadline Slaves consider Jobs. The influence of Pools depends on the Job Scheduling Order that has been configured for the Repository.

Why have Pools, and why are they considered before a Job's Priority? It is not uncommon for an organization to be handling multiple projects at once or have multiple departments that compete for the fixed compute resources of a farm. It's also not uncommon for a project or a department to have escalated urgency and therefore submit Jobs with a high Priority setting. If Priority is the first consideration in the scheduling order, it's quite possible to completely starve out other projects or departments from using the farm. I'm sure I'm not the only person who has experienced a "priority escalation war" between projects that were in a hurry to get work done.

By organizing Slaves into Pools, and by making Pools the first consideration in the scheduling order, it's possible to ensure that each project or department gets first billing on a subset of the Slaves. Priority wars can still happen, but they are limited to the scope of a Pool, making them more of a tribal war than a global war.

In Deadline, each Slave may be assigned an ordered list of Pools via Tools -> Manage Pools... from the menu in Deadline Monitor. And each Job may be associated with exactly one Pool (or "none") at the time of submission. The Pool and Secondary Pool settings are typically found together, whether in an integrated submitter or in a typical Monitor submitter as shown here:

The Pool settings for a Job may be viewed and altered via the Job Properties dialog:

The Job Scheduling Order, set via Tools -> Configure Repository Options... (Job Settings, Job Scheduling tab), will then determine how Pools figure into the order that Slaves consider Jobs. In the case of Deadline's default order of "Pool, Priority, First-In-First-Out", a Slave will consider each Pool in its list of Pools, in order, looking for Jobs that have a matching Pool setting. If multiple Jobs are found in the same Pool, it will go on to consider Priority, and finally the time of submission.

Sharing the Farm

Pools can be aligned to a business dimension like projects or departments or whatever makes the most sense for your use case. In this article, I'll align Pools to projects.

How might we organize the Pools to share the farm among multiple projects? One approach is to create a Pool for each project, and then use cascading Pool ordering across the Slaves so that each Project has a subset of Slaves where its Pool is listed first, another set where its pool is listed second, and so forth.

For a simple example, suppose we have two projects, "Project A" and "Project B", and two Slaves, "Slave-01" and "Slave-02". To cascade the Slave usage, we would create a Pool for each project and then cascade the Pool order among the Slaves:

Slave-01:project_a, project_b
Slave-02:project_b, project_a

This means, under the default scheduling order, that Slave-01 will always pick up Jobs with a Pool setting of project_a before Jobs with a setting of project_b, regardless of the Jobs' Priority settings. It will only consider Jobs with a Pool setting of project_b if it finds there is no available work among Jobs with a setting of project_a. Just the opposite is true for Slave-02. So this means that both projects get priority on one of the Slaves but will benefit from the other Slave if that other Slave would otherwise be idle.

To further demonstrate cascading, let's look at three projects and three Slaves:

Slave-01: project_a, project_b, project_c
Slave-02: project_b, project_c, project_a
Slave-03: project_c, project_a, project_b

Notice that the Pool order of each Slave is essentially just left-shifted from the one above it. In this way, each project gets first priority on one of the Slaves, second priority on one of the Slaves, and third priority on one of the Slaves. As more Slaves are added, we may simply continue the left-shift. As long as the number of Slaves significantly outnumbers the number of projects, nobody should be bothered by a little rounding:

Slave-04: project_a, project_b, project_c
Slave-05: project_b, project_c, project_a
Slave-06: project_c, project_a, project_b

Well, this seems like a fair way to distribute the Pools across the Slaves so that each project gets its share of farm resources. But it may be apparent that this could become an administration headache if there were dozens of projects, especially as projects come and go requiring a complete redo of all the Slaves' Pool assignments.

The Secondary Pool

The Secondary Pool feature (see Secondary Pools and Job Scheduling) was among the Job scheduling enhancements introduced in Deadline 6.1 (January, 2014). Prior to Deadline 6.1, a Job could only be associated with a single Pool. 6.1 added the ability to also associate a Job with a Secondary Pool. Let's look at how the Secondary Pool works.

When a Slave is scanning the queue for work, it will work through its Pools list, in order, looking for Jobs with a (primary) Pool setting that matches the Pool being considered by the Slave. If no Jobs are found it will repeat the process, this time considering the Secondary Pool setting of each Job. This can potentially reduce administration overhead by eliminating, or at least reducing, the need for cascading:

Slave-01: project_a, pool_all
Slave-02: project_b, pool_all
Slave-03: project_c, pool_all
Slave-04: project_a, pool_all
Slave-05: project_b, pool_all
Slave-06: project_c, pool_all

Suppose there is a Job "X" in the queue with its Pool set to project_a and its Secondary Pool set to pool_all. Suppose also that the two Slaves that list the project_a pool are busy. Finally, suppose that Slave-02 has scanned the Jobs in the queue looking at their Pool setting but has found nothing. In this case it will re-scan the Jobs in the queue, this time examining their Secondary Pool, and it will now find Job "X" and begin working on it.

However, a potential weakness is that the Secondary Pool will again be dominated by high priority Jobs. This may be desirable, because the above configuration means that every project gets first billing on at least two Slaves, and after that it's more about a Job's Priority setting.

The influence of the Priority setting on the Secondary Pool could be diluted by cascading the Secondary Pools:

Slave-01: project_a, spillover_1, spillover_2
Slave-02: project_b, spillover_1, spillover_2
Slave-03: project_c, spillover_1, spillover_2
Slave-04: project_a, spillover_2, spillover_1
Slave-05: project_b, spillover_2, spillover_1
Slave-06: project_c, spillover_2, spillover_1

The Secondary Pool of a Job could then be set to either spillover_1 or spillover_2. Given four projects, A-D, then a strategy might be to assign spillover_1 as the Secondary Pool for projects A-B and spillover_2 for projects C-D. The advantage, compared to a full cascade of the (primary) Pools, is that the Secondary Pools can be more generic and fewer in number while still achieving a similar result.

Ultimately, the aim of the Secondary Pool is to provide another means to manage farm utilization while also reducing administration overhead. If your farm uses a Job scheduling algorithm where Pool is listed first, it's worth considering whether the Secondary Pool feature may be of benefit.