What Is a Test Case?

A test case is a written description of the steps to be taken in order to validate a behavior. The test case should contain all of the information and instructions the tester needs.

Components of a Test Case

Summary

Summarize purpose of test
Less than 50 characters

Instructions

Put each step in a separate line
Provide a high-level summary of what needs to be done in each step
Do not specify OS-specific commands

Host

Specify host for each step: server, sched, MoM, comm, client

User

If relevant, specify whether to run command as root or unprivileged user
For example, there is usually no need to specify that a hook or qmgr must be run as root. However, specify this when you are checking user-related behavior.

Pre- and post-conditions

Specify these if there any besides the common configuration settings

Expected outcome

Be clear about what you expect.
Specify exactly what should happen to an attribute, resource, etc.
Do not use snapshots of qstat, pbsnodes, or logs

Cleanup

Specify any steps needed to return the system to its vanilla state

Guidelines

Add a summary to the test of fewer than 50 characters
Provide a unique test name.
Refer to guides for any upgrade or installation steps. DO NOT add any upgrade or installation steps in the test. Also DO NOT add reference to a section in PBS guides as it may change in future.

Make sure that each step is unambiguous.

Notes on Test Execution

Refer the related documentation. For example, if testing on a special branch then refer the branch specific documentation and EDD along with generic PBS guides,
Sometime new features might not be documented in time. If not able to find information in PBS guides then refer the respective EDD. Also one might have to follow up with the project team who has delivered the feature on the status of documentation and previous testing.
Link PBS documentation and product bugs to the tests if they are failing either due to missing documentation or product behavior.

Examples ### most of these examples fail our recommendations; need better

Job Placement Should Respect ncpus on Nodes Marked Exclusive

Prerequisites

Create 2 vnodes with 2 ncpus and sharing=default_excl each

Test

Submit 1 job with 1 ncpus and place=excl
Submit another job with 3 chunks each of 1 ncpus and place=excl
Check 1st job is running and 2nd is queued with the following job comment:
```
Insufficient amount of resource: ncpus (R: 3 A: 2 T: 4)
```
Note: A is 2 and not 3 since the node on which first job is running is exclusive.

Cleanup

Remove the vnode and reset the configuration to default

Calendaring When opt_backfill_fuzzy Set to "low"

Prerequisites

Set 2 ncpus on the node

Test

Set sched attribute opt_backfill_fuzzy to low
Set server’s backfill_depth = 2
Set sched attribute strict_ordering to true
Submit a job consuming 1 ncpu of walltime 60 secs
Submit a reservation which will start after 60 secs consuming both the ncpus
Submit a job with walltime 120 secs.
Verify that above job will be calendared to start after reservation end time. To see that, check the estimated start time of the second job.

Cleanup

Set ncpus to default and unset backfill_depth, opt_backfill_fuzzy, and strict_ordering.

sched_preempt_enforce_resumption is true and Job is topjob_ineligible ###

Prerequisites

Set 2 ncpus on the node

Test

Set server’s backfill_depth = 2
Set sched attribute sched_preempt_enforce_resumption = true
Submit a job consuming 2 ncpus with walltime of 2min
Set running job's topjob_ineligible=true via qalter
Create a high priority queue
Submit 2 jobs to high priority queue with walltime of 1min
See that pre-empted jobs are calendared even when they are set topjob_ineligible=true, i.e., pre-empted jobs have estimated start time set.

Cleanup

Reset to default configuration

Check Performance with Various opt_backfill_fuzzy Values

Test

Submit a reservation of 30 secs which will recur every minute for a whole day (-r "FREQ=MINUTELY;COUNT=3600")

2. Configure daemons:

Set server’s backfill_depth = 1000

set ncpus=2 on MoM

set sched_cycle_length to 600 on server

set strict_ordering to true on scheduler

3. Create 3 queues q1,q2,and q3.

4. Set q1 and q2 queue’s backfill_depth to 1000

5. Submit the 10-second jobs below with walltimes of 10 seconds

1000 jobs to q1

1000 jobs to q2

1000 jobs to q3

2000 jobs to default workq

6. Set the opt_backfill_fuzzy scheduler attribute to each of off, low, med, high, and do the following:

Re-install PBS and rerun all the above steps for each value of opt_backfill_fuzzy. (We recommend this method, since it is easier and faster than the next one)

OR

Run a scheduling cycle for each value of opt_backfill_fuzzy. Note that this step will not initiate a new scheduling cycle, hence you need to wait until the current scheduling cycle is over. Read the data of the next scheduling cycle. It helps to get more than one scheduling cycle for each value.

7. Run pbs_loganalyzer on sched_log/<date>

8. Compare the following for each value from the pbs_log_analyzer output:

cycle_duration,num_jobs_calendared,num_jobs_considered, time_to_calendar

Create a Hook with Event execjob_epilogue

Prerequisites

Have hook script test.py copied from XYZ location ###

Create a hook with event execjob_epilogue and import test.py
Submit a job and make sure it is running
Verify that hook will not update Variable_List of job with "BONJOUR=Mounsieur Shlomi" while job is running
Verify that 'resources_available.file = 1gb' is not set under nodes while job is running
wait till the job finishes
hook has updated following node attributes
1. added resources_available.file = 1gb
check the Variable_List of finished job. BONJOUR=Mounsieur Shlomi has appended to the list.
look for following messages in mom_logs of both the nodes

Hook;pbs_python;printing pbs.event() values ---------------------->

Hook;pbs_python;event is EXECJOB_EPILOGUE

Hook;pbs_python;hook_name is test

Hook;pbs_python;hook_type is site

Hook;pbs_python;requestor is pbs_mom

Hook;pbs_python;requestor_host is (hostname)

Cleanup

Delete the hook and reset to default configuration

Creating a Hook as Ordinary User Throws Error

Test

Create a hook as user with the following events:
1. execjob_begin
2. execjob_prologue
3. execjob_epilogue
4. execjob_preterm
5. execjob_end
6. exechost_periodic

Expected Result

All above commands will fail with error "(user)@(fqdn hostname) is unauthorized to access hooks data from server"

PBS Failover Configuration with PBS_PUBLIC_HOSTNAME

Prerequisites

6 nodes where node1 is primary server and node2 is secondary server with failover configured. Node3 and Node6 are mom only. Node4 and Node5 are comm+mom. Set following values in pbs.conf

Node1 - Primary server node with comm
PBS_PUBLIC_HOST_NAME=node1
PBS_LEAF_NAME=node1

Node2 - Secondary server node with comm
PBS_PUBLIC_HOST_NAME=node2
PBS_LEAF_NAME=node2

Node3 - Mom only
PBS_PRIMARY=node1
PBS_SECONDARY=node2
PBS_LEAF_NAME=node3
PBS_LEAF_ROUTERS=node1,node2,node4,node5

Node4 - Comm + Mom
PBS_PRIMARY=node1
PBS_SECONDARY=node2
PBS_LEAF_NAME=node4
PBS_COMM_ROUTERS=node1,node2

Node5 - Comm + Mom
PBS_LEAF_NAME=node5
PBS_LEAF_ROUTERS=node4
PBS_COMM_ROUTERS=node1,node2,node4
PBS_PRIMARY=node1
PBS_SECONDARY=node2

Node6 - Mom only
PBS_PRIMARY=node1
PBS_SECONDARY=node2
PBS_LEAF_NAME=node6
PBS_LEAF_ROUTERS=node1,node2,node4,node5

Test

Submit 3 jobs each asking 2 chunks of 1 ncpus and place=scatter.
Verify all jobs are running
Bring primary down and wait for secondary to come over (approx ~30s)
Delete a job. other 2 jobs will continue to run.
Bring primary up again
2 jobs will still be running
Delete another job. Now there is only one job running

Verify that Jobs Run Fine After Upgrade

Prerequisites

Install old version of PBS
Create 2 queues
Create a queuejob hook and import queuejob.py
Create an execjob_epilogue hook and import epi.py
Add following custom resources
A type=float
B type=string_array
Set resources_available.A=4.5 and resources_available.B="AA,BB,CC" on server
Add resources A and B in sched_config

Test

Submit a few jobs asking for resources A and B.
Verify some are queued and some running depending on the number of ncpus
Upgrade to new version of PBS following the steps from the upgrade guide. Requeue the jobs.
After upgrade verify that the jobs from old server continues to run.
Verify hook has executed by looking at server_log and mom_log
Wait for jobs to get over and submit new jobs to the cluster.
Verify that new jobs are running/queued successfully.

Job Continues to Run During Failover

Prerequisites

Configure failover between node1 and node2 where node1 is primary and node2 is secondary.

Test

Submit a long job and verify it is running.
Bring the primary down by using qterm
wait till secondary takes over (approx 30s)
make sure job continues to run on the secondary too.
delete the job and submit another job with walltime of 120s
bring primary back up
verify that second job is still running.
Wait for second job to get over (approx 2min) and make sure exit status of the job is 0

When Not Running, Job Prints the Reason

Prerequisites

2 ncpus set on execution host

Test

submit 1 job with 1 ncpus and place=excl and verify that it is running
Also verify that node state is job-exclusive
submit another 1ncpu job
verify that job is queued with job comment

Expected Result

For PBS <= 10.x : "not enough free vnodes available"

For PBS >10.x : "cannot run job: Insufficient amount of resource: ncpus (R:1 A:0 T:2)"

Questions

Do we have a test suite tool for returning the system to its vanilla state?

What's the relationship between a "test case", and a PTL test, and a test suite?

OSS Site Map

Developer Guide Pages

What Is a Test Case?

Components of a Test Case

Summary

Instructions

Host

User

Pre- and post-conditions

Expected outcome

Cleanup

Guidelines

Notes on Test Execution

Examples ### most of these examples fail our recommendations; need better

Job Placement Should Respect ncpus on Nodes Marked Exclusive

Prerequisites

Test

Cleanup

Calendaring When opt_backfill_fuzzy Set to "low"

Prerequisites

Test

Cleanup

sched_preempt_enforce_resumption is true and Job is topjob_ineligible ###

Prerequisites

Test

Cleanup

Check Performance with Various opt_backfill_fuzzy Values

Test

Create a Hook with Event execjob_epilogue

Prerequisites

Cleanup

Creating a Hook as Ordinary User Throws Error

Test

Expected Result

PBS Failover Configuration with PBS_PUBLIC_HOSTNAME

Prerequisites

Test

Verify that Jobs Run Fine After Upgrade

Prerequisites

Test

Job Continues to Run During Failover

Prerequisites

Test

When Not Running, Job Prints the Reason

Prerequisites

Test

Expected Result

Questions