Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Page Properties
Target release1.0
Epic

Jira Legacy
serverJIRA (pbspro.atlassian.net)
serverId32008a99-7831-3ff8-9638-3db0cd01164d
keyPP-624

Document status
Status
titleDRAFT
Document owner
Designer

Ram Pranesh

Suresh Thelkar (Deactivated)

Arun Grover (Deactivated)

Vinod (Deactivated)

Developers

Ram Pranesh

Suresh Thelkar (Deactivated)

Arun Grover (Deactivated)

Vinod (Deactivated)

QA<TBD>

Goals

Implementation of DRMAAv2 Specification

Background and strategic fit

The Distributed Resource Management Application API Version 2 (DRMAA) specification defines an interface for tightly coupled, but still portable access to the majority of DRM systems. PBS is Distributed Resource Management system(DRMS).  Standardized access to PBS product through a DRMAAv2 implementation(API). High-level API designers, meta-scheduler architects and end users can rely on DRMAAv2 implementations for a unified access to execution resources.

DRMAAv2 Specification (Distributed Resource Management Application API 2)

Overview

As we all know open standards are taking major role in the exponential growth in many areas. DRMAA2 is such an open standard. It is the successor of the wide-spread DRMAA. DRMAA is generally used for submitting jobs (or creating job work-flows) into a compute cluster by using a cluster resource management system like PBS Professional. Unlike DRMAA with DRMAA2 you can not only submit jobs, you can also get cluster related information, like getting host names, types and status information or insight about queues configured in the resource management system. You can also monitor jobs not submitted within the application. Overall it covers many more use cases than the old DRMAA standard. 

...

https://www.ogf.org/documents/GFD.231.pdf

Assumptions


Requirements

The implementation of DRMAA2 specification for PBS Pro is broadly divided into the following requirements/user stories.

#User StoryImportanceNotes
1

Jira Legacy
serverJIRA (pbspro.atlassian.net)
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId32008a99-7831-3ff8-9638-3db0cd01164d
keyPP-628


2

Jira Legacy
serverJIRA (pbspro.atlassian.net)
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId32008a99-7831-3ff8-9638-3db0cd01164d
keyPP-634



3

Jira Legacy
serverJIRA (pbspro.atlassian.net)
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId32008a99-7831-3ff8-9638-3db0cd01164d
keyPP-631



4

Jira Legacy
serverJIRA (pbspro.atlassian.net)
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId32008a99-7831-3ff8-9638-3db0cd01164d
keyPP-630



5

Jira Legacy
serverJIRA (pbspro.atlassian.net)
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId32008a99-7831-3ff8-9638-3db0cd01164d
keyPP-629



6

Jira Legacy
serverJIRA (pbspro.atlassian.net)
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId32008a99-7831-3ff8-9638-3db0cd01164d
keyPP-632



7

Jira Legacy
serverJIRA (pbspro.atlassian.net)
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId32008a99-7831-3ff8-9638-3db0cd01164d
keyPP-633



8

Jira Legacy
serverJIRA (pbspro.atlassian.net)
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId32008a99-7831-3ff8-9638-3db0cd01164d
keyPP-643



9

Jira Legacy
serverJIRA (pbspro.atlassian.net)
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId32008a99-7831-3ff8-9638-3db0cd01164d
keyPP-626



10

Jira Legacy
serverJIRA (pbspro.atlassian.net)
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId32008a99-7831-3ff8-9638-3db0cd01164d
keyPP-644



11

Jira Legacy
serverJIRA (pbspro.atlassian.net)
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId32008a99-7831-3ff8-9638-3db0cd01164d
keyPP-636



Architecture and design

Application stack



DRMAA2 Layout

PDF
nameDRMAAv2_Layout.pdf

Domain model diagram

Image Modified


ConnectionPool is set active connection with PBSPro. In multi-threaded application each thread uses one connection. ConnectionPool Is Member of DRMSystem

Overview of Object mapping


PDF
nameDRMAA2Overview.pdf

...

Refer DRMAAv2 document for description of Objects

Few important Sequence diagram

 

Image Modified


Image Modified

Source Directory structure in github


├── ...

├──
pbspro


│   ├── src

│ ├──
├── lib

...


│ --
configure.ac Existing configure.ac is updated with DRMAAv2 configs
 

Directory structure in target

├── ...

├──
$(PBS_EXEC)
│   ├── lib

...

│   ├── include
│   ├── ├── drmaav2.h                                           

Dependent packages

PBSPro - libdrmaav2.so needs libpbs.so library for IFL calls

...

doxygen - for generating docs



Compiling DRMAA2 applications for PBS Pro

It is assumed that libdrmaa2.so or libdrmaa2.a is installed as follows prior to compiling a DRMAA2 application.

...

export LD_LIBRARY_PATH=/usr/lib/libdrmaa2.so ./<drmaa2_app>

Sample Example

Code Block
languagecpp
themeEclipse
titleExample1: DRMAA2 Simple Application
linenumberstrue
 	drmaa2_jsession js; /* Job session object*/		
    drmaa2_jtemplate jt; /* Job session template*/
    drmaa2_j j;  /* Job object*/
    ...
    ...
	...
    /* Create and open job session */
    js = drmaa2_create_jsession("testsession", DRMAA2_UNSET_STRING);
    if (js == NULL) {
   		 ...
       return;
    }
    ...
    ...
	...

    /* Create job template */
    jt = drmaa2_jtemplate_create();
    jt->remoteCommand = strdup("/bin/date");
    j = drmaa2_jsession_run_job(js, jt);

    ...
	...
    drmaa2_j_wait_terminated(j, DRMAA2_INFINITE_TIME);
    drmaa2_jtemplate_free(&jt);
    drmaa2_destroy_jsession("testsession");
    ...
    ...
	...
    drmaa2_j_free(&j);
    drmaa2_jsession_free(&js);
    ...
    ...
	...
Code Block
languagecpp
themeEclipse
titleExample2: DRMAA2 Advance Application
linenumberstrue
 	...
    ...
	...
    drmaa2_jtemplate        jt = drmaa2_jtemplate_create(); /* job template object */
    drmaa2_rtemplate        rt = drmaa2_rtemplate_create(); /* reservation template object */
    drmaa2_string_list      cl = drmaa2_list_create(DRMAA2_STRINGLIST, NULL); / *create list of strings */
    drmaa2_dict            env = drmaa2_dict_create(NULL); /* create a dictionary */

    drmaa2_jsession js = drmaa2_create_jsession("myjsession", NULL); /* open sessions to DRM system */
    if (js == NULL) {
        ...
        ...
        return;
    }
    drmaa2_rsession rs = drmaa2_create_rsession("myrsession", NULL); /* create a reservation session */
    if (rs == NULL)
    {
        ...
        ...
        return;
    }
    drmaa2_msession ms = drmaa2_open_msession(NULL); 					 /*create and open monitoring session */

    ml = drmaa2_msession_get_all_machines(ms, DRMAA2_UNSET_LIST);        /* determine name of first machine */
    if (drmaa2_list_size(ml) < 1) {
    	...
        return;
    }
    m = (drmaa2_machineinfo)drmaa2_list_get(ml, 0);
    drmaa2_list_add(cl, m->name);

    rt->maxSlots = 4;                                                    /* perform advance reservation */
    ...
    rt->machineOS=DRMAA2_LINUX;
    rt->candidateMachines = cl;
    r = drmaa2_rsession_request_reservation(rs, rt);

    jt->remoteCommand = strdup("/bin/date");                             /* submit job */
    jt->reservationId = drmaa2_r_get_id(r);
    drmaa2_dict_set(env, "PBS_SCP", "/usr/bin/scp");
    jt->jobEnvironment = env;
    j = drmaa2_jsession_run_job(js, jt);

    drmaa2_j_wait_terminated(j, DRMAA2_INFINITE_TIME);                  /* Wait for termination and print exit status */
    ji = drmaa2_j_get_info(j);
    ...
	...
	...

    /* close sessions, cleanup */
    drmaa2_jtemplate_free(&jt);
    drmaa2_rtemplate_free(&rt);
    drmaa2_jinfo_free(&ji);
    drmaa2_j_free(&j);
    drmaa2_r_free(&r);

    drmaa2_close_msession(ms);
    drmaa2_close_rsession(rs);
    drmaa2_close_jsession(js);
    drmaa2_msession_free(&ms);
    drmaa2_rsession_free(&rs);
    drmaa2_jsession_free(&js);

    ...
    ...

Questions

Question

Outcome


Not Doing


References