Saturday, July 25, 2015

Introducing diksha -- An AWS Lambda Function Scheduler


diksha is a scalable scheduler that can be used to schedule AWS Lambda functions. It is available here.

The 30 second pitch

1. diksha schedules in a cron-like manner.
2. diksha enables the end user to specify, optionally, the number of times for  and  start and end time of such executions
3. diksha plays nicely with the security model of AWS
4. diksha scales on the cloud.
5. diksha is command line driven
6. diksha is open sourced under the friendly Apache 2.0  License.
7. diksha just requires java 7 and a couple of jars...everything else in on the cloud

Sounds interesting? Read on...

The two minute tour


diksha has two components: a server side and a client side. The client side is the command line driven.

A quintessential command line command is for scheduling a a function (Lambda) to be executed.

java -jar diksha-client-<SNAPSHOT>.jar  -cf "cool|L|arn:aws:lambda:us-east-1:123456789012:function:echocool"
creates an alias to the function "cool|L|arn:aws:lambda:us-east-1:123456789012:function:echocool" called "cool"


java -jar diksha-client-<SNAPSHOT>.jar -ef
"cool|somecontext|0 */1 * * * *|10"

implies

execute lambda ("L") function echocool (arn:aws:lambda:us-east-1:123456789012:function:echocool) passing context ("somecontext") every minute (cron expression: 0 */1 * * * *) for 10 times

More expressive

"cool|somecontext|0 */1 * * * *|10|25.07.2015T14:32:00-0700|25.07.2015T14:35:00-0700



Function Alias    lambda function alias to execute
Context                 context to be passed to lambda function
CronExpression    Defines the periodicity of when the function should be executed
RepeatTimes        How many times does this function need be called
StartTime             When this function should be started to execute
EndTime               When should this function automatically be stopped to execute


As this example shows, there is a conflict between the RepeatTimes and the combination  of StartTime and EndTime because of the CronExpression. The CronExpression is saying "run this every min". The RepeatTimes is saying do this 10 times. Therefore, under normal circumstances, this function needs to run for at least 10 mins (in practice a little longer).  However the difference between startTime and endTime is 3 mins. No enough time for the functon to run for the full 10 cycles. diskha will terminate the function at the end of endTime. Please note the exactness of the startTime and endTime corresponding to the SimpleDateFormat dd.MM.yyyy'T'HH:mm:ssZ.

More Details 

The execution of the command line gives an executionId associated with the schedule such as "81cad398-feb0-4e74-95dd-40101ea33ca7".

The ability to see into the execution of the schedule is through --list-status-execution <executionId> or  -lse <executionId>

clientID : 0a2a5de3-599f-4dd8-b4b9-303176e36d09
     Launch Parameters
           Function: (arn:aws:lambda:us-east-1:123456789012:function:echocool) with context = somecontext
                 CronExpression  : 0 */2 * * * *
                 RepeatTimes     : 10
                 StartTimeDate   : null
                 EndTimeDate     : null
      Current State
            status of loop       : FINISH
            # of times executed  : 10
            Last Executed @      : Thu Jul 30 06:01:59 PDT 2015
            Next Proposed Time @ : Thu Jul 30 06:04:00 PDT 2015

                  ActivityTaskCompleted       Thu Jul 30 05:44:01 PDT 2015    
                  ActivityTaskCompleted       Thu Jul 30 05:46:00 PDT 2015    
                  ActivityTaskCompleted       Thu Jul 30 05:48:02 PDT 2015    
                  ActivityTaskCompleted       Thu Jul 30 05:50:03 PDT 2015    
                  ActivityTaskCompleted       Thu Jul 30 05:52:00 PDT 2015    
                  ActivityTaskCompleted       Thu Jul 30 05:54:01 PDT 2015    
                  ActivityTaskCompleted       Thu Jul 30 05:56:01 PDT 2015    
                  ActivityTaskCompleted       Thu Jul 30 05:58:02 PDT 2015    
                  ActivityTaskCompleted       Thu Jul 30 06:00:01 PDT 2015    
                  ActivityTaskCompleted       Thu Jul 30 06:02:00 PDT 2015    




The ability to cancel a current execution is through --cancel-execution <executionId>|reason or -cane <executionId>|reason

The ability to cancel is done on a best effort from diksha and is NOT immediate. More details later on actual functioning of diksha.


Some Sustainability Tips

While previous two sections were essentially a whirlwind tour of diksha, this section goes more into how-to-sustainably-use diksha.

creating a function alias is done through  --create-function or -cf

-cf "cool|L|arn:aws:lambda:us-east-1:123456789012:function:echocool"  creates a friendly function alias called cool pointing to the actual lambda function.

-cf "supercool|L|arn:aws:lambda:us-east-1:123456789012:function:echocool" creates another function alias called supercool

creating a predefined job is done through --create-job or -cj

 -cj "runcooljobeverymin|cool|contextmin|0 0-59 * * * *|2"

creates a job named  runcooljobeverymin running the function alias cool with context of contextmin every min for 2 times

running a predefined job is done though --execute-job or -ej

-ej "runcooljobeverymin"

A little bit of setting upfront saves a lot of downstream effort. There is no particular current validation and it is possible to overwrite both the function and the job with completely something different.


User Scalability


diksha is designed to handle execution of several executions; each one of which may be run completely independent of the other in context of parameters. For example, you can have several executions running every hour, certain executions every weeks and others running monthly/quarterly.

-laes or --list-active-jobs lists all the active jobs on the scheduler

     CronExpression Loop Count   Next Scheduled Time                                     ExecutionId
       0 */3 * * * *       2     2015-08-01T03:18:00.000Z     1b7bba8a-8181-481d-86e0-aa6a35ab0da7
         0 0 * * * *       1     2015-08-01T04:00:00.000Z     287a2fe5-66a9-4acd-bbe1-57b6c143248a
       0 */1 * * * *       3     2015-08-01T03:16:00.000Z     74f0ff3b-c8f1-452f-8cac-fe0146af2e2e
       0 */5 * * * *       2     2015-08-01T03:20:00.000Z     ad49eed1-db56-42d4-986d-bb95500243d8
       0 */2 * * * *       2     2015-08-01T03:16:00.000Z     bfc3beb2-1b09-41ab-8191-eacdaa5d7e2c



After some time

      CronExpression Loop Count   Next Scheduled Time                                     ExecutionId
       0 */3 * * * *       4     2015-08-01T03:21:00.000Z     1b7bba8a-8181-481d-86e0-aa6a35ab0da7
         0 0 * * * *       1     2015-08-01T04:00:00.000Z     287a2fe5-66a9-4acd-bbe1-57b6c143248a
       0 */1 * * * *       6     2015-08-01T03:19:00.000Z     74f0ff3b-c8f1-452f-8cac-fe0146af2e2e
       0 */5 * * * *       2     2015-08-01T03:20:00.000Z     ad49eed1-db56-42d4-986d-bb95500243d8
       0 */2 * * * *       4     2015-08-01T03:20:00.000Z     bfc3beb2-1b09-41ab-8191-eacdaa5d7e2c