Batch Apex in Salesforce

There are a lot of declarative automation tools and features introduced by Salesforce in the recent past, which has considerably reduced the need for Apex code. But, to process a large number of records, developers still use the humble Batch Apex.

What is Batch Apex?

Batch Apex is used for processing a large number of records. Since the code is run asynchronously, the records are processed within platform limits. Once the batch Apex job is invoked, the records are processed in batches using the execution logic provided. Each batch is considered a separate Apex transaction.

Batch Apex can handle large data volumes, allows staying within governor limits, and makes it possible to perform a rollback for only the failed batch of records.

Batch Apex is also used to recalculate Apex Managed Sharing. Normal sharing changes to the organization are recalculated automatically. However, Apex sharing changes need to be updated in bulk for all records, which is best done using batch Apex.

A quick look at the general syntax of Batch Apex

global class classname implements Database.Batchable<sObject> {

   global (Database.QueryLocator | Iterable<sObject>) start(Database.BatchableContext bc) {
      // collect the batches of records or objects to be passed to execute
   }

   global void execute(Database.BatchableContext bc, List<sObject> records){
      // process each batch of records
   }

   global void finish(Database.BatchableContext bc){
      // execute any post-processing operations
   }
}

A batch Apex class must implement the ‘Database.Batchable’ interface and should include the following methods: start, execute and finish.

Start Method  

  • Defines the scope of the whole process - which records would be considered in the entire batch process
  • Returns either of the two types of objects - Database.QueryLocator or Iterable
  • QueryLocator is the most commonly used object to fetch records without pre-processing, returns up to 50 million records.
  • Iterable is helpful when a lot of initial logic is to be applied to the fetched dataset before actual processing. However, this is governed by the limits on the total number of records that are returned by SOQL queries.

Execute Method   

  • Processes batches of records from the ‘start’ method (sObject records or a list of sObject records), up to 200 records at a time
  • Takes the parameter ‘Database.BatchableContext’ - useful to retrieve information on the batch Apex
  • The second parameter contains the scope returned by ‘Start’ method
  • Has no control over the order of execution of batches

Finish Method   

Post-processing actions - actions that need to be done when all the batches are processed

A real-world scenario to understand Batch Apex processing better

Cityworks is a package delivery company that has delivery stations in multiple states and delivers around 1000 packages every day. The company uses a Salesforce org to manage data about delivery agents and their delivery assignments. Delivery agents are stored in a custom object called ‘Service Agent’. A child object called 'Delivery' is used to store the deliveries assigned to each agent. A secondary Salesforce org is used to manage the delivery assignments of each agent. It exposes a REST-based service that returns new delivery data. At the end of each day, the status of all the new deliveries needs to be updated in the primary Salesforce org.

To meet this requirement, the high-level steps would be:

  1. Connect the external system (secondary Salesforce org) to the primary Salesforce org.
  2. Write a Batch Apex class that executes a callout to the external system to get the required data and update the primary org.
  3. Schedule the Batch Apex class to run every night.

This use case requires the use of batch Apex with some special features and interfaces, namely, Database.Stateful interface, Database.AllowCallouts interface, and Schedulable.

Connecting the external system to the primary Salesforce org

Getting the data from the external system (secondary org)

Since the secondary org provides a REST-based service that returns delivery data, it can be utilized by the primary org to get the required data at the end of each day.

Note: For the sake of convenience and simplicity, the external system in the scenario is a Salesforce org. The following steps show a simple way to connect to a Salesforce org.

Configuring the external system (secondary org)

Connected App:
The first step is to allow the primary org to connect to the secondary org. This can be done by creating a Connected App in the secondary org by navigating to Setup | Apps | App Manager | New Connected App.

Create a New Connected App

The auto-generated consumer key and secret in the connected app can be used in the primary org to connect to the secondary org.

Manage Connected Apps

Configuring the primary org

An ‘Authentication Provider’ and a ‘Named Credential’ can be set up in the primary org to connect to the secondary org. The Authentication Provider should use the consumer key and secret obtained from the connected app.

Set Up an Auth. Provider

Create a Named Credential

The Batch Apex

A Batch Apex class like the following can be created to execute the callout to the secondary org, retrieve the data, and update the records in the primary org.

global class BatchApexCalloutFOF implements Database.Batchable<sObject>, Database.AllowsCallouts, Database.Stateful{

Global List<String> ExternalIDs = new List<String>();
Global Map<String,String> UpdateDeliveryStatus = new map<String,String>();
String a;
String b;
String query;

//Constructor
Global BatchApexCalloutFOF(){
//Callout Logic
Http http = new Http();
HttpRequest request = new HttpRequest();
//Set timeout to 1 minute to avoid read timed out error (only if it appears)
request.setTimeout(60000);
request.setEndpoint('callout:Apex_Rest_Services_Test/services/apexrest/retrieveDeliveries');
request.setMethod('GET');
HttpResponse response = http.send(request);
while (response.getStatusCode() == 302) {
request.setEndpoint(response.getHeader('Location'));
response = new Http().send(request);
}
// If the request is successful, parse the JSON response.
System.debug(response.getBody());
//JSON to get just the external IDs in a list, map to the respective delivery status
JSONParser parser = JSON.createParser(response.getBody());
while (parser.nextToken() != null){
if((parser.getCurrentToken() == JSONToken.FIELD_NAME)){
if (parser.getText() == 'Id')
{
parser.nextToken();
a = parser.getText();
ExternalIDs.add(a);
}
else if (parser.getText() == 'Status__c'){
parser.nextToken();
b = parser.getText();
UpdateDeliveryStatus.put(a,b);
}
}
}
//Verify if the data comes through during runtime
System.debug('JSON External IDs list: ' + ExternalIds);
System.debug('The map:' + UpdateDeliveryStatus);

}

global Database.querylocator start(Database.BatchableContext BC){
system.debug('Inside the Start statement');
//Create query with ExternalIDs list to limit the scope
query = 'Select Id, Name, External_ID__c, Service_Agent__c, Status__c from Delivery__c where External_ID__c in :ExternalIds ';
return Database.getQueryLocator(query);
}



global void execute(Database.BatchableContext BC, List<Delivery__c> scope){
List<Delivery__c> delivs = new List<Delivery__c>();
//Verify the start of the process during runtime
system.debug('Inside the Execute statement');
System.debug('The map:' + UpdateDeliveryStatus);
//Loop through records in scope, batch-wise
for(Delivery__c s : scope){
System.debug('The current record in loop'+ s.External_ID__c);
s.Status__c = UpdateDeliveryStatus.get(s.External_ID__c);
delivs.add(s);
}
update delivs;
}
global void finish(Database.BatchableContext BC){
Messaging.SingleEmailMessage mail = new Messaging.SingleEmailMessage();
mail.setToAddresses(new String[] {‘admin@fof.com'});
mail.setReplyTo('admin@fof.com');
mail.setSenderDisplayName('Batch Process');
mail.setSubject('Delivery Statuses updated successfully');
mail.setPlainTextBody('Batch Process has completed.');
Messaging.sendEmail(new Messaging.SingleEmailMessage[] { mail });
}
}

Note: This code does not handle exceptions.

The overall idea behind using the batch Apex code here is to perform a REST callout to the external system to ‘GET’ the data that is required in the form of a JSON response, parse the response, store the external IDs and delivery statuses in variables, retrieve the records to be updated, and then update those records in batches.

Here’s a sample response from the external system:

[
{
"attributes": {
"type": "Delivery__c",
"url": "/services/data/v48.0/sobjects/Delivery__c/a032v00006DbH3oAAF"
},
"Id": "a032v00006DbH3oAAF",
"Name": "D-0000002",
"Service_Agent__c": "a022v00001yJFjzAAG",
"Status__c": "Cancelled"
},
{
"attributes": {
"type": "Delivery__c",
"url": "/services/data/v48.0/sobjects/Delivery__c/a032v00006DbH3jAAF"
},
"Id": "a032v00006DbH3jAAF",
"Name": "D-0000001",
"Service_Agent__c": "a022v00001yJFjzAAG",
"Status__c": "Delivered"
},
{
"attributes": {
"type": "Delivery__c",
"url": "/services/data/v48.0/sobjects/Delivery__c/a032v00006DbH3tAAF"
},
"Id": "a032v00006DbH3tAAF",
"Name": "D-0000003",
"Service_Agent__c": "a022v00001yJFk4AAG",
"Status__c": "Delivered"
}
]

The purpose of the code here is to update only the records for which there is an update from the external system. Since there would be a large set of records already in the system, it is necessary to process only the new deliveries, i.e., around 1000 records per day.

Some important considerations are as follows:

  • Since the code is performing a callout, the interface ‘Database.AllowsCallouts’ needs to be implemented.
  • Before the Start method defines the scope of the process, the callout is made in the constructor method. 
  • As the request is to get the data from the external system, the ‘GET’ http method is used here. The response is in JSON format. 
  • With the response received, the external ids and the respective delivery statuses are collected using JSON Parser methods. This collection of external ids is used in the Start method to return only the records that need to be updated.
  • The list and map defined in the constructor would return null outside the constructor by default, as the batch processing is stateless. To maintain the state (value) of the list and map across different transactions, the class implements the ‘Database.Stateful’ interface.
  • The query is constructed in the start method using the ExternalIds list and returned in the form of a QueryLocator object.
  • The execute method processes the records obtained from the query in batches. The map helps to update the relevant record easily, thereby reducing the number of code lines.
  • The finish method is used to provide an update to the administrator (or any other relevant user) through an email once all batches of records are updated.

Invoking the batch Apex class

For the scenario, the batch Apex class can be scheduled to run at a particular time, especially when no more updates are expected.

The code to run the batch Apex job is placed within another Apex class that implements the ‘schedulable’ interface. This class can be scheduled to run using the Apex Scheduler in Setup.

Code to schedule the batch Apex job:

global class ScheduleDeliveryUpdateFOF implements schedulable
{
   global void execute(SchedulableContext sc)
   {
   BatchApexCalloutFOF quickupdate = new BatchApexCalloutFOF();
   Database.executeBatch(quickupdate,100);
   }
}

Schedule the Batch Apex Class

However, there are other ways to invoke a batch Apex class:

1. A batch Apex class can be invoked using the ‘Database.executeBatch’ method in the Execute Anonymous Apex window in the Developer Console.

BatchApexCallOutFOF quickupdate = new BatchApexCallOutFOF();
Database.executeBatch(quickupdate,100);
//where 100 is limiting the number of records per batch (scope parameter)

Sample Log from Developer Console

2. Batch Apex can be invoked using an Apex trigger. But the trigger should not add more batch jobs than the limit.

How can batch Apex be tested?

The batch Apex class can be tested by simply inserting some sample records in a test class and processing them using the batch class. Only one execution of the ‘execute’ method is possible in a test class. Since Apex test methods don’t support callout to an external system, a mock callout can be created to ‘mock’ the actual callout.

How can batch Apex jobs be monitored?

1. The Apex Jobs page in Setup can be used to monitor batch Apex jobs.

Monitor Batch Apex Job

2. The link to the new batch jobs page can be used.

New Batch Jobs Page

The ‘More Info’ section can be clicked to view detailed information.

More Information on Batch Jobs

3. Batch Apex jobs can also be monitored programmatically. An instance of the ‘Database.BatchableContext’ can be used along with the instance method ‘getJobID’ to retrieve the current batch job Id within the batch class. This can be used in combination with ‘AsyncApexJob’ to obtain additional information about the batch job. Database.executeBatch returns the current job ID when invoking batch Apex.

global void finish(Database.BatchableContext BC){
   AsyncApexJob a = [SELECT Id, Status, NumberOfErrors, JobItemsProcessed,
      TotalJobItems, CreatedBy.Email
      FROM AsyncApexJob WHERE Id =
      :BC.getJobId()];
      //further finish method logic
}

ID batchprocessid = Database.executeBatch(quickupdate);

AsyncApexJob apexj = [SELECT Id, Status, JobItemsProcessed, TotalJobItems, NumberOfErrors
   FROM AsyncApexJob WHERE ID =: batchprocessid ];

Chaining Batch Apex Jobs

Batch Apex jobs can be chained together; one job can start another job. This is possible by calling Database.executeBatch or System.scheduleBatch in the finish method of the current batch Apex class. The new batch job will commence after the existing batch job ends. This is handy when dealing with large volumes of data.

Best practices to follow while writing a batch class

  • Use efficient SOQL queries to minimize delay and stay within governor limits.
  • Execute Batch Apex jobs as quickly as possible by minimizing the callout times.
  • Use batch Apex only when there is a need to process more than 200 records at a time (consider the need to process in batches vs. normal Apex code)
  • Consider the number of batches that would stand in a queue if any automation is involved in invoking batch class, such as trigger/process builder process/flow - there can only be five queued or active batch jobs at a time.
  • Define all methods in the class as global or public.
  • Do not call or declare future methods in the batch Apex class.

Happy ‘batching’!

More information on Batch Apex

What Certification are you studying for now?

Focus on Force currently provide practice exams and study guides for nine certifications