Deleting records in AWS Cloud Search

Deleting All Records through SDK

Managing documents is sucky in Cloud Search. The CloudSearch Management Console won’t let you see all of your documents – the best you can do is some test searches and you can see the response. The best way (I’ve found) to delete all of your records is to just create a big batch json with the type ‘delete’ along with the doc ‘id’. As long as you’ve got a record id that sequential and you know the range you want to delete – then you can create your json batch programmatically with a for loop. Once you build the batch you can send it off to AWS CloudSearch to wipe those ids out. Luckily the end point doesn’t care if the id exists or not, so with a big enough batch you can wipe out an entire range of documents without having to look them up. This is absolutely so much faster than deleting the domain and starting over.

<?php
$batch = array();
// JUST SET $i to the range you want to delete - this isn't discriminatory and will delete all id's in that range - SO BE CAREFUL!
        for ($i=0; $i < 100; $i++) { 
            $doc = array(
            'type' => 'delete',
            'id' => $i
            );
            array_push($batch, $doc);
        }
require 'library/aws-autoloader.php';
use Aws\CloudSearchDomain\CloudSearchDomainClient;

    $domainClient = CloudSearchDomainClient::factory(array(
        'credentials' => array(
            'key' => AWS_KEY,
            'secret'  => AWS_SECRET,
           ),
         'version' => '2013-01-01',
        'endpoint' => 'https://' . CLOUDSEARCH_DOMAIN
    ));

    $result = $domainClient->uploadDocuments(array(
            'documents'     => json_encode($batch),
            'contentType'     =>'application/json'
            )
        );

    var_dump($result);

 

You may also like...