CHUVASH.eu

CHunky Universe of Vigourous Astonishing SHarepoint :)

Batch remove

To add items to a list in bulk, or remove them in bulk. Well to do that you can use SPLinq, Server Object Model and … web.ProcessBatchData, which is the most effective.

I did an experiment today. I created 1000 tasks in my task list three times and removed all items in three different ways and took time.

First I put 1000 items with ajax and listdata.svc:

function pad(n) {
   if (n >= 10000) return n;
   if (n >= 1000) return '0' + n;
   if (n >= 100) return '00' + n;
   if (n >= 10) return '000' + n;
   return '0000' + n;
 }
var s;
var counter = 0;
var title = "task "
function foo() {
   counter++;
   if (counter > 10000) {
      window.clearInterval(s);
   }
   else {
      var value = {};
      value.title = title + pad(counter);
      createNewTask(value);
   }
}
s = setInterval(foo, 200);

Then I ran three different scenarios. Here is the resut for VMWare machine Windows Server 2008 R2, 4GB RAM, SharePoint 2010 SP1:

1000 items 10 000 items 20 000 items
Batch 00h:00m:28s.950ms (29s) 00h:04m:20s.730ms (261s) 00h:08m:35s.260ms (515s)
Object Model 00h:01m:37s.860ms (98s) 00h:16m:40s.770ms (1001s) 00h:33m:50s.250ms (2030s)
SPLinq 00h:02m:24s.070ms (144s) 02h:17m:02s.570ms (8223s) 11h:51m:08s.130ms (42668s)

This result confirms another comparision. See even advantages and disadvantages of linq2sp vs caml.

The time includes even the instantiation of the SPSite and SPWeb objects and getting of task ids to remove, this to get a fair comparision with SPLinq, where this instantiation is implicit (within the context).

Average time for removing one list item:
1000 items 10 000 items 20 000 items
Batch 29.0ms 26.2ms 25.8ms
Object Model 97.9ms 100.1ms 101.5ms
SPLinq 144.1ms 822.3ms 2133.4ms

Some curious facts:
1. SPLinq method left one list item after removing 10 000 batch (task 08787) and three items while removing 20 000 batch (task 07550, task 12042, task 19663). Why?

2. I opened the list sometimes and I found that deleting of items took very long time in the beginning, and then it became faster and faster. I suppose the whole data context was saved after every submission. I am wondering if it would  be different if I’d set ObjectTrackingEnabled to false, which is true by default.

See the whole blog entry for the code I ran to compare these three ways to remove list items.

internal static void Remove1000()
{

    var stopWatch = new Stopwatch();
    stopWatch.Start();

    //uncomment one of them
    //BatchRemove();//00:00:28.95
    //ObjectModelRemove();//00:01:37.86
    //LinqToSPRemove(); //00:02:24.07

    stopWatch.Stop();
    // Get the elapsed time as a TimeSpan value.
    var ts = stopWatch.Elapsed;

    // Format and display the TimeSpan value.
    var elapsedTime = String.Format("{0:00}:{1:00}:{2:00}.{3:00}",
        ts.Hours, ts.Minutes, ts.Seconds,
        ts.Milliseconds / 10);
    Console.WriteLine("RunTime " + elapsedTime);

}

private static void LinqToSPRemove()
{
    var ctx = new OppgaverDataContext("http://takana/");
    var taskList = ctx.GetList<OppgaverTask>("Oppgaver");
    var tasks = from o in ctx.Oppgaver select o;
    foreach (var t in tasks)
    {
        taskList.DeleteOnSubmit((OppgaverTask) t);
    }
    ctx.SubmitChanges();
}

private static void ObjectModelRemove()
{
    using (var site = new SPSite("http://takana"))
    {
        using (var web = site.OpenWeb())
        {
            var lst = web.Lists["Oppgaver"];
            var query = new SPQuery
            {
                ViewFields = "<FieldRef Name='ID'/>",
                IncludePermissions = false,
                RowLimit = 20000
            };

            // first get all the id's to be deleted
            var al = new ArrayList();
            do
            {
                var myItems = lst.GetItems(query);
                foreach (SPListItem item in myItems) // note use enumeration
                {
                    al.Add(item.ID);
                }
                query.ListItemCollectionPosition = myItems.ListItemCollectionPosition;
            }
            while (query.ListItemCollectionPosition != null);

            for (int i = 0; i < al.Count; i++)
            {
                var item = lst.GetItemById((int) al[i]);
                item.Delete();
            }
        }
    }
}

private static void BatchRemove()
{
    using(var site = new SPSite("http://takana"))
    {
        using (var web = site.OpenWeb())
        {
            var lst = web.Lists["Oppgaver"];
            var query = new SPQuery
                {
                    ViewFields = "<FieldRef Name='ID'/>",
                    IncludePermissions = false,
                    RowLimit = 20000
                };

            // first get all the id's to be deleted
            var al = new ArrayList();
            do
            {
                var myItems = lst.GetItems(query);

                foreach (SPListItem item in myItems) // note use enumeration
                {
                    al.Add(item.ID.ToString());
                }
                query.ListItemCollectionPosition = myItems.ListItemCollectionPosition;
            }
            while (query.ListItemCollectionPosition != null);

            var sbDelete = new StringBuilder();
            sbDelete.Append("<?xml version=\"1.0\" encoding=\"UTF-8\"?><Batch>");
            var listguid = lst.ID.ToString();
            var bcount = 0;

            for (int i = 0; i < al.Count; i++)
            {
                if (bcount > 1000)
                {
                    sbDelete.Append("</Batch>");
                    lst.ParentWeb.ProcessBatchData(sbDelete.ToString());
                    sbDelete = new StringBuilder();
                    sbDelete.Append("<?xml version=\"1.0\" encoding=\"UTF-8\"?><Batch>");
                    bcount = 0;
                }
                bcount++;
                sbDelete.Append("<Method>");
                sbDelete.Append("<SetList Scope=\"Request\">" + listguid + "</SetList>");
                sbDelete.Append("<SetVar Name=\"ID\">" + al[i] + "</SetVar>");
                sbDelete.Append("<SetVar Name=\"Cmd\">Delete</SetVar>");
                sbDelete.Append("</Method>");
            }
            sbDelete.Append("</Batch>");
            web.ProcessBatchData(sbDelete.ToString());
        }
    }
}

To create many items as quickly as possible you can use the same ProcessBatchData, e.g. in a console application:

//takes approx. 02m:35s
internal static void Add20Thousends()
{
    using (var site = new SPSite("http://takana"))
    {
        using (var web = site.OpenWeb())
        {
            var lst = web.Lists["Oppgaver"];

            var sb = new StringBuilder();
            sb.Append("<?xml version=\"1.0\" encoding=\"UTF-8\"?><Batch>");
            var listguid = lst.ID.ToString();

            for (var i = 20000; i < 40000; i++)
            {
                sb.Append("<Method>");
                sb.Append("<SetList>" + listguid + "</SetList>");
                sb.Append("<SetVar Name=\"ID\">New</SetVar>");
                sb.Append("<SetVar Name=\"Cmd\">Save</SetVar>");
                sb.Append("<SetVar Name=\"urn:schemas-microsoft-com:office:office#Title\">" +
                                string.Format("task {0:00000}", i+1) +
                                "</SetVar>");
                sb.Append("</Method>");
            }

            sb.Append("</Batch>");
            Console.WriteLine("Sending batch job");
            web.ProcessBatchData(sb.ToString());
        }
    }
}

Or even better in a powershell script:

if(-not(Get-PSSnapin |
    Where { $_.Name -eq "Microsoft.SharePoint.PowerShell"})) {
    Add-PSSnapin Microsoft.SharePoint.PowerShell
}
$web = get-spweb http://dev
$list = $web.Lists["Oppgaver"]
$listid = $list.ID.ToString()
$b = '<?xml version="1.0" encoding="UTF-8"?><Batch>'
$cmd = '<Method>'
$cmd += '<SetList>{0}</SetList>'
$cmd += '<SetVar Name="ID">New</SetVar>'
$cmd += '<SetVar Name="Cmd">Save</SetVar>'
$cmd += '<SetVar Name="urn:schemas-microsoft-com:office:office#Title">task {1:00000}</SetVar>'
$cmd += '<SetVar Name="urn:schemas-microsoft-com:office:office#DueDate">{2}</SetVar>'
$cmd += '</Method>'
$i = 0
$d = [DateTime]::UtcNow
$d = $d.AddDays(-30)
$iso = [string]::concat($d.ToString("s"), "Z")
while ($i -lt 1000) {
    $b += $cmd -f $listid, ++$i, $iso
}
$b += '</Batch>'
$web.ProcessBatchData($b)

Pay attention to quotation marks, the attributes must be double quoted otherwise you’ll get an exception:

Exception calling "ProcessBatchData" with "1" argument(s): "0x80070057"
At line:1 char:22
+ $web.processbatchdata <<<< ($b)
    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : DotNetMethodException

If you wonder how to access specific columns read the RPC documentation on msdn. DueDate field is accepting a date in ISO 8601 format.

This batch functionality can be used for populating lists in development environments. By the way, here I can recommend a nice application from codeplex: SharePointTestDataTool.

sharepointtestdatatool

Advertisements

7 responses to “Batch remove

  1. David 2012-01-03 at 17:04

    Very good stuff, thank you!

  2. Mattias 2012-01-05 at 08:47

    I’ve tried similiar code to delete 40 000 objects and it works great. Make sure to use the rowlimit property, otherwise if the list becomes to big to handle it times out. I use this in a timerjob.

    Also, instead of if(bcount > 1000), do it in the rowlimit instead. Rowlimit = 1000 and run the method without “if(bcount>1000)”

    • Anatoly Mironov 2012-01-05 at 10:13

      Thank you Mattias for your comment and improvement suggestsions. Do you mean a kind of for loop which invokes BatchRemove which only “takes care” of 1000 items at a time?

      • Mattias 2012-01-05 at 15:08

        My code look something like this and it works when i tried delete 40000 items.

        private void DeleteStuff(SPWeb web)
        {
           StringBuilder sbDelete;
           var query = new SPQuery
           query.Query = "Something";
           query.RowLimit = 5000 
           do
           {
                sbDelete = new StringBuilder;
                sbDelete = sbDelete.Append("<?xml version=\"1.0\" encoding=\"UTF-8\"?><Batch  OnError=\"Return\">");
                var myItems = lst.GetItems(query);
        
                foreach (SPListItem item in myItems) // note use enumeration
                {
                        sbDelete.Append("<Method>");
                        sbDelete.Append("<SetList Scope=\"Request\">" + Convert.ToString(item.ParentList.ID) +
                                          "</SetList>");
                        sbDelete.Append("<<SetVar Name=\"ID\">" + Convert.ToString(item.ID) + "</SetVar");
                        sbDelete.Append("<SetVar Name=\"Cmd\">Save</SetVar>");
                        sbDelete.Append("<SetVar Name=\"Cmd\">Delete</SetVar>");
                        sbDelete.Append("</Method>");
                }
                sbDelete.Append("</Batch>");
                web.ProcessBatchData(sbDelete.ToString());
                query.ListItemCollectionPosition = myItems.ListItemCollectionPosition;
           } while (query.ListItemCollectionPosition != null);
        }
  3. ponniprakasam 2012-11-27 at 08:03

    Good Work.Highly Useful Post

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Вула Чăвашла

VulaCV - Чăвашла вулаттаракан сайт

Discovering SharePoint

And going crazy doing it

Bram de Jager talking Office 365, SharePoint and Azure

My view and thoughts on Productivity and more.

My programming life

and everything in between

SharePoint Development Lab by @avishnyakov

It is a good place to share some SharePoint stories and development practices.

SharePoint Dragons

Nikander & Margriet on SharePoint

Paul J. Swider - RealActivity

RealActivity is a specialized healthcare services and solution advisory firm.

Mai Omar Desouki - Avid SharePointer

Egyptian & Vodafoner - Senior SharePoint Consultant

Cameron Dwyer | Office 365, SharePoint, Outlook, OnePlace Solutions

Office 365, SharePoint, OnePlace Solutions & Life's Other Little Wonders

paul.tavares

Me and My doings!

Share SharePoint Points!!

By Mohit Vashishtha

Jimmy Janlén "Den Scrummande Konsulten"

Erfarenheter, synpunkter och raljerande om Scrum från Jimmy Janlén

SPJoel

SharePoint for everyone

SharePointRyan

Ryan Dennis is a SharePoint Solution Architect with a passion for SharePoint and PowerShell

SharePoint 2020

The Vision for a Future of Clarity

Aharoni in Unicode

Treacle tarts for great justice

... And All That JS

JavaScript, Web Apps and SharePoint

%d bloggers like this: