Recently I created a program which one of its task is to filter files based on their last update date.
My initial code was something like this:
var procFolder = new DirectoryInfo(ConfigurationManager.AppSettings["ProcessFolder"]);
var mdnFolder = new DirectoryInfo(procFolder.FullName + "\\mdn");
var lastWriteTime = DateTime.Parse(ConfigurationManager.AppSettings["ModifiedDate"]);
var dTarget = new DirectoryInfo(ConfigurationManager.AppSettings["targetFolder"]);
var qMdn = from c in mdnFolder.GetFiles().AsQueryable()
where c.LastWriteTime == lastWriteTime
select c;
if (qMdn.Count() == 0)
qMdn = from c in mdnFolder.GetFiles().AsQueryable()
where c.LastWriteTime >= lastWriteTime.AddSeconds(-1)
&& c.LastWriteTime <= lastWriteTime.AddSeconds(1)
select c;
foreach (FileInfo f in qMdn)
{
f.CopyTo(dTarget.FullName + "\\" + f.Name, true);
}
As I expected, it was slow. It is iterating the files in sequential manner. To make it faster, we need to iterate the files in parallel manner. This is where Parallel LINQ kicks ass. I’ve heard about Parallel LINQ (PLINQ) before, but I never actually tried it.
So I visited PLINQ’s webpage to find out more. Over there, the page states that PLINQ only runs on .NET 4 and above. This is sad, as I’m still stuck with Visual Studio 2008.
Is there a way to run PLINQ in .NET 3.5? With that question, Google led me to a discussion at StackOverflow. From there, I know that Reactive.NET has backported the Task Parallel Library to .NET 3.5, as Jon Skeet mentioned. To make life even easier, Omer Mor has created a nuget called TaskParallelLibrary for easy integration. This nuget contains System.Threading.dll which can be easily included in your project reference.
Once you’ve referenced it in your project, we can use .AsParallel() in our code:
var procFolder = new DirectoryInfo(ConfigurationManager.AppSettings["ProcessFolder"]);
var mdnFolder = new DirectoryInfo(procFolder.FullName + "\\mdn");
var lastWriteTime = DateTime.Parse(ConfigurationManager.AppSettings["ModifiedDate"]);
var dTarget = new DirectoryInfo(ConfigurationManager.AppSettings["targetFolder"]);
var qMdn = from c in mdnFolder.GetFiles().AsQueryable().AsParallel()
where c.LastWriteTime == lastWriteTime
select c;
if (qMdn.Count() == 0)
qMdn = from c in mdnFolder.GetFiles().AsQueryable().AsParallel()
where c.LastWriteTime >= lastWriteTime.AddSeconds(-1)
&& c.LastWriteTime <= lastWriteTime.AddSeconds(1)
select c;
foreach (FileInfo f in qMdn)
{
f.CopyTo(dTarget.FullName + "\\" + f.Name, true);
}
There you go. It’s easy to make Parallel LINQ available to .NET 3.5. But be warned that this method is not suported by Microsoft. If you plan to deploy this in production, Jon Skeet advises us to upgrade to .Net 4.
I hope it helps, cheers!