2016
02.04

Yesterday, I found myself in need of a TIFF image splitter. Reason: The scanner in my office didn’t allow me to retain the settings after a document scan. So every time I choose to ‘End Scanning’, the output image quality setting is reset to default. Which means I need change the output settings for each of document that I want to scan. The only way was to scan all documents at one go. That lead me to the situation where a TIFF splitter is needed.

A quick Google-search brought me to a simple, free and open-source program called Tiff Splitter. It is exactly the kind of program that I need. It is simple, and does its job well.

Because it does the job well, I become interested to find out how exactly Tiff Splitter works. Since it’s an open-source, I can immediately take a dive and learn.

TiffSplit_2016-01-28_15-16-40

After we click, select an input file, or, drop a file, the event handlers will eventually call processFile method.

private void processFile(string fileName)
{
   // ... SNIP ...
   
   _configs.inputFile = fileName;

   // Configure the extractor
   RetObj retObj = TiffSplitCode.Prepare(_configs, out numOfPages);
   
   // ... SNIP ...
   
   // Let user select pages to extract
   PageSelection ps = new PageSelection(numOfPages);
   ps.ShowDialog();
   _configs.fromPage = ps.PageFrom;
   _configs.toPage = ps.PageTo;
   _configs.doOverwrite = ps.OverwriteFiles;
   
   // ... SNIP ...
   
   // do the work in separate thread
   backgroundWorkerSplit.RunWorkerAsync(_configs);
}

On the worker thread it will do the splitting

private void backgroundWorkerSplit_DoWork(object sender, DoWorkEventArgs e)
{
	ConfObj input = e.Argument as ConfObj;  
	RetObj retObj = TiffSplitCode.Split(input, _updateProgress);
	e.Result = retObj;
}

The actual work was done by TiffSplitCode

public static RetObj Split(ConfObj input, UpdateProgress updateProgress)
{
	int numOfPages = input.toPage - input.fromPage + 1;                               
	// save each image
	for (int i = 0; i < numOfPages; i++)
	{
		_coder.Save(input.fromPage - 1 + i);
		// ... SNIP ...
	}
	// ... SNIP ...
}


Hmm, it looks simple. But who is _coder? How does it able to distinguish PDF, TIFF or JPG? Different format definitely requires different treatment right (or so I thought) ?

The secret is in Prepare and CoderFactory method.

public static RetObj Prepare(ConfObj input, out int numOfPages)
{
	// .. Snipped: Validate input file ..	

	// overwrite output type if the input is PDF
	if (ext.ToUpper() == ".PDF")
		input.outputType = OutputType.PDF;

	// create output coder
	_coder = CoderFactory(input.outputType);

	// open file
	numOfPages = _coder.LoadImage(input.inputFile);

	// prepare
	_coder.Prepare(input);

	return retobj;
}

static private ICoder CoderFactory(OutputType type)
{
	switch (type)
	{
		case OutputType.TIF:
			return new TiffCoder();
		case OutputType.JPG:
			return new JpegCoder();
		case OutputType.PDF:
			return new PDFCoder();
		default:
			throw new Exception("Unknown output format.");
	}
}


So now it's clear that the actual loading and splitting in classes which implement ICoder interface. So for Tiff, the work is done by TiffCoder

public class TiffCoder : ICoder
{
	private Image _image;
	private FrameDimension _dim;
	
	// ... SNIP ...

	public int LoadImage(string fileName)
	{
		_inputImageName = fileName;
		_image = Image.FromFile(fileName);
		Guid guid = _image.FrameDimensionsList[0];
		_dim = new FrameDimension(guid);
		return _image.GetFrameCount(_dim);
	}


	public void Save(int pageNum)
	{
		_image.SelectActiveFrame(_dim, pageNum);
		string outputFileName = null;

		// ... SNIP: Output file name

		if (!_config.doOverwrite)
		{
			outputFileName = HelperMethods.ModifyFileName(outputFileName);
		}

		_image.Save(outputFileName);
	}

	// ... SNIP ...
}


Well that's interesting. The program actually uses .NET's System.Drawing.Image instead of some external library to handle TIFF. As a dig deeper, I found out that System.Drawing is actually a managed interface to Windows' native library, GDI+ (mind blown!)

I'll stop my exploration here, perhaps in the future I'll have the motivation to dig deeper than today. For more reading please check the following references:

  1. System.Drawing.Image source code
  2. brief introduction to GDI
  3. Microsoft GDI+ page
GD Star Rating
a WordPress rating system
Tiff Splitter: Free and Open Source TIFF/PDF/JPG Splitter, 3.0 out of 5 based on 1 rating

Incoming Search Term

Advertise Here

No Comment

Add Your Comment