Saturday, March 24, 2012

Using Gepsio From C#

A developer new to Gepsio asked for a C#-based sample of using Gepsio. I have built a simple C# console app using Visual Studio 2010 and the Gepsio Nov 2011 CTP. I’ll show the code first, and then I will discuss some key points after the code is shown:

using JeffFerguson.Gepsio;
using System;

namespace GepsioConsole
{
class Program
{
/// <summary>
/// The magic begins here.
/// </summary>
/// <param name="args">
/// A collection of program arguments. One argument should be supplied: the address of the XBRL document
/// to load.
/// </param>
static void Main(string[] args)
{
if (args.Length != 1)
{
Console.WriteLine("usage: GepsioConsole [XBRL document]");
return;
}
ProcessXbrlWithGepsio(args[0]);
}

/// <summary>
/// Process a named XBRL document.
/// </summary>
/// <param name="xbrlFile">
/// The address of the XBRL document to process.
/// </param>
private static void ProcessXbrlWithGepsio(string xbrlFile)
{
try
{
var xbrlDoc = new XbrlDocument();
xbrlDoc.Load(xbrlFile);
foreach (var currentFragment in xbrlDoc.XbrlFragments)
{
DisplayFragmentStatistics(currentFragment);
WriteFactValue(currentFragment, "EntityRegistrantName");
WriteFactValue(currentFragment, "DocumentPeriodEndDate");
}
}
catch (XbrlException xbrle)
{
Console.WriteLine("ERROR: {0}", xbrle.Message);
}
}

/// <summary>
/// Display statistics relating to the loaded document fragment.
/// </summary>
/// <param name="currentFragment">
/// The XBRL fragment whose statistics should be published.
/// </param>
private static void DisplayFragmentStatistics(XbrlFragment currentFragment)
{
var factsCollection = currentFragment.Facts;
Console.WriteLine("Number of facts...: {0}", factsCollection.Count);
var unitsCollection = currentFragment.Units;
Console.WriteLine("Number of units...: {0}", unitsCollection.Count);
var contextsCollection = currentFragment.Contexts;
Console.WriteLine("Number of contexts: {0}", contextsCollection.Count);
}

/// <summary>
/// Display the value of a fact pulled from an XBRL fragment.
/// </summary>
/// <param name="currentFragment">
/// The fragment containing the fact to be found.
/// </param>
/// <param name="factName">
/// The name of the fact to be found.
/// </param>
private static void WriteFactValue(XbrlFragment currentFragment, string factName)
{
foreach (var currentFact in currentFragment.Facts)
{
if (currentFact.Name.Equals(factName) == true)
{
var currentFactAsItem = currentFact as Item;
Console.WriteLine("{0}: {1}", factName, currentFactAsItem.Value);
return;
}
}
}
}
}



I wanted to offer a few notes regarding this code:



  • This is a console application, and is intended to be run with a command line argument specifying the address of the XBRL document to be loaded, as in GepsioConsole http://www.sec.gov/Archives/edgar/data/21344/000104746911006790/ko-20110701.xml. Gepsio can work with documents stored on the Web, so HTTP-based document addresses are valid.

  • From Gepsio’s point of view, an XBRL document is a collection of fragments. The fragments idea was originally designed to support the notion of Inline XBRL, where a document may consist of multiple XBRL fragments. For standard XBRL documents, however, the entire document is an XBRL document, which makes up one “fragment”. This explains the “for each fragment in document” code in the ProcessCodeWithGepsio() method.

  • The WriteFactValue() method looks for a fact in the fragment’s collection of facts. Remember that, from an XBRL point of view, a fact is a type of item. In XBRL parlance, items can be facts, which have a single value, or tuples, which can have more than one value. Gepsio models this fact by defining a base class called Item and then deriving both Fact and Tuple from Item. This explains the cast back to Item in the code for WriteFactValue(). The cast may look a bit strange … perhaps I’ll revisit this in a later CTP.

As I developed this sample, I noticed that Gepsio performs as intended but, for larger documents, its performance can be improved. I’ll be addressing this shortly. Look for a future blog post where I discuss performance, where it could be improved, and how the improvements will be implemented. I will be using the code in this blog post as the showcase for the performance improvements, so you will be seeing this code again.

1 comment:

  1. great demo!
    one question though, instead of extracting fact items, can we extract some specific blocks of content? Say I want to extract the whole MD&A section?

    ReplyDelete