Article Options
Premium Sponsor
Premium Sponsor

 »  Home  »  Data Programming  »  Migrating Delimited Files into XML
 »  Home  »  Data Programming  »  XML  »  Migrating Delimited Files into XML
Migrating Delimited Files into XML
by Tiberius OsBurn | Published  05/05/2002 | Data Programming XML | Rating:
Tiberius OsBurn

Tiberius OsBurn is a Senior Developer/System Analyst for The Gallup Organization (http://www.gallup.com). He recently completed a huge data warehousing project that archived data and documents from 1935 to the present - all coded in C#, SQL Server and ASP.NET.

Tiberius has extensive experience in VB, VB.NET, C#, SQL Server, ASP.NET and various other web technologies. Be sure to visit his site for his latest articles of interest to .NET developers.

http://tiberi.us

 

View all articles by Tiberius OsBurn...
Migrating Delimited Files into XML

Article source code: delimit_xml.zip

This article will demonstrate the conversion of a delimited file into a DataSet or an XML file. For the purposes of demonstration, I've created a tab-delimited file for the Roach Motel's room service department.

Why make the migration?

Tab delimited and Comma delimited files once were the standard in representing data outside of a relational database. With the inception of XML, developers can now represent their data in a much richer and more descriptive environment without the overhead or cost of relational databases. Converting legacy delimited files into an XML format allows developers to take advantage of many of the new technologies that have been created around XML.

Why convert to a DataSet first?

The nice thing about .NET's DataSet object is the ability to save the internal data as XML. In the past, Microsoft developers would have to use a Split() function to parse out their delimited files and then piece together the rows in the file into an XML format. With large files, this became a particularly nasty exercise. .NET's DataSet object does most of the work for us, cleanly and quickly. We'll be using the DataSet object's GetXml() method to convert the DataSet representation of the delimited file into an XML representation.

Walk Through

We're going to create a DataSet dynamically with the help of the StreamReader object, the DataTable object and the DataRows object.

Let's take a quick look at a snippet of XML produced from our tab delimited file:

<Kitchen xmlns="The_Roach_Motel">
  <Room_Service>
    <Room_Number>290</Room_Number>
    <Breakfast>Bagel</Breakfast>
    <Lunch>Pizza</Lunch>
    <Dinner>Salmon</Dinner>
  </Room_Service>
</Kitchen>

One of the really sweet things about the XML/DataSet connection is how integrated ADO and XML have become. I can define the namespace of my XML representation though a property in the DataSet object:

oDS.DataSetName = "Kitchen"
oDS.Namespace = "The_Roach_Motel"

Notice in the XML snippet above, the DataSet name becomes our root element name and the Namespace property translates into XML nicely.

Next, we'll add a table called 'Room_Service'. I'd avoid using spaces when you are naming DataSet tables and XML elements.

oDS.Tables.Add("Room_Service")

We're using the StreamReader object to open up the delimited file.

Dim oSR As New StreamReader(strFilePath)
'Go to the top of the file
oSR.BaseStream.Seek(0SeekOrigin.Begin)

Most delimited files have a header, or a group of descriptive names that identify a column of data. We're going to snatch out the first line of our delimited file and use the column header names as the columns in our DataSet. After we add a new table to our DataSet, we're going to add the column names to our new table.

Notice that we're using the Split function on our StreamReader object to break apart the delimited column names.

'Add in the Header Columns
For Each strFields In oSR.ReadLine().Split(strDelimiter)
    oDS.Tables(0).Columns.Add(strFields)
Next

After we've added our columns, we can start to populate the rows with data. We'll need to instantiate our DataTable object by setting a reference to our DataSet's table. After that is done, we'll need to create a DataRow object. The DataRow object holds all of the data for that row and makes sure that the data is mapped to the correct column.

oTable = oDS.Tables(0)
While (oSR.Peek() > -1)
    oRows = oTable.NewRow()

Notice that we're iterating through our StreamReader object again. There's no need to initialize it once more or start from the beginning - we've already gotten our columns and now are just concerned with migrating the data from the delimited file. I've added an intCounter to mark the ordinal location of the columns since we don't really know their names at this point.

    For Each strFields In oSR.ReadLine().Split(strDelimiter)
        oRows(intCounter= strFields
        intCounter = intCounter + 1
    Next
    intCounter = 0

Once we've filled up our DataRows object with data, we're ready to add that row to the DataTable Object.

    oTable.Rows.Add(oRows)
End While

All that's left to do is to is convert the Dataset to XML using the GetXML() function and return the XML as a string. You could save this string to an XML file using the WriteXML() function.

In conclusion, XML and ADO.NET allow developers a wide berth of freedom when converting legacy data formats into usable, query-able documents. I hope that you've found this article helpful and useful.

Full Code

Imports System
Imports System.IO
Imports System.Collections
Imports System.Data
Imports System.Text

Module modXML

    Sub Main()
        Dim strXML As String
        strXML = delimitedDataSet(vbTab"c:/food.tab")
    End Sub

    Function delimitedDataSet(ByVal strDelimiter As String_
        ByVal strFilePath As StringAs String
        Dim oDS As New DataSet()
        Dim strFields As String
        Dim oTable As New DataTable()
        Dim oRows As DataRow
        Dim intCounter As Int32 = 0
        Dim oRow As DataRow()

        oDS.DataSetName = "Kitchen"
        oDS.Namespace = "The_Roach_Motel"
        oDS.Tables.Add("Room_Service")

        Dim oSR As New StreamReader(strFilePath)
        'Go to the top of the file
        oSR.BaseStream.Seek(0SeekOrigin.Begin)
        'Add in the Header Columns
        For Each strFields In oSR.ReadLine().Split(strDelimiter)
            oDS.Tables(0).Columns.Add(strFields)
        Next

        'Now add in the Rows

        oTable = oDS.Tables(0)
        While (oSR.Peek() > -1)
            oRows = oTable.NewRow()
            For Each strFields In oSR.ReadLine().Split(strDelimiter)
                oRows(intCounter= strFields
                intCounter = intCounter + 1
            Next
            intCounter = 0
            oTable.Rows.Add(oRows)
        End While
        Return oDS.GetXml()
        'oDS.WriteXml("c:/food.xml")
    End Function
Generated using PrettyCode.Encoder

Tab File

Here's the tab file I created to represent the room service at an illustrious hotel.

Room_Number  Breakfast   Lunch      Dinner
290          Bagel       Pizza      Salmon
301          Orange      Pizza      Chicken ala King
349          Sweet Roll  Salad      Tofu and Vegetables
500          Omelet      Sausage    Veal
702          Eggs        Tuna fish  Cheese Sandwich

XML Output

After we call GetXml() on our DataSet, we can save out our string as an XML file.

<Kitchen xmlns="The_Roach_Motel">
  <Room_Service>
    <Room_Number>290</Room_Number>
    <Breakfast>Bagel</Breakfast>
    <Lunch>Pizza</Lunch>
    <Dinner>Salmon</Dinner>
  </Room_Service>
  <Room_Service>
    <Room_Number>301</Room_Number>
    <Breakfast>Orange</Breakfast>
    <Lunch>Pizza</Lunch>
    <Dinner>Chicken ala King</Dinner>
  </Room_Service>
  <Room_Service>
    <Room_Number>349</Room_Number>
    <Breakfast>Sweet Roll</Breakfast>
    <Lunch>Salad</Lunch>
    <Dinner>Tofu and Vegetables</Dinner>
  </Room_Service>
  <Room_Service>
    <Room_Number>500</Room_Number>
    <Breakfast>Omelet</Breakfast>
    <Lunch>Sausage</Lunch>
    <Dinner>Veal</Dinner>
  </Room_Service>
  <Room_Service>
    <Room_Number>702</Room_Number>
    <Breakfast>Eggs</Breakfast>
    <Lunch>Tuna fish</Lunch>
    <Dinner>Cheese Sandwich</Dinner>
  </Room_Service>
</Kitchen>

Related devCity.NET articles:

How would you rate the quality of this article?
1 2 3 4 5
Poor Excellent
Tell us why you rated this way (optional):

Article Rating
The average rating is: No-one else has rated this article yet.

Article rating:3.56164383561643 out of 5
 73 people have rated this page
Article Score50712
Sponsored Links