Article Options
Recently Viewed
Premium Sponsor
Premium Sponsor

 »  Home  »  Data Programming  »  Migrating Delimited Files into XML
 »  Home  »  Data Programming  »  XML  »  Migrating Delimited Files into XML
Migrating Delimited Files into XML
by Tiberius OsBurn | Published  05/05/2002 | Data Programming XML | Rating:
Tiberius OsBurn

Tiberius OsBurn is a Senior Developer/System Analyst for The Gallup Organization (http://www.gallup.com). He recently completed a huge data warehousing project that archived data and documents from 1935 to the present - all coded in C#, SQL Server and ASP.NET.

Tiberius has extensive experience in VB, VB.NET, C#, SQL Server, ASP.NET and various other web technologies. Be sure to visit his site for his latest articles of interest to .NET developers.

http://tiberi.us

 

View all articles by Tiberius OsBurn...
Migrating Delimited Files into XML

Article source code: delimit_xml.zip

This article will demonstrate the conversion of a delimited file into a DataSet or an XML file. For the purposes of demonstration, I've created a tab-delimited file for the Roach Motel's room service department.

Why make the migration?

Tab delimited and Comma delimited files once were the standard in representing data outside of a relational database. With the inception of XML, developers can now represent their data in a much richer and more descriptive environment without the overhead or cost of relational databases. Converting legacy delimited files into an XML format allows developers to take advantage of many of the new technologies that have been created around XML.

Why convert to a DataSet first?

The nice thing about .NET's DataSet object is the ability to save the internal data as XML. In the past, Microsoft developers would have to use a Split() function to parse out their delimited files and then piece together the rows in the file into an XML format. With large files, this became a particularly nasty exercise. .NET's DataSet object does most of the work for us, cleanly and quickly. We'll be using the DataSet object's GetXml() method to convert the DataSet representation of the delimited file into an XML representation.

Walk Through

We're going to create a DataSet dynamically with the help of the StreamReader object, the DataTable object and the DataRows object.

Let's take a quick look at a snippet of XML produced from our tab delimited file:

<Kitchen xmlns="The_Roach_Motel">
  <Room_Service>
    <Room_Number>290</Room_Number>
    <Breakfast>Bagel</Breakfast>
    <Lunch>Pizza</Lunch>
    <Dinner>Salmon</Dinner>
  </Room_Service>
</Kitchen>

One of the really sweet things about the XML/DataSet connection is how integrated ADO and XML have become. I can define the namespace of my XML representation though a property in the DataSet object:

oDS.DataSetName = "Kitchen"
oDS.Namespace = "The_Roach_Motel"

Notice in the XML snippet above, the DataSet name becomes our root element name and the Namespace property translates into XML nicely.

Next, we'll add a table called 'Room_Service'. I'd avoid using spaces when you are naming DataSet tables and XML elements.

oDS.Tables.Add("Room_Service")

We're using the StreamReader object to open up the delimited file.

Dim oSR As New StreamReader(strFilePath)
'Go to the top of the file
oSR.BaseStream.Seek(0SeekOrigin.Begin)

Most delimited files have a header, or a group of descriptive names that identify a column of data. We're going to snatch out the first line of our delimited file and use the column header names as the columns in our DataSet. After we add a new table to our DataSet, we're going to add the column names to our new table.

Notice that we're using the Split function on our StreamReader object to break apart the delimited column names.

'Add in the Header Columns
For Each strFields In oSR.ReadLine().Split(strDelimiter)
    oDS.Tables(0).Columns.Add(strFields)
Next

After we've added our columns, we can start to populate the rows with data. We'll need to instantiate our DataTable object by setting a reference to our DataSet's table. After that is done, we'll need to create a DataRow object. The DataRow object holds all of the data for that row and makes sure that the data is mapped to the correct column.

oTable = oDS.Tables(0)
While (oSR.Peek() > -1)
    oRows = oTable.NewRow()

Notice that we're iterating through our StreamReader object again. There's no need to initialize it once more or start from the beginning - we've already gotten our columns and now are just concerned with migrating the data from the delimited file. I've added an intCounter to mark the ordinal location of the columns since we don't really know their names at this point.

    For Each strFields In oSR.ReadLine().Split(strDelimiter)
        oRows(intCounter= strFields
        intCounter = intCounter + 1
    Next
    intCounter = 0

Once we've filled up our DataRows object with data, we're ready to add that row to the DataTable Object.

    oTable.Rows.Add(oRows)
End While

All that's left to do is to is convert the Dataset to XML using the GetXML() function and return the XML as a string. You could save this string to an XML file using the WriteXML() function.

In conclusion, XML and ADO.NET allow developers a wide berth of freedom when converting legacy data formats into usable, query-able documents. I hope that you've found this article helpful and useful.

Full Code

Imports System
Imports System.IO
Imports System.Collections
Imports System.Data
Imports System.Text

Module modXML

    Sub Main()
        Dim strXML As String
        strXML = delimitedDataSet(vbTab"c:/food.tab")
    End Sub

    Function delimitedDataSet(ByVal strDelimiter As String_
        ByVal strFilePath As StringAs String
        Dim oDS As New DataSet()
        Dim strFields As String
        Dim oTable As New DataTable()
        Dim oRows As DataRow
        Dim intCounter As Int32 = 0
        Dim oRow As DataRow()

        oDS.DataSetName = "Kitchen"
        oDS.Namespace = "The_Roach_Motel"
        oDS.Tables.Add("Room_Service")

        Dim oSR As New StreamReader(strFilePath)
        'Go to the top of the file
        oSR.BaseStream.Seek(0SeekOrigin.Begin)
        'Add in the Header Columns
        For Each strFields In oSR.ReadLine().Split(strDelimiter)
            oDS.Tables(0).Columns.Add(strFields)
        Next

        'Now add in the Rows

        oTable = oDS.Tables(0)
        While (oSR.Peek() > -1)
            oRows = oTable.NewRow()
            For Each strFields In oSR.ReadLine().Split(strDelimiter)
                oRows(intCounter= strFields
                intCounter = intCounter + 1
            Next
            intCounter = 0
            oTable.Rows.Add(oRows)
        End While
        Return oDS.GetXml()
        'oDS.WriteXml("c:/food.xml")
    End Function
Generated using PrettyCode.Encoder

Tab File

Here's the tab file I created to represent the room service at an illustrious hotel.

Room_Number  Breakfast   Lunch      Dinner
290          Bagel       Pizza      Salmon
301          Orange      Pizza      Chicken ala King
349          Sweet Roll  Salad      Tofu and Vegetables
500          Omelet      Sausage    Veal
702          Eggs        Tuna fish  Cheese Sandwich

XML Output

After we call GetXml() on our DataSet, we can save out our string as an XML file.

<Kitchen xmlns="The_Roach_Motel">
  <Room_Service>
    <Room_Number>290</Room_Number>
    <Breakfast>Bagel</Breakfast>
    <Lunch>Pizza</Lunch>
    <Dinner>Salmon</Dinner>
  </Room_Service>
  <Room_Service>
    <Room_Number>301</Room_Number>
    <Breakfast>Orange</Breakfast>
    <Lunch>Pizza</Lunch>
    <Dinner>Chicken ala King</Dinner>
  </Room_Service>
  <Room_Service>
    <Room_Number>349</Room_Number>
    <Breakfast>Sweet Roll</Breakfast>
    <Lunch>Salad</Lunch>
    <Dinner>Tofu and Vegetables</Dinner>
  </Room_Service>
  <Room_Service>
    <Room_Number>500</Room_Number>
    <Breakfast>Omelet</Breakfast>
    <Lunch>Sausage</Lunch>
    <Dinner>Veal</Dinner>
  </Room_Service>
  <Room_Service>
    <Room_Number>702</Room_Number>
    <Breakfast>Eggs</Breakfast>
    <Lunch>Tuna fish</Lunch>
    <Dinner>Cheese Sandwich</Dinner>
  </Room_Service>
</Kitchen>

Related devCity.NET articles:

How would you rate the quality of this article?
1 2 3 4 5
Poor Excellent
Tell us why you rated this way (optional):

Article Rating
The average rating is: No-one else has rated this article yet.

Article rating:3.77083333333333 out of 5
 48 people have rated this page
Article Score44214
Comments    Submit Comment

Comment #1  (Posted by john a. bailo on 05/14/2002)

this is good stuff and is helping me with some parsing routines im working on right now... thanks!

here's my questions (trivial though they may be)...

i want to compile this using the line command compiler. i do it with "vbc /references:System.Data.dll modXML.vb"

the hang ups are 1. vbTab doesnt seem to be recognized...i read somewhere that this requires importing the old vb6 namespace... is there a .net way to represent the tab character. for the statements using the DataSet :: error msg returned is that it wants the assembly with the System.ComponentModel.MarshallByValueComponent -- a guess as to which .dll contains this so i can add it to my references would be appreciated :)
 
Comment #2  (Posted by Vindy on 05/23/2002)

Could you please help me with the code for converting an xml file to a comma seperated file.
Thanks,
Vinod.
 
Comment #3  (Posted by Roger on 05/30/2002)

This is really useful
 
Comment #4  (Posted by ARUN on 08/16/2002)

Please help me , I want to convert the MARC21 Concise format Bibliographic Data into XML as well as Text file.
The scenario is Receive the bibliographic Data from a database and convert it to XML and Text file .
Looking forward please reply.
Regards
Arun


 
Comment #5  (Posted by Kevin Forbes on 03/18/2003)

This is great, however, the file I'm trying to convert has multiple consecutive delimiters of the same type (multiple spaces between columns). I can get the data into excel using a query, however, I must check the (treat consecutive delimiters as one). Is there an option like this in the oSR.ReadLine().Split method that I haven't been able to find?
 
Comment #6  (Posted by Tim on 05/16/2003)

Great article! This is exactly the reference I was looking for. I am trying to convert a pipe delimited report file into xml so that I have all kinds of display options on the web. Thanks.
 
Comment #7  (Posted by Harendra Kumar on 10/30/2003)

Their is better , quick and neat way to do it in 3 steps.
Step 1
Define the Text File Schema in schema.ini and store in in same directory where you will store the text files.
Step 2
USE ADO.NET with OleDb.OleDbConnection object to connect to this text file.
Step 3
Use following code do it all for you.
string PathtoTextFile = Server.MapPath("data/");
System.Data.OleDb.OleDbConnection oCon = new
System.Data.OleDb.OleDbConnection(
"Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + PathtoTextFile + ";" +
"Extended Properties=\"text;HDR=YES;FMT=Delimited\"");
System.Data.OleDb.OleDbDataAdapter oCmd = new
System.Data.OleDb.OleDbDataAdapter(
"select * from csv.txt",
oCon);

DataSet myDS = new DataSet();
oCmd.Fill(myDS);
oCon.Close();

It's all done.

 
Comment #8  (Posted by Danyelle on 01/21/2004)

Do you know of any tools that convert any or most files to XML? What types of files does this tool convert to XML?
 
Comment #9  (Posted by Padmaja on 04/23/2004)

This is really great stuff.

I want to put this xml data into a sql server 2000 table with a mapping xml schema. For that, I should have xsd="http://www.w3.org/2001/XMLSchema" xsi="http://www.w3.org/2001/XMLSchema-instance" in the root element of xml data.

can I have xsd="http://www.w3.org/2001/XMLSchema" xsi="http://www.w3.org/2001/XMLSchema-instance" in the root element and 'nil=true' attribute in the column nodes, which loos like this:
------------------------
Kitchen xsd="http://www.w3.org/2001/XMLSchema" xsi="http://www.w3.org/2001/XMLSchema-instance"
Room_Service
Room_Number nil="true" 290 Room_Number
Breakfast nil="true" Bagel Breakfast
Lunch nil="true" Pizza Lunch
Dinner nil="true" Salmon Dinner
Room_Service
Kitchen
---------------------
If so, Please explain me how to do that.....

Thanks for all your help

Rgds,
Padmaja

 
Comment #10  (Posted by Himanshu on 05/17/2004)

Hey Harendra,
This code is not workin on my system.....
 
Comment #11  (Posted by Sean Wilkins on 12/24/2004)

Nicely done - simple, yet powerful! I like to see code example like this!


 
Comment #12  (Posted by an unknown user on 04/06/2005)
Rating
Hey man, that is some nice sort and sweet code, thanks.

What about some code clipping into exchange for e-mail disclaimer auto-reply.

Heinrich Glover hg@tiscali.co.za
6-Apr-2004
 
Comment #13  (Posted by an unknown user on 04/18/2005)
Rating
Clean to the point and well written
 
Comment #14  (Posted by an unknown user on 05/17/2005)
Rating
actually i want to convert text or Xls file in xml format . and it helps me for that ..


 
Comment #15  (Posted by an unknown user on 07/22/2005)
Rating
great stuff! i'm on a tight timeline and this will serve as a framework for my conversion tool. thank you!
 
Comment #16  (Posted by an unknown user on 09/29/2005)
Rating
This is exactly what I needed, and it works great! Thank you for posting it.
 
Comment #17  (Posted by an unknown user on 09/29/2005)
Rating
This is exactly what I needed, and it works great! Thank you for posting it.
 
Comment #18  (Posted by an unknown user on 03/20/2006)
Rating
it is very useful since operations with flatfiles and xml is a important task in asp.net


 
Comment #19  (Posted by an unknown user on 04/11/2006)
Rating
Superb.. To the point, accurate and very useful.
 
Comment #20  (Posted by an unknown user on 04/13/2006)
Rating
This code is really great.Thanks a lot Burn.This code really helped me a lot
 
Comment #21  (Posted by an unknown user on 04/16/2006)
Rating
Thank you, this code helped me a lot. Now I just wonder how I can set the encoding for the XML file to utf-8...
 
Comment #22  (Posted by an unknown user on 06/13/2006)
Rating
Very very helpful.Thank you very much.
 
Comment #23  (Posted by an unknown user on 07/05/2006)
Rating
It has a specific goal which is acheived, doesn't deviate from the topic and provides clear examples.
 
Comment #24  (Posted by an unknown user on 11/24/2006)
Rating
it is very usel fOR XML DEVELOPER...... THANK U VERY MUCH.pLEASE CONTINUE IN THE FUTURE ALSO.
THANKS
Sriram
 
Comment #25  (Posted by an unknown user on 01/03/2007)
Rating
Good really helped me soo much.
Thanks for ur help.
 
Comment #26  (Posted by an unknown user on 04/13/2007)
Rating
Really Really Good article explaining the conversion very easy way to follow.
 
Comment #27  (Posted by an unknown user on 04/23/2007)
Rating
really nice code...it was help me allot..thx for help
 
Comment #28  (Posted by an unknown user on 05/12/2007)
Rating
good stuff,
 
Comment #29  (Posted by an unknown user on 06/11/2007)
Rating
Sir, In visual studio 2005 i am getting a error in the following field. Any help would be greatly appreciated.

oRows(intCounter) = strFields
"Make sure that the maximum index on a list is less than the list size.'

I am using same text files and same code.

Thank You

- Kris
 
Comment #30  (Posted by an unknown user on 08/22/2007)
Rating
Bravo !!
 
Comment #31  (Posted by an unknown user on 04/04/2008)
Rating
This was very useful thanks
 
Comment #32  (Posted by an unknown user on 09/16/2008)
Rating
Awesommmmmmmmmme code...really really helpful..
 
Comment #33  (Posted by an unknown user on 11/01/2008)
Rating
This is tremendous, it works even in a web page using ASP.net, this saved my assignment, really easy and comprehensible, you are genius, this is recommendable to everyone dealing with legacy systems and more. All thanks from Birmingham
 
Comment #34  (Posted by an unknown user on 12/01/2008)
Rating
riclira
 
Comment #35  (Posted by an unknown user on 08/19/2009)
Rating
Great work! Thanks!!!
 
Comment #36  (Posted by on 10/11/2009)
Rating

 
Comment #37  (Posted by Cesadolv on 11/09/2009)
Rating
best plavix prices online best plavix prices online best plavix prices online . best plavix prices online best plavix prices online . best plavix prices online best plavix prices online . best plavix prices online free carisoprodol without a perscription free carisoprodol without a perscription . free carisoprodol without a perscription free carisoprodol without a perscription . free carisoprodol without a perscription free carisoprodol without a perscription . free carisoprodol without a perscription free carisoprodol without a perscription free carisoprodol without a perscription . celexa sexual side effects ambien celexa sexual side effects ambien celexa sexual side effects ambien celexa sexual side effects ambien celexa sexual side effects ambien . celexa sexual side effects ambien celexa sexual side effects ambien celexa sexual side effects ambien celexa sexual side effects ambien celexa sexual side effects ambien .
 
Comment #38  (Posted by Gedlopjv on 11/25/2009)
Rating
Here is intresting people… Lets talk!: URLsWithHREF
 
Comment #39  (Posted by SasDero on 11/28/2009)
Rating
I like your diary. Allow to be friends!: URLsWithURL
 
Comment #40  (Posted by an unknown user on 12/16/2009)
Rating
I have been trying to find something that would start me on coverting a text file to xml - and this was the best I came across so far. thank you
 
Sponsored Links