Extracting large XML documents from a database

I had a scenario where I had to extract a large XML document  from a stored procedure. The business requirement was to extract the data from several tables as xml that conforms to a XSD document, write it to a file, then send it via FTP.

The solution was to use SSIS and the SQL Task. This proved to be an issue because of the size of the XML extracted from the database. The process took over 20 minutes to complete and I constantly got out of memory errors.

The final solution was to use a Script task incorporating a xml reader and xml writer to solve the memory issues. The process now runs under 15 seconds.

Here is the code I wrote in the scripting task.

 public void Main()
        {
            // TODO: Add your code here
            bool fireAgain = true;
            Dts.Events.FireInformation(0, Dts.Variables["System::TaskName"].Value.ToString(), "Starting...", string.Empty, 0, ref fireAgain);
            
            ConnectionManager cm = Dts.Connections["Custs_Connection"];

            string dropFolder = (string)Dts.Connections["FILE_Xml_Extract_Folder"].ConnectionString; 
            string requestLogID = Dts.Variables["V_RequestLogID"].Value.ToString();
            string sprocName = Dts.Variables["V_ExtractStoreProcedure"].Value.ToString();    
          
            string connString = cm.ConnectionString;
            //remove the existing provider type
            connString = connString.Replace("Provider=SQLNCLI10.1;","");
            connString = connString.Replace("Auto Translate=False;", "");

            //create filename using the request log id and datetime
            string fileName = "CustomerBulkData_" + requestLogID + "_" + DateTime.Now.ToString("ddMMMyyyy_HHmmss") + ".xml";

            //close the file setting
            var settings = new XmlWriterSettings { CloseOutput = true};

            try
            {

                using (SqlConnection conn = new SqlConnection(connString))
                {
                    SqlCommand cmd = conn.CreateCommand();
                    cmd.CommandText = sprocName;
                    cmd.CommandType = CommandType.StoredProcedure;
                    cmd.Parameters.Add("@RequestID", SqlDbType.Int).Value = requestLogID;
                    conn.Open();

                    using (XmlReader xdr = cmd.ExecuteXmlReader())
                    using (XmlWriter writer = XmlWriter.Create(File.OpenWrite(Path.Combine(dropFolder, fileName)), settings))
                    {
                        writer.WriteNode(xdr, true);
                    }

                }

                Dts.Variables["V_ExtractFileName"].Value = fileName;

                Dts.Events.FireInformation(0, Dts.Variables["System::TaskName"].Value.ToString(), "Completed.", string.Empty, 0, ref fireAgain);
                Dts.TaskResult = (int)ScriptResults.Success;
            }
            catch (Exception ex)
            {
                Dts.Events.FireError(0, (string)Dts.Variables["System::TaskName"].Value,
                    string.Format("Error:{0}", ex),
                    null, 0);

                Dts.TaskResult = (int)DTSExecResult.Failure;
            }

        }

Enjoy.

Advertisements
This entry was posted in SQL, SSIS and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s