Chilkat HOME .NET Core C# Android™ AutoIt C C# C++ Chilkat2-Python CkPython Classic ASP DataFlex Delphi ActiveX Delphi DLL Go Java Lianja Mono C# Node.js Objective-C PHP ActiveX PHP Extension Perl PowerBuilder PowerShell PureBasic Ruby SQL Server Swift 2 Swift 3,4,5... Tcl Unicode C Unicode C++ VB.NET VBScript Visual Basic 6.0 Visual FoxPro Xojo Plugin
(PureBasic) Efficiently Process a Huge XML FileDemonstrates a technique for processing a huge XML file (can be any size, even many gigabytes). Note: This example requires Chilkat v9.5.0.80 or greater.
IncludeFile "CkXml.pb" IncludeFile "CkStringBuilder.pb" IncludeFile "CkFileAccess.pb" Procedure ChilkatExample() ; This example shows a way to efficiently process a gigantic XML file -- one that may be too large ; to fit in memory. ; ; Two types of XML parsers exist: DOM parsers and SAX parsers. ; A DOM parser is a Document Object Model parser, where the entire XML is loaded into memory ; and the application has the luxury of interacting with the XML in a convenient, random-access ; way. The Chilkat Xml class is a DOM parser. Because the entire XML is loaded into memory, ; huge XML files (on the order of gigabytes) are usually not loadable for memory constraints. ; A SAX parser is such that the XML file is parsed as an input stream. No DOM exists. ; Using a SAX parser is generally less palatable than using a DOM parser, for many reasons. ; ; The technique described here is a hybrid. It streams the XML file as unstructured text ; to extract fragments that are individually treated as separate XML documents loaded into ; the Chilkat Xml parser. ; ; For example, imagine your XML file is several GBs in size, but has a relatively simple structure, such as: ; ; <Transactions> ; <Transaction id="1"> ; ... ; </Transaction> ; <Transaction id="2"> ; ... ; </Transaction> ; <Transaction id="3"> ; ... ; </Transaction> ; ... ; </Transactions> ; In the following code, each <Transaction ...> ... </Transaction> ; is extracted and loaded separately into an Xml object, where it can be manipulated ; independently. The entire XML file is never entirely loaded into memory. fac.i = CkFileAccess::ckCreate() If fac.i = 0 Debug "Failed to create object." ProcedureReturn EndIf success.i = CkFileAccess::ckOpenForRead(fac,"qa_data/xml/transactions.xml") If success = 0 Debug CkFileAccess::ckLastErrorText(fac) CkFileAccess::ckDispose(fac) ProcedureReturn EndIf xml.i = CkXml::ckCreate() If xml.i = 0 Debug "Failed to create object." ProcedureReturn EndIf sb.i = CkStringBuilder::ckCreate() If sb.i = 0 Debug "Failed to create object." ProcedureReturn EndIf firstIteration.i = 1 retval.i = 1 numTransactions.i = 0 ; The begin marker is "XML tag aware". If the begin marker begins with "<" ; and ends with ">", then it is assumed to be an XML tag and it will also match ; substrings where the ">" can be a whitespace char. beginMarker.s = "<Transaction>" endMarker.s = "</Transaction>" While retval = 1 CkStringBuilder::ckClear(sb) ; The retval can have the following values: ; 0: No more fragments exist. ; 1: Captured the next fragment. The text from beginMarker to endMarker, including the markers, are returned in sb. ; -1: Error. retval = CkFileAccess::ckReadNextFragment(fac,firstIteration,beginMarker,endMarker,"utf-8",sb) firstIteration = 0 If retval = 1 numTransactions = numTransactions + 1 success = CkXml::ckLoadSb(xml,sb,1) ; Your application may now do what it needs with this particular XML fragment... EndIf Wend If retval < 0 Debug CkFileAccess::ckLastErrorText(fac) EndIf Debug "numTransactions: " + Str(numTransactions) CkFileAccess::ckDispose(fac) CkXml::ckDispose(xml) CkStringBuilder::ckDispose(sb) ProcedureReturn EndProcedure |
© 2000-2024 Chilkat Software, Inc. All Rights Reserved.