Chilkat • HOME • Android™ • AutoIt • C • C# • C++ • Chilkat2-Python • CkPython • Classic ASP • DataFlex • Delphi DLL • Go • Java • Node.js • Objective-C • PHP Extension • Perl • PowerBuilder • PowerShell • PureBasic • Ruby • SQL Server • Swift • Tcl • Unicode C • Unicode C++ • VB.NET • VBScript • Visual Basic 6.0 • Visual FoxPro • Xojo Plugin
(PHP ActiveX) Efficiently Process a Huge XML FileDemonstrates a technique for processing a huge XML file (can be any size, even many gigabytes). Note: This example requires Chilkat v9.5.0.80 or greater.
<?php // This example shows a way to efficiently process a gigantic XML file -- one that may be too large // to fit in memory. // // Two types of XML parsers exist: DOM parsers and SAX parsers. // A DOM parser is a Document Object Model parser, where the entire XML is loaded into memory // and the application has the luxury of interacting with the XML in a convenient, random-access // way. The Chilkat Xml class is a DOM parser. Because the entire XML is loaded into memory, // huge XML files (on the order of gigabytes) are usually not loadable for memory constraints. // A SAX parser is such that the XML file is parsed as an input stream. No DOM exists. // Using a SAX parser is generally less palatable than using a DOM parser, for many reasons. // // The technique described here is a hybrid. It streams the XML file as unstructured text // to extract fragments that are individually treated as separate XML documents loaded into // the Chilkat Xml parser. // // For example, imagine your XML file is several GBs in size, but has a relatively simple structure, such as: // // <Transactions> // <Transaction id="1"> // ... // </Transaction> // <Transaction id="2"> // ... // </Transaction> // <Transaction id="3"> // ... // </Transaction> // ... // </Transactions> // In the following code, each <Transaction ...> ... </Transaction> // is extracted and loaded separately into an Xml object, where it can be manipulated // independently. The entire XML file is never entirely loaded into memory. // For versions of Chilkat < 10.0.0, use new COM('Chilkat_9_5_0.Chilkat.FileAccess') $fac = new COM("Chilkat.FileAccess"); $success = $fac->OpenForRead('qa_data/xml/transactions.xml'); if ($success == 0) { print $fac->LastErrorText . "\n"; exit; } // For versions of Chilkat < 10.0.0, use new COM('Chilkat_9_5_0.Chilkat.Xml') $xml = new COM("Chilkat.Xml"); // For versions of Chilkat < 10.0.0, use new COM('Chilkat_9_5_0.Chilkat.StringBuilder') $sb = new COM("Chilkat.StringBuilder"); $firstIteration = 1; $retval = 1; $numTransactions = 0; // The begin marker is "XML tag aware". If the begin marker begins with "<" // and ends with ">", then it is assumed to be an XML tag and it will also match // substrings where the ">" can be a whitespace char. $beginMarker = '<Transaction>'; $endMarker = '</Transaction>'; while ($retval == 1) { $sb->Clear(); // The retval can have the following values: // 0: No more fragments exist. // 1: Captured the next fragment. The text from beginMarker to endMarker, including the markers, are returned in sb. // -1: Error. $retval = $fac->ReadNextFragment($firstIteration,$beginMarker,$endMarker,'utf-8',$sb); $firstIteration = 0; if ($retval == 1) { $numTransactions = $numTransactions + 1; $success = $xml->LoadSb($sb,1); // Your application may now do what it needs with this particular XML fragment... } } if ($retval < 0) { print $fac->LastErrorText . "\n"; } print 'numTransactions: ' . $numTransactions . "\n"; ?> |
© 2000-2025 Chilkat Software, Inc. All Rights Reserved.