OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

 


 

   RE: [xml-dev] Large xml processing into SQL Server

[ Lists Home | Date Index | Thread Index ]
  • To: xml-dev@lists.xml.org
  • Subject: RE: [xml-dev] Large xml processing into SQL Server
  • From: Christopher Milton <cmiltonperl@yahoo.com>
  • Date: Fri, 7 Jan 2005 12:55:09 -0800 (PST)
  • Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; b=iXU2+P8Q+JPa/JTYYQy35ZWM1ViAxH1W04THOe4IWJTh4BNpbnKxrC7E9oXpwafN55uNsfFioRgqlWJQ4R4pQTQcZhDU8XS3hZbbjKFCYPqEfe3RVOzP3aOC1qmE1QkHsJpLEq+CXzEJpuY8+6Qltb1LKI0izFim022goZ1xk8o= ;
  • In-reply-to: <5FA491FA70F4AD43B6C3ECA9B39E441614AF25@AMRXM1106.dir.svc.accenture.com>

These links may help:

"Using the WebLogic XML Streaming API"
http://e-docs.bea.com/wls/docs81/xml/xml_stream.html#1092257
"BEA WebLogic XML/Non-XML Translator"
http://e-docs.bea.com/wlxt/10/index.htm

From experience, I set my java -Xmx# to at least 4 times the
size of the file I'm parsing for in-memory processing like DOM
and at least twice the size for stream processing like SAX.
YMMV

--- william.j.boxall@accenture.com wrote:
> Thank you all who have offered to help!
>  
> I am home sick right now, so don't have access to my work computer so I can't
> check all the settings.. The driver we are using, though, is
> weblogic.jdbc.sqlserver.SQLServerDriver.
> The temp table I am attempting to write the records to is:
>  
> CREATE TABLE [TMP_TAN_TAB] (
>  [TAN_PREFIX] [char] (4) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
>  [TAN_NUMBER] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
>  [CHANNEL_TYPE] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
>  [STATUS] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
>  [TIN] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
>  [NON_GDS_IND] [bit] NULL ,
>  [ACTIVE_IND] [bit] NOT NULL ,
>  [LAST_UPDATE] [smalldatetime] NOT NULL ,
>  [INTERFACE_NAME] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL 
> ) ON [PRIMARY]
> GO
> 
> And a sample 3 record xml file is:
> 
> <?xml version="1.0"?> 
> <AMEXTANToACNTAN DateTimeCreated="20040127165815" RecordCount="3"
> xmlns="stsi/amextan" SerialNum="1">
>     <BusinessUnit Code="AAAA" TANCount="2">
>         <ITEM TAN="AAAA00324">
>             <ChannelType>P</ChannelType>
>             <TANStatus>A</TANStatus>
>             <TIN>       </TIN>
>         </ITEM>
>         <ITEM TAN="AAAA98422">
>             <ChannelType>R</ChannelType>
>             <TANStatus>U</TANStatus>
>             <TIN>1234567</TIN>
>         </ITEM>
>     </BusinessUnit>
>     <BusinessUnit Code="BBBB" TANCount="1">
>         <ITEM TAN="BBBB00420">
>             <ChannelType>P</ChannelType>
>             <TANStatus>C</TANStatus>
>             <TIN>       </TIN>
>         </ITEM>
>     </BusinessUnit>
> </AMEXTANToACNTAN>
> 
> 
> But the size of the file IS big.. The sample file we received was 72 meg. 
> That equates into roughly 819,000 records.
>  
> This huge a file won't be an every day occurance, though.  It's a 1 time
> initial load, and thereafter the daily files we receive will be smaller,
> though we do not know exactly what size they will be.
>  
> Currently, I'm attempting to work with a smaller version of the file, which
> is only 3 meg, or about 33,000 records.
>  
> What we're using to process the xml is whatever came packaged with BEA
> Weblogic.. I'll admit I didn't dig too deeply into that.  There is a file
> event generator that picks up the physical file and passes it to my process
> as an xml document.  I then store it in a document type, and do my iteration
> and validation checks on individual fields, then as I iterate through and the
> individual record passes the validation routine, I add it to a batch sql
> statement, which I pass to the database at the end.  
>  
> We do not use DTDs or XSLT, currently, though we may have to.  We do have a
> schema, though, which is in hierarchical format.  It may need to be flattened
> if we are to do a bulk load.
>  
> Once again, thank you for any help you can give!
> Bill
>  
> 
> 
> This message is for the designated recipient only and may contain privileged,
> proprietary, or otherwise private information.  If you have received it in
> error, please notify the sender immediately and delete the original.  Any
> other use of the email by you is prohibited.
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> 
> The list archives are at http://lists.xml.org/archives/xml-dev/
> 
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://www.oasis-open.org/mlmanage/index.php>
> 
> 





 

News | XML in Industry | Calendar | XML Registry
Marketplace | Resources | MyXML.org | Sponsors | Privacy Statement

Copyright 2001 XML.org. This site is hosted by OASIS