Advertisement

09.17.2007 at 10:24AM PDT, ID: 22833778
[x]
Attachment Details
[x]
The Solution Rating System

With so many solutions, how can you tell which solutions are most likely to help you and which ones are not? To provide you with a tool to use, we rate our solutions based on various elements that most accurately determine if a solution is a quality solution. To explain what factors affect the solution rating, here are the elements we take into consideration when formulating our solution rating.

  • The Grade of the Solution
  • The Zone Rank of the Expert Providing the Solution
  • The Number of Author and Expert Comments
  • The Number of Experts Contributing
  • The Feedback of the Community

Your Input Matters
Because of the way the system is set up, the most important variable in this equation is you. As a member of Experts Exchange, you are able to cast your vote on the quality of the solutions in regard to how complete, accurate, helpful and easy to understand each solution is. When you provide your feedback, each rating is adjusted accordingly. So, if you see a solution that has a poor rating that you think is a good solution, let us know by rating it. As you do, the rating will be adjusted and will become more accurate for other members of our site.

If you have any suggestions that you would like to make for our rating system, please ask a question in the Suggestions Zone of Community Support.

Thank you!

7.0

Convert XML to CSV (Pipe Delimited) - Large Files 100mb to 1GB

Asked by thyros in Extensible Stylesheet Language Transformation (XSLT), Extensible Markup Language (XML)

Tags: , , , ,

I am working with large xml files (ranging from 100mb to 1GB+), thus it is not really possible to load them into memory.. and the goal is to convert these files to a csv format, but pipe (|) delimited.


The XML file looks like this;


  <?xml version="1.0" encoding="UTF-8" ?>
- <merch_item_feed>
- <item_data>
- <item_basic_data>
  <item_unique_id>0115526102</item_unique_id>
  <item_ean>9780115526107</item_ean>
  <item_sku>0115526102</item_sku>
  <item_upc />
  <item_mpn />
  <item_brand>Stationery Office Books</item_brand>
  <item_name>The Official Learning to Drive Pack (Driving Skills)</item_name>
  <item_model />
  <item_category>Book</item_category>
  <item_short_desc>Paperback, Stationery Office Books</item_short_desc>
  <item_page_url>http://www.amazon.co.uk/exec/obidos/ASIN/0115526102/AssocID/ref=nosim</item_page_url>
  <amzn_page_url>http://www.amazon.co.uk/exec/obidos/ASIN/0115526102/AssocID/ref=nosim</amzn_page_url>
  <offer_page_url>http://www.amazon.co.uk/o/redirect?tag=AssocID&link_code=asm&path=tg/detail/offer-listing/-/0115526102/new/ASIN/0115526102&camp=1634&creative=6738</offer_page_url>
  <offer_used_url>http://www.amazon.co.uk/o/redirect?tag=AssocID&link_code=asm&path=tg/detail/offer-listing/-/0115526102/used/ASIN/0115526102&camp=1634&creative=6738</offer_used_url>
  <item_image_url>http://ec1.images-amazon.com/images/I/31DY01WGJ1L.jpg</item_image_url>
  <item_image_url_small>http://ec1.images-amazon.com/images/I/11FE11DK7YL.jpg</item_image_url_small>
  <item_salesrank>233499</item_salesrank>
  <item_price>21.23</item_price>
  <item_inventory>Usually dispatched within 1-2 business days</item_inventory>
  <item_shipping_charge>Check Site.</item_shipping_charge>
  <amzn_price>24.99</amzn_price>
  <amzn_inventory>Usually dispatched within 24 hours</amzn_inventory>
  <amzn_shipping_charge>Free!</amzn_shipping_charge>
  <fm_price>24.99</fm_price>
  <fm_inventory>Usually dispatched within 24 hours</fm_inventory>
  <fm_shipping_charge>Free!</fm_shipping_charge>
  <tp_new_price>21.23</tp_new_price>
  <tp_new_inventory>Usually dispatched within 1-2 business days</tp_new_inventory>
  <tp_new_shipping_charge>Check Site.</tp_new_shipping_charge>
  <tp_used_price>20.00</tp_used_price>
  <tp_used_inventory>In Stock</tp_used_inventory>
  <tp_used_shipping_charge>Check Site.</tp_used_shipping_charge>
  </item_basic_data>
- <prod_specific_data category="book">
  <known_attr_val_pair attr="book_author" val="Driving Standards Agency" />
  <known_attr_val_pair attr="book_isbn" val="0115526102" />
  <known_attr_val_pair attr="book_format" val="Paperback" />
  </prod_specific_data>
- <merch_cat_list>
- <merch_cat_item>
  <merch_cat_name>277082</merch_cat_name>
  <merch_cat_path>Books/Subjects/Reference/Transport/Automotive/Driving & the Highway Code</merch_cat_path>
  </merch_cat_item>
- <merch_cat_item>
  <merch_cat_name>278131</merch_cat_name>
  <merch_cat_path>Books/Subjects/Science & Nature/Engineering & Technology/Civil Engineering/Road & Transport</merch_cat_path>
  </merch_cat_item>
- <merch_cat_item>
  <merch_cat_name>10834521</merch_cat_name>
  <merch_cat_path>Books/Special Features/34% off Books over £10/Science & Nature</merch_cat_path>
  </merch_cat_item>
- <merch_cat_item>
  <merch_cat_name>10834491</merch_cat_name>
  <merch_cat_path>Books/Special Features/34% off Books over £10/Reference & Languages</merch_cat_path>
  </merch_cat_item>
  </merch_cat_list>
  </item_data>
- <item_data>
- <item_basic_data>
  <item_unique_id>0115528423</item_unique_id>
  <item_ean>9780115528422</item_ean>
  <item_sku>0115528423</item_sku>
  <item_upc />
  <item_mpn />
  <item_brand>The Stationary Office (TSO)</item_brand>
  <item_name>The Official DSA Theory Test for Motorcyclists CD-ROM</item_name>
  <item_model />
  <item_category>Software</item_category>
  <item_short_desc>, Platforms: Windows XP</item_short_desc>
  <item_page_url>http://www.amazon.co.uk/exec/obidos/ASIN/0115528423/AssocID/ref=nosim</item_page_url>
  <amzn_page_url>http://www.amazon.co.uk/exec/obidos/ASIN/0115528423/AssocID/ref=nosim</amzn_page_url>
  <offer_page_url>http://www.amazon.co.uk/o/redirect?tag=AssocID&link_code=asm&path=tg/detail/offer-listing/-/0115528423/new/ASIN/0115528423&camp=1634&creative=6738</offer_page_url>
  <offer_used_url>http://www.amazon.co.uk/o/redirect?tag=AssocID&link_code=asm&path=tg/detail/offer-listing/-/0115528423/used/ASIN/0115528423&camp=1634&creative=6738</offer_used_url>
  <item_image_url>http://ec1.images-amazon.com/images/I/31Au9yM7IZL.jpg</item_image_url>
  <item_image_url_small>http://ec1.images-amazon.com/images/I/11ldivFAIML.jpg</item_image_url_small>
  <item_salesrank>1068</item_salesrank>
  <item_price>16.99</item_price>
  <item_inventory>Not yet released</item_inventory>
  <item_shipping_charge>Free!</item_shipping_charge>
  <amzn_price>16.99</amzn_price>
  <amzn_inventory>Not yet released</amzn_inventory>
  <amzn_shipping_charge>Free!</amzn_shipping_charge>
  <fm_price>16.99</fm_price>
  <fm_inventory>Not yet released</fm_inventory>
  <fm_shipping_charge>Free!</fm_shipping_charge>
  </item_basic_data>
- <prod_specific_data category="software">
  <known_attr_val_pair attr="hardware_platform" val="PC" />
  <known_attr_val_pair attr="software_os" val="Windows XP" />
  <known_attr_val_pair attr="software_format" val="CD-ROM" />
  </prod_specific_data>
- <merch_cat_list>
- <merch_cat_item>
  <merch_cat_name>277082</merch_cat_name>
  <merch_cat_path>Books/Subjects/Reference/Transport/Automotive/Driving & the Highway Code</merch_cat_path>
  </merch_cat_item>
- <merch_cat_item>
  <merch_cat_name>278131</merch_cat_name>
  <merch_cat_path>Books/Subjects/Science & Nature/Engineering & Technology/Civil Engineering/Road & Transport</merch_cat_path>
  </merch_cat_item>
- <merch_cat_item>
  <merch_cat_name>912026</merch_cat_name>
  <merch_cat_path>Software/Categories/Hobbies & Pastimes/Driving Tests</merch_cat_path>
  </merch_cat_item>
- <merch_cat_item>
  <merch_cat_name>16305411</merch_cat_name>
  <merch_cat_path>Software/Categories/Hobbies & Pastimes/All Hobbies & Pastimes</merch_cat_path>
  </merch_cat_item>
- <merch_cat_item>
  <merch_cat_name>317243011</merch_cat_name>
  <merch_cat_path>Software/Categories/Digital Imaging/Creativity Software</merch_cat_path>
  </merch_cat_item>
- <merch_cat_item>
  <merch_cat_name>341610011</merch_cat_name>
  <merch_cat_path>uk-shops/Education Resources/Software/Driving Tests</merch_cat_path>
  </merch_cat_item>
  </merch_cat_list>
  </item_data>
- </merchitemfeed>


Objective is to extract the data from the 'item_basic_data' elements and separate them by pipe character.

Output should look something like (with the field headers);

item_unique_id|item_ean|iteam_upc
12345678901|12345678|12345678
12345678901|12345678|12345678
12345678901|12345678|12345678

------------------------------------------

Please note that only the information from 'item_basic_data' needs to be extracted - instructions on how to accomplish this is sufficient as an answer.  However, if you know your stuff, I would appreciate a solution that could extract the first instance of 'merch_cat_path'.  If you notice, each 'item_basic_data' has 4 or 5 duplicate elements of 'merch_cat_path', but we only want the first instance if possible.

I am assuming we will need some xslt file, but I don't know how to write it.  I am experimenting with a program that will do the processing of the input xml, transform xslt, and output csv files, but it does not supply the xslt file itself.

Also, if you have any suggestions for similar programs that can handle & process large xml files - preferably freeware, but commercial is ok too.


Start Free Trial
 
 
Loading Advertisement...
 
[+][-]09.17.2007 at 10:34AM PDT, ID: 19906666

At Experts Exchange, members can ask their questions to thousands of technology professionals, also known as Experts. Experts compete and collaborate to answer those questions by leaving comments like this one.

Start your 7-day free trial to view this Expert Comment or ask the Experts your question.

 
[+][-]09.17.2007 at 10:44AM PDT, ID: 19906727

At Experts Exchange, members can ask their questions to thousands of technology professionals, also known as Experts. Experts compete and collaborate to answer those questions by leaving comments like this one.

Start your 7-day free trial to view this Expert Comment or ask the Experts your question.

 
[+][-]09.17.2007 at 10:44AM PDT, ID: 19906737

At Experts Exchange, members can ask their questions to thousands of technology professionals, also known as Experts. Experts compete and collaborate to answer those questions by leaving comments like this one.

Start your 7-day free trial to view this Expert Comment or ask the Experts your question.

 
[+][-]09.17.2007 at 11:01AM PDT, ID: 19906855

At Experts Exchange, members can ask their questions to thousands of technology professionals, also known as Experts. Experts compete and collaborate to answer those questions by leaving comments like this one.

Start your 7-day free trial to view this Expert Comment or ask the Experts your question.

 
[+][-]09.17.2007 at 11:21AM PDT, ID: 19906997

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 7-day free trial to view this Author Comment or ask the Experts your question.

 
[+][-]09.17.2007 at 12:00PM PDT, ID: 19907335

At Experts Exchange, members can ask their questions to thousands of technology professionals, also known as Experts. Experts compete and collaborate to answer those questions by leaving comments like this one.

Start your 7-day free trial to view this Expert Comment or ask the Experts your question.

 
[+][-]09.17.2007 at 12:03PM PDT, ID: 19907357

At Experts Exchange, members can ask their questions to thousands of technology professionals, also known as Experts. Experts compete and collaborate to answer those questions by leaving comments like this one.

Start your 7-day free trial to view this Expert Comment or ask the Experts your question.

 
[+][-]09.17.2007 at 03:51PM PDT, ID: 19909055

At Experts Exchange, members can ask their questions to thousands of technology professionals, also known as Experts. Experts compete and collaborate to answer those questions by leaving comments like this one.

Start your 7-day free trial to view this Expert Comment or ask the Experts your question.

 
[+][-]09.18.2007 at 05:13AM PDT, ID: 19912003

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 7-day free trial to view this Author Comment or ask the Experts your question.

 
[+][-]09.18.2007 at 06:19AM PDT, ID: 19912477

At Experts Exchange, members can ask their questions to thousands of technology professionals, also known as Experts. Experts compete and collaborate to answer those questions by leaving comments like this one.

Start your 7-day free trial to view this Expert Comment or ask the Experts your question.

 
[+][-]09.18.2007 at 06:25AM PDT, ID: 19912533

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 7-day free trial to view this Author Comment or ask the Experts your question.

 
[+][-]09.19.2007 at 04:26AM PDT, ID: 19919646

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 7-day free trial to view this Author Comment or ask the Experts your question.

 
[+][-]09.20.2007 at 03:38AM PDT, ID: 19927222

View this solution now by starting your 7-day free trial. Setting up your free trial is quick, easy, and secure. We will return you to this solution, unlocked, when you're done.

 

About this solution

Zones: Extensible Stylesheet Language Transformation (XSLT), Extensible Markup Language (XML)
Tags: xml, csv, convert, pipe, file
Sign Up Now!
Solution Provided By: abel
Participating Experts: 2
Solution Grade: A
 
 
[+][-]09.20.2007 at 03:45AM PDT, ID: 19927244

At Experts Exchange, members can ask their questions to thousands of technology professionals, also known as Experts. Experts compete and collaborate to answer those questions by leaving comments like this one.

Start your 7-day free trial to view this Expert Comment or ask the Experts your question.

 
[+][-]09.20.2007 at 05:45AM PDT, ID: 19927767

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 7-day free trial to view this Author Comment or ask the Experts your question.

 
[+][-]09.20.2007 at 06:01AM PDT, ID: 19927873

At Experts Exchange, members can ask their questions to thousands of technology professionals, also known as Experts. Experts compete and collaborate to answer those questions by leaving comments like this one.

Start your 7-day free trial to view this Expert Comment or ask the Experts your question.

 
[+][-]09.20.2007 at 06:06AM PDT, ID: 19927908

At Experts Exchange, members can ask their questions to thousands of technology professionals, also known as Experts. Experts compete and collaborate to answer those questions by leaving comments like this one.

Start your 7-day free trial to view this Expert Comment or ask the Experts your question.

 
 
Loading Advertisement...
20080716-EE-VQP-32 / EE_QW_1_20070628