|
Last Exit to Springfield
The Simpsons, Season 4
The Simpsons |
|
|
Krusty Gets Kancelled
The Simpsons, Season 4
The Simpsons |
★★★ |
|
Marge in Chains
The Simpsons, Season 4
The Simpsons |
★★★ |
|
Whacking Day
The Simpsons, Season 4
The Simpsons |
★★★ |
|
The Front
The Simpsons, Season 4
The Simpsons |
★★★ |
|
I've Got You Under My Skin
Classic Sinatra
Frank Sinatra |
|
|
Doctorin' The Tardis
Doctorin' the Tardis CD Single
The Timelords |
★★★★ |
|
Strange Kind Of Love
Wild Birds 1985-1995: The Best Of The Beggars Banquet Years
Peter Murphy |
★★★ |
|
The History of the World (Part 1)
Smash It Up: The Anthology
The Damned |
★★★★★ |
|
Shoplifters of the World Unite
The World Won't Listen
The Smiths |
★★★ |
|
Absurd (Reeferendrum)
Absurd (Promo Single)
Fluke |
★★★★ |
|
There Is A Light That Never Goes Out (Josh Patrick Remix)
There Is A Light That Never Goes Out / Evil
The Smiths |
★★★★ |
|
Name of the Game
Tweekend
The Crystal Method |
|
|
Take A Chance On Me
Pop!
Erasure |
★★★ |
|
Only This Moment
The Understanding
Röyksopp |
★★★★ |
|
Opus 132
Pieces In A Modern Style
William Orbit |
★★★ |
|
Adagio For Strings (Ferry Corsten Remix)
Pieces In A Modern Style
William Orbit |
★★★★ |
|
Tomorrow People
1000 Original Hits: 1988
Ziggy Marley |
★★★ |
|
Least Complicated
Retrospective
Indigo Girls |
|
|
Luv 2 U
Trust It
Junior Jack |
|
|
Aerodynamic
Discovery
Daft Punk |
★★★★ |
|
Down In It
Pretty Hate Machine
Nine Inch Nails |
★★★ |
|
Movin' Right Along
Muppets: The Green Album
Alkaline Trio |
|
|
Disappearing Into You
Seventh Heaven
Govi |
|
|
P. Machinery [Beta Mix]
Outside World
Propaganda |
|
|
Ooh La La
Supernature
Goldfrapp |
★★★★ |
|
Monkey Gone To Heaven
Death To The Pixies
Pixies |
|
|
Down Down Down (Midnight Juggernauts Remix)
Re-Sets
The Presets |
|
|
In a State (Sasha Remix) 2003
Involver
UNKLE |
★★★★ |
|
Destiny
Simple Things (Bonus)
Zero 7 |
★★★★ |
|
Das Testament des Dr Mabuse [DJ Promo Version]
Outside World
Propaganda |
★★★ |
|
Stop!
Pop!
Erasure |
★★★ |
|
Rock You Like A Hurricane
The Platinum Collection
Scorpions |
★★★★ |
|
Frozen
Ray Of Light
Madonna |
|
|
Pompeji
Voices & Images
Camouflage |
|
|
Pop A Cap In Yo' Ass
Buzzin' Fly Volume 2
Ben Watt Feat. Estelle |
★★★★ |
|
La Luna
DJ Culture (Limited Edition)
Blank & Jones |
|
|
Soul Sauce (Guachi Guaro)
Verve Unmixed 2
Cal Tjader |
|
|
Desafinado
Jazz Samba
Stan Getz & Charlie Byrd |
|
|
La Flûte De Pan
The Seduction Of Claude Debussy
The Art Of Noise |
|
|
Agua De Beber
Jazz Masters 9
Astrud Gilberto |
|
|
Wild, Sweet And Cool
Tweekend
The Crystal Method |
|
|
Girl And The Sea (Cut Copy Remix)
Re-Sets
The Presets |
★★★★ |
|
Versus
Velocifero
Ladytron |
|
|
Day Too Soon
Some People Have Real Problems
Sia |
★★★★ |
|
My Weakness
Play
Moby |
|
|
Things Can Only Get Better (Superfly Development Vocal Mix)
Things Can Only Get Better (Single)
D:Ream |
★★★★★ |
|
These Days
Involver
Petter |
|
|
Isn't It Romantic
One Fine Day Soundtrack
Ella Fitzgerald |
|
|
The Sky Is Falling (Remix)
Leave The Light On
Agnes Poetry |
★★★★ |
A reader nicknamed "Aviator" asked for help parsing the iTunes Library XML file using a SAX parser. He did most of the work implementing the parser, so I'll simply build on his sample code, which you can see in the comments here.
First, let's step back and describe a SAX parser. The SAX parser is an approach to parsing an XML document that uses "handlers", or simple callback methods, to mark when the parser encounters specific parsing events, such as the start or end of an XML tag. SAX parsers are considered lightweight because they only know about parsing events, and typically only parse into memory the portion of an XML document (the document fragment) that corresponds to the tag being parsed. Contrast this approach with that of the DOM parser which reads the whole document into memory before working on it.
It's also helpful to take a look at Apple's XML plist format, the file format of the iTunes XML file. The plist format goes way back to the NeXT computer platform. It is basically a key-value representation of data, in which each key corresponds to a value in the document.
For example, consider this very simple (old style) plist:
{
name = "Fred";
}
If you squint a little, it looks a lot like another common format, JSON.
The XML plist format came along later, but it retains the same basic format of a key-value representation, with a bit of extra sugar sprinkled on top. This makes it a little weird to work with.
In traditional XML representations, you might expect the following:
<name="Fred" />
or
<name>Fred</name>
However, in Apple's XML plist format, you would see:
<dict>
<key>name</key><string>Fred</string>
</dict>
This throws a curve ball at a SAX parser, because it traditionally only knows about the "current" tag. Handling this is where Aviator's code fell short. Fortunately, it's easy enough to teach the parser a few tricks by keeping some local state. We just need to keep a reference to the previous tag and tag value in order to provide the necessary context to understand the current tag's meaning. A more complete parser would probably maintain a stack of the previous tags/values for to fully represent the parser's current context, but a simple reference to the previous tag and value is enough for our needs.
Below is my (slight) reworking of Aviator's code. It will successfully parse the songs out of the iTunes XML Library file.
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
/**
* A sample iTunes Library file SAX parser.
*/
public class SAXParserExample extends DefaultHandler {
private static final String LIBRARY_FILE_PATH = "/tmp/iTunes Music Library.xml"; //"C:\\iTunes Music Library.xml";
List<Song> myTracks;
private String tempVal;
//to maintain context
private Song tempTrack;
boolean foundTracks = false;
private String previousTag;
private String previousTagVal;
public SAXParserExample() {
myTracks = new ArrayList<Song>();
}
public void runExample() {
parseDocument();
printData();
}
private void parseDocument() {
//get a factory
SAXParserFactory spf = SAXParserFactory.newInstance();
try {
//get a new instance of parser
SAXParser sp = spf.newSAXParser();
//parse the file and also register this class for call backs
sp.parse(LIBRARY_FILE_PATH, this);
}catch(SAXException se) {
se.printStackTrace();
}catch(ParserConfigurationException pce) {
pce.printStackTrace();
}catch (IOException ie) {
ie.printStackTrace();
}
}
/**
* Iterate through the list and print
* the contents
*/
private void printData(){
System.out.println("No of Tracks '" + myTracks.size() + "'.");
Iterator<Song> it = myTracks.iterator();
while(it.hasNext()) {
Song song = it.next();
System.out.println(song.getAlbum() + " - " + song.getName());
}
}
//Event Handlers
@Override
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
//reset
tempVal = "";
if (foundTracks) {
if ("key".equals(previousTag) && "dict".equalsIgnoreCase(qName)) {
//create a new instance of employee
tempTrack = new Song();
myTracks.add(tempTrack);
}
} else {
if ("key".equals(previousTag) && "Tracks".equalsIgnoreCase(previousTagVal) && "dict".equalsIgnoreCase(qName)) {
foundTracks = true; // We are now inside the Tracks dict.
}
}
}
public void characters(char[] ch, int start, int length) throws SAXException {
tempVal = new String(ch,start,length);
}
public void endElement(String uri, String localName, String qName) throws SAXException {
if (foundTracks) {
if (previousTagVal.equalsIgnoreCase("Name") && qName.equals("string"))
{
tempTrack.setName(tempVal);
}
else if (previousTagVal.equalsIgnoreCase("Artist") && qName.equals("string"))
{
tempTrack.setArtist(tempVal);
}
else if (previousTagVal.equalsIgnoreCase("Album") && qName.equals("string"))
{
tempTrack.setAlbum(tempVal);
}
else if (previousTagVal.equalsIgnoreCase("Play Count") && qName.equals("integer"))
{
Integer value = Integer.parseInt(tempVal);
tempTrack.setPlayCount(value.intValue());
}
// Mark when we come to the end of the "Tracks" dict.
if ("key".equals(qName) && "Playlists".equalsIgnoreCase(tempVal)) {
foundTracks = false;
}
}
// Keep track of the previous tag so we can track the context when we're at the second tag in a key, value pair.
previousTagVal = tempVal;
previousTag = qName;
}
/**
* A simple representation of a song in the iTunes library.
*/
public class Song {
private String name;
private String artist;
private String album;
private int playCount;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getArtist() {
return artist;
}
public void setArtist(String artistName) {
this.artist = artistName;
}
public String getAlbum() {
return album;
}
public void setAlbum(String albumName) {
this.album = albumName;
}
public int getPlayCount() {
return playCount;
}
public void setPlayCount(int playCount) {
this.playCount = playCount;
}
}
public static void main(String[] args) {
SAXParserExample spe = new SAXParserExample();
spe.runExample();
}
}
At March 31, 2010 @ 11:32 p.m. Aviator(Rishupreet Oberoi) said:
Nicely put Travis!
Fortunately i managed to do almost the same you mentioned there and parse by storing previous value of tag encountered.
Here is my code:-
public class ItunesParser extends DefaultHandler{
Map<String,MUDTrackBean> tracksMap = null;
private String current_tag;
private String prev_val;
private StringBuilder tempVal = new StringBuilder();
private MUDTrackBean tempTrack;
private String trackid;
private boolean isParentArray=false;
private String fileLoc;
public Map<String,MUDTrackBean> itunesParser(String fileLoc) throws SAXException {
tracksMap = new HashMap<String,MUDTrackBean>();
this.fileLoc = fileLoc;
parseDocument();
//printData();
return tracksMap;
}
private void parseDocument() throws SAXException {
try{
SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setValidating(true);
spf.setFeature("http://apache.org/xml/features/validation/schema", true);
SAXParser sp = spf.newSAXParser();
sp.parse(fileLoc, this);
}catch(ParserConfigurationException pce) {
pce.printStackTrace();
}catch (IOException ie) {
ie.printStackTrace();
}
}
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
current_tag = qName.trim();
prev_val = tempVal.toString();
tempVal = new StringBuilder();
if(qName.equalsIgnoreCase("array")) {
isParentArray = true;
}
}
public void characters(char[] ch, int start, int length) throws SAXException {
if(current_tag.equalsIgnoreCase("string")) {
tempVal.append(new String(ch, start, length));
} else {
tempVal = new StringBuilder(new String(ch, start, length));
}
}
public void endElement(String uri, String localName, String qName) throws SAXException {
String tempValStr=tempVal.toString();
if(qName.equalsIgnoreCase("dict") && !isParentArray)
{
if(tempTrack!=null)
{
if(tempTrack.getTrackid()!=null)
{
tracksMap.put(trackid, tempTrack);
}
}
}
else if (qName.equalsIgnoreCase("key") && !isParentArray)
{
if(tempValStr.equalsIgnoreCase("Track ID"))
tempTrack = new MUDTrackBean();
}
else if(qName.equals("integer") && prev_val.equalsIgnoreCase("Track ID") && !isParentArray)
{
trackid = tempValStr;
tempTrack.setTrackid(trackid);
}
else if(qName.equals("string") && prev_val.equalsIgnoreCase("Name") && !isParentArray)
{
tempTrack.setTrackTitle(tempValStr);
}
else if (qName.equalsIgnoreCase("string") && prev_val.equalsIgnoreCase("Artist") && !isParentArray)
{
tempTrack.setArtist(tempValStr);
}
else if (qName.equalsIgnoreCase("string") && prev_val.equalsIgnoreCase("Album") && !isParentArray)
{
tempTrack.setAlbum(tempValStr);
}
else if (qName.equalsIgnoreCase("integer") && prev_val.equalsIgnoreCase("Play Count") && !isParentArray)
{
tempTrack.setPlayCount(Integer.parseInt(tempValStr));
}
}
}
Regards
Rishupreet Oberoi (Aviator)
At April 27, 2010 @ 1:37 p.m. Lucas said:
I found it fascinating! but i'm having some issues... like anpersands in song titles... the parser can't get it?
At June 10, 2010 @ 11:55 p.m. Zach Caraher said:
yo can you put up a download link for biddang by ratatat? i pre-ordered the album but its not downloading for some reason. can u email me at bigzremixes@gmail.com
thanks a lot,
zach
At June 21, 2010 @ 12:51 p.m. Travis Cripps said:
Zach,
I suggest you get in touch with the iTunes support people about your missing preorder track. Sorry.
At Feb. 1, 2011 @ 5:08 a.m. Moti said:
Hi and thank for this post its help me a lot.
actually i wrote my application in C++ (using QT - cross platform framework)
my question is: how do you handle localization string in the Local field (i.e. if i have mp3 file located in a folder with France/Hebrew/Arabic.... chars this folder is present in the iTunes music library.xml as not readable chars like '/myMusic/%D7%A9%D7%9C%D7%95%D7%9D%20%D7%97%D7%A0%D7%95%D7%9A%20-%20%D7%94%D7%9E%D7%99%D7%98%D7%91/%D7%A9%D7%9C%D7%95%D7%9D%20%D7%97%D7%A0%D7%95%D7%9A%20-%20%20%D7%9B%D7%9B%D7%94%20%D7%95%D7%9B%D7%9B%D7%94.mp3')
At Feb. 10, 2011 @ 5:49 a.m. Mits said:
Very useful code indeed. Lucas, the problem you have can be solved with the string method String.replace("&","\&"); when you read something you want to use. The backslash will add this like just a string. Thanks again for the really usefull code!!
At June 13, 2011 @ 4:28 p.m. Gil said:
I tried this myself. The SAX parser threw an error.
It turns out that the iTunes XML is not well-formed. It contains two root elements.
The first root element is this: <plist version="1.0">
It is never closed. The second element is the "proper" <dict> root element.
How did you get this to run without encountering an error?
Have any thoughts about this post? Add your comment.