Parse RSS feeds with PHP

RSS feeds are very common today, and at times we want to write a simple script to grab some information from a feed.

PHP has got and extensive set of functions that can be used to manipulate XML (RSS feeds or even HTML) files. PHP DOM library is one of the handy libraries that can be used to parse RSS feeds in PHP. DOM (Document Object Model) is a standard way for accessing and manipulating XML documents. XML documents can be represented in tree-structure (a node tree), with the elements, attributes, and text defined as nodes.

Below you can find the script for parsing a standard RSS feeds:

// Create a new DOMDocument object
$doc = new DOMDocument();
// Load the RSS file into the object
// Initialize empty array
$arrFeeds = array();
// Get a list of all the elements with the name 'item'
foreach ($doc->getElementsByTagName('item') as $node) {
	$itemRSS = array (
		'title' => $node->getElementsByTagName('title')->item(0)->nodeValue,
		'desc' => $node->getElementsByTagName('description')->item(0)->nodeValue,
		'link' => $node->getElementsByTagName('link')->item(0)->nodeValue,
		'date' => $node->getElementsByTagName('pubDate')->item(0)->nodeValue
	array_push($arrFeeds, $itemRSS);
// Output

The getElementsByTagName method is used, within the loop of the item nodes, to get the nodeValue for the title, description, link and date tags. The nodeValue is the text within the node. An array is used to store each set of values and each array represents an entry in the big array that holds our structured RSS data. At the end of the script all the data will be hold by the $arrFeeds array, which is well structured and can be used to display or further manipulation.

One drawback of using the DOM library is that it reads the entire XML document into memory, and then we use the functions for manipulating the data. Thus this method is that is not recommended for large XML documents, which would take too much memory to build the model of the document.

Anyway, usually the feeds we are dealing with are of normal size, and this won’t be an issue at most occasions.

Categorized as Tech Notes Tagged

By Nimal

I am a grounded nomad. I love a bit of art and science. #thappillai


Comments are closed.