I have built a node.js microformats parser, it is based on my previous javascript parsing code. It has been packaged up so you can easily be add to your projects using npm.
Source code: https://github.com/glennjones/microformat-node /> Test server : http://microformat-node.jit.su
npm install microformat-node
or
git clone http://github.com/glennjones/microformat-node.git
cd microformat-node
npm link
with URL
var shiv = require("microformat-node");
shiv.parseUrl('http://glennjones.net/about', {}, function(data){
// do something with data
});
or with raw html
var shiv = require('microformat-node');
var html = '';
shiv.parseHtml(html, {}, function(data){
// do something with data
});
with URL for a single format
var shiv = require("microformat-node");
shiv.parseUrl('http://glennjones.net/about', {'format': 'XFN'}, function(data){
// do something with data
});
Currently microformat-node supports the following formats: />
hCard
, XFN
, hReview
, hCalendar
, hAtom
, hResume
, geo
, adr
and tag
. Its important to use the right case when specifying the format query string parameter.
This will return JSON. This is example of two geo microformats found in a page.
{
"microformats": {
"geo": [{
"latitude": 37.77,
"longitude": -122.41
}, {
"latitude": 37.77,
"longitude": -122.41
}]
},
"parser-information": {
"name": "Microformat Shiv",
"version": "0.2.4",
"page-title": "geo 1 - extracting singular and paired values test",
"time": "-140ms",
"page-http-status": 200,
"page-url": "http://ufxtract.com/testsuite/geo/geo1.htm"
}
}
Start the server binary:
$ bin/microformat-node
Then visit the server URL
http://localhost:8888/
You need to provide the url of the web page and the format(s) you wish to parse as a single value or a comma delimited list:
GET http://localhost:8888/?url=http%3A%2F%2Fufxtract.com%2Ftestsuite%2Fhcard%2Fhcard1.htm&format=hCard
You can also use the hash # fragment element of a url to target only part of a HTML page. The hash is used to target the HTML element with the same id.
Viewing the unit tests
The module inculdes a page which runs the ufxtract microfomats unit test suite.
http://localhost:8888/unit-tests/
microformat-node using a module called ‘jsdom’ which in turn uses ‘contextify’ that requires native code build.
There are a couple of things you normally need to do to compile node code on Windows.
If you have the standard release of node it will probably be x86 rather than x64, for x64 there is a different Visual Studio shell but usally in same place.