Monday, August 8, 2011

How to save images from img tags and css from stylesheet tags using Regular Expression in Android?

Regular Expressions are the best way to use for search a particular pattern using special syntax. Even though it looks complex to understand it simplifies searching and replacing in text processing and parsing the text. Most of the editors commonly use regex for searching and replacing so its usage is common across any programming languages. I used it for html parsing for an Android applications mainly img tags and css tags which i need to store it in local file system for offline usage.

The following regex pattern is used to retrieve the css path from the html page
".*?(\"text\\/css\").*?((?:\\/[\\w\\.\\-]+)+)" So the code snippet will be like this
Pattern p = Pattern.compile(regex);
Matcher m = p.match(fileName) //fileName is the document we want to parse;
if(m.find())
{
String word = m.group(1); // this gives the exact word ex: text/css
String path =m.group(2); // this gives the relative path like /css/xyz.css
}
Many people suggest to use parses like jsoup neckohtml etc which are under GPL but if the requirement is just parsing few search terms it will be easier to use Regex since we dont need to include external jar files.

Happy coding........

No comments:

Post a Comment