Metadata extraction and text categorization using Universal Resource Locator expansions

dc.contributor.authorMin-Yen KANen_US
dc.date.accessioned2004-10-21T14:28:52Zen_US
dc.date.accessioned2017-01-23T06:59:54Z
dc.date.available2004-10-21T14:28:52Zen_US
dc.date.available2017-01-23T06:59:54Z
dc.date.issued2003-10-01T00:00:00Zen_US
dc.description.abstractUniform resource locators (URLs), which mark the address of a resource on the World Wide Web, are often human-readable and can indicate metadata about a resource. This paper explores the mining of URLs to yield categoric metadata about web resources via a three-phase pipeline of word segmentation, abbreviation expansion and classification. I apply this approach to the problem of subject metadata generation and quantify its performance relative to title- and document-based methods, both which require the retrieval of the source document.en_US
dc.format.extent267515 bytesen_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.urihttps://dl.comp.nus.edu.sg/xmlui/handle/1900.100/1436en_US
dc.language.isoenen_US
dc.relation.ispartofseriesTR10/03en_US
dc.titleMetadata extraction and text categorization using Universal Resource Locator expansionsen_US
dc.typeTechnical Reporten_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
report.pdf
Size:
261.25 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.52 KB
Format:
Plain Text
Description: