Top  Papers  Paper Tracks  Data Mining

Data Mining


With the phenomenal growth of the Web, there is an ever-increasing volume of data and information being published in Web pages. The research in Web data mining aims to develop new techniques to effectively extract and mine useful knowledge/information from these Web sources to provide value added services. Due to the heterogeneity and the lack of structure of Web data, automated discovery of targeted or unexpected knowledge/information is a challenging task. It calls for novel methods that draw from a wide range of fields spanning data mining, machine learning, natural language processing, statistics, databases, and information retrieval. In the past few years, there was a rapid expansion of activities in Web data mining, which consists of Web usage mining, Web structure mining, and Web content mining.  Web usage mining refers to mining of usage logs of Web sources.  Web structure mining tries to discover useful information from the structure of hyperlinks. Web content mining aims to extract and mine information/knowledge from web page contents.

For the data mining track, we invite original and high quality submissions addressing all aspects of Web data mining. The relevant topics include, but are not restricted to, the following:

  • Building user profiles and providing recommendations
  • Extracting structured/unstructured data from the Web
  • Integrated mining of Web content, link structure and usage data
  • Mining for business/competitive intelligence and for security
  • Mining hyperlink structures
  • Mining online opinion sources, such as customer reviews, and discussion forums
  • Mining the Web to build topic hierarchies or ontology
  • Mining the semantic Web
  • Segmenting Web pages and detecting noise
  • Web information integration
  • Web page/search results classification, clustering, etc.
  • Web page/site monitoring
  • Web usage mining and web analytics
  • Integrating domain knowledge in Web data mining

Vice/Deputy Chairs and PC Members

Vice Chair: Shinichi Morishita (University of Tokyo)
Deputy Chair: Bing Liu (University of Illinois at Chicago)
PC Members:

  • Charu Aggarwal (IBM T. J. Watson Research Center)
  • Ricardo Baeza-Yates (University of Chile)
  • Krishna Bharat (Google)
  • Soumen Chakrabarti (Indian Institute of Technology)
  • Ming-Syan Chen (National Taiwan University)
  • Gary William Flake (Yahoo! Research Labs)
  • Johannes Gehrke (Cornell University)
  • Lee Giles (PSU)
  • Monika Henzinger (Google Inc and EPFL)
  • Masaru Kitsuregawa (University of Tokyo)
  • Wee Sun Lee (National University of Singapore)
  • Wei-Ying Ma (Microsoft Research)
  • Andrew McCallum (University of Massachusetts Amherst)
  • Raymond Ng (University of British Columbia)
  • Raghu Ramakrishnan (University of Wisconsin)
  • Myra Spiliopoulou (Otto-von-Guericke-University Magdeburg)
  • Ramakrishnan Srikant (IBM Almaden)
  • Jaideep Srivastava (University of Minnesota)
  • Katsumi Tanaka (Kyoto University)
  • Philip Yu (IBM T. J. Watson Research Center)