Automatic Estimation of Web Bloggers’ Age Using Regression Models

Forskningsoutput: Kapitel i bok/rapport/Conference proceedingKonferenspaper i proceeding

Abstract

In this article, we address the problem of automatic age estimation of web users based on their posts. Most studies on age identification treat the issue as a classification problem. Instead of following an age category classification approach, we investigate the appropriateness of several regression algorithms on the task of age estimation. We evaluate a number of well-known and widely used machine learning algorithms for numerical estimation, in order to examine their appropriateness on this task. We used a set of 42 text features. The experimental results showed that the Bagging algorithm with RepTree base learner offered the best performance, achieving estimation of web users’ age with mean absolute error equal to 5.44, while the root mean squared error is approximately 7.14.

Detaljer

Författare
Externa organisationer
  • University of Patras
Forskningsområden

Ämnesklassifikation (UKÄ) – OBLIGATORISK

  • Jämförande språkvetenskap och lingvistik

Nyckelord

Originalspråkengelska
Titel på värdpublikationSpeech and Computer
Undertitel på gästpublikation17th International Conference, SPECOM 2015, Athens, Greece, September 20-24, 2015, Proceedings
RedaktörerAndrey Ronzhin, Rodmonga Potapova, Nikos Fakotakis
FörlagSpringer
Sidor113-120
ISBN (elektroniskt)978-3-319-23132-7
StatusPublished - 2015
PublikationskategoriForskning
Peer review utfördJa
Externt publiceradJa

Publikationsserier

NamnLecture Notes in Computer Science
Volym9319
ISSN (tryckt)0302-9743