Web Science 2018
Course start: 10.04.2018, 15:30-17:00, Multimedia-Hörsaal (3703 - 023)
- Responsible Professor: Prof. Dr. techn. Wolfgang Nejdl
- Assistant: Philipp Kemkes
- NOTE: Please put "[WebScienceCourse]" into the subject line when writing an email
- Lecture + Tutorial: Tuesdays 15:30 - 17:45
- Room: Multimedia-Hörsaal (3703 - 023), Appelstraße 4, 30167 Hannover
The oral exam consists of two parts:
- Detailed questions on the papers presented by the student during the course. The presentation of the papers is compulsory!
- More general questions on other papers of the same topic, and some on other topics. As a guideline you should be able to answer the following questions:
- What is the problem addressed in the paper?
- How does the solution look like?
- How is it evaluated?
Topics for Student Paper Presentation
Below are the topics of Web Science that will be addressed in the course. Each student will have to pick two papers of the same topic that she/he will present to the other students in the second part of the course.
Until 24.04.2018 send a mail to Philipp with the following details:
- At least 2 papers of the same topic that you wish to present.
- Any time period (if exists) during the semester lecture period in which you absolutely cannot present.
We will try to take the following criteria into account when assigning papers to students:
- Papers will be assigned to students according to the first come first served policy.
- The exact presentation date will be fixed as soon as enough topics have been assigned.
- Presentations about the same topic should take place on the same day.
- A similar number of papers per topic should be presented (as far as possible).
- Each topic should have at least one paper presented.
- Here we collected hints helping you to prepare a good presentation.
- You are highly encouraged to use the provided slide template for your presentation: powerpoint / latex.
List of available Topic Papers
Below are the papers to be chosen and presented, grouped by topic.
1. Fake news detection
- [Selected by Rui Tang] Popat, Kashyap, et al. "Where the truth lies: Explaining the credibility of emerging claims on the web and social media." WWW, 2017.
- [Selected by Kush Varma] Horne, Benjamin D., and Sibel Adali. "This just in: fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news." arXiv preprint arXiv:1703.09398 (2017).
- [Selected by Rui Tang] Popat, Kashyap, et al. "Credibility assessment of textual claims on the web." CIKM, 2016.
- [Selected by Kush Varma] Kumar, Srijan, Robert West, and Jure Leskovec. "Disinformation on the web: Impact, characteristics, and detection of wikipedia hoaxes." WWW 2016.
2. Fairness and Transparency for Big Data Analysis
- [Selected by Kabir Firoz] Tien T. Nguyen, Pik-Mai Hui, F. Maxwell Harper, Loren Terveen, Joseph A. Konstan. Exploring the Filter Bubble: The Effect of Using Recommender Systems on Content Diversity. WWW '14. [PDF]
- [Selected by Md Musa] Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, Krishna P. Gummadi. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment. WWW '17. [PDF]
- [Selected by Md Musa] Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, Adam Kalai. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. NIPS '16. [PDF]
- [Selected by Kabir Firoz] Aylin Caliskan-Islam, Joanna J. Bryson, Arvind Narayanan. Semantics derived automatically from language corpora necessarily contain human biases. 2016. [PDF]
3. Introduction to DeepLearning
- [Selected by Jiang Xuan] Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
- [Selected by Jiang Xuan] Rajpurkar, Pranav, et al. "CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning." arXiv preprint arXiv:1711.05225 (2017).
- [Selected by Xue Yuan] Felbo, Bjarke, et al. "Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm." arXiv preprint arXiv:1708.00524 (2017).
- [Selected by Xue Yuan] Buolamwini, Joy, and Timnit Gebru. "Gender shades: Intersectional accuracy disparities in commercial gender classification." Conference on Fairness, Accountability and Transparency. 2018.
- [Selected by Alexandra Risch] Difallah, Djellel Eddine, et al. The dynamics of micro-task crowdsourcing: The case of amazon mturk. WWW '15. [PDF]
- Raykar, Vikas C., et al. Learning from crowds. JMLR '10. [PDF]
- [Selected by Alexandra Risch] Kazai, Gabriella. In search of quality in crowdsourcing for search engine evaluation. ECIR '11. [PDF]
- Bernstein, Michael S., et al. Soylent: a word processor with a crowd inside. UIST '10. [PDF]
5. Accessing Web Archives
- Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. Graphs over time: densification laws, shrinking diameters and possible explanations. KDD '05 [PDF]
- [Selected by Max Kaulmann] Marijn Koolen and Jaap Kamps. The Importance of Anchor Text for Ad Hoc Search Revisited. SIGIR '00 [PDF]
- Avishek Anand, Srikanta Bedathur, Klaus Berberich, and Ralf Schenkel. Index Maintenance for Time-Travel Text Search. SIGIR '12 [PDF]
- [Selected by Max Kaulmann] Liudmila Ostroumova Prokhorenkova et al. Publication Date Prediction through Reverse Engineering of the Web. WSDM '16 [PDF]
6. Semantic Text Mining
- [Selected by Nils Nommensen] Vlad Niculae, Joonsuk Park, Claire Cardie. Argument Mining with Structured SVMs and RNNs. ACL '17. [PDF]
- [Selected by Nils Nommensen] David Tsurel , Dan Pelleg, Ido Guy, Dafna Shahaf. Fun Facts: Automatic Trivia Fact Extraction from Wikipedia. WSDM '17. [PDF]
- [Selected by Max Idahl] Knowledge Base Unification via Sense Embeddings and Disambiguation [PDF]
- [Selected by Max Idahl] Knowledge Graph and Text Jointly Embedding [PDF]
7. Quality Control Mechanisms in Crowdsourcing Systems
- [Selected by Miao Zhengyuan] Rogstadius, J., Kostakos, V., Kittur, A., Smus, B., Laredo, J., & Vukovic, M. (2011). An assessment of intrinsic and extrinsic motivation on task performance in crowdsourcing markets. ICWSM, 11, 17-21
- [Selected by Miao Zhengyuan] Lilly C. Irani and M. Six Silberman. 2013. Turkopticon: interrupting worker invisibility in amazon mechanical turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). ACM, New York, NY, USA, 611-620
- [Selected by Clemens Pollak] Edith Law, Ming Yin, Joslin Goh, Kevin Chen, Michael A. Terry, and Krzysztof Z. Gajos. 2016. Curiosity Killed the Cat, but Makes Crowdwork Better. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). ACM, New York, NY, USA, 4098-4110.
- [Selected by Clemens Pollak] Ujwal Gadiraju and Stefan Dietze. 2017. Improving learning through achievement priming in crowdsourced information finding microtasks. In Proceedings of the Seventh International Learning Analytics & Knowledge Conference (LAK '17). ACM, New York, NY, USA, 105-114.
- Nieves R. Brisaboa, Antonio Fariña, Susana Ladra, and Gonzalo Navarro. 2008. Reorganizing compressed text. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '08). ACM, New York, NY, USA, 139-146.
- Hartmut Liefke and Dan Suciu. 2000. XMill: an efficient compressor for XML data. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data (SIGMOD '00). ACM, New York, NY, USA, 153-164.
- Anh, V. N., & Moffat, A. (2005). Inverted index compression using word-aligned binary codes. Information Retrieval, 8(1), 151-166.
- Kazuhiko Yamamoto, Tatsuhiro Tsujikawa, and Kazuho Oku. 2017. Exploring HTTP/2 Header Compression. In Proceedings of the 12th International Conference on Future Internet Technologies (CFI'17). ACM, New York, NY, USA
10.04.2018 - Lecture
- Fairness and Transparency for Big Data Analysis (Prof. Dr. Wolfgang Nejdl) - Slides
17.04.2018 - Lecture
- Quality Control Mechanisms in Crowdsourcing Systems (Ujwal Gadiraju) - Slides
- Semantic Text Mining (Besnik Fetahu) - Slides
24.04.2018 - Lecture
- Introduction to DeepLearning (Asmelash Teka)
08.05.2018 - Lecture
- Accessing Web Archives (Helge Holzmann)
- Fake news detection (Vinicius Woloszyn)
15.05.2018 - Lecture
- Crowdsourcing (Markus Rokicki)
29.05.2018 - Lecture
- Compression (Philipp Kemkes)