USC Viterbi - School of Engineering - Department of Computer Science CSCI 586 Database Systems Interoperability

Project Specifications (link)
Samples: presentation, report

Monday Section

Group 1

Topic: We will work with multiple different datasets such as IMDB top 5000 movies from Kaggle, mine plot keywords and other attributes, so we can answer questions such as:

  • Famous people associated with a movie e.g. movies on Abraham Lincon (Famous people dataset from DBpedia).
  • Cities/countries in the movie e.g. movies about California.
  • Sports associated with the movie e.g. basketball movies.

  • Neha Sachdev
  • Vasini Chandrasekaran
  • Karthik Anantha Prakash
  • Sparshith Nairbalige Rai

Group 2

Topic: Understanding and assess the factors that impacted the “Trump effect” . Probable questions we are trying to answer are:

  • Which religious/Ethnic group supports Trump the most?
  • City wise percentage of homeless people who voted for Trump.(Visualization)?
  • What percentage of users were Facebook user but not Whatsapp user but still supported Trump?
  • What percentage of people voted for 'Lower taxes' propaganda of Trump and were in the age group of 24-36?
  • How many people from LGBTQ community in the range of 2400-3200 salary supported him and what were their main voting issues?

  • Anand, Deepika
  • Kochhar, Sakshi
  • Ramaswamy, Srividya
  • Thuravi Prakash, Roopa

Group 3
Topic: TBD
  • Lu, Guannan
  • Nguyen, Minh, Ngoc Binh
  • Zhang, Zixuan
  • Chen, Dingyuan

Group 4
Topic: TBD
  • Kaur, Gurleen
  • Lohith, Sachin, Keshavaiah
  • Parekh, Disha Yogesh
  • Junaid, Mohammed

Group 5

Topic: We will be importing a massive collection of JSON documents to Elasticsearch (A powerful data management platform) which will serve as the central source of our data repository. The data logged inside Elasticsearch will be made use of in the form of aggregate analysis and visualization through Kibana (A front-end visualization plugin) that works in conjunction with Elasticsearch. This stack will enable us to answer a variety of questions through:

  • Trends in the data via evaluation of specific JSON keys.
  • Construction of bar graphs and charts to find insightful patterns in our data.
  • Historical analysis via timestamps.
  • Predict most frequently occurring values for a JSON key via aggregate counts.
  • Plotting of metrics to enable answering of questions across the whole data set.
  • Data will come from the following domains: World Population, Education, Income, Budget, Literacy, Crime, Drugs/Health, Trading, Military, Natural Resources, Infrastructure, Currency, Wildlife, Weather.

  • Jason Christopher Tan
  • Georgios Lydakis
  • Chetan Yadav
  • Manas Mahanta

Group 6

Topic: Sentiment Analysis using Travel Ontology and Social Media Data
Goal: To successfully design and create travel ontology and perform sentiment analysis using the dataset from Twitter, Instagram, and Expedia,, Yelp. This will help evaluate popular reviews for any airline, hotels, restaurants or travel destinations/ landmarks.
Results: Sample results could look like below:

  1. Query on best landmarks leveraging the sentiments (Twitter feeds)
  2. Restaurants to be visited at location based on the sentiment of the users(travelers)
  3. Best/Cheap Airlines associated with the locations
  4. Best time to visit a destination based on people's feedback
  • Atharva kale
  • Ganmani sekar
  • Aman Mathur
  • Puneet Koul

Group 7
Topic: TBD
  • Andrade, Juan, Francisco
  • Muvva, Upendra Sandeep
  • Naik, Chinmay
  • Yesmin Kumar, Mehul

Group 8
Topic: TBD
  • Bino Joseph
  • Nikita Gupta
  • Malvika Nagpal

Group 9

Topic: The key idea of our project is to integrate music data and their releases, ratings, artists, events, albums etc. from several heterogeneous sources such as Spotify, MusicBrainz, Sound Cloud, ITunes etc. This will help to provide a unified view of music industry. The resulting ontology can then be used to answer interesting questions like:

  • Which songs have good ratings across different web sites?
  • Which are the most popular songs/Albums for a particular artist?
  • Which songs/albums have won Grammy awards?
  • Are there any concerts/ events in the vicinity for a specific artist?
  • Which Genre has the highest number of subscribers/ listeners?
  • Borikar, Shubham, Sudhir
  • Jain, Eshaan
  • Kheberi, Sarabjit Singh, Harwansh Singh
  • Aggarwal, Madhur

Wednesday Section

Group 1

Topic: Construct a system that records information(Car data, Reviews) about Cars/Automobile and answers queries regarding it. The System will be able to answer the queries such as:

  • Details of the cars below a certain price range.
  • The features of a particular car.
  • Review of the car.
  • Ratings.
  • Adarsh Rajanikanth
  • Malatesha Somasundar Anantha
  • Neelima Vangipuram
  • Sahil Wadhwa

Group 2

Topic: The idea is to create a federated ontology based tool on the films adapted from novels which can serve as a knowledge base for users. Various features are to be analyzed and existing ontology for each feature is to be used to build the framework.

  • The movie dataset will be extracted from Dbpedia, and book dataset will be extracted from Book Crossing Dataset from IIF.
  • The built model will answer the books related to given movie in the query and vice-versa. Example - Given Movie Name: Notebook will extract the book with Book Name - Notebook.
  • Books searched by author name resulting in sorted dataset by rating for the given author name.
  • Most popular books read by the given age group.
  • Nisha Kapoor
  • Yash Rajkumar Kedia
  • Shruti Priya
  • Nitisha Pandey

Group 3
Topic: TBD
  • Rahul Chandhoke
  • Viswambharan Kasturi Rangarajan
  • Shruti Anand Kulkarni
  • Riya Bharat Punjab

Group 4
Topic: TBD
  • Aditya Parameshwara
  • Charita Venugopal Etta
  • Ganesh Madhav Raghupathy
  • Xiaoyang Zhang

Group 5

Topic: We are thinking to work with food data using datasets:

  • Restaurants on TripAdvisor
  • Open Food Facts
  • What's Cooking
  • All these three datasets are on Kaggle. Using these datasets, we are going to answer few queries such as: Find the healthiest restaurant Find the healthiest restaurant in a particular cuisine Find the restaurant with high hotness (spicy) level
    • Ahuja, Swapnil
    • Chandurkar, Rushi, Nitin
    • Mittal, Saksham
    • Porwal, Raj

    Group 6

    Topic: We want to create an ontology system that integrate data obtained from databases, e.g. museum websites, or from crawling websites, e.g. painting wikipedia, that contain information about arts, artists, museums. With this system, we can create an engine that allows users to query terms such as artists, time-periods, or country, in which the engine could return museums that have the artist’s paintings on showing or return related paintings with detailed description in a time period. Our purpose for creating this ontology is so that people who are interested in art will have a media in which they can search up information about certain art, where it is displayed, and other relevant information.

    • YuLong Pei
    • Jingjing Wang
    • Majid Ghasemi Gol
    • Dan Ma

    Group 7
    Topic: TBD
    • Abhilash Natraj
    • Rajni Kumari
    • Shruthi Kalkunte Narayanaswamy

    Group 8

    Topic: The basic idea for the project is that we will collect data from famous software language package management systems (npm for nodejs, nuget for .net, gem for ruby etc), then build ontologies based knowledge base on package's metadata including the package name, package usage (problems they solved), package dependencies, author(s) to discover relationship between different packages even across different languages. With the information from online source code hosting service like GitHub, we can track life cycles of these packages to improve our knowledge base.

    • Dizheng Wang
    • Huiqing Dai
    • Sijie Chen
    • Tianlei Xu

    Group 9
    Topic: TBD
    • Rasvitha Kandur
    • Manikanta Kotthapalli
    • Karthik Chindalur Sridhara
    • Karthik Ravindra Rao